planning - 2025-04-23

VR-based Intervention for Perspective Change: A Case to Investigate Virtual Materiality

Authors:Ali Arya, Anthony Scavarelli, Dan Hawes, Luciara Nardon
Date:2025-04-22 16:53:21

This paper addresses the concept of materiality in virtual environments, which we define as being composed of objects that can influence user experience actively. Such virtual materiality is closely related to its physical counterpart, which is discussed in theoretical frameworks such as sociomateriality and actor-network theory. They define phenomena in terms of the entanglement of human and non-human elements. We report on an early investigation of virtual materiality within the context of reflection and perspective change in nature-based virtual environments. We considered the case of university students reflecting on the planning and management of their theses and major projects. Inspired by nature's known positive cognitive and affective effects and repeated questioning processes, we established a virtual reflection intervention to demonstrate the environmental mechanisms and material characteristics relevant to virtual materiality. Our work is a preliminary step toward understanding virtual materiality and its implications for research and the design of virtual environments.

Monocular inspection of spacecraft under illumination constraints and avoidance regions

Authors:Tochukwu Elijah Ogri, Muzaffar Qureshi, Zachary I. Bell, Matthew Longmire, Rushikesh Kamalapurkar
Date:2025-04-22 14:50:09

This paper presents an adaptive control approach to information-based guidance and control of a spacecraft carrying out on-orbit inspection by actively computing optimal policies for the spacecraft to achieve the best possible representation of objects within its orbital environment. Due to the complexity of navigating the space environment, it may be impossible to carry out on-orbit servicing to maintain space systems like satellites using a spacecraft equipped with controllers that cannot adapt to changing conditions. In particular, the presence of constraints such as illumination, field-of-view (FOV), minimal fuel, the use of visual-inertial navigation for improved localization, and the need for real-time computation of control policies render the spacecraft motion planning problem challenging. The control framework developed in this paper addresses these challenges by formulating the inspection task as a constrained optimization problem where the goal is to maximize information gained from the cameras, while navigating to the next best view, subject to illumination and FOV constraints. The developed architecture is analyzed using a Lyapunov-based stability analysis and the effectiveness of the planning algorithm is verified in simulation.

Bayesian sample size calculations for external validation studies of risk prediction models

Authors:Mohsen Sadatsafavi, Paul Gustafson, Solmaz Setayeshgar, Laure Wynants, Richard Riley
Date:2025-04-22 14:07:05

Summary: Contemporary sample size calculations for external validation of risk prediction models require users to specify fixed values of assumed model performance metrics alongside target precision levels (e.g., 95% CI widths). However, due to the finite samples of previous studies, our knowledge of true model performance in the target population is uncertain, and so choosing fixed values represents an incomplete picture. As well, for net benefit (NB) as a measure of clinical utility, the relevance of conventional precision-based inference is doubtful. In this work, we propose a general Bayesian algorithm for constructing the joint distribution of predicted risks and response values based on summary statistics of model performance in previous studies. For statistical metrics of performance, we propose sample size determination rules that either target desired expected precision, or a desired assurance probability that the precision criteria will be satisfied. For NB, we propose rules based on optimality assurance (the probability that the planned study correctly identifies the most beneficial strategy) and the Expected Value of Sample Information (EVSI), the expected gain in NB from the planned validation study. We showcase these developments in a case study on the validation of a risk prediction model for deterioration of hospitalized COVID-19 patients. Compared to the conventional sample size calculation methods, a Bayesian approach requires explicit quantification of uncertainty around model performance, but thereby enables various sample size rules based on expected precision, assurance probabilities, and value of information.

Bidirectional Task-Motion Planning Based on Hierarchical Reinforcement Learning for Strategic Confrontation

Authors:Qizhen Wu Lei Chen, Kexin Liu, Jinhu Lü
Date:2025-04-22 13:22:58

In swarm robotics, confrontation scenarios, including strategic confrontations, require efficient decision-making that integrates discrete commands and continuous actions. Traditional task and motion planning methods separate decision-making into two layers, but their unidirectional structure fails to capture the interdependence between these layers, limiting adaptability in dynamic environments. Here, we propose a novel bidirectional approach based on hierarchical reinforcement learning, enabling dynamic interaction between the layers. This method effectively maps commands to task allocation and actions to path planning, while leveraging cross-training techniques to enhance learning across the hierarchical framework. Furthermore, we introduce a trajectory prediction model that bridges abstract task representations with actionable planning goals. In our experiments, it achieves over 80\% in confrontation win rate and under 0.01 seconds in decision time, outperforming existing approaches. Demonstrations through large-scale tests and real-world robot experiments further emphasize the generalization capabilities and practical applicability of our method.

An Extended Horizon Tactical Decision-Making for Automated Driving Based on Monte Carlo Tree Search

Authors:Karim Essalmi, Fernando Garrido, Fawzi Nashashibi
Date:2025-04-22 13:11:16

This paper introduces COR-MCTS (Conservation of Resources - Monte Carlo Tree Search), a novel tactical decision-making approach for automated driving focusing on maneuver planning over extended horizons. Traditional decision-making algorithms are often constrained by fixed planning horizons, typically up to 6 seconds for classical approaches and 3 seconds for learning-based methods limiting their adaptability in particular dynamic driving scenarios. However, planning must be done well in advance in environments such as highways, roundabouts, and exits to ensure safe and efficient maneuvers. To address this challenge, we propose a hybrid method integrating Monte Carlo Tree Search (MCTS) with our prior utility-based framework, COR-MP (Conservation of Resources Model for Maneuver Planning). This combination enables long-term, real-time decision-making, significantly enhancing the ability to plan a sequence of maneuvers over extended horizons. Through simulations across diverse driving scenarios, we demonstrate that COR-MCTS effectively improves planning robustness and decision efficiency over extended horizons.

The 2nd MERCADO Workshop at IEEE VIS 2025: Multimodal Experiences for Remote Communication Around Data Online

Authors:Wolfgang Büschel, Gabriela Molina León, Arnaud Prouzeau, Mahmood Jasim, Christophe Hurter, Maxime Cordeil, Matthew Brehmer
Date:2025-04-22 12:54:54

We propose a half-day workshop at IEEE VIS 2025 on addressing the emerging challenges in data-rich multimodal remote collaboration. We focus on synchronous, remote, and hybrid settings where people take part in tasks such as data analysis, decision-making, and presentation. With this workshop, we continue successful prior work from the first MERCADO workshop at VIS 2023 and a 2024 Shonan Seminar that followed. Based on the findings of the earlier events, we invite research and ideas related to four themes of challenges: Tools & Technologies, Individual Differences & Interpersonal Dynamics, AI-assisted Collaboration, and Evaluation. With this workshop, we aim to broaden the community, foster new collaborations, and develop a research agenda to address these challenges in future research. Our planned workshop format is comprised of a keynote, short presentations, a breakout group session, and discussions organized around the identified challenges.

Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction

Authors:Tobias Demmler, Lennart Hartung, Andreas Tamke, Thao Dang, Alexander Hegai, Karsten Haug, Lars Mikelsons
Date:2025-04-22 10:20:35

In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle's planning processes. Modern trajectory prediction models strive to interpret complex patterns and dependencies from agent and map data. The Motion Transformer (MTR) architecture and subsequent work define the most accurate methods in common benchmarks such as the Waymo Open Motion Benchmark. The MTR model employs pre-generated static intention points as initial goal points for trajectory prediction. However, the static nature of these points frequently leads to misalignment with map data in specific traffic scenarios, resulting in unfeasible or unrealistic goal points. Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model. This adaptation of the MTR model was trained and evaluated on the Waymo Open Motion Dataset. Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory prediction accuracy, especially for predictions over long time horizons. Furthermore, we analyze the impact on ground truth trajectories which are not compliant with the map data or are illegal maneuvers.

Comparative Analysis of Evolutionary Algorithms for Energy-Aware Production Scheduling

Authors:Sascha C Burmeister, Till N Rogalski, Guido Schryen
Date:2025-04-22 07:54:05

The energy transition is driving rapid growth in renewable energy generation, creating the need to balance energy supply and demand with energy price awareness. One such approach for manufacturers to balance their energy demand with available energy is energyaware production planning. Through energy-aware production planning, manufacturers can align their energy demand with dynamic grid conditions, supporting renewable energy integration while benefiting from lower prices and reduced emissions. Energy-aware production planning can be modeled as a multi-criteria scheduling problem, where the objectives extend beyond traditional metrics like makespan or required workers to also include minimizing energy costs and emissions. Due to market dynamics and the NP-hard multi-objective nature of the problem, evolutionary algorithms are widely used for energy-aware scheduling. However, existing research focuses on the design and analysis of single algorithms, with limited comparisons between different approaches. In this study, we adapt NSGA-III, HypE, and $\theta$-DEA as memetic metaheuristics for energy-aware scheduling to minimize makespan, energy costs, emissions, and the number of workers, within a real-time energy market context. These adapted metaheuristics present different approaches for environmental selection. In a comparative analysis, we explore differences in solution efficiency and quality across various scenarios which are based on benchmark instances from the literature and real-world energy market data. Additionally, we estimate upper bounds on the distance between objective values obtained with our memetic metaheuristics and reference sets obtained via an exact solver.

Exploring Inevitable Waypoints for Unsolvability Explanation in Hybrid Planning Problems

Authors:Mir Md Sajid Sarwar, Rajarshi Ray
Date:2025-04-22 07:45:30

Explaining unsolvability of planning problems is of significant research interest in Explainable AI Planning. AI planning literature has reported several research efforts on generating explanations of solutions to planning problems. However, explaining the unsolvability of planning problems remains a largely open and understudied problem. A widely practiced approach to plan generation and automated problem solving, in general, is to decompose tasks into sub-problems that help progressively converge towards the goal. In this paper, we propose to adopt the same philosophy of sub-problem identification as a mechanism for analyzing and explaining unsolvability of planning problems in hybrid systems. In particular, for a given unsolvable planning problem, we propose to identify common waypoints, which are universal obstacles to plan existence; in other words, they appear on every plan from the source to the planning goal. This work envisions such waypoints as sub-problems of the planning problem and the unreachability of any of these waypoints as an explanation for the unsolvability of the original planning problem. We propose a novel method of waypoint identification by casting the problem as an instance of the longest common subsequence problem, a widely popular problem in computer science, typically considered as an illustrative example for the dynamic programming paradigm. Once the waypoints are identified, we perform symbolic reachability analysis on them to identify the earliest unreachable waypoint and report it as the explanation of unsolvability. We present experimental results on unsolvable planning problems in hybrid domains.

An ACO-MPC Framework for Energy-Efficient and Collision-Free Path Planning in Autonomous Maritime Navigation

Authors:Yaoze Liu, Zhen Tian, Qifan Zhou, Zixuan Huang, Hongyu Sun
Date:2025-04-22 06:09:54

Automated driving on ramps presents significant challenges due to the need to balance both safety and efficiency during lane changes. This paper proposes an integrated planner for automated vehicles (AVs) on ramps, utilizing an unsatisfactory level metric for efficiency and arrow-cluster-based sampling for safety. The planner identifies optimal times for the AV to change lanes, taking into account the vehicle's velocity as a key factor in efficiency. Additionally, the integrated planner employs arrow-cluster-based sampling to evaluate collision risks and select an optimal lane-changing curve. Extensive simulations were conducted in a ramp scenario to verify the planner's efficient and safe performance. The results demonstrate that the proposed planner can effectively select an appropriate lane-changing time point and a safe lane-changing curve for AVs, without incurring any collisions during the maneuver.

Antenna Technology Readiness for the Black Hole Explorer (BHEX) Mission

Authors:T. K. Sridharan, R. Lehmensiek, S. Schwarz, D. P. Marrone
Date:2025-04-22 03:30:32

The Black Hole Explorer (BHEX) will be the first sub-mm wavelength Space Very-Long-Baseline Interferometry (VLBI) mission. It targets astronomical imaging with the highest ever spatial resolution to enable detection of the photon ring of a supermassive black hole. BHEX is being proposed for launch in 2031 as a NASA Small Explorers mission. BHEX science goals and mission opportunity require a high precision lightweight spaceborne antenna. A survey of the technology landscape for realizing such an antenna is presented. Technology readiness (TRL) for the antenna is discussed and assessed to be at TRL 5. An update on our technology maturation efforts is provided. Design studies leading to the conceptual design of a metallized carbon fiber reinforced plastic (CFRP) technology based antenna with a mass of only $\dim 50$ kg, incorporating a 3.4 m primary reflector with a surface precision of < 40 $\mu$m to allow efficient operation up to 320 GHz are outlined. Current plans anticipate attaining TRL6 in 2026 for the BHEX antenna. Completed design studies point to a large margin in surface precision which opens up opportunities for applications beyond BHEX, at significantly higher THz frequencies.

MRTA-Sim: A Modular Simulator for Multi-Robot Allocation, Planning, and Control in Open-World Environments

Authors:Victoria Marie Tuck, Hardik Parwana, Pei-Wei Chen, Georgios Fainekos, Bardh Hoxha, Hideki Okamoto, S. Shankar Sastry, Sanjit A. Seshia
Date:2025-04-21 20:03:07

This paper introduces MRTA-Sim, a Python/ROS2/Gazebo simulator for testing approaches to Multi-Robot Task Allocation (MRTA) problems on simulated robots in complex, indoor environments. Grid-based approaches to MRTA problems can be too restrictive for use in complex, dynamic environments such in warehouses, department stores, hospitals, etc. However, approaches that operate in free-space often operate at a layer of abstraction above the control and planning layers of a robot and make an assumption on approximate travel time between points of interest in the system. These abstractions can neglect the impact of the tight space and multi-agent interactions on the quality of the solution. Therefore, MRTA solutions should be tested with the navigation stacks of the robots in mind, taking into account robot planning, conflict avoidance between robots, and human interaction and avoidance. This tool connects the allocation output of MRTA solvers to individual robot planning using the NAV2 stack and local, centralized multi-robot deconfliction using Control Barrier Function-Quadrtic Programs (CBF-QPs), creating a platform closer to real-world operation for more comprehensive testing of these approaches. The simulation architecture is modular so that users can swap out methods at different levels of the stack. We show the use of our system with a Satisfiability Modulo Theories (SMT)-based approach to dynamic MRTA on a fleet of indoor delivery robots.

Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs

Authors:Chun-Hsiao Yeh, Chenyu Wang, Shengbang Tong, Ta-Ying Cheng, Rouyu Wang, Tianzhe Chu, Yuexiang Zhai, Yubei Chen, Shenghua Gao, Yi Ma
Date:2025-04-21 17:59:53

Multi-view understanding, the ability to reconcile visual information across diverse viewpoints for effective navigation, manipulation, and 3D scene comprehension, is a fundamental challenge in Multi-Modal Large Language Models (MLLMs) to be used as embodied agents. While recent MLLMs have shown impressive advances in high-level reasoning and planning, they frequently fall short when confronted with multi-view geometric consistency and cross-view correspondence. To comprehensively evaluate the challenges of MLLMs in multi-view scene reasoning, we propose All-Angles Bench, a benchmark of over 2,100 human carefully annotated multi-view question-answer pairs across 90 diverse real-world scenes. Our six tasks (counting, attribute identification, relative distance, relative direction, object manipulation, and camera pose estimation) specifically test model's geometric correspondence and the capacity to align information consistently across views. Our extensive experiments, benchmark on 27 representative MLLMs including Gemini-2.0-Flash, Claude-3.7-Sonnet, and GPT-4o against human evaluators reveals a substantial performance gap, indicating that current MLLMs remain far from human-level proficiency. Through in-depth analysis, we show that MLLMs are particularly underperforming under two aspects: (1) cross-view correspondence for partially occluded views and (2) establishing the coarse camera poses. These findings highlight the necessity of domain-specific refinements or modules that embed stronger multi-view awareness. We believe that our All-Angles Bench offers valuable insights and contribute to bridging the gap between MLLMs and human-level multi-view understanding. The project and benchmark are publicly available at https://danielchyeh.github.io/All-Angles-Bench/.

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Authors:Vaishnavh Nagarajan, Chen Henry Wu, Charles Ding, Aditi Raghunathan
Date:2025-04-21 17:47:46

We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model. Much like real-world tasks that require a creative, far-sighted leap of thought, our tasks require an implicit, open-ended stochastic planning step that either (a) discovers new connections in an abstract knowledge graph (like in wordplay, drawing analogies, or research) or (b) constructs new patterns (like in designing math problems or new proteins). In these tasks, we empirically and conceptually argue how next-token learning is myopic and memorizes excessively; comparatively, multi-token approaches, namely teacherless training and diffusion models, excel in producing diverse and original output. Secondly, in our tasks, we find that to elicit randomness from the Transformer without hurting coherence, it is better to inject noise right at the input layer (via a method we dub hash-conditioning) rather than defer to temperature sampling from the output layer. Thus, our work offers a principled, minimal test-bed for analyzing open-ended creative skills, and offers new arguments for going beyond next-token learning and softmax-based sampling. We make part of the code available under https://github.com/chenwu98/algorithmic-creativity

Scalable Discrete Event Simulation Tool for Large-Scale Cyber-Physical Energy Systems: Advancing System Efficiency and Scalability

Authors:Khandaker Akramul Haque, Shining Sun, Xiang Huo, Ana E. Goulart, Katherine R. Davis
Date:2025-04-21 16:14:45

Modern power systems face growing risks from cyber-physical attacks, necessitating enhanced resilience due to their societal function as critical infrastructures. The challenge is that defense of large-scale systems-of-systems requires scalability in their threat and risk assessment environment for cyber physical analysis including cyber-informed transmission planning, decision-making, and intrusion response. Hence, we present a scalable discrete event simulation tool for analysis of energy systems, called DESTinE. The tool is tailored for largescale cyber-physical systems, with a focus on power systems. It supports faster-than-real-time traffic generation and models packet flow and congestion under both normal and adversarial conditions. Using three well-established power system synthetic cases with 500, 2000, and 10,000 buses, we overlay a constructed cyber network employing star and radial topologies. Experiments are conducted to identify critical nodes within a communication network in response to a disturbance. The findings are incorporated into a constrained optimization problem to assess the impact of the disturbance on a specific node and its cascading effects on the overall network. Based on the solution of the optimization problem, a new hybrid network topology is also derived, combining the strengths of star and radial structures to improve network resilience. Furthermore, DESTinE is integrated with a virtual server and a hardware-in-the-loop (HIL) system using Raspberry Pi 5.

Neural ATTF: A Scalable Solution to Lifelong Multi-Agent Path Planning

Authors:Kushal Shah, Jihyun Park, Seung-Kyum Choi
Date:2025-04-21 14:25:32

Multi-Agent Pickup and Delivery (MAPD) is a fundamental problem in robotics, particularly in applications such as warehouse automation and logistics. Existing solutions often face challenges in scalability, adaptability, and efficiency, limiting their applicability in dynamic environments with real-time planning requirements. This paper presents Neural ATTF (Adaptive Task Token Framework), a new algorithm that combines a Priority Guided Task Matching (PGTM) Module with Neural STA* (Space-Time A*), a data-driven path planning method. Neural STA* enhances path planning by enabling rapid exploration of the search space through guided learned heuristics and ensures collision avoidance under dynamic constraints. PGTM prioritizes delayed agents and dynamically assigns tasks by prioritizing agents nearest to these tasks, optimizing both continuity and system throughput. Experimental evaluations against state-of-the-art MAPD algorithms, including TPTS, CENTRAL, RMCA, LNS-PBS, and LNS-wPBS, demonstrate the superior scalability, solution quality, and computational efficiency of Neural ATTF. These results highlight the framework's potential for addressing the critical demands of complex, real-world multi-agent systems operating in high-demand, unpredictable settings.

A General Infrastructure and Workflow for Quadrotor Deep Reinforcement Learning and Reality Deployment

Authors:Kangyao Huang, Hao Wang, Yu Luo, Jingyu Chen, Jintao Chen, Xiangkui Zhang, Xiangyang Ji, Huaping Liu
Date:2025-04-21 14:25:23

Deploying robot learning methods to a quadrotor in unstructured outdoor environments is an exciting task. Quadrotors operating in real-world environments by learning-based methods encounter several challenges: a large amount of simulator generated data required for training, strict demands for real-time processing onboard, and the sim-to-real gap caused by dynamic and noisy conditions. Current works have made a great breakthrough in applying learning-based methods to end-to-end control of quadrotors, but rarely mention the infrastructure system training from scratch and deploying to reality, which makes it difficult to reproduce methods and applications. To bridge this gap, we propose a platform that enables the seamless transfer of end-to-end deep reinforcement learning (DRL) policies. We integrate the training environment, flight dynamics control, DRL algorithms, the MAVROS middleware stack, and hardware into a comprehensive workflow and architecture that enables quadrotors' policies to be trained from scratch to real-world deployment in several minutes. Our platform provides rich types of environments including hovering, dynamic obstacle avoidance, trajectory tracking, balloon hitting, and planning in unknown environments, as a physical experiment benchmark. Through extensive empirical validation, we demonstrate the efficiency of proposed sim-to-real platform, and robust outdoor flight performance under real-world perturbations. Details can be found from our website https://emnavi.tech/AirGym/.

Optimal Behavior Planning for Implicit Communication using a Probabilistic Vehicle-Pedestrian Interaction Model

Authors:Markus Amann, Malte Probst, Raphael Wenzel, Thomas H. Weisswange, Miguel Ángel Sotelo
Date:2025-04-21 13:38:37

In interactions between automated vehicles (AVs) and crossing pedestrians, modeling implicit vehicle communication is crucial. In this work, we present a combined prediction and planning approach that allows to consider the influence of the planned vehicle behavior on a pedestrian and predict a pedestrian's reaction. We plan the behavior by solving two consecutive optimal control problems (OCPs) analytically, using variational calculus. We perform a validation step that assesses whether the planned vehicle behavior is adequate to trigger a certain pedestrian reaction, which accounts for the closed-loop characteristics of prediction and planning influencing each other. In this step, we model the influence of the planned vehicle behavior on the pedestrian using a probabilistic behavior acceptance model that returns an estimate for the crossing probability. The probabilistic modeling of the pedestrian reaction facilitates considering the pedestrian's costs, thereby improving cooperative behavior planning. We demonstrate the performance of the proposed approach in simulated vehicle-pedestrian interactions with varying initial settings and highlight the decision making capabilities of the planning approach.

Robust Planning and Control of Omnidirectional MRAVs for Aerial Communications in Wireless Networks

Authors:Giuseppe Silano, Daniel Bonilla Licea, Hajar El Hammouti, Mounir Ghogho, and Martin Saska
Date:2025-04-21 13:23:48

A new class of Multi-Rotor Aerial Vehicles (MRAVs), known as omnidirectional MRAVs (o-MRAVs), has gained attention for their ability to independently control 3D position and orientation. This capability enhances robust planning and control in aerial communication networks, enabling more adaptive trajectory planning and precise antenna alignment without additional mechanical components. These features are particularly valuable in uncertain environments, where disturbances such as wind and interference affect communication stability. This paper examines o-MRAVs in the context of robust aerial network planning, comparing them with the more common under-actuated MRAVs (u-MRAVs). Key applications, including physical layer security, optical communications, and network densification, are highlighted, demonstrating the potential of o-MRAVs to improve reliability and efficiency in dynamic communication scenarios.

Expected Free Energy-based Planning as Variational Inference

Authors:Bert de Vries, Wouter Nuijten, Thijs van de Laar, Wouter Kouw, Sepideh Adamiat, Tim Nisslbeck, Mykola Lukashchuk, Hoang Minh Huu Nguyen, Marco Hidalgo Araya, Raphael Tresor, Thijs Jenneskens, Ivana Nikoloska, Raaja Subramanian, Bart van Erp, Dmitry Bagaev, Albert Podusenko
Date:2025-04-21 07:09:05

We address the problem of planning under uncertainty, where an agent must choose actions that not only achieve desired outcomes but also reduce uncertainty. Traditional methods often treat exploration and exploitation as separate objectives, lacking a unified inferential foundation. Active inference, grounded in the Free Energy Principle, offers such a foundation by minimizing Expected Free Energy (EFE), a cost function that combines utility with epistemic drives like ambiguity resolution and novelty seeking. However, the computational burden of EFE minimization has remained a major obstacle to its scalability. In this paper, we show that EFE-based planning arises naturally from minimizing a variational free energy functional on a generative model augmented with preference and epistemic priors. This result reinforces theoretical consistency with the Free Energy Principle, by casting planning itself as variational inference. Our formulation yields optimal policies that jointly support goal achievement and information gain, while incorporating a complexity term that accounts for bounded computational resources. This unifying framework connects and extends existing methods, enabling scalable, resource-aware implementations of active inference agents.

Never too Cocky to Cooperate: An FIM and RL-based USV-AUV Collaborative System for Underwater Tasks in Extreme Sea Conditions

Authors:Jingzehua Xu, Guanwen Xie, Jiwei Tang, Yimian Ding, Weiyi Liu, Shuai Zhang, Yi Li
Date:2025-04-21 06:47:46

This paper develops a novel unmanned surface vehicle (USV)-autonomous underwater vehicle (AUV) collaborative system designed to enhance underwater task performance in extreme sea conditions. The system integrates a dual strategy: (1) high-precision multi-AUV localization enabled by Fisher information matrix-optimized USV path planning, and (2) reinforcement learning-based cooperative planning and control method for multi-AUV task execution. Extensive experimental evaluations in the underwater data collection task demonstrate the system's operational feasibility, with quantitative results showing significant performance improvements over baseline methods. The proposed system exhibits robust coordination capabilities between USV and AUVs while maintaining stability in extreme sea conditions. To facilitate reproducibility and community advancement, we provide an open-source simulation toolkit available at: https://github.com/360ZMEM/USV-AUV-colab .

FERMI: Flexible Radio Mapping with a Hybrid Propagation Model and Scalable Autonomous Data Collection

Authors:Yiming Luo, Yunfei Wang, Hongming Chen, Chengkai Wu, Ximin Lyu, Jinni Zhou, Jun Ma, Fu Zhang, Boyu Zhou
Date:2025-04-21 05:03:19

Communication is fundamental for multi-robot collaboration, with accurate radio mapping playing a crucial role in predicting signal strength between robots. However, modeling radio signal propagation in large and occluded environments is challenging due to complex interactions between signals and obstacles. Existing methods face two key limitations: they struggle to predict signal strength for transmitter-receiver pairs not present in the training set, while also requiring extensive manual data collection for modeling, making them impractical for large, obstacle-rich scenarios. To overcome these limitations, we propose FERMI, a flexible radio mapping framework. FERMI combines physics-based modeling of direct signal paths with a neural network to capture environmental interactions with radio signals. This hybrid model learns radio signal propagation more efficiently, requiring only sparse training data. Additionally, FERMI introduces a scalable planning method for autonomous data collection using a multi-robot team. By increasing parallelism in data collection and minimizing robot travel costs between regions, overall data collection efficiency is significantly improved. Experiments in both simulation and real-world scenarios demonstrate that FERMI enables accurate signal prediction and generalizes well to unseen positions in complex environments. It also supports fully autonomous data collection and scales to different team sizes, offering a flexible solution for creating radio maps. Our code is open-sourced at https://github.com/ymLuo1214/Flexible-Radio-Mapping.

Statistical Design and Planning of an Adaptive Trial using Hierarchical Composite Outcomes: A Practical example

Authors:Krishna Padmanabhan, Cyrus Mehta
Date:2025-04-20 21:41:29

Hierarchical composite endpoints, such as those analyzed using the Finkelstein-Schoenfeld (FS) statistic, are increasingly used in clinical trials for their ability to incorporate clinically prioritized outcomes. However, adaptive design methods for these endpoints remain underdeveloped. This paper presents a practical framework for implementing sample size re-estimation (SSR) in trials using hierarchical composites, motivated by a cardiovascular trial with mortality, hospitalization, and a functional response as prioritized endpoints. We use a two-stage adaptive design with a single interim analysis for illustration. The interim analysis incorporates predictive probabilities to determine whether the trial should stop for futility, continue as planned, or increase the sample size to maintain power. The decision framework is based on predefined zones for predictive probability, with corresponding adjustments to the stage 2 sample size. Simulation studies across various treatment scenarios demonstrate strong type I error control and increased power compared to a fixed design, particularly for treatment effects that are clinically relevant but lower than the alternative hypothesis. We also explore an alternative conditional power approach for SSR, offering further sample size optimization. Our results support the use of SSR with hierarchical composite outcomes using an FS statistic, enhancing trial efficiency.

Adaptive Field Effect Planner for Safe Interactive Autonomous Driving on Curved Roads

Authors:Qinghao Li, Zhen Tian, Xiaodan Wang, Jinming Yang, Zhihao Lin
Date:2025-04-20 21:41:19

Autonomous driving has garnered significant attention for its potential to improve safety, traffic efficiency, and user convenience. However, the dynamic and complex nature of interactive driving poses significant challenges, including the need to navigate non-linear road geometries, handle dynamic obstacles, and meet stringent safety and comfort requirements. Traditional approaches, such as artificial potential fields (APF), often fall short in addressing these complexities independently, necessitating the development of integrated and adaptive frameworks. This paper presents a novel approach to autonomous vehicle navigation that integrates artificial potential fields, Frenet coordinates, and improved particle swarm optimization (IPSO). A dynamic risk field, adapted from traditional APF, is proposed to ensure interactive safety by quantifying risks and dynamically adjusting lane-changing intentions based on surrounding vehicle behavior. Frenet coordinates are utilized to simplify trajectory planning on non-straight roads, while an enhanced quintic polynomial trajectory generator ensures smooth and comfortable path transitions. Additionally, an IPSO algorithm optimizes trajectory selection in real time, balancing safety and user comfort within a feasible input range. The proposed framework is validated through extensive simulations and real-world scenarios, demonstrating its ability to navigate complex traffic environments, maintain safety margins, and generate smooth, dynamically feasible trajectories.

Med-2D SegNet: A Light Weight Deep Neural Network for Medical 2D Image Segmentation

Authors:Md. Sanaullah Chowdhury, Salauddin Tapu, Noyon Kumar Sarkar, Ferdous Bin Ali, Lameya Sabrin
Date:2025-04-20 19:04:43

Accurate and efficient medical image segmentation is crucial for advancing clinical diagnostics and surgical planning, yet remains a complex challenge due to the variability in anatomical structures and the demand for low-complexity models. In this paper, we introduced Med-2D SegNet, a novel and highly efficient segmentation architecture that delivers outstanding accuracy while maintaining a minimal computational footprint. Med-2D SegNet achieves state-of-the-art performance across multiple benchmark datasets, including KVASIR-SEG, PH2, EndoVis, and GLAS, with an average Dice similarity coefficient (DSC) of 89.77% across 20 diverse datasets. Central to its success is the compact Med Block, a specialized encoder design that incorporates dimension expansion and parameter reduction, enabling precise feature extraction while keeping model parameters to a low count of just 2.07 million. Med-2D SegNet excels in cross-dataset generalization, particularly in polyp segmentation, where it was trained on KVASIR-SEG and showed strong performance on unseen datasets, demonstrating its robustness in zero-shot learning scenarios, even though we acknowledge that further improvements are possible. With top-tier performance in both binary and multi-class segmentation, Med-2D SegNet redefines the balance between accuracy and efficiency, setting a new benchmark for medical image analysis. This work paves the way for developing accessible, high-performance diagnostic tools suitable for clinical environments and resource-constrained settings, making it a step forward in the democratization of advanced medical technology.

Exposing the Copycat Problem of Imitation-based Planner: A Novel Closed-Loop Simulator, Causal Benchmark and Joint IL-RL Baseline

Authors:Hui Zhou, Shaoshuai Shi, Hongsheng Li
Date:2025-04-20 18:51:26

Machine learning (ML)-based planners have recently gained significant attention. They offer advantages over traditional optimization-based planning algorithms. These advantages include fewer manually selected parameters and faster development. Within ML-based planning, imitation learning (IL) is a common algorithm. It primarily learns driving policies directly from supervised trajectory data. While IL has demonstrated strong performance on many open-loop benchmarks, it remains challenging to determine if the learned policy truly understands fundamental driving principles, rather than simply extrapolating from the ego-vehicle's initial state. Several studies have identified this limitation and proposed algorithms to address it. However, these methods often use original datasets for evaluation. In these datasets, future trajectories are heavily dependent on initial conditions. Furthermore, IL often overfits to the most common scenarios. It struggles to generalize to rare or unseen situations. To address these challenges, this work proposes: 1) a novel closed-loop simulator supporting both imitation and reinforcement learning, 2) a causal benchmark derived from the Waymo Open Dataset to rigorously assess the impact of the copycat problem, and 3) a novel framework integrating imitation learning and reinforcement learning to overcome the limitations of purely imitative approaches. The code for this work will be released soon.

A Complete and Bounded-Suboptimal Algorithm for a Moving Target Traveling Salesman Problem with Obstacles in 3D

Authors:Anoop Bhat, Geordan Gutow, Bhaskar Vundurthy, Zhongqiang Ren, Sivakumar Rathinam, Howie Choset
Date:2025-04-20 16:51:29

The moving target traveling salesman problem with obstacles (MT-TSP-O) seeks an obstacle-free trajectory for an agent that intercepts a given set of moving targets, each within specified time windows, and returns to the agent's starting position. Each target moves with a constant velocity within its time windows, and the agent has a speed limit no smaller than any target's speed. We present FMC*-TSP, the first complete and bounded-suboptimal algorithm for the MT-TSP-O, and results for an agent whose configuration space is $\mathbb{R}^3$. Our algorithm interleaves a high-level search and a low-level search, where the high-level search solves a generalized traveling salesman problem with time windows (GTSP-TW) to find a sequence of targets and corresponding time windows for the agent to visit. Given such a sequence, the low-level search then finds an associated agent trajectory. To solve the low-level planning problem, we develop a new algorithm called FMC*, which finds a shortest path on a graph of convex sets (GCS) via implicit graph search and pruning techniques specialized for problems with moving targets. We test FMC*-TSP on 280 problem instances with up to 40 targets and demonstrate its smaller median runtime than a baseline based on prior work.

Virtual Reality for Urban Walkability Assessment

Authors:Viet Hung Pham, Malte Wagenfeld, Regina Bernhaupt
Date:2025-04-20 12:05:16

Traditional urban planning methodologies often fail to capture the complexity of contemporary urbanization and environmental sustainability challenges. This study investigates the integration of Generative Design, Virtual Reality (VR), and Digital Twins (DT) to enhance walkability in urban planning. VR provides distinct benefits over conventional approaches, including 2D maps, static renderings, and physical models, by allowing stakeholders to engage with urban designs more intuitively, identify walkability challenges, and suggest iterative improvements. Preliminary findings from structured interviews with Eindhoven residents provide critical insights into pedestrian preferences and walkability considerations. The next phase of the study involves the development of VR-DT integrated prototypes to simulate urban environments, assess walkability, and explore the role of Generative Design in generating adaptive urban planning solutions. The objective is to develop a decision-support tool that enables urban planners to incorporate diverse stakeholder perspectives, optimize pedestrian-oriented urban design, and advance regenerative development principles. By leveraging these emerging technologies, this research contributes to the evolution of data-driven, participatory urban planning frameworks aimed at fostering sustainable and walkable cities.

SkyNetPredictor: Network Performance Prediction in Avionic Communication using AI

Authors:Hind Mukhtar, Raymond Schaub, Melike Erol-Kantarci
Date:2025-04-20 01:28:04

Satellite-based communication systems are integral to delivering high-speed data services in aviation, particularly for business aviation operations requiring global connectivity. These systems, however, are challenged by a multitude of interdependent factors such as satellite handovers, congestion, flight maneuvers and seasonal trends, making network performance prediction a complex task. No established methodologies currently exist for network performance prediction in avionic communication systems. This paper addresses the gap by proposing machine learning (ML)-based approaches for pre-flight network performance predictions. The proposed models predict performance along a given flight path, taking as input positional and network-related information and outputting the predicted performance for each position. In business aviation, flight crews typically have multiple flight plans to choose from for each city pair, allowing them to select the most optimal option. This approach enables proactive decision-making, such as selecting optimal flight paths prior to departure.

Toward Generation of Test Cases from Task Descriptions via History-aware Planning

Authors:Duy Cao, Phu Nguyen, Vy Le, Tien N. Nguyen, Vu Nguyen
Date:2025-04-19 16:03:03

In automated web testing, generating test scripts from natural language task descriptions is crucial for enhancing the test generation process. This activity involves creating the correct sequences of actions to form test scripts for future testing activities. Current state-of-the-art approaches are limited in generating these action sequences, as they either demand substantial manual effort for human demonstrations or fail to consider the history of previous web content and actions to decide the next action. In this paper, we introduce HxAgent, an iterative large language model agent planning approach that determines the next action based on: 1) observations of the current contents and feasible actions, 2) short-term memory of previous web states and actions, and 3) long-term experience with (in)correct action sequences. The agent generates a sequence of actions to perform a given task, which is effectively an automated test case to verify the task. We conducted an extensive empirical evaluation of HxAgent using two datasets. On the MiniWoB++ dataset, our approach achieves 97% exact-match accuracy that is comparable to the best baselines while eliminating the need for human demonstrations required by those methods. For complex tasks requiring navigation through multiple actions and screens, HxAgent achieves an average 82% exact-match. On the second dataset, comprising 350 task instances across seven popular websites, including YouTube, LinkedIn, Facebook, and Google, HxAgent achieves high performance, with 87% of the action sequences exactly matching the ground truth and a prefix-match of 93%, outperforming the baseline by 59%.