planning - 2025-03-08

Granular mortality modelling with temperature and epidemic shocks: a three-state regime-switching approach

Authors:Jens Robben, Karim Barigou, Torsten Kleinow
Date:2025-03-06 16:01:09

This paper develops a granular regime-switching framework to model mortality deviations from seasonal baseline trends driven by temperature- and epidemic-related shocks. The framework features three states: (1) a baseline state that captures observed seasonal mortality patterns, (2) an environmental shock state for heat waves, and (3) a respiratory shock state that addresses mortality deviations caused by strong outbreaks of respiratory diseases due to influenza and COVID-19. Transition probabilities between states are modeled using covariate-dependent multinomial logit functions. These functions incorporate, among others, lagged temperature and influenza incidence rates as predictors, allowing dynamic adjustments to evolving shocks. Calibrated on weekly mortality data across 21 French regions and six age groups, the regime-switching framework accounts for spatial and demographic heterogeneity. Under various projection scenarios for temperature and influenza, we quantify uncertainty in mortality forecasts through prediction intervals constructed using an extensive bootstrap approach. These projections can guide healthcare providers and hospitals in managing risks and planning resources for potential future shocks.

The nexus between disease surveillance, adaptive human behavior and epidemic containment

Authors:Baltazar Espinoza, Roger Sanchez, Jimmy Calvo-Monge, Fabio Sanchez
Date:2025-03-06 15:14:57

Epidemics exhibit interconnected processes that operate at multiple time and organizational scales, a hallmark of complex adaptive systems. Modern epidemiological modeling frameworks incorporate feedback between individual-level behavioral choices and centralized interventions. Nonetheless, the realistic operational course for disease detection, planning, and response is often overlooked. Disease detection is a dynamic challenge, shaped by the interplay between surveillance efforts and transmission characteristics. It serves as a tipping point that triggers emergency declarations, information dissemination, adaptive behavioral responses, and the deployment of public health interventions. Evaluating the impact of disease surveillance systems as triggers for adaptive behavior and public health interventions is key to designing effective control policies. We examine the multiple behavioral and epidemiological dynamics generated by the feedback between disease surveillance and the intertwined dynamics of information and disease propagation. Specifically, we study the intertwined dynamics between: $(i)$ disease surveillance triggering health emergency declarations, $(ii)$ risk information dissemination producing decentralized behavioral responses, and $(iii)$ centralized interventions. Our results show that robust surveillance systems that quickly detect a disease outbreak can trigger an early response from the population, leading to large epidemic sizes. The key result is that the response scenarios that minimize the final epidemic size are determined by the trade-off between the risk information dissemination and disease transmission, with the triggering effect of surveillance mediating this trade-off. Finally, our results confirm that behavioral adaptation can create a hysteresis-like effect on the final epidemic size.

SeGMan: Sequential and Guided Manipulation Planner for Robust Planning in 2D Constrained Environments

Authors:Cankut Bora Tuncer, Dilruba Sultan Haliloglu, Ozgur S. Oguz
Date:2025-03-06 13:05:25

In this paper, we present SeGMan, a hybrid motion planning framework that integrates sampling-based and optimization-based techniques with a guided forward search to address complex, constrained sequential manipulation challenges, such as pick-and-place puzzles. SeGMan incorporates an adaptive subgoal selection method that adjusts the granularity of subgoals, enhancing overall efficiency. Furthermore, proposed generalizable heuristics guide the forward search in a more targeted manner. Extensive evaluations in maze-like tasks populated with numerous objects and obstacles demonstrate that SeGMan is capable of generating not only consistent and computationally efficient manipulation plans but also outperform state-of-the-art approaches.

Robust design of bicycle infrastructure networks

Authors:Christoph Steinacker, Mads Paulsen, Malte Schröder, Jeppe Rich
Date:2025-03-06 11:45:12

Promoting active mobility like cycling relies on the availability of well-connected, high-quality bicycle networks. However, expanding these networks over an extended planning horizon presents one of the most complex challenges in transport science. This complexity arises from the intricate interactions between infrastructure availability and usage, such as network spillover effects and mode choice substitutions. In this paper, we approach the problem from two perspectives: direct optimization methods, which generate near-optimal solutions using operations research techniques, and conceptual heuristics, which offer intuitive and scalable algorithms grounded in network science. Specifically, we compare direct welfare optimization with an inverse network percolation approach to planning cycle superhighway extensions in Copenhagen. Interestingly, while the more complex optimization models yield better overall welfare results, the improvements over simpler methods are small. More importantly, we demonstrate that the increased complexity of planning approaches generally makes them more vulnerable to input uncertainty, reflecting the bias-variance tradeoff. This issue is particularly relevant in the context of long-term planning, where conditions change during the implementation of the planned infrastructure expansions. Therefore, while planning bicycle infrastructure is important and renders exceptionally high benefit-cost ratios, considerations of robustness and ease of implementation may justify the use of more straightforward network-based methods.

Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks

Authors:Lukáš Gajdošech, Hassan Ali, Jan-Gerrit Habekost, Martin Madaras, Matthias Kerzel, Stefan Wermter
Date:2025-03-06 10:51:04

Datasets for object detection often do not account for enough variety of glasses, due to their transparent and reflective properties. Specifically, open-vocabulary object detectors, widely used in embodied robotic agents, fail to distinguish subclasses of glasses. This scientific gap poses an issue to robotic applications that suffer from accumulating errors between detection, planning, and action execution. The paper introduces a novel method for the acquisition of real-world data from RGB-D sensors that minimizes human effort. We propose an auto-labeling pipeline that generates labels for all the acquired frames based on the depth measurements. We provide a novel real-world glass object dataset that was collected on the Neuro-Inspired COLlaborator (NICOL), a humanoid robot platform. The data set consists of 7850 images recorded from five different cameras. We show that our trained baseline model outperforms state-of-the-art open-vocabulary approaches. In addition, we deploy our baseline model in an embodied agent approach to the NICOL platform, on which it achieves a success rate of 81% in a human-robot bartending scenario.

On the Connection Between Magnetic-Field Odometry Aided Inertial Navigation and Magnetic-Field SLAM

Authors:Isaac Skog, Manon Kok, Gustaf Hendeby, Chuan Huang, Thomas Edridge
Date:2025-03-06 10:14:36

Magnetic-field simultaneous localization and mapping (SLAM) using consumer-grade inertial and magnetometer sensors offers a scalable, cost-effective solution for indoor localization. However, the rapid error accumulation in the inertial navigation process limits the feasible exploratory phases of these systems. Advances in magnetometer array processing have demonstrated that odometry information, i.e., displacement and rotation information, can be extracted from local magnetic field variations and used to create magnetic-field odometry-aided inertial navigation systems. The error growth rate of these systems is significantly lower than that of standalone inertial navigation systems. This study seeks an answer to whether a magnetic-field SLAM system fed with measurements from a magnetometer array can indirectly extract odometry information -- without requiring algorithmic modifications -- and thus sustain longer exploratory phases. The theoretical analysis and simulation results show that such a system can extract odometry information and indirectly create a magnetic field odometry-aided inertial navigation system during the exploration phases. However, practical challenges related to map resolution and computational complexity remain significant.

RCRank: Multimodal Ranking of Root Causes of Slow Queries in Cloud Database Systems

Authors:Biao Ouyang, Yingying Zhang, Hanyin Cheng, Yang Shu, Chenjuan Guo, Bin Yang, Qingsong Wen, Lunting Fan, Christian S. Jensen
Date:2025-03-06 09:35:20

With the continued migration of storage to cloud database systems,the impact of slow queries in such systems on services and user experience is increasing. Root-cause diagnosis plays an indispensable role in facilitating slow-query detection and revision. This paper proposes a method capable of both identifying possible root cause types for slow queries and ranking these according to their potential for accelerating slow queries. This enables prioritizing root causes with the highest impact, in turn improving slow-query revision effectiveness. To enable more accurate and detailed diagnoses, we propose the multimodal Ranking for the Root Causes of slow queries (RCRank) framework, which formulates root cause analysis as a multimodal machine learning problem and leverages multimodal information from query statements, execution plans, execution logs, and key performance indicators. To obtain expressive embeddings from its heterogeneous multimodal input, RCRank integrates self-supervised pre-training that enhances cross-modal alignment and task relevance. Next, the framework integrates root-cause-adaptive cross Transformers that enable adaptive fusion of multimodal features with varying characteristics. Finally, the framework offers a unified model that features an impact-aware training objective for identifying and ranking root causes. We report on experiments on real and synthetic datasets, finding that RCRank is capable of consistently outperforming the state-of-the-art methods at root cause identification and ranking according to a range of metrics.

Simulation-based Analysis Of Highway Trajectory Planning Using High-Order Polynomial For Highly Automated Driving Function

Authors:Milin Patel, Marzana Khatun, Rolf Jung, Michael Glaß
Date:2025-03-06 07:23:17

One of the fundamental tasks of autonomous driving is safe trajectory planning, the task of deciding where the vehicle needs to drive, while avoiding obstacles, obeying safety rules, and respecting the fundamental limits of road. Real-world application of such a method involves consideration of surrounding environment conditions and movements such as Lane Change, collision avoidance, and lane merge. The focus of the paper is to develop and implement safe collision free highway Lane Change trajectory using high order polynomial for Highly Automated Driving Function (HADF). Planning is often considered as a higher-level process than control. Behavior Planning Module (BPM) is designed that plans the high-level driving actions like Lane Change maneuver to safely achieve the functionality of transverse guidance ensuring safety of the vehicle using motion planning in a scenario including environmental situation. Based on the recommendation received from the (BPM), the function will generate a desire corresponding trajectory. The proposed planning system is situation specific with polynomial based algorithm for same direction two lane highway scenario. To support the trajectory system polynomial curve can be used to reduces overall complexity and thereby allows rapid computation. The proposed Lane Change scenario is modeled, and results has been analyzed (verified and validate) through the MATLAB simulation environment. The method proposed in this paper has achieved a significant improvement in safety and stability of Lane Changing maneuver.

Robust Computer-Vision based Construction Site Detection for Assistive-Technology Applications

Authors:Junchi Feng, Giles Hamilton-Fletcher, Nikhil Ballem, Michael Batavia, Yifei Wang, Jiuling Zhong, Maurizio Porfiri, John-Ross Rizzo
Date:2025-03-06 06:35:19

Navigating urban environments poses significant challenges for people with disabilities, particularly those with blindness and low vision. Environments with dynamic and unpredictable elements like construction sites are especially challenging. Construction sites introduce hazards like uneven surfaces, obstructive barriers, hazardous materials, and excessive noise, and they can alter routing, complicating safe mobility. Existing assistive technologies are limited, as navigation apps do not account for construction sites during trip planning, and detection tools that attempt hazard recognition struggle to address the extreme variability of construction paraphernalia. This study introduces a novel computer vision-based system that integrates open-vocabulary object detection, a YOLO-based scaffolding-pole detection model, and an optical character recognition (OCR) module to comprehensively identify and interpret construction site elements for assistive navigation. In static testing across seven construction sites, the system achieved an overall accuracy of 88.56\%, reliably detecting objects from 2m to 10m within a 0$^\circ$ -- 75$^\circ$ angular offset. At closer distances (2--4m), the detection rate was 100\% at all tested angles. At

Fractional Correspondence Framework in Detection Transformer

Authors:Masoumeh Zareapoor, Pourya Shamsolmoali, Huiyu Zhou, Yue Lu, Salvador García
Date:2025-03-06 05:29:20

The Detection Transformer (DETR), by incorporating the Hungarian algorithm, has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training. While effective, this strict matching process does not inherently account for the varying densities and distributions of objects, leading to suboptimal correspondences such as failing to handle multiple detections of the same object or missing small objects. To address this, we propose the Regularized Transport Plan (RTP). RTP introduces a flexible matching strategy that captures the cost of aligning predictions with ground truths to find the most accurate correspondences between these sets. By utilizing the differentiable Sinkhorn algorithm, RTP allows for soft, fractional matching rather than strict one-to-one assignments. This approach enhances the model's capability to manage varying object densities and distributions effectively. Our extensive evaluations on the MS-COCO and VOC benchmarks demonstrate the effectiveness of our approach. RTP-DETR, surpassing the performance of the Deform-DETR and the recently introduced DINO-DETR, achieving absolute gains in mAP of +3.8% and +1.7%, respectively.

Compositional Structures as Substrates for Human-AI Co-creation Environment: A Design Approach and A Case Study

Authors:Yining Cao, Yiyi Huang, Anh Truong, Hijung Valentina Shin, Haijun Xia
Date:2025-03-06 05:23:38

It has been increasingly recognized that effective human-AI co-creation requires more than prompts and results, but an environment with empowering structures that facilitate exploration, planning, iteration, as well as control and inspection of AI generation. Yet, a concrete design approach to such an environment has not been established. Our literature analysis highlights that compositional structures-which organize and visualize individual elements into meaningful wholes-are highly effective in granting creators control over the essential aspects of their content. However, efficiently aggregating and connecting these structures to support the full creation process remains challenging. Therefore, we propose a design approach of leveraging compositional structures as the substrates and infusing AI within and across these structures to enable a controlled and fluid creation process. We evaluate this approach through a case study of developing a video co-creation environment using this approach. User evaluation shows that such an environment allowed users to stay oriented in their creation activity, remain aware and in control of AI's generation, and enable flexible human-AI collaborative workflows.

H3O: Hyper-Efficient 3D Occupancy Prediction with Heterogeneous Supervision

Authors:Yunxiao Shi, Hong Cai, Amin Ansari, Fatih Porikli
Date:2025-03-06 03:27:14

3D occupancy prediction has recently emerged as a new paradigm for holistic 3D scene understanding and provides valuable information for downstream planning in autonomous driving. Most existing methods, however, are computationally expensive, requiring costly attention-based 2D-3D transformation and 3D feature processing. In this paper, we present a novel 3D occupancy prediction approach, H3O, which features highly efficient architecture designs that incur a significantly lower computational cost as compared to the current state-of-the-art methods. In addition, to compensate for the ambiguity in ground-truth 3D occupancy labels, we advocate leveraging auxiliary tasks to complement the direct 3D supervision. In particular, we integrate multi-camera depth estimation, semantic segmentation, and surface normal estimation via differentiable volume rendering, supervised by corresponding 2D labels that introduces rich and heterogeneous supervision signals. We conduct extensive experiments on the Occ3D-nuScenes and SemanticKITTI benchmarks that demonstrate the superiority of our proposed H3O.

Planning and Control for Deformable Linear Object Manipulation

Authors:Burak Aksoy, John Wen
Date:2025-03-06 01:44:36

Manipulating a deformable linear object (DLO) such as wire, cable, and rope is a common yet challenging task due to their high degrees of freedom and complex deformation behaviors, especially in an environment with obstacles. Existing local control methods are efficient but prone to failure in complex scenarios, while precise global planners are computationally intensive and difficult to deploy. This paper presents an efficient, easy-to-deploy framework for collision-free DLO manipulation using mobile manipulators. We demonstrate the effectiveness of leveraging standard planning tools for high-dimensional DLO manipulation without requiring custom planners or extensive data-driven models. Our approach combines an off-the-shelf global planner with a real-time local controller. The global planner approximates the DLO as a series of rigid links connected by spherical joints, enabling rapid path planning without the need for problem-specific planners or large datasets. The local controller employs control barrier functions (CBFs) to enforce safety constraints, maintain the DLO integrity, prevent overstress, and handle obstacle avoidance. It compensates for modeling inaccuracies by using a state-of-the-art position-based dynamics technique that approximates physical properties like Young's and shear moduli. We validate our framework through extensive simulations and real-world demonstrations. In complex obstacle scenarios-including tent pole transport, corridor navigation, and tasks requiring varied stiffness-our method achieves a 100% success rate over thousands of trials, with significantly reduced planning times compared to state-of-the-art techniques. Real-world experiments include transportation of a tent pole and a rope using mobile manipulators. We share our ROS-based implementation to facilitate adoption in various applications.

Enhancing Autonomous Driving Safety with Collision Scenario Integration

Authors:Zi Wang, Shiyi Lan, Xinglong Sun, Nadine Chang, Zhenxin Li, Zhiding Yu, Jose M. Alvarez
Date:2025-03-05 23:08:43

Autonomous vehicle safety is crucial for the successful deployment of self-driving cars. However, most existing planning methods rely heavily on imitation learning, which limits their ability to leverage collision data effectively. Moreover, collecting collision or near-collision data is inherently challenging, as it involves risks and raises ethical and practical concerns. In this paper, we propose SafeFusion, a training framework to learn from collision data. Instead of over-relying on imitation learning, SafeFusion integrates safety-oriented metrics during training to enable collision avoidance learning. In addition, to address the scarcity of collision data, we propose CollisionGen, a scalable data generation pipeline to generate diverse, high-quality scenarios using natural language prompts, generative models, and rule-based filtering. Experimental results show that our approach improves planning performance in collision-prone scenarios by 56\% over previous state-of-the-art planners while maintaining effectiveness in regular driving situations. Our work provides a scalable and effective solution for advancing the safety of autonomous driving systems.

CREStE: Scalable Mapless Navigation with Internet Scale Priors and Counterfactual Guidance

Authors:Arthur Zhang, Harshit Sikchi, Amy Zhang, Joydeep Biswas
Date:2025-03-05 21:42:46

We address the long-horizon mapless navigation problem: enabling robots to traverse novel environments without relying on high-definition maps or precise waypoints that specify exactly where to navigate. Achieving this requires overcoming two major challenges -- learning robust, generalizable perceptual representations of the environment without pre-enumerating all possible navigation factors and forms of perceptual aliasing and utilizing these learned representations to plan human-aligned navigation paths. Existing solutions struggle to generalize due to their reliance on hand-curated object lists that overlook unforeseen factors, end-to-end learning of navigation features from scarce large-scale robot datasets, and handcrafted reward functions that scale poorly to diverse scenarios. To overcome these limitations, we propose CREStE, the first method that learns representations and rewards for addressing the full mapless navigation problem without relying on large-scale robot datasets or manually curated features. CREStE leverages visual foundation models trained on internet-scale data to learn continuous bird's-eye-view representations capturing elevation, semantics, and instance-level features. To utilize learned representations for planning, we propose a counterfactual-based loss and active learning procedure that focuses on the most salient perceptual cues by querying humans for counterfactual trajectory annotations in challenging scenes. We evaluate CREStE in kilometer-scale navigation tasks across six distinct urban environments. CREStE significantly outperforms all state-of-the-art approaches with 70% fewer human interventions per mission, including a 2-kilometer mission in an unseen environment with just 1 intervention; showcasing its robustness and effectiveness for long-horizon mapless navigation. For videos and additional materials, see https://amrl.cs.utexas.edu/creste .

GO-VMP: Global Optimization for View Motion Planning in Fruit Mapping

Authors:Allen Isaac Jose, Sicong Pan, Tobias Zaenker, Rohit Menon, Sebastian Houben, Maren Bennewitz
Date:2025-03-05 21:25:03

Automating labor-intensive tasks such as crop monitoring with robots is essential for enhancing production and conserving resources. However, autonomously monitoring horticulture crops remains challenging due to their complex structures, which often result in fruit occlusions. Existing view planning methods attempt to reduce occlusions but either struggle to achieve adequate coverage or incur high robot motion costs. We introduce a global optimization approach for view motion planning that aims to minimize robot motion costs while maximizing fruit coverage. To this end, we leverage coverage constraints derived from the set covering problem (SCP) within a shortest Hamiltonian path problem (SHPP) formulation. While both SCP and SHPP are well-established, their tailored integration enables a unified framework that computes a global view path with minimized motion while ensuring full coverage of selected targets. Given the NP-hard nature of the problem, we employ a region-prior-based selection of coverage targets and a sparse graph structure to achieve effective optimization outcomes within a limited time. Experiments in simulation demonstrate that our method detects more fruits, enhances surface coverage, and achieves higher volume accuracy than the motion-efficient baseline with a moderate increase in motion cost, while significantly reducing motion costs compared to the coverage-focused baseline. Real-world experiments further confirm the practical applicability of our approach.

Predicting Cislunar Orbit Lifetimes from Initial Orbital Elements

Authors:Denvir Higgins, Travis Yeager, Peter McGill, James Buchanan, Tara Grice, Alexx Perloff, Michael Schneider
Date:2025-03-05 20:49:53

Cislunar space is the volume between Earth's geosynchronous orbit and beyond the Moon, including the lunar Lagrange points. Understanding the stability of orbits within this space is crucial for the successful planning and execution of space missions. Orbits in cislunar space are influenced by the gravitational forces of the Sun, Earth, Moon, and other Solar System planets leading to typically unpredictable and chaotic behavior. It is therefore difficult to predict the stability of an orbit from a set of initial orbital elements. We simulate one million cislunar orbits and use a self-organizing map (SOM) to cluster the orbits into families based on how long they remain stable within the cislunar regime. Utilizing Lawrence Livermore National Laboratory's (LLNL) High Performance Computers (HPC) we develop a highly adaptable SOM capable of efficiently characterizing observations from individual events. We are able to predict the lifetime from the initial three line element (TLE) to within 10 percent for 8 percent of the test dataset, within 50 percent for 43 percent of the dataset, and within 100 percent for 75 percent of the dataset. The fractional absolute deviation peaks at 1 for all lifetimes. Multi-modal clustering in the SOM suggests that a variety of orbital morphologies have similar lifetimes. The trained SOMs use an average of 2.73 milliseconds of computational time to produce an orbital stability prediction. The outcomes of this research enhance our understanding of cislunar orbital dynamics and also provide insights for mission planning, enabling the rapid identification of stable orbital regions and pathways for future space exploration. As demonstrated in this study, an SOM can generate orbital lifetime estimates from minimal observational data, such as a single TLE, making it essential for early warning systems and large-scale sensor network operations.

Learning to Negotiate via Voluntary Commitment

Authors:Shuhui Zhu, Baoxiang Wang, Sriram Ganapathi Subramanian, Pascal Poupart
Date:2025-03-05 19:55:10

The partial alignment and conflict of autonomous agents lead to mixed-motive scenarios in many real-world applications. However, agents may fail to cooperate in practice even when cooperation yields a better outcome. One well known reason for this failure comes from non-credible commitments. To facilitate commitments among agents for better cooperation, we define Markov Commitment Games (MCGs), a variant of commitment games, where agents can voluntarily commit to their proposed future plans. Based on MCGs, we propose a learnable commitment protocol via policy gradients. We further propose incentive-compatible learning to accelerate convergence to equilibria with better social welfare. Experimental results in challenging mixed-motive tasks demonstrate faster empirical convergence and higher returns for our method compared with its counterparts. Our code is available at https://github.com/shuhui-zhu/DCL.

Constrained Gaussian Wasserstein Optimal Transport with Commutative Covariance Matrices

Authors:Jun Chen, Jia Wang, Ruibin Li, Han Zhou, Wei Dong, Huan Liu, Yuanhao Yu
Date:2025-03-05 18:56:48

Optimal transport has found widespread applications in signal processing and machine learning. Among its many equivalent formulations, optimal transport seeks to reconstruct a random variable/vector with a prescribed distribution at the destination while minimizing the expected distortion relative to a given random variable/vector at the source. However, in practice, certain constraints may render the optimal transport plan infeasible. In this work, we consider three types of constraints: rate constraints, dimension constraints, and channel constraints, motivated by perception-aware lossy compression, generative principal component analysis, and deep joint source-channel coding, respectively. Special attenion is given to the setting termed Gaussian Wasserstein optimal transport, where both the source and reconstruction variables are multivariate Gaussian, and the end-to-end distortion is measured by the mean squared error. We derive explicit results for the minimum achievable mean squared error under the three aforementioned constraints when the covariance matrices of the source and reconstruction variables commute.

CHOP: Mobile Operating Assistant with Constrained High-frequency Optimized Subtask Planning

Authors:Yuqi Zhou, Shuai Wang, Sunhao Dai, Qinglin Jia, Zhaocheng Du, Zhenhua Dong, Jun Xu
Date:2025-03-05 18:56:16

The advancement of visual language models (VLMs) has enhanced mobile device operations, allowing simulated human-like actions to address user requirements. Current VLM-based mobile operating assistants can be structured into three levels: task, subtask, and action. The subtask level, linking high-level goals with low-level executable actions, is crucial for task completion but faces two challenges: ineffective subtasks that lower-level agent cannot execute and inefficient subtasks that fail to contribute to the completion of the higher-level task. These challenges stem from VLM's lack of experience in decomposing subtasks within GUI scenarios in multi-agent architecture. To address these, we propose a new mobile assistant architecture with constrained high-frequency o}ptimized planning (CHOP). Our approach overcomes the VLM's deficiency in GUI scenarios planning by using human-planned subtasks as the basis vector. We evaluate our architecture in both English and Chinese contexts across 20 Apps, demonstrating significant improvements in both effectiveness and efficiency. Our dataset and code is available at https://github.com/Yuqi-Zhou/CHOP

A modeling framework to support the electrification of private transport in African cities: a case study of Addis Ababa

Authors:Jérémy Dumoulin, Dawit Gebremeskel, Kanchwodia Gashaw, Ingeborg Graabak, Noémie Jeannin, Alejandro Pena-Bello, Christophe Ballif, Nicolas Wyrsch
Date:2025-03-05 17:07:49

The electrification of road transport, as the predominant mode of transportation in Africa, represents a great opportunity to reduce greenhouse gas emissions and dependence on costly fuel imports. However, it introduces major challenges for local energy infrastructures, including the deployment of charging stations and the impact on often fragile electricity grids. Despite its importance, research on electric mobility planning in Africa remains limited, while existing planning tools rely on detailed local mobility data that is often unavailable, especially for privately owned passenger vehicles. In this study, we introduce a novel framework designed to support private vehicle electrification in data-scarce regions and apply it to Addis Ababa, simulating the mobility patterns and charging needs of 100,000 electric vehicles. Our analysis indicate that these vehicles generate a daily charging demand of approximately 350 MWh and emphasize the significant influence of the charging location on the spatial and temporal distribution of this demand. Notably, charging at public places can help smooth the charging demand throughout the day, mitigating peak charging loads on the electricity grid. We also estimate charging station requirements, finding that workplace charging requires approximately one charging point per three electric vehicles, while public charging requires only one per thirty. Finally, we demonstrate that photovoltaic energy can cover a substantial share of the charging needs, emphasizing the potential for renewable energy integration. This study lays the groundwork for electric mobility planning in Addis Ababa while offering a transferable framework for other African cities.

Motion Planning and Control with Unknown Nonlinear Dynamics through Predicted Reachability

Authors:Zhiquan Zhang, Gokul Puthumanaillam, Manav Vora, Melkior Ornik
Date:2025-03-05 16:14:36

Autonomous motion planning under unknown nonlinear dynamics presents significant challenges. An agent needs to continuously explore the system dynamics to acquire its properties, such as reachability, in order to guide system navigation adaptively. In this paper, we propose a hybrid planning-control framework designed to compute a feasible trajectory toward a target. Our approach involves partitioning the state space and approximating the system by a piecewise affine (PWA) system with constrained control inputs. By abstracting the PWA system into a directed weighted graph, we incrementally update the existence of its edges via affine system identification and reach control theory, introducing a predictive reachability condition by exploiting prior information of the unknown dynamics. Heuristic weights are assigned to edges based on whether their existence is certain or remains indeterminate. Consequently, we propose a framework that adaptively collects and analyzes data during mission execution, continually updates the predictive graph, and synthesizes a controller online based on the graph search outcomes. We demonstrate the efficacy of our approach through simulation scenarios involving a mobile robot operating in unknown terrains, with its unknown dynamics abstracted as a single integrator model.

A Generative System for Robot-to-Human Handovers: from Intent Inference to Spatial Configuration Imagery

Authors:Hanxin Zhang, Abdulqader Dhafer, Zhou Daniel Hao, Hongbiao Dong
Date:2025-03-05 15:13:54

We propose a novel system for robot-to-human object handover that emulates human coworker interactions. Unlike most existing studies that focus primarily on grasping strategies and motion planning, our system focus on 1. inferring human handover intents, 2. imagining spatial handover configuration. The first one integrates multimodal perception-combining visual and verbal cues-to infer human intent. The second one using a diffusion-based model to generate the handover configuration, involving the spacial relationship among robot's gripper, the object, and the human hand, thereby mimicking the cognitive process of motor imagery. Experimental results demonstrate that our approach effectively interprets human cues and achieves fluent, human-like handovers, offering a promising solution for collaborative robotics. Code, videos, and data are available at: https://i3handover.github.io.

A Benchmark for Optimal Multi-Modal Multi-Robot Multi-Goal Path Planning with Given Robot Assignment

Authors:Valentin N. Hartmann, Tirza Heinle, Stelian Coros
Date:2025-03-05 13:57:05

In many industrial robotics applications, multiple robots are working in a shared workspace to complete a set of tasks as quickly as possible. Such settings can be treated as multi-modal multi-robot multi-goal path planning problems, where each robot has to reach an ordered sequence of goals. Existing approaches to this type of problem solve this using prioritization or assume synchronous completion of tasks, and are thus neither optimal nor complete. We formalize this problem as a single path planning problem and introduce a benchmark encompassing a diverse range of problem instances including scenarios with various robots, planning horizons, and collaborative tasks such as handovers. Along with the benchmark, we adapt an RRT* and a PRM* planner to serve as a baseline for the planning problems. Both planners work in the composite space of all robots and introduce the required changes to work in our setting. Unlike existing approaches, our planner and formulation is not restricted to discretized 2D workspaces, supports a changing environment, and works for heterogeneous robot teams over multiple modes with different constraints, and multiple goals. Videos and code for the benchmark and the planners is available at https://vhartman.github.io/mrmg-planning/.

Generative Artificial Intelligence in Robotic Manipulation: A Survey

Authors:Kun Zhang, Peng Yun, Jun Cen, Junhao Cai, Didi Zhu, Hangjie Yuan, Chao Zhao, Tao Feng, Michael Yu Wang, Qifeng Chen, Jia Pan, Bo Yang, Hua Chen
Date:2025-03-05 12:54:54

This survey provides a comprehensive review on recent advancements of generative learning models in robotic manipulation, addressing key challenges in the field. Robotic manipulation faces critical bottlenecks, including significant challenges in insufficient data and inefficient data acquisition, long-horizon and complex task planning, and the multi-modality reasoning ability for robust policy learning performance across diverse environments. To tackle these challenges, this survey introduces several generative model paradigms, including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), diffusion models, probabilistic flow models, and autoregressive models, highlighting their strengths and limitations. The applications of these models are categorized into three hierarchical layers: the Foundation Layer, focusing on data generation and reward generation; the Intermediate Layer, covering language, code, visual, and state generation; and the Policy Layer, emphasizing grasp generation and trajectory generation. Each layer is explored in detail, along with notable works that have advanced the state of the art. Finally, the survey outlines future research directions and challenges, emphasizing the need for improved efficiency in data utilization, better handling of long-horizon tasks, and enhanced generalization across diverse robotic scenarios. All the related resources, including research papers, open-source data, and projects, are collected for the community in https://github.com/GAI4Manipulation/AwesomeGAIManipulation

Active Learning for Deep Learning-Based Hemodynamic Parameter Estimation

Authors:Patryk Rygiel, Julian Suk, Kak Khee Yeung, Christoph Brune, Jelmer M. Wolterink
Date:2025-03-05 12:35:54

Hemodynamic parameters such as pressure and wall shear stress play an important role in diagnosis, prognosis, and treatment planning in cardiovascular diseases. These parameters can be accurately computed using computational fluid dynamics (CFD), but CFD is computationally intensive. Hence, deep learning methods have been adopted as a surrogate to rapidly estimate CFD outcomes. A drawback of such data-driven models is the need for time-consuming reference CFD simulations for training. In this work, we introduce an active learning framework to reduce the number of CFD simulations required for the training of surrogate models, lowering the barriers to their deployment in new applications. We propose three distinct querying strategies to determine for which unlabeled samples CFD simulations should be obtained. These querying strategies are based on geometrical variance, ensemble uncertainty, and adherence to the physics governing fluid dynamics. We benchmark these methods on velocity field estimation in synthetic coronary artery bifurcations and find that they allow for substantial reductions in annotation cost. Notably, we find that our strategies reduce the number of samples required by up to 50% and make the trained models more robust to difficult cases. Our results show that active learning is a feasible strategy to increase the potential of deep learning-based CFD surrogates.

Top-K Maximum Intensity Projection Priors for 3D Liver Vessel Segmentation

Authors:Xiaotong Zhang, Alexander Broersen, Gonnie CM van Erp, Silvia L. Pintea, Jouke Dijkstra
Date:2025-03-05 10:43:01

Liver-vessel segmentation is an essential task in the pre-operative planning of liver resection. State-of-the-art 2D or 3D convolution-based methods focusing on liver vessel segmentation on 2D CT cross-sectional views, which do not take into account the global liver-vessel topology. To maintain this global vessel topology, we rely on the underlying physics used in the CT reconstruction process, and apply this to liver-vessel segmentation. Concretely, we introduce the concept of top-k maximum intensity projections, which mimics the CT reconstruction by replacing the integral along each projection direction, with keeping the top-k maxima along each projection direction. We use these top-k maximum projections to condition a diffusion model and generate 3D liver-vessel trees. We evaluate our 3D liver-vessel segmentation on the 3D-ircadb-01 dataset, and achieve the highest Dice coefficient, intersection-over-union (IoU), and Sensitivity scores compared to prior work.

SEAL: Safety Enhanced Trajectory Planning and Control Framework for Quadrotor Flight in Complex Environments

Authors:Yiming Wang, Jianbin Ma, Junda Wu, Huizhe Li, Zhexuan Zhou, Youmin Gong, Jie Mei, Guangfu Ma
Date:2025-03-05 10:15:56

For quadrotors, achieving safe and autonomous flight in complex environments with wind disturbances and dynamic obstacles still faces significant challenges. Most existing methods address wind disturbances in either trajectory planning or control, which may lead to hazardous situations during flight. The emergence of dynamic obstacles would further worsen the situation. Therefore, we propose an efficient and reliable framework for quadrotors that incorporates wind disturbance estimations during both the planning and control phases via a generalized proportional integral observer. First, we develop a real-time adaptive spatial-temporal trajectory planner that utilizes Hamilton-Jacobi (HJ) reachability analysis for error dynamics resulting from wind disturbances. By considering the forward reachability sets propagation on an Euclidean Signed Distance Field (ESDF) map, safety is guaranteed. Additionally, a Nonlinear Model Predictive Control (NMPC) controller considering wind disturbance compensation is implemented for robust trajectory tracking. Simulation and real-world experiments verify the effectiveness of our framework. The video and supplementary material will be available at https://github.com/Ma29-HIT/SEAL/.

Navigating Intelligence: A Survey of Google OR-Tools and Machine Learning for Global Path Planning in Autonomous Vehicles

Authors:Alexandre Benoit, Pedram Asef
Date:2025-03-05 10:12:22

We offer a new in-depth investigation of global path planning (GPP) for unmanned ground vehicles, an autonomous mining sampling robot named ROMIE. GPP is essential for ROMIE's optimal performance, which is translated into solving the traveling salesman problem, a complex graph theory challenge that is crucial for determining the most effective route to cover all sampling locations in a mining field. This problem is central to enhancing ROMIE's operational efficiency and competitiveness against human labor by optimizing cost and time. The primary aim of this research is to advance GPP by developing, evaluating, and improving a cost-efficient software and web application. We delve into an extensive comparison and analysis of Google operations research (OR)-Tools optimization algorithms. Our study is driven by the goal of applying and testing the limits of OR-Tools capabilities by integrating Reinforcement Learning techniques for the first time. This enables us to compare these methods with OR-Tools, assessing their computational effectiveness and real-world application efficiency. Our analysis seeks to provide insights into the effectiveness and practical application of each technique. Our findings indicate that Q-Learning stands out as the optimal strategy, demonstrating superior efficiency by deviating only 1.2% on average from the optimal solutions across our datasets.

IoT Integration Protocol for Enhanced Hospital Care

Authors:Ellie Zontou, Antonia Kyprioti
Date:2025-03-05 10:07:48

This paper introduces the "IoT Integration Protocol for Enhanced Hospital Care", a comprehensive framework designed to leverage Internet of Things (IoT) technology to enhance patient care, improve operational efficiency, and ensure data security in hospital settings. With the growing emphasis on utilizing advanced technologies in healthcare, this protocol aims to harness the potential of IoT devices to optimize patient monitoring, enable remote care, and support clinical decision-making. By integrating IoT seamlessly into nursing workflows and patient care plans, hospitals can achieve higher levels of patient-centric care and real-time data insights, leading to better treatment outcomes and resource allocation. This paper outlines the protocol's objectives, key components, and expected benefits, while emphasizing the importance of ethical considerations and ongoing evaluation to ensure successful implementation.