planning - 2025-04-10

Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning

Authors:Chenjie Hao, Weyl Lu, Yifan Xu, Yubei Chen
Date:2025-04-09 17:59:32

An embodied system must not only model the patterns of the external world but also understand its own motion dynamics. A motion dynamic model is essential for efficient skill acquisition and effective planning. In this work, we introduce the neural motion simulator (MoSim), a world model that predicts the future physical state of an embodied system based on current observations and actions. MoSim achieves state-of-the-art performance in physical state prediction and provides competitive performance across a range of downstream tasks. This works shows that when a world model is accurate enough and performs precise long-horizon predictions, it can facilitate efficient skill acquisition in imagined worlds and even enable zero-shot reinforcement learning. Furthermore, MoSim can transform any model-free reinforcement learning (RL) algorithm into a model-based approach, effectively decoupling physical environment modeling from RL algorithm development. This separation allows for independent advancements in RL algorithms and world modeling, significantly improving sample efficiency and enhancing generalization capabilities. Our findings highlight that world models for motion dynamics is a promising direction for developing more versatile and capable embodied systems.

AssistanceZero: Scalably Solving Assistance Games

Authors:Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
Date:2025-04-09 17:59:03

Assistance games are a promising alternative to reinforcement learning from human feedback (RLHF) for training AI assistants. Assistance games resolve key drawbacks of RLHF, such as incentives for deceptive behavior, by explicitly modeling the interaction between assistant and user as a two-player game where the assistant cannot observe their shared goal. Despite their potential, assistance games have only been explored in simple settings. Scaling them to more complex environments is difficult because it requires both solving intractable decision-making problems under uncertainty and accurately modeling human users' behavior. We present the first scalable approach to solving assistance games and apply it to a new, challenging Minecraft-based assistance game with over $10^{400}$ possible goals. Our approach, AssistanceZero, extends AlphaZero with a neural network that predicts human actions and rewards, enabling it to plan under uncertainty. We show that AssistanceZero outperforms model-free RL algorithms and imitation learning in the Minecraft-based assistance game. In a human study, our AssistanceZero-trained assistant significantly reduces the number of actions participants take to complete building tasks in Minecraft. Our results suggest that assistance games are a tractable framework for training effective AI assistants in complex environments. Our code and models are available at https://github.com/cassidylaidlaw/minecraft-building-assistance-game.

Self-Steering Language Models

Authors:Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, Jacob Andreas
Date:2025-04-09 17:54:22

While test-time reasoning enables language models to tackle complex tasks, searching or planning in natural language can be slow, costly, and error-prone. But even when LMs struggle to emulate the precise reasoning steps needed to solve a problem, they often excel at describing its abstract structure--both how to verify solutions and how to search for them. This paper introduces DisCIPL, a method for "self-steering" LMs where a Planner model generates a task-specific inference program that is executed by a population of Follower models. Our approach equips LMs with the ability to write recursive search procedures that guide LM inference, enabling new forms of verifiable and efficient reasoning. When instantiated with a small Follower (e.g., Llama-3.2-1B), DisCIPL matches (and sometimes outperforms) much larger models, including GPT-4o and o1, on challenging constrained generation tasks. In decoupling planning from execution, our work opens up a design space of highly-parallelized Monte Carlo inference strategies that outperform standard best-of-N sampling, require no finetuning, and can be implemented automatically by existing LMs.

Microlensing at Cosmological Distances: Event Rate Predictions in the Warhol Arc of MACS 0416

Authors:J. M. Palencia, J. M. Diego, L. Dai, M. Pascale, R. Windhorst, A. M. Koekemoer, Sung Kei Li, B. J. Kavanagh, Fengwu Sun, Amruth Alfred, Ashish K. Meena, Thomas J. Broadhurst, Patrick L. Kelly, Derek Perera, Hayley Williams, Adi Zitrin
Date:2025-04-09 16:54:22

Highly magnified stars ($\mu$ $>$ 100) are now outinely identified as transient events at cosmological distances thanks to microlensing by intra-cluster stars near the critical curves of galaxy clusters. Using the {\it James Webb} Space Telescope (JWST) in combination with the {\it Hubble} Space Telescope (HST), we outline here an analytical framework that is applied to the Warhol arc (at $z=0.94$) in the MACS 0416 galaxy cluster (at $z=0.396)$ where over a dozen microlensed stars have been detected to date. This method is general and can be applied to other lensed arcs. Within this lensed galaxy we fit the spatially resolved SED spanned by eight JWST-NIRCam filters combined with three ACS filters, for accurate lensed star predictions in 2D. With this tool we can generate 2D maps of microlensed stars for well resolved arcs in general, including dependence on wavelength and limiting apparent magnitude, for comparison with with planned cadenced campaigns for JWST and Hubble, for constraining directly the IMF and the level of dark matter substructure.

UAV Position Estimation using a LiDAR-based 3D Object Detection Method

Authors:Uthman Olawoye, Jason N. Gross
Date:2025-04-09 16:43:59

This paper explores the use of applying a deep learning approach for 3D object detection to compute the relative position of an Unmanned Aerial Vehicle (UAV) from an Unmanned Ground Vehicle (UGV) equipped with a LiDAR sensor in a GPS-denied environment. This was achieved by evaluating the LiDAR sensor's data through a 3D detection algorithm (PointPillars). The PointPillars algorithm incorporates a column voxel point-cloud representation and a 2D Convolutional Neural Network (CNN) to generate distinctive point-cloud features representing the object to be identified, in this case, the UAV. The current localization method utilizes point-cloud segmentation, Euclidean clustering, and predefined heuristics to obtain the relative position of the UAV. Results from the two methods were then compared to a reference truth solution.

Leveraging GCN-based Action Recognition for Teleoperation in Daily Activity Assistance

Authors:Thomas M. Kwok, Jiaan Li, Yue Hu
Date:2025-04-09 16:14:55

Caregiving of older adults is an urgent global challenge, with many older adults preferring to age in place rather than enter residential care. However, providing adequate home-based assistance remains difficult, particularly in geographically vast regions. Teleoperated robots offer a promising solution, but conventional motion-mapping teleoperation imposes unnatural movement constraints on operators, leading to muscle fatigue and reduced usability. This paper presents a novel teleoperation framework that leverages action recognition to enable intuitive remote robot control. Using our simplified Spatio-Temporal Graph Convolutional Network (S-ST-GCN), the system recognizes human actions and executes corresponding preset robot trajectories, eliminating the need for direct motion synchronization. A finite-state machine (FSM) is integrated to enhance reliability by filtering out misclassified actions. Our experiments demonstrate that the proposed framework enables effortless operator movement while ensuring accurate robot execution. This proof-of-concept study highlights the potential of teleoperation with action recognition for enabling caregivers to remotely assist older adults during activities of daily living (ADLs). Future work will focus on improving the S-ST-GCN's recognition accuracy and generalization, integrating advanced motion planning techniques to further enhance robotic autonomy in older adult care, and conducting a user study to evaluate the system's telepresence and ease of control.

Optimal promotions of new products on networks

Authors:Gadi Fibich, Amit Golan
Date:2025-04-09 15:23:10

We present a novel methodology for analyzing the optimal promotion in the Bass model for the spreading of new products on networks. For general networks with $M$ nodes, the optimal promotion is the solution of $2^M-1$ nonlinearly-coupled boundary-value problems. On structured networks, however, the number of equations can be reduced to a manageable size which is amendable to simulations and analysis. This enables us to gain insight into the effect of the network structure on optimal promotions. We find that the optimal advertising strategy decreases with time, whereas the optimal boosting of peer effects increases from zero and then decreases. In low-degree networks, it is optimal to prioritize advertising over boosting peer effects, but this relation is flipped in high-degree networks. When the planning horizon is finite, the optimal promotion continues until the last minute, as opposed to an infinite planning horizon where the optimal promotion decays to zero. Finally, promotions with short planning horizons can yield an order of magnitude higher increase of profits, compared to those with long planning horizons.

Understanding The Effects of Geotechnical Properties on Viscous Erosion Rate from Plume Surface Interactions

Authors:B. Dotson, A. St. John, R. Hall, D. Sapkota, D. Britt, P. Metzger
Date:2025-04-09 14:51:21

With humans returning to the Moon under the Artemis program, understanding and mitigating effects from Plume Surface Interactions (PSI) will be essential for the protection of personnel and equipment on the Moon. To help characterize the underlying mechanics associated with viscous erosion and crater formation, experimental measurements using regolith simulants and subsonic, non-reacting flows were completed using compressed air in a splitter plate, plume cratering setup. More specifically, these investigations examined the underlying effects of bulk density, cohesion, and exhaust flow characteristics on viscous erosion rates and crater formation using Lunar highlands simulant (LHS-1), Lunar mare simulant (LMS-1), LHS-1D (Dust) simulants, and 40-80 um glass beads in atmosphere. Results show that particle size distribution can ultimately influence crater shapes and erosion rates, likely owing to internal angle of friction. Measurements show that increasing bulk density, especially from an uncompacted to a slightly compacted state, decreases erosion rate by as much as 50%. While cohesion of granular material can mitigate erosion rates to some extent, higher levels of cohesion above 1,000 Pa may actually increase viscous erosion rates due to particle clumping. A modified version of Metzger's (2024a) equation for volumetric erosion rate is presented, with limitations discussed. These modified equations for viscous erosion, with limitations noted, show that geotechnical properties play an important role in viscous erosion and should be considered in PSI computer models for future mission planning.

Longitudinal Assessment of Lung Lesion Burden in CT

Authors:Tejas Sudharshan Mathai, Benjamin Hou, Ronald M. Summers
Date:2025-04-09 14:30:43

In the U.S., lung cancer is the second major cause of death. Early detection of suspicious lung nodules is crucial for patient treatment planning, management, and improving outcomes. Many approaches for lung nodule segmentation and volumetric analysis have been proposed, but few have looked at longitudinal changes in total lung tumor burden. In this work, we trained two 3D models (nnUNet) with and without anatomical priors to automatically segment lung lesions and quantified total lesion burden for each patient. The 3D model without priors significantly outperformed ($p < .001$) the model trained with anatomy priors. For detecting clinically significant lesions $>$ 1cm, a precision of 71.3\%, sensitivity of 68.4\%, and F1-score of 69.8\% was achieved. For segmentation, a Dice score of 77.1 $\pm$ 20.3 and Hausdorff distance error of 11.7 $\pm$ 24.1 mm was obtained. The median lesion burden was 6.4 cc (IQR: 2.1, 18.1) and the median volume difference between manual and automated measurements was 0.02 cc (IQR: -2.8, 1.2). Agreements were also evaluated with linear regression and Bland-Altman plots. The proposed approach can produce a personalized evaluation of the total tumor burden for a patient and facilitate interval change tracking over time.

Combining high-contrast imaging with high-resolution spectroscopy: Actual on-sky MIRI/MRS results compared to expectations

Authors:S. Martos, A. Bidot, A. Carlotti, D. Mouillet
Date:2025-04-09 13:51:20

CONTEXT: Combining high-contrast imaging with high-resolution spectroscopy offers a powerful way to detect and characterize exoplanets around nearby stars, despite challenges linked to their faintness. Instruments like VLT/SPHERE are state of the art in high-contrast imaging, but their spectral resolution (R=50) limits them to basic characterization of close companions. These systems can detect planets down to 5-10 Mjup at 10 AU from their stars. Detection limits are mainly constrained by speckle noise, which dominates over photon and detector noise at short separations, even with advanced differential imaging. Space-based high-contrast imaging is also limited by image stability. Speckle noise can, however, be mitigated through molecular mapping, a technique that leverages high-resolution spectroscopic data. AIMS: We aim to predict detection limits in spectro-imaging after molecular mapping, analyzing how photon and detector noise propagate and comparing predictions with real data to assess performance losses from instrumental effects. We also propose mitigation strategies and validate our model using observations. METHODS: We analyzed JWST/MIRI/MRS data with FastCurves, an numerical tool, and compared results to outputs from the MIRI simulator. We also applied principal component analysis (PCA) to identify and isolate systematic effects, with and without molecular mapping. RESULTS: We studied various systematic effects and their impacts on signal and noise. PCA helped highlight and reduce straylight, fringes, and aliasing. We further compared observed and modeled companion spectra. CONCLUSIONS: FastCurves was improved to account for systematics and validated with real data. In high-flux regimes, systematics impose contrast limits even with molecular mapping. Our approach could benefit other instruments and inform the planning of future facilities like ELT/ANDES and ELT/PCS.

A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools

Authors:Rio Kishimoto, Tetsuya Kanda, Yuki Manabe, Katsuro Inoue, Shi Qiu, Yoshiki Higo
Date:2025-04-09 13:35:02

A Software Bill of Materials (SBOM) is becoming an essential tool for effective software dependency management. An SBOM is a list of components used in software, including details such as component names, versions, and licenses. Using SBOMs, developers can quickly identify software components and assess whether their software depends on vulnerable libraries. Numerous tools support software dependency management through SBOMs, which can be broadly categorized into two types: tools that generate SBOMs and tools that utilize SBOMs. A substantial collection of accurate SBOMs is required to evaluate tools that utilize SBOMs. However, there is no publicly available dataset specifically designed for this purpose, and research on SBOM consumption tools remains limited. In this paper, we present a dataset of SBOMs to address this gap. The dataset we constructed comprises 46 SBOMs generated from real-world Java projects, with plans to expand it to include a broader range of projects across various programming languages. Accurate and well-structured SBOMs enable researchers to evaluate the functionality of SBOM consumption tools and identify potential issues. We collected 3,271 Java projects from GitHub and generated SBOMs for 798 of them using Maven with an open-source SBOM generation tool. These SBOMs were refined through both automatic and manual corrections to ensure accuracy, currently resulting in 46 SBOMs that comply with the SPDX Lite profile, which defines minimal requirements tailored to practical workflows in industries. This process also revealed issues with the SBOM generation tools themselves. The dataset is publicly available on Zenodo (DOI: 10.5281/zenodo.14233415).

Inducing Programmatic Skills for Agentic Tasks

Authors:Zora Zhiruo Wang, Apurva Gandhi, Graham Neubig, Daniel Fried
Date:2025-04-09 12:25:37

To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning task-specific skills online through interaction with the web environment. In this work, we demonstrate that programs are an effective representation for skills. We propose agent skill induction (ASI), which allows agents to adapt themselves by inducing, verifying, and utilizing program-based skills on the fly. We start with an evaluation on the WebArena agent benchmark and show that ASI outperforms the static baseline agent and its text-skill counterpart by 23.5% and 11.3% in success rate, mainly thanks to the programmatic verification guarantee during the induction phase. ASI also improves efficiency by reducing 10.7-15.3% of the steps over baselines, by composing primitive actions (e.g., click) into higher-level skills (e.g., search product). We then highlight the efficacy of ASI in remaining efficient and accurate under scaled-up web activities. Finally, we examine the generalizability of induced skills when transferring between websites, and find that ASI can effectively reuse common skills, while also updating incompatible skills to versatile website changes.

How do Copilot Suggestions Impact Developers' Frustration and Productivity?

Authors:Emanuela Guglielmi, Venera Arnoudova, Gabriele Bavota, Rocco Oliveto, Simone Scalabrino
Date:2025-04-09 11:55:22

Context. AI-based development tools, such as GitHub Copilot, are transforming the software development process by offering real-time code suggestions. These tools promise to improve the productivity by reducing cognitive load and speeding up task completion. Previous exploratory studies, however, show that developers sometimes perceive the automatic suggestions as intrusive. As a result, they feel like their productivity decreased. Theory. We propose two theories on the impact of automatic suggestions on frustration and productivity. First, we hypothesize that experienced developers are frustrated from automatic suggestions (mostly from irrelevant ones), and this also negatively impacts their productivity. Second, we conjecture that novice developers benefit from automatic suggestions, which reduce the frustration caused from being stuck on a technical problem and thus increase their productivity. Objective. We plan to conduct a quasi-experimental study to test our theories. The empirical evidence we will collect will allow us to either corroborate or reject our theories. Method. We will involve at least 32 developers, both experts and novices. We will ask each of them to complete two software development tasks, one with automatic suggestions enabled and one with them disabled, allowing for within-subject comparisons. We will measure independent and dependent variables by monitoring developers' actions through an IDE plugin and screen recording. Besides, we will collect physiological data through a wearable device. We will use statistical hypothesis tests to study the effects of the treatments (i.e., automatic suggestions enabled/disabled) on the outcomes (frustration and productivity).

Compatibility of Missing Data Handling Methods across the Stages of Producing Clinical Prediction Models

Authors:Antonia Tsvetanova, Matthew Sperrin, David A. Jenkins, Niels Peek, Iain Buchan, Stephanie Hyland, Marcus Taylor, Angela Wood, Richard D. Riley, Glen P. Martin
Date:2025-04-09 11:45:10

Missing data is a challenge when developing, validating and deploying clinical prediction models (CPMs). Traditionally, decisions concerning missing data handling during CPM development and validation havent accounted for whether missingness is allowed at deployment. We hypothesised that the missing data approach used during model development should optimise model performance upon deployment, whilst the approach used during model validation should yield unbiased predictive performance estimates upon deployment; we term this compatibility. We aimed to determine which combinations of missing data handling methods across the CPM life cycle are compatible. We considered scenarios where CPMs are intended to be deployed with missing data allowed or not, and we evaluated the impact of that choice on earlier modelling decisions. Through a simulation study and an empirical analysis of thoracic surgery data, we compared CPMs developed and validated using combinations of complete case analysis, mean imputation, single regression imputation, multiple imputation, and pattern sub-modelling. If planning to deploy a CPM without allowing missing data, then development and validation should use multiple imputation when required. Where missingness is allowed at deployment, the same imputation method must be used during development and validation. Commonly used combinations of missing data handling methods result in biased predictive performance estimates.

Adaptive Human-Robot Collaborative Missions using Hybrid Task Planning

Authors:Gricel Vázquez, Alexandros Evangelidis, Sepeedeh Shahbeigi, Simos Gerasimou
Date:2025-04-09 10:07:15

Producing robust task plans in human-robot collaborative missions is a critical activity in order to increase the likelihood of these missions completing successfully. Despite the broad research body in the area, which considers different classes of constraints and uncertainties, its applicability is confined to relatively simple problems that can be comfortably addressed by the underpinning mathematically-based or heuristic-driven solver engines. In this paper, we introduce a hybrid approach that effectively solves the task planning problem by decomposing it into two intertwined parts, starting with the identification of a feasible plan and followed by its uncertainty augmentation and verification yielding a set of Pareto optimal plans. To enhance its robustness, adaptation tactics are devised for the evolving system requirements and agents' capabilities. We demonstrate our approach through an industrial case study involving workers and robots undertaking activities within a vineyard, showcasing the benefits of our hybrid approach both in the generation of feasible solutions and scalability compared to native planners.

nnLandmark: A Self-Configuring Method for 3D Medical Landmark Detection

Authors:Alexandra Ertl, Shuhan Xiao, Stefan Denner, Robin Peretzke, David Zimmerer, Peter Neher, Fabian Isensee, Klaus Maier-Hein
Date:2025-04-09 09:53:39

Landmark detection plays a crucial role in medical imaging tasks that rely on precise spatial localization, including specific applications in diagnosis, treatment planning, image registration, and surgical navigation. However, manual annotation is labor-intensive and requires expert knowledge. While deep learning shows promise in automating this task, progress is hindered by limited public datasets, inconsistent benchmarks, and non-standardized baselines, restricting reproducibility, fair comparisons, and model generalizability.This work introduces nnLandmark, a self-configuring deep learning framework for 3D medical landmark detection, adapting nnU-Net to perform heatmap-based regression. By leveraging nnU-Net's automated configuration, nnLandmark eliminates the need for manual parameter tuning, offering out-of-the-box usability. It achieves state-of-the-art accuracy across two public datasets, with a mean radial error (MRE) of 1.5 mm on the Mandibular Molar Landmark (MML) dental CT dataset and 1.2 mm for anatomical fiducials on a brain MRI dataset (AFIDs), where nnLandmark aligns with the inter-rater variability of 1.5 mm. With its strong generalization, reproducibility, and ease of deployment, nnLandmark establishes a reliable baseline for 3D landmark detection, supporting research in anatomical localization and clinical workflows that depend on precise landmark identification. The code will be available soon.

Optimal Execution and Macroscopic Market Making

Authors:Ivan Guo, Shijia Jin
Date:2025-04-09 09:18:37

We propose a stochastic game modelling the strategic interaction between market makers and traders of optimal execution type. For traders, the permanent price impact commonly attributed to them is replaced by quoting strategies implemented by market makers. For market makers, order flows become endogenous, driven by tactical traders rather than assumed exogenously. Using the forward-backward stochastic differential equation (FBSDE) characterization of Nash equilibria, we establish a local well-posedness result for the general game. In the specific Almgren-Chriss-Avellaneda-Stoikov model, a decoupling approach guarantees the global well-posedness of the FBSDE system via the well-posedness of an associated backward stochastic Riccati equation. Finally, by introducing small diffusion terms into the inventory processes, global well-posedness is achieved for the approximation game.

SHiP experiment at the SPS Beam Dump Facility

Authors:SHiP Collaboration, HI-ECN3 Project Team
Date:2025-04-09 08:53:20

In 2024, the SHiP experiment, together with the associated Beam Dump Facility (BDF) under the auspices of the High Intensity ECN3 (HI-ECN3) project, was selected for the future physics exploitation of the ECN3 experimental facility at the SPS. The SHiP experiment is a general-purpose intensity-frontier setup designed to search for physics beyond the Standard Model in the domain of Feebly Interacting Particles at the GeV-scale. It comprises a multi-system apparatus that provides discovery sensitivity to both decay and scattering signatures of models with feebly interacting particles, such as dark-sector mediators, both elastic and inelastic light dark matter, as well as millicharged particles. The experiment will also be able to perform both Standard Model measurements and Beyond Standard Model searches with neutrino interactions. In particular, it will have access to unprecedented statistics of tau and anti-tau neutrinos. The construction plan foresees commissioning of the facility and detector, and start of operation in advance of Long Shutdown 4, with a programme of exploration for 15 years of data taking. By exploring unique regions of parameter space for feebly interacting particles in the GeV/c$^2$ mass range, the SHiP experiment will complement ongoing searches at the LHC and searches at future colliders.

RAMBO: RL-augmented Model-based Optimal Control for Whole-body Loco-manipulation

Authors:Jin Cheng, Dongho Kang, Gabriele Fadini, Guanya Shi, Stelian Coros
Date:2025-04-09 07:53:09

Loco-manipulation -- coordinated locomotion and physical interaction with objects -- remains a major challenge for legged robots due to the need for both accurate force interaction and robustness to unmodeled dynamics. While model-based controllers provide interpretable dynamics-level planning and optimization, they are limited by model inaccuracies and computational cost. In contrast, learning-based methods offer robustness while struggling with precise modulation of interaction forces. We introduce RAMBO -- RL-Augmented Model-Based Optimal Control -- a hybrid framework that integrates model-based reaction force optimization using a simplified dynamics model and a feedback policy trained with reinforcement learning. The model-based module generates feedforward torques by solving a quadratic program, while the policy provides feedback residuals to enhance robustness in control execution. We validate our framework on a quadruped robot across a diverse set of real-world loco-manipulation tasks -- such as pushing a shopping cart, balancing a plate, and holding soft objects -- in both quadrupedal and bipedal walking. Our experiments demonstrate that RAMBO enables precise manipulation while achieving robust and dynamic locomotion, surpassing the performance of policies trained with end-to-end scheme. In addition, our method enables flexible trade-off between end-effector tracking accuracy with compliance.

Domain-Conditioned Scene Graphs for State-Grounded Task Planning

Authors:Jonas Herzog, Jiangpin Liu, Yue Wang
Date:2025-04-09 07:51:46

Recent robotic task planning frameworks have integrated large multimodal models (LMMs) such as GPT-4V. To address grounding issues of such models, it has been suggested to split the pipeline into perceptional state grounding and subsequent state-based planning. As we show in this work, the state grounding ability of LMM-based approaches is still limited by weaknesses in granular, structured, domain-specific scene understanding. To address this shortcoming, we develop a more structured state grounding framework that features a domain-conditioned scene graph as its scene representation. We show that such representation is actionable in nature as it is directly mappable to a symbolic state in classical planning languages such as PDDL. We provide an instantiation of our state grounding framework where the domain-conditioned scene graph generation is implemented with a lightweight vision-language approach that classifies domain-specific predicates on top of domain-relevant object detections. Evaluated across three domains, our approach achieves significantly higher state estimation accuracy and task planning success rates compared to the previous LMM-based approaches.

Overcoming Dynamic Environments: A Hybrid Approach to Motion Planning for Manipulators

Authors:Ho Minh Quang Ngo, Dac Dang Khoa Nguyen, Dinh Tung Le, Gavin Paul
Date:2025-04-09 05:46:41

Robotic manipulators operating in dynamic and uncertain environments require efficient motion planning to navigate obstacles while maintaining smooth trajectories. Velocity Potential Field (VPF) planners offer real-time adaptability but struggle with complex constraints and local minima, leading to suboptimal performance in cluttered spaces. Traditional approaches rely on pre-planned trajectories, but frequent recomputation is computationally expensive. This study proposes a hybrid motion planning approach, integrating an improved VPF with a Sampling-Based Motion Planner (SBMP). The SBMP ensures optimal path generation, while VPF provides real-time adaptability to dynamic obstacles. This combination enhances motion planning efficiency, stability, and computational feasibility, addressing key challenges in uncertain environments such as warehousing and surgical robotics.

CAFE-AD: Cross-Scenario Adaptive Feature Enhancement for Trajectory Planning in Autonomous Driving

Authors:Junrui Zhang, Chenjie Wang, Jie Peng, Haoyu Li, Jianmin Ji, Yu Zhang, Yanyong Zhang
Date:2025-04-09 05:16:29

Imitation learning based planning tasks on the nuPlan dataset have gained great interest due to their potential to generate human-like driving behaviors. However, open-loop training on the nuPlan dataset tends to cause causal confusion during closed-loop testing, and the dataset also presents a long-tail distribution of scenarios. These issues introduce challenges for imitation learning. To tackle these problems, we introduce CAFE-AD, a Cross-Scenario Adaptive Feature Enhancement for Trajectory Planning in Autonomous Driving method, designed to enhance feature representation across various scenario types. We develop an adaptive feature pruning module that ranks feature importance to capture the most relevant information while reducing the interference of noisy information during training. Moreover, we propose a cross-scenario feature interpolation module that enhances scenario information to introduce diversity, enabling the network to alleviate over-fitting in dominant scenarios. We evaluate our method CAFE-AD on the challenging public nuPlan Test14-Hard closed-loop simulation benchmark. The results demonstrate that CAFE-AD outperforms state-of-the-art methods including rule-based and hybrid planners, and exhibits the potential in mitigating the impact of long-tail distribution within the dataset. Additionally, we further validate its effectiveness in real-world environments. The code and models will be made available at https://github.com/AlniyatRui/CAFE-AD.

Hydrodynamic Modelling of Early Peaks in Type Ibc Supernovae with Shock Cooling Emission from Circumstellar Matter

Authors:Ryotaro Chiba, Takashi J. Moriya
Date:2025-04-08 21:33:00

Recent high-cadence transient surveys have uncovered a rare subclass of Type Ibc supernovae (SNe) that exhibit an early, blue peak lasting a few days before the main, radioactively powered peak. Since progenitors of Type Ibc SNe are typically compact and lack an extended envelope, this early peak is commonly attributed to shock cooling emission from circumstellar matter (CSM) surrounding the progenitor star. As such, these SNe provide a unique opportunity to constrain the pre-explosion activity of Type Ibc SN progenitors. We present the first systematic study of this Type Ibc SN population that uses hydrodynamic modelling. We simulated Type Ibc SNe exploding in a CSM using the multi-group radiation-hydrodynamics code STELLA, exploring a range of SN and CSM properties. By comparing the resulting theoretical multi-band light curves to a sample of seven Type Ibc SNe with early peaks, we constrained their CSM properties. Assuming a wind-like density distribution for the CSM, we found CSM masses of $10^{-2} - 10^{-1} \ \mathrm{M}_\odot$ and CSM radii of $(1 - 5) \times 10^3 \ \mathrm{R}_\odot$. While the masses were roughly consistent with a previous estimate obtained using an analytical model, the radii were significantly different, likely due to our assumption of spatially spread out CSM. We infer that the progenitors could have created CSM via late-time binary mass transfer or pulsational pair instability. We also estimate that, in the planned ULTRASAT high-cadence survey, $\sim 30$ shock cooling peaks from Type Ibc SNe will be observed.

Bayesian estimation for conditional probabilities associated to directed acyclic graphs: study of hospitalization of severe influenza cases

Authors:Lesly Acosta, Carmen Armero
Date:2025-04-08 21:19:32

This paper presents a Bayesian inferential framework for estimating joint, conditional, and marginal probabilities in directed acyclic graphs (DAGs) applied to the study of the progression of hospitalized patients with severe influenza. Using data from the PIDIRAC retrospective cohort study in Catalonia, we model patient pathways from admission through different stages of care until discharge, death, or transfer to a long-term care facility. Direct transition probabilities are estimated through a Bayesian approach combining conjugate Dirichlet-multinomial inferential processes, while posterior distributions associated to absorbing state or inverse probabilities are assessed via simulation techniques. Bayesian methodology quantifies uncertainty through posterior distributions, providing insights into disease progression and improving hospital resource planning during seasonal influenza peaks. These results support more effective patient management and decision making in healthcare systems. Keywords: Confirmed influenza hospitalization; Directed acyclic graphs (DAGs); Dirichlet-multinomial Bayesian inferential process; Healthcare decision-making; Transition probabilities.

Extended Version: Multi-Robot Motion Planning with Cooperative Localization

Authors:Anne Theurkauf, Nisar Ahmed, Morteza Lahijanian
Date:2025-04-08 20:58:19

We consider the uncertain multi-robot motion planning (MRMP) problem with cooperative localization (CL-MRMP), under both motion and measurement noise, where each robot can act as a sensor for its nearby teammates. We formalize CL-MRMP as a chance-constrained motion planning problem, and propose a safety-guaranteed algorithm that explicitly accounts for robot-robot correlations. Our approach extends a sampling-based planner to solve CL-MRMP while preserving probabilistic completeness. To improve efficiency, we introduce novel biasing techniques. We evaluate our method across diverse benchmarks, demonstrating its effectiveness in generating motion plans, with significant performance gains from biasing strategies.

Low Rank Learning for Offline Query Optimization

Authors:Zixuan Yi, Yao Tian, Zachary G. Ives, Ryan Marcus
Date:2025-04-08 19:41:19

Recent deployments of learned query optimizers use expensive neural networks and ad-hoc search policies. To address these issues, we introduce \textsc{LimeQO}, a framework for offline query optimization leveraging low-rank learning to efficiently explore alternative query plans with minimal resource usage. By modeling the workload as a partially observed, low-rank matrix, we predict unobserved query plan latencies using purely linear methods, significantly reducing computational overhead compared to neural networks. We formalize offline exploration as an active learning problem, and present simple heuristics that reduces a 3-hour workload to 1.5 hours after just 1.5 hours of exploration. Additionally, we propose a transductive Tree Convolutional Neural Network (TCNN) that, despite higher computational costs, achieves the same workload reduction with only 0.5 hours of exploration. Unlike previous approaches that place expensive neural networks directly in the query processing ``hot'' path, our approach offers a low-overhead solution and a no-regressions guarantee, all without making assumptions about the underlying DBMS. The code is available in \href{https://github.com/zixy17/LimeQO}{https://github.com/zixy17/LimeQO}.

Towards practicable Machine Learning development using AI Engineering Blueprints

Authors:Nicolas Weeger, Annika Stiehl, Jóakim vom Kistowski, Stefan Geißelsöder, Christian Uhl
Date:2025-04-08 19:28:05

The implementation of artificial intelligence (AI) in business applications holds considerable promise for significant improvements. The development of AI systems is becoming increasingly complex, thereby underscoring the growing importance of AI engineering and MLOps techniques. Small and medium-sized enterprises (SMEs) face considerable challenges when implementing AI in their products or processes. These enterprises often lack the necessary resources and expertise to develop, deploy, and operate AI systems that are tailored to address their specific problems. Given the lack of studies on the application of AI engineering practices, particularly in the context of SMEs, this paper proposes a research plan designed to develop blueprints for the creation of proprietary machine learning (ML) models using AI engineering and MLOps practices. These blueprints enable SMEs to develop, deploy, and operate AI systems by providing reference architectures and suitable automation approaches for different types of ML. The efficacy of the blueprints is assessed through their application to a series of field projects. This process gives rise to further requirements and additional development loops for the purpose of generalization. The benefits of using the blueprints for organizations are demonstrated by observing the process of developing ML models and by conducting interviews with the developers.

Review, Definition and Challenges of Electrical Energy Hubs

Authors:Giacomo Bastianel, Jan Kircheis, Merijn Van Deyck, Dongyeong Lee, Geraint Chaffey, Marta Vanin, Hakan Ergun, Jef Beerten, Dirk Van Hertem
Date:2025-04-08 18:44:37

To transition towards a carbon-neutral power system, considerable amounts of renewable energy generation capacity are being installed in the North Sea area. Consequently, projects aggregating many gigawatts of power generation capacity and transmitting renewable energy to the main load centers are being developed. Given the electrical challenges arising from having bulk power capacity in a compact geographical area with several connections to the main grid, and a lack of a robust definition identifying the type of system under study, this paper proposes a general technical definition of such projects introducing the term Electrical Energy Hub (EEH). The concept, purpose, and functionalities of EEHs are introduced in the text, emphasizing the importance of a clear technical definition for future planning procedures, grid codes, regulations, and support schemes for EEHs and multiterminal HVDC (MTDC) grids in general. Furthermore, the unique electrical challenges associated with integrating EEHs into the power system are discussed. Three research areas of concern are identified, namely control, planning, and protection. Through this analysis, insights are provided into the effective implementation of multi-GW scale EEH projects and their integration into the power grid through multiple interconnections. Finally, a list of ongoing and planned grid development projects is evaluated to assess whether they fall within the EEH category

Underwater Robotic Simulators Review for Autonomous System Development

Authors:Sara Aldhaheri, Yang Hu, Yongchang Xie, Peng Wu, Dimitrios Kanoulas, Yuanchang Liu
Date:2025-04-08 17:43:48

The increasing complexity of underwater robotic systems has led to a surge in simulation platforms designed to support perception, planning, and control tasks in marine environments. However, selecting the most appropriate underwater robotic simulator (URS) remains a challenge due to wide variations in fidelity, extensibility, and task suitability. This paper presents a comprehensive review and comparative analysis of five state-of-the-art, ROS-compatible, open-source URSs: Stonefish, DAVE, HoloOcean, MARUS, and UNav-Sim. Each simulator is evaluated across multiple criteria including sensor fidelity, environmental realism, sim-to-real capabilities, and research impact. We evaluate them across architectural design, sensor and physics modeling, task capabilities, and research impact. Additionally, we discuss ongoing challenges in sim-to-real transfer and highlight the need for standardization and benchmarking in the field. Our findings aim to guide practitioners in selecting effective simulation environments and inform future development of more robust and transferable URSs.

Exploring Adversarial Obstacle Attacks in Search-based Path Planning for Autonomous Mobile Robots

Authors:Adrian Szvoren, Jianwei Liu, Dimitrios Kanoulas, Nilufer Tuptuk
Date:2025-04-08 15:48:26

Path planning algorithms, such as the search-based A*, are a critical component of autonomous mobile robotics, enabling robots to navigate from a starting point to a destination efficiently and safely. We investigated the resilience of the A* algorithm in the face of potential adversarial interventions known as obstacle attacks. The adversary's goal is to delay the robot's timely arrival at its destination by introducing obstacles along its original path. We developed malicious software to execute the attacks and conducted experiments to assess their impact, both in simulation using TurtleBot in Gazebo and in real-world deployment with the Unitree Go1 robot. In simulation, the attacks resulted in an average delay of 36\%, with the most significant delays occurring in scenarios where the robot was forced to take substantially longer alternative paths. In real-world experiments, the delays were even more pronounced, with all attacks successfully rerouting the robot and causing measurable disruptions. These results highlight that the algorithm's robustness is not solely an attribute of its design but is significantly influenced by the operational environment. For example, in constrained environments like tunnels, the delays were maximized due to the limited availability of alternative routes.