planning - 2025-08-21

Virtual Community: An Open World for Humans, Robots, and Society

Authors:Qinhong Zhou, Hongxin Zhang, Xiangye Lin, Zheyuan Zhang, Yutian Chen, Wenjun Liu, Zunzhe Zhang, Sunli Chen, Lixing Fang, Qiushi Lyu, Xinyu Sun, Jincheng Yang, Zeyuan Wang, Bao Chi Dang, Zhehuan Chen, Daksha Ladia, Jiageng Liu, Chuang Gan
Date:2025-08-20 17:59:32

The rapid progress in AI and Robotics may lead to a profound societal transformation, as humans and robots begin to coexist within shared communities, introducing both opportunities and challenges. To explore this future, we present Virtual Community-an open-world platform for humans, robots, and society-built on a universal physics engine and grounded in real-world 3D scenes. With Virtual Community, we aim to study embodied social intelligence at scale: 1) How robots can intelligently cooperate or compete; 2) How humans develop social relations and build community; 3) More importantly, how intelligent robots and humans can co-exist in an open world. To support these, Virtual Community features: 1) An open-source multi-agent physics simulator that supports robots, humans, and their interactions within a society; 2) A large-scale, real-world aligned community generation pipeline, including vast outdoor space, diverse indoor scenes, and a community of grounded agents with rich characters and appearances. Leveraging Virtual Community, we propose two novel challenges. The Community Planning Challenge evaluates multi-agent reasoning and planning ability in open-world settings, such as cooperating to help agents with daily activities and efficiently connecting other agents. The Community Robot Challenge requires multiple heterogeneous robots to collaborate in solving complex open-world tasks. We evaluate various baselines on these tasks and demonstrate the challenges in both high-level open-world task planning and low-level cooperation controls. We hope that Virtual Community will unlock further study of human-robot coexistence within open-world environments.

Safe and Transparent Robots for Human-in-the-Loop Meat Processing

Authors:Sagar Parekh, Casey Grothoff, Ryan Wright, Robin White, Dylan P. Losey
Date:2025-08-20 15:10:01

Labor shortages have severely affected the meat processing sector. Automated technology has the potential to support the meat industry, assist workers, and enhance job quality. However, existing automation in meat processing is highly specialized, inflexible, and cost intensive. Instead of forcing manufacturers to buy a separate device for each step of the process, our objective is to develop general-purpose robotic systems that work alongside humans to perform multiple meat processing tasks. Through a recently conducted survey of industry experts, we identified two main challenges associated with integrating these collaborative robots alongside human workers. First, there must be measures to ensure the safety of human coworkers; second, the coworkers need to understand what the robot is doing. This paper addresses both challenges by introducing a safety and transparency framework for general-purpose meat processing robots. For safety, we implement a hand-detection system that continuously monitors nearby humans. This system can halt the robot in situations where the human comes into close proximity of the operating robot. We also develop an instrumented knife equipped with a force sensor that can differentiate contact between objects such as meat, bone, or fixtures. For transparency, we introduce a method that detects the robot's uncertainty about its performance and uses an LED interface to communicate that uncertainty to the human. Additionally, we design a graphical interface that displays the robot's plans and allows the human to provide feedback on the planned cut. Overall, our framework can ensure safe operation while keeping human workers in-the-loop about the robot's actions which we validate through a user study.

Design of high-efficiency UHV loading of nanodiamonds into a Paul trap: Towards Matter-Wave Interferometry with Massive Objects

Authors:Rafael Benjaminov, Sela Liran, Or Dobkowski, Yaniv Bar-Haim, Michael Averbukh, Ron Folman
Date:2025-08-20 14:01:50

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle, can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in completely new regimes, as well as the interface between QM and GR, e.g., testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several paths are being pursued, we focus on utilizing nanodiamonds as our particle, and a spin embedded in the ND together with Stern-Gerlach forces, to achieve a closed loop in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work, we review current methods for loading nanodiamonds into a Paul trap, and their capabilities and limitations regarding our application. We also present our experiments on loading and launching nanodiamonds using a vibrating piezoelectric element and by electrical forces. Finally, we present our design of a novel nanodiamond loading method for ultra-high-vacuum experiments. As the production of highly accurate, high-purity nanodiamonds with a single NV required for interferometric measurements is expected to be expensive, we put emphasis on achieving high loading efficiency, while loading the charged ND into a Paul trap in ultra-high vacuum.

Rule-based Key-Point Extraction for MR-Guided Biomechanical Digital Twins of the Spine

Authors:Robert Graf, Tanja Lerchl, Kati Nispel, Hendrik Möller, Matan Atad, Julian McGinnis, Julius Maria Watrinet, Johannes Paetzold, Daniel Rueckert, Jan S. Kirschke
Date:2025-08-20 13:31:40

Digital twins offer a powerful framework for subject-specific simulation and clinical decision support, yet their development often hinges on accurate, individualized anatomical modeling. In this work, we present a rule-based approach for subpixel-accurate key-point extraction from MRI, adapted from prior CT-based methods. Our approach incorporates robust image alignment and vertebra-specific orientation estimation to generate anatomically meaningful landmarks that serve as boundary conditions and force application points, like muscle and ligament insertions in biomechanical models. These models enable the simulation of spinal mechanics considering the subject's individual anatomy, and thus support the development of tailored approaches in clinical diagnostics and treatment planning. By leveraging MR imaging, our method is radiation-free and well-suited for large-scale studies and use in underrepresented populations. This work contributes to the digital twin ecosystem by bridging the gap between precise medical image analysis with biomechanical simulation, and aligns with key themes in personalized modeling for healthcare.

Trapping and cooling of nanodiamonds in a Paul trap under ultra-high vacuum: Towards matter-wave interferometry with massive objects

Authors:Omer Feldman, Ben Baruch Shultz, Maria Muretova, Or Dobkowski, Yonathan Japha, David Grosswasser, Ron Folman
Date:2025-08-20 13:05:23

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in previously unexplored regimes. It also opens the possibility of probing the interface between QM and GR, such as testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several approaches are being explored, we focus on utilizing nanodiamonds with embedded spins as test particles which, in combination with Stern-Gerlach forces, enable the realization of a closed-loop matter-wave interferometer in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work we detail the trapping of a nanodiamond at 10^-8 mbar, which is good enough for the realization of a short-duration Stern-Gerlach interferometer. We describe in detail the cooling we have performed to sub-Kelvin temperatures, and demonstrate that the nanodiamond remains confined within the trap even under high-intensity 1560 nm laser illumination. We would be happy to make available more details upon request.

An Informative Planning Framework for Target Tracking and Active Mapping in Dynamic Environments with ASVs

Authors:Sanjeev Ramkumar Sudha, Marija Popović, Erlend M. Coates
Date:2025-08-20 11:44:30

Mobile robot platforms are increasingly being used to automate information gathering tasks such as environmental monitoring. Efficient target tracking in dynamic environments is critical for applications such as search and rescue and pollutant cleanups. In this letter, we study active mapping of floating targets that drift due to environmental disturbances such as wind and currents. This is a challenging problem as it involves predicting both spatial and temporal variations in the map due to changing conditions. We propose an informative path planning framework to map an arbitrary number of moving targets with initially unknown positions in dynamic environments. A key component of our approach is a spatiotemporal prediction network that predicts target position distributions over time. We propose an adaptive planning objective for target tracking that leverages these predictions. Simulation experiments show that our proposed planning objective improves target tracking performance compared to existing methods that consider only entropy reduction as the planning objective. Finally, we validate our approach in field tests using an autonomous surface vehicle, showcasing its ability to track targets in real-world monitoring scenarios.

TRUST-Planner: Topology-guided Robust Trajectory Planner for AAVs with Uncertain Obstacle Spatial-temporal Avoidance

Authors:Junzhi Li, Teng Long, Jingliang Sun, Jianxin Zhong
Date:2025-08-20 10:52:28

Despite extensive developments in motion planning of autonomous aerial vehicles (AAVs), existing frameworks faces the challenges of local minima and deadlock in complex dynamic environments, leading to increased collision risks. To address these challenges, we present TRUST-Planner, a topology-guided hierarchical planning framework for robust spatial-temporal obstacle avoidance. In the frontend, a dynamic enhanced visible probabilistic roadmap (DEV-PRM) is proposed to rapidly explore topological paths for global guidance. The backend utilizes a uniform terminal-free minimum control polynomial (UTF-MINCO) and dynamic distance field (DDF) to enable efficient predictive obstacle avoidance and fast parallel computation. Furthermore, an incremental multi-branch trajectory management framework is introduced to enable spatio-temporal topological decision-making, while efficiently leveraging historical information to reduce replanning time. Simulation results show that TRUST-Planner outperforms baseline competitors, achieving a 96\% success rate and millisecond-level computation efficiency in tested complex environments. Real-world experiments further validate the feasibility and practicality of the proposed method.

Multi-Tier UAV Edge Computing for Low Altitude Networks Towards Long-Term Energy Stability

Authors:Yufei Ye, Shijian Gao, Xinhu Zheng, Liuqing Yang
Date:2025-08-20 10:37:22

This paper presents a novel multi-tier UAV-assisted edge computing system designed for low-altitude networks. The system comprises vehicle users, lightweight Low-Tier UAVs (L-UAVs), and High-Tier UAV (H-UAV). L-UAVs function as small-scale edge servers positioned closer to vehicle users, while the H-UAV, equipped with more powerful server and larger-capacity battery, serves as mobile backup server to address the limitations in endurance and computing resources of L-UAVs. The primary objective is to minimize task execution delays while ensuring long-term energy stability for L-UAVs. To address this challenge, the problem is first decoupled into a series of deterministic problems for each time slot using Lyapunov optimization. The priorities of task delay and energy consumption for L-UAVs are adaptively adjusted based on real-time energy status. The optimization tasks include assignment of tasks, allocation of computing resources, and trajectory planning for both L-UAVs and H-UAV. Simulation results demonstrate that the proposed approach achieves a reduction of at least 26% in transmission energy for L-UAVs and exhibits superior energy stability compared to existing benchmarks.

EAROL: Environmental Augmented Perception-Aware Planning and Robust Odometry via Downward-Mounted Tilted LiDAR

Authors:Xinkai Liang, Yigu Ge, Yangxi Shi, Haoyu Yang, Xu Cao, Hao Fang
Date:2025-08-20 09:16:29

To address the challenges of localization drift and perception-planning coupling in unmanned aerial vehicles (UAVs) operating in open-top scenarios (e.g., collapsed buildings, roofless mazes), this paper proposes EAROL, a novel framework with a downward-mounted tilted LiDAR configuration (20{\deg} inclination), integrating a LiDAR-Inertial Odometry (LIO) system and a hierarchical trajectory-yaw optimization algorithm. The hardware innovation enables constraint enhancement via dense ground point cloud acquisition and forward environmental awareness for dynamic obstacle detection. A tightly-coupled LIO system, empowered by an Iterative Error-State Kalman Filter (IESKF) with dynamic motion compensation, achieves high level 6-DoF localization accuracy in feature-sparse environments. The planner, augmented by environment, balancing environmental exploration, target tracking precision, and energy efficiency. Physical experiments demonstrate 81% tracking error reduction, 22% improvement in perceptual coverage, and near-zero vertical drift across indoor maze and 60-meter-scale outdoor scenarios. This work proposes a hardware-algorithm co-design paradigm, offering a robust solution for UAV autonomy in post-disaster search and rescue missions. We will release our software and hardware as an open-source package for the community. Video: https://youtu.be/7av2ueLSiYw.

Transforming Next-generation Network Planning assisted by Data Acquisition of Top Three Spanish MNOs

Authors:M. Umar Khan
Date:2025-08-20 06:02:04

In this paper, we address the necessity of data related to mobile traffic of the legacy infrastructure to extract useful information and perform network dimensioning for 5G. These data can help us achieve a more efficient network planning design, especially in terms of topology and cost. To that end, a real open database of top three Spanish mobile network operators (MNOs) is used to estimate the traffic and to identify the area of highest user density for the deployment of new services. We propose the data acquisition procedure described to clean the database, to extract meaningful traffic information and to visualize traffic density patterns for new gNB deployments. We present the state of the art in Network Data. We describe the considered network database in detail. The Network Data Acquisition entity along with the proposed procedure is explained. The corresponding results are discussed, following the conclusions.

Online Incident Response Planning under Model Misspecification through Bayesian Learning and Belief Quantization

Authors:Kim Hammar, Tao Li
Date:2025-08-20 03:25:59

Effective responses to cyberattacks require fast decisions, even when information about the attack is incomplete or inaccurate. However, most decision-support frameworks for incident response rely on a detailed system model that describes the incident, which restricts their practical utility. In this paper, we address this limitation and present an online method for incident response planning under model misspecification, which we call MOBAL: Misspecified Online Bayesian Learning. MOBAL iteratively refines a conjecture about the model through Bayesian learning as new information becomes available, which facilitates model adaptation as the incident unfolds. To determine effective responses online, we quantize the conjectured model into a finite Markov model, which enables efficient response planning through dynamic programming. We prove that Bayesian learning is asymptotically consistent with respect to the information feedback. Additionally, we establish bounds on misspecification and quantization errors. Experiments on the CAGE-2 benchmark show that MOBAL outperforms the state of the art in terms of adaptability and robustness to model misspecification.

FiReFly: Fair Distributed Receding Horizon Planning for Multiple UAVs

Authors:Nicole Fronda, Bardh Hoxha, Houssam Abbas
Date:2025-08-20 03:21:44

We propose injecting notions of fairness into multi-robot motion planning. When robots have competing interests, it is important to optimize for some kind of fairness in their usage of resources. In this work, we explore how the robots' energy expenditures might be fairly distributed among them, while maintaining mission success. We formulate a distributed fair motion planner and integrate it with safe controllers in a algorithm called FiReFly. For simulated reach-avoid missions, FiReFly produces fairer trajectories and improves mission success rates over a non-fair planner. We find that real-time performance is achievable up to 15 UAVs, and that scaling up to 50 UAVs is possible with trade-offs between runtime and fairness improvements.

Fair-CoPlan: Negotiated Flight Planning with Fair Deconfliction for Urban Air Mobility

Authors:Nicole Fronda, Phil Smith, Bardh Hoxha, Yash Pant, Houssam Abbas
Date:2025-08-20 03:21:19

Urban Air Mobility (UAM) is an emerging transportation paradigm in which Uncrewed Aerial Systems (UAS) autonomously transport passengers and goods in cities. The UAS have different operators with different, sometimes competing goals, yet must share the airspace. We propose a negotiated, semi-distributed flight planner that optimizes UAS' flight lengths {\em in a fair manner}. Current flight planners might result in some UAS being given disproportionately shorter flight paths at the expense of others. We introduce Fair-CoPlan, a planner in which operators and a Provider of Service to the UAM (PSU) together compute \emph{fair} flight paths. Fair-CoPlan has three steps: First, the PSU constrains take-off and landing choices for flights based on capacity at and around vertiports. Then, operators plan independently under these constraints. Finally, the PSU resolves any conflicting paths, optimizing for path length fairness. By fairly spreading the cost of deconfliction Fair-CoPlan encourages wider participation in UAM, ensures safety of the airspace and the areas below it, and promotes greater operator flexibility. We demonstrate Fair-CoPlan through simulation experiments and find fairer outcomes than a non-fair planner with minor delays as a trade-off.

Action-Constrained Imitation Learning

Authors:Chia-Han Yeh, Tse-Sheng Nan, Risto Vuorio, Wei Hung, Hung-Yen Wu, Shao-Hua Sun, Ping-Chun Hsieh
Date:2025-08-20 03:19:07

Policy learning under action constraints plays a central role in ensuring safe behaviors in various robot control and resource allocation applications. In this paper, we study a new problem setting termed Action-Constrained Imitation Learning (ACIL), where an action-constrained imitator aims to learn from a demonstrative expert with larger action space. The fundamental challenge of ACIL lies in the unavoidable mismatch of occupancy measure between the expert and the imitator caused by the action constraints. We tackle this mismatch through \textit{trajectory alignment} and propose DTWIL, which replaces the original expert demonstrations with a surrogate dataset that follows similar state trajectories while adhering to the action constraints. Specifically, we recast trajectory alignment as a planning problem and solve it via Model Predictive Control, which aligns the surrogate trajectories with the expert trajectories based on the Dynamic Time Warping (DTW) distance. Through extensive experiments, we demonstrate that learning from the dataset generated by DTWIL significantly enhances performance across multiple robot control tasks and outperforms various benchmark imitation learning algorithms in terms of sample efficiency. Our code is publicly available at https://github.com/NYCU-RL-Bandits-Lab/ACRL-Baselines.

TCFNet: Bidirectional face-bone transformation via a Transformer-based coarse-to-fine point movement network

Authors:Runshi Zhang, Bimeng Jie, Yang He, Junchen Wang
Date:2025-08-20 03:02:16

Computer-aided surgical simulation is a critical component of orthognathic surgical planning, where accurately simulating face-bone shape transformations is significant. The traditional biomechanical simulation methods are limited by their computational time consumption levels, labor-intensive data processing strategies and low accuracy. Recently, deep learning-based simulation methods have been proposed to view this problem as a point-to-point transformation between skeletal and facial point clouds. However, these approaches cannot process large-scale points, have limited receptive fields that lead to noisy points, and employ complex preprocessing and postprocessing operations based on registration. These shortcomings limit the performance and widespread applicability of such methods. Therefore, we propose a Transformer-based coarse-to-fine point movement network (TCFNet) to learn unique, complicated correspondences at the patch and point levels for dense face-bone point cloud transformations. This end-to-end framework adopts a Transformer-based network and a local information aggregation network (LIA-Net) in the first and second stages, respectively, which reinforce each other to generate precise point movement paths. LIA-Net can effectively compensate for the neighborhood precision loss of the Transformer-based network by modeling local geometric structures (edges, orientations and relative position features). The previous global features are employed to guide the local displacement using a gated recurrent unit. Inspired by deformable medical image registration, we propose an auxiliary loss that can utilize expert knowledge for reconstructing critical organs.Compared with the existing state-of-the-art (SOTA) methods on gathered datasets, TCFNet achieves outstanding evaluation metrics and visualization results. The code is available at https://github.com/Runshi-Zhang/TCFNet.

Generative AI Against Poaching: Latent Composite Flow Matching for Wildlife Conservation

Authors:Lingkai Kong, Haichuan Wang, Charles A. Emogor, Vincent Börsch-Supan, Lily Xu, Milind Tambe
Date:2025-08-20 01:35:51

Poaching poses significant threats to wildlife and biodiversity. A valuable step in reducing poaching is to forecast poacher behavior, which can inform patrol planning and other conservation interventions. Existing poaching prediction methods based on linear models or decision trees lack the expressivity to capture complex, nonlinear spatiotemporal patterns. Recent advances in generative modeling, particularly flow matching, offer a more flexible alternative. However, training such models on real-world poaching data faces two central obstacles: imperfect detection of poaching events and limited data. To address imperfect detection, we integrate flow matching with an occupancy-based detection model and train the flow in latent space to infer the underlying occupancy state. To mitigate data scarcity, we adopt a composite flow initialized from a linear-model prediction rather than random noise which is the standard in diffusion models, injecting prior knowledge and improving generalization. Evaluations on datasets from two national parks in Uganda show consistent gains in predictive accuracy.

Tooth-Diffusion: Guided 3D CBCT Synthesis with Fine-Grained Tooth Conditioning

Authors:Said Djafar Said, Torkan Gholamalizadeh, Mostafa Mehdipour Ghazi
Date:2025-08-19 21:21:35

Despite the growing importance of dental CBCT scans for diagnosis and treatment planning, generating anatomically realistic scans with fine-grained control remains a challenge in medical image synthesis. In this work, we propose a novel conditional diffusion framework for 3D dental volume generation, guided by tooth-level binary attributes that allow precise control over tooth presence and configuration. Our approach integrates wavelet-based denoising diffusion, FiLM conditioning, and masked loss functions to focus learning on relevant anatomical structures. We evaluate the model across diverse tasks, such as tooth addition, removal, and full dentition synthesis, using both paired and distributional similarity metrics. Results show strong fidelity and generalization with low FID scores, robust inpainting performance, and SSIM values above 0.91 even on unseen scans. By enabling realistic, localized modification of dentition without rescanning, this work opens opportunities for surgical planning, patient communication, and targeted data augmentation in dental AI workflows. The codes are available at: https://github.com/djafar1/tooth-diffusion.

Strong Confinement of a Nanoparticle in a Needle Paul Trap: Towards Matter-Wave Interferometry with Nanodiamonds

Authors:Peter Skakunenko, Daniel Folman, Yaniv Bar-Haim, Ron Folman
Date:2025-08-19 21:02:57

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle, can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in completely new regimes, as well as the interface between QM and GR, e.g., testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several paths are being pursued, we focus on utilizing nanodiamonds as our particle, and a spin embedded in the nanodiamond together with Stern-Gerlach forces, to achieve a closed loop in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work, we achieve strong confinement of a levitated particle, which is crucial for angular confinement, precise positioning, and perhaps also advantageous for deep cooling. We designed a needle Paul trap with a controllable distance between the electrodes, giving rise to a strong electric gradient. By combining it with an effective charging method - electrospray - we reach a trap frequency of up to 40 kHz, which is more than twice the state of the art. We believe that the designed trap could become a significant tool in the hands of the community working towards massive matter-wave interferometry. We would be happy to make more details available upon request.

SLAM-based Safe Indoor Exploration Strategy

Authors:Omar Mostafa, Nikolaos Evangeliou, Anthony Tzes
Date:2025-08-19 19:50:24

This paper suggests a 2D exploration strategy for a planar space cluttered with obstacles. Rather than using point robots capable of adjusting their position and altitude instantly, this research is tailored to classical agents with circular footprints that cannot control instantly their pose. Inhere, a self-balanced dual-wheeled differential drive system is used to explore the place. The system is equipped with linear accelerometers and angular gyroscopes, a 3D-LiDAR, and a forward-facing RGB-D camera. The system performs RTAB-SLAM using the IMU and the LiDAR, while the camera is used for loop closures. The mobile agent explores the planar space using a safe skeleton approach that places the agent as far as possible from the static obstacles. During the exploration strategy, the heading is towards any offered openings of the space. This space exploration strategy has as its highest priority the agent's safety in avoiding the obstacles followed by the exploration of undetected space. Experimental studies with a ROS-enabled mobile agent are presented indicating the path planning strategy while exploring the space.

New Insights into Automatic Treatment Planning for Cancer Radiotherapy Using Explainable Artificial Intelligence

Authors:Md Mainul Abrar, Xun Jia, Yujie Chi
Date:2025-08-19 19:38:16

Objective: This study aims to uncover the opaque decision-making process of an artificial intelligence (AI) agent for automatic treatment planning. Approach: We examined a previously developed AI agent based on the Actor-Critic with Experience Replay (ACER) network, which automatically tunes treatment planning parameters (TPPs) for inverse planning in prostate cancer intensity modulated radiotherapy. We selected multiple checkpoint ACER agents from different stages of training and applied an explainable AI (EXAI) method to analyze the attribution from dose-volume histogram (DVH) inputs to TPP-tuning decisions. We then assessed each agent's planning efficacy and efficiency and evaluated their policy and final TPP tuning spaces. Combining these analyses, we systematically examined how ACER agents generated high-quality treatment plans in response to different DVH inputs. Results: Attribution analysis revealed that ACER agents progressively learned to identify dose-violation regions from DVH inputs and promote appropriate TPP-tuning actions to mitigate them. Organ-wise similarities between DVH attributions and dose-violation reductions ranged from 0.25 to 0.5 across tested agents. Agents with stronger attribution-violation similarity required fewer tuning steps (~12-13 vs. 22), exhibited a more concentrated TPP-tuning space with lower entropy (~0.3 vs. 0.6), converged on adjusting only a few TPPs, and showed smaller discrepancies between practical and theoretical tuning steps. Putting together, these findings indicate that high-performing ACER agents can effectively identify dose violations from DVH inputs and employ a global tuning strategy to achieve high-quality treatment planning, much like skilled human planners. Significance: Better interpretability of the agent's decision-making process may enhance clinician trust and inspire new strategies for automatic treatment planning.

Design, development, and commissioning of a flexible test setup for the AXIS prototype detector

Authors:Abigail Y. Pan, Haley R. Stueber, Tanmoy Chattopadhyay, Steven W. Allen, Marshall W. Bautz, Kevan Donlon, Catherine E. Grant, Sven Hermann, Beverly LaMarr, Andrew Malonis, Eric D. Miller, Glenn Morris, Peter Orel, Artem Poliszczuk, Gregory Prigozhin, Dan Wilkins
Date:2025-08-19 18:01:22

The Advanced X-ray Imaging Satellite (AXIS) is one of two candidate mission concepts selected for Phase-A study for the new NASA Astrophysics Probe Explorer (APEX) mission class, with a planned launch in 2032. The X-ray camera for AXIS is under joint development by the X-ray Astronomy and Observational Cosmology (XOC) Group at Stanford, the MIT Kavli Institute (MKI), and MIT Lincoln Laboratory (MIT-LL). To accelerate development efforts and meet the AXIS mission requirements, XOC has developed a twin beamline testing system, capable of providing the necessary performance, flexibility, and robustness. We present design details, simulations, and performance results for the newer of the two beamlines, constructed and optimized to test and characterize the first full-size MIT-LL AXIS prototype detectors, operating with the Stanford-developed Multi-Channel Readout Chip (MCRC) integrated readout electronics system. The XOC X-ray beamline design is forward-looking and flexible, with a modular structure adaptable to a wide range of detector technologies identified by the Great Observatories Maturation Program (GOMAP) that span the X-ray to near-infrared wavelengths.

Ground calibration plans for the AXIS high speed camera

Authors:Catherine E. Grant, Eric D. Miller, Marshall W. Bautz, Jill Juneau, Beverly J. LaMarr, Andrew Malonis, Gregory Y. Prigozhin, Christopher W. Leitz, Steven W. Allen, Tanmoy Chattopadhyay, Sven Herrmann, R. Glenn Morris, Abigail Y. Pan, Artem Poliszczuk, Haley R. Stueber, Daniel R. Wilkins
Date:2025-08-19 18:01:06

The Advanced X-ray Imaging Satellite (AXIS), an astrophysics NASA probe mission currently in phase A, will provide high-throughput, high-spatial resolution X-ray imaging in the 0.3 to 10 keV band. We report on the notional ground calibration plan for the High Speed Camera on AXIS, which is being developed at the MIT Kavli Institute for Astrophysics and Space Research using state-of-the-art CCDs provided by MIT Lincoln Laboratory in combination with an integrated, high-speed ASIC readout chip from Stanford University. AXIS camera ground calibration draws on previous experience with X-ray CCD focal plans, in particular Chandra/ACIS and Suzaku/XIS, utilizing mono-energetic X-ray line sources to measure spectral resolution and quantum efficiency. Relative quantum efficiency of the CCDs will be measured against an sCMOS device, with known absolute calibration from synchrotron measurements. We walk through the envisioned CCD calibration pipeline and we discuss the observatory-level science and calibration requirements and how they inform the camera calibration.

ResPlan: A Large-Scale Vector-Graph Dataset of 17,000 Residential Floor Plans

Authors:Mohamed Abouagour, Eleftherios Garyfallidis
Date:2025-08-19 17:07:47

We introduce ResPlan, a large-scale dataset of 17,000 detailed, structurally rich, and realistic residential floor plans, created to advance spatial AI research. Each plan includes precise annotations of architectural elements (walls, doors, windows, balconies) and functional spaces (such as kitchens, bedrooms, and bathrooms). ResPlan addresses key limitations of existing datasets such as RPLAN (Wu et al., 2019) and MSD (van Engelenburg et al., 2024) by offering enhanced visual fidelity and greater structural diversity, reflecting realistic and non-idealized residential layouts. Designed as a versatile, general-purpose resource, ResPlan supports a wide range of applications including robotics, reinforcement learning, generative AI, virtual and augmented reality, simulations, and game development. Plans are provided in both geometric and graph-based formats, enabling direct integration into simulation engines and fast 3D conversion. A key contribution is an open-source pipeline for geometry cleaning, alignment, and annotation refinement. Additionally, ResPlan includes structured representations of room connectivity, supporting graph-based spatial reasoning tasks. Finally, we present comparative analyses with existing benchmarks and outline several open benchmark tasks enabled by ResPlan. Ultimately, ResPlan offers a significant advance in scale, realism, and usability, providing a robust foundation for developing and benchmarking next-generation spatial intelligence systems.

Self-Supervised Sparse Sensor Fusion for Long Range Perception

Authors:Edoardo Palladin, Samuel Brucker, Filippo Ghilotti, Praveen Narayanan, Mario Bijelic, Felix Heide
Date:2025-08-19 16:40:29

Outside of urban hubs, autonomous cars and trucks have to master driving on intercity highways. Safe, long-distance highway travel at speeds exceeding 100 km/h demands perception distances of at least 250 m, which is about five times the 50-100m typically addressed in city driving, to allow sufficient planning and braking margins. Increasing the perception ranges also allows to extend autonomy from light two-ton passenger vehicles to large-scale forty-ton trucks, which need a longer planning horizon due to their high inertia. However, most existing perception approaches focus on shorter ranges and rely on Bird's Eye View (BEV) representations, which incur quadratic increases in memory and compute costs as distance grows. To overcome this limitation, we built on top of a sparse representation and introduced an efficient 3D encoding of multi-modal and temporal features, along with a novel self-supervised pre-training scheme that enables large-scale learning from unlabeled camera-LiDAR data. Our approach extends perception distances to 250 meters and achieves an 26.6% improvement in mAP in object detection and a decrease of 30.5% in Chamfer Distance in LiDAR forecasting compared to existing methods, reaching distances up to 250 meters. Project Page: https://light.princeton.edu/lrs4fusion/

The Social Context of Human-Robot Interactions

Authors:Sydney Thompson, Kate Candon, Marynel Vázquez
Date:2025-08-19 16:15:58

The Human-Robot Interaction (HRI) community often highlights the social context of an interaction as a key consideration when designing, implementing, and evaluating robot behavior. Unfortunately, researchers use the term "social context" in varied ways. This can lead to miscommunication, making it challenging to draw connections between related work on understanding and modeling the social contexts of human-robot interactions. To address this gap, we survey the HRI literature for existing definitions and uses of the term "social context". Then, we propose a conceptual model for describing the social context of a human-robot interaction. We apply this model to existing work, and we discuss a range of attributes of social contexts that can help researchers plan for interactions, develop behavior models for robots, and gain insights after interactions have taken place. We conclude with a discussion of open research questions in relation to understanding and modeling the social contexts of human-robot interactions.

Multi-User Contextual Cascading Bandits for Personalized Recommendation

Authors:Jiho Park, Huiwen Jia
Date:2025-08-19 16:14:33

We introduce a Multi-User Contextual Cascading Bandit model, a new combinatorial bandit framework that captures realistic online advertising scenarios where multiple users interact with sequentially displayed items simultaneously. Unlike classical contextual bandits, MCCB integrates three key structural elements: (i) cascading feedback based on sequential arm exposure, (ii) parallel context sessions enabling selective exploration, and (iii) heterogeneous arm-level rewards. We first propose Upper Confidence Bound with Backward Planning (UCBBP), a UCB-style algorithm tailored to this setting, and prove that it achieves a regret bound of $\widetilde{O}(\sqrt{THN})$ over $T$ episodes, $H$ session steps, and $N$ contexts per episode. Motivated by the fact that many users interact with the system simultaneously, we introduce a second algorithm, termed Active Upper Confidence Bound with Backward Planning (AUCBBP), which shows a strict efficiency improvement in context scaling, i.e., user scaling, with a regret bound of $\widetilde{O}(\sqrt{T+HN})$. We validate our theoretical findings via numerical experiments, demonstrating the empirical effectiveness of both algorithms under various settings.

Real-Time, Population-Based Reconstruction of 3D Bone Models via Very-Low-Dose Protocols

Authors:Yiqun Lin, Haoran Sun, Yongqing Li, Rabia Aslam, Lung Fung Tse, Tiange Cheng, Chun Sing Chui, Wing Fung Yau, Victorine R. Le Meur, Meruyert Amangeldy, Kiho Cho, Yinyu Ye, James Zou, Wei Zhao, Xiaomeng Li
Date:2025-08-19 15:36:58

Patient-specific bone models are essential for designing surgical guides and preoperative planning, as they enable the visualization of intricate anatomical structures. However, traditional CT-based approaches for creating bone models are limited to preoperative use due to the low flexibility and high radiation exposure of CT and time-consuming manual delineation. Here, we introduce Semi-Supervised Reconstruction with Knowledge Distillation (SSR-KD), a fast and accurate AI framework to reconstruct high-quality bone models from biplanar X-rays in 30 seconds, with an average error under 1.0 mm, eliminating the dependence on CT and manual work. Additionally, high tibial osteotomy simulation was performed by experts on reconstructed bone models, demonstrating that bone models reconstructed from biplanar X-rays have comparable clinical applicability to those annotated from CT. Overall, our approach accelerates the process, reduces radiation exposure, enables intraoperative guidance, and significantly improves the practicality of bone models, offering transformative applications in orthopedics.

Development of a defacing algorithm to protect the privacy of head and neck cancer patients in publicly-accessible radiotherapy datasets

Authors:Kayla O'Sullivan-Steben, Luc Galarneau, John Kildea
Date:2025-08-19 15:14:16

Introduction: The rise in public medical imaging datasets has raised concerns about patient reidentification from head CT scans. However, existing defacing algorithms often remove or distort Organs at Risk (OARs) and Planning Target Volumes (PTVs) in head and neck cancer (HNC) patients, and ignore DICOM-RT Structure Set and Dose data. Therefore, we developed and validated a novel automated defacing algorithm that preserves these critical structures while removing identifiable features from HNC CTs and DICOM-RT data. Methods: Eye contours were used as landmarks to automate the removal of CT pixels above the inferior-most eye slice and anterior to the eye midpoint. Pixels within PTVs were retained if they intersected with the removed region. The body contour and dose map were reshaped to reflect the defaced image. We validated our approach on 829 HNC CTs from 622 patients. Privacy protection was evaluated by applying the FaceNet512 facial recognition algorithm before and after defacing on 3D-rendered CT pairs from 70 patients. Research utility was assessed by examining the impact of defacing on autocontouring performance using LimbusAI and analyzing PTV locations relative to the defaced regions. Results: Before defacing, FaceNet512 matched 97% of patients' CTs. After defacing, this rate dropped to 4%. LimbusAI effectively autocontoured organs in the defaced CTs, with perfect Dice scores of 1 for OARs below the defaced region, and excellent scores exceeding 0.95 for OARs on the same slices as the crop. We found that 86% of PTVs were entirely below the cropped region, 9.1% were on the same slice as the crop without overlap, and only 4.9% extended into the cropped area. Conclusions: We developed a novel defacing algorithm that anonymizes HNC CT scans and related DICOM-RT data while preserving essential structures, enabling the sharing of HNC imaging datasets for Big Data and AI.

Toward Deployable Multi-Robot Collaboration via a Symbolically-Guided Decision Transformer

Authors:Rathnam Vidushika Rasanji, Jin Wei-Kocsis, Jiansong Zhang, Dongming Gan, Ragu Athinarayanan, Paul Asunda
Date:2025-08-19 14:42:18

Reinforcement learning (RL) has demonstrated great potential in robotic operations. However, its data-intensive nature and reliance on the Markov Decision Process (MDP) assumption limit its practical deployment in real-world scenarios involving complex dynamics and long-term temporal dependencies, such as multi-robot manipulation. Decision Transformers (DTs) have emerged as a promising offline alternative by leveraging causal transformers for sequence modeling in RL tasks. However, their applications to multi-robot manipulations still remain underexplored. To address this gap, we propose a novel framework, Symbolically-Guided Decision Transformer (SGDT), which integrates a neuro-symbolic mechanism with a causal transformer to enable deployable multi-robot collaboration. In the proposed SGDT framework, a neuro-symbolic planner generates a high-level task-oriented plan composed of symbolic subgoals. Guided by these subgoals, a goal-conditioned decision transformer (GCDT) performs low-level sequential decision-making for multi-robot manipulation. This hierarchical architecture enables structured, interpretable, and generalizable decision making in complex multi-robot collaboration tasks. We evaluate the performance of SGDT across a range of task scenarios, including zero-shot and few-shot scenarios. To our knowledge, this is the first work to explore DT-based technology for multi-robot manipulation.

Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach

Authors:Collins O. Ogbodo, Timothy J. Rogers, Mattia Dal Borgo, David J. Wagg
Date:2025-08-19 12:43:32

Modal testing plays a critical role in structural analysis by providing essential insights into dynamic behaviour across a wide range of engineering industries. In practice, designing an effective modal test campaign involves complex experimental planning, comprising a series of interdependent decisions that significantly influence the final test outcome. Traditional approaches to test design are typically static-focusing only on global tests without accounting for evolving test campaign parameters or the impact of such changes on previously established decisions, such as sensor configurations, which have been found to significantly influence test outcomes. These rigid methodologies often compromise test accuracy and adaptability. To address these limitations, this study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments. The framework formulates the problem using an underspecified partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent through a dual-curriculum learning strategy. A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments, validating its robustness and real-world applicability in experimental settings.