planning - 2025-09-23

Strategic Coordination for Evolving Multi-agent Systems: A Hierarchical Reinforcement and Collective Learning Approach

Authors:Chuhao Qin, Evangelos Pournaras
Date:2025-09-22 17:58:45

Decentralized combinatorial optimization in evolving multi-agent systems poses significant challenges, requiring agents to balance long-term decision-making, short-term optimized collective outcomes, while preserving autonomy of interactive agents under unanticipated changes. Reinforcement learning offers a way to model sequential decision-making through dynamic programming to anticipate future environmental changes. However, applying multi-agent reinforcement learning (MARL) to decentralized combinatorial optimization problems remains an open challenge due to the exponential growth of the joint state-action space, high communication overhead, and privacy concerns in centralized training. To address these limitations, this paper proposes Hierarchical Reinforcement and Collective Learning (HRCL), a novel approach that leverages both MARL and decentralized collective learning based on a hierarchical framework. Agents take high-level strategies using MARL to group possible plans for action space reduction and constrain the agent behavior for Pareto optimality. Meanwhile, the low-level collective learning layer ensures efficient and decentralized coordinated decisions among agents with minimal communication. Extensive experiments in a synthetic scenario and real-world smart city application models, including energy self-management and drone swarm sensing, demonstrate that HRCL significantly improves performance, scalability, and adaptability compared to the standalone MARL and collective learning approaches, achieving a win-win synthesis solution.

V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts

Authors:Hsu-kuang Chiu, Ryo Hachiuma, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen, Stephen F. Smith
Date:2025-09-22 17:27:29

Current state-of-the-art autonomous vehicles could face safety-critical situations when their local sensors are occluded by large nearby objects on the road. Vehicle-to-vehicle (V2V) cooperative autonomous driving has been proposed as a means of addressing this problem, and one recently introduced framework for cooperative autonomous driving has further adopted an approach that incorporates a Multimodal Large Language Model (MLLM) to integrate cooperative perception and planning processes. However, despite the potential benefit of applying graph-of-thoughts reasoning to the MLLM, this idea has not been considered by previous cooperative autonomous driving research. In this paper, we propose a novel graph-of-thoughts framework specifically designed for MLLM-based cooperative autonomous driving. Our graph-of-thoughts includes our proposed novel ideas of occlusion-aware perception and planning-aware prediction. We curate the V2V-GoT-QA dataset and develop the V2V-GoT model for training and testing the cooperative driving graph-of-thoughts. Our experimental results show that our method outperforms other baselines in cooperative perception, prediction, and planning tasks.

The STAR-XAI Protocol: An Interactive Framework for Inducing Second-Order Agency in AI Agents

Authors:Antoni Guasch, Maria Isabel Valdez
Date:2025-09-22 16:24:17

Current Large Reasoning Models (LRMs) exhibit significant limitations in reliability and transparency, often showing a collapse in reasoning capabilities when faced with high-complexity, long-horizon tasks. This "illusion of thinking" is frequently an artifact of non-agentic, black-box evaluation paradigms that fail to cultivate robust problem-solving processes. In response, we introduce The STAR-XAI Protocol (Socratic, Transparent, Agentic, Reasoning - for eXplainable Artificial Intelligence), a novel methodology for training and operating verifiably reliable AI agents. Our method reframes the human-AI interaction as a structured, Socratic dialogue, governed by an explicit and evolving rulebook, the Consciousness Transfer Package (CTP). Through an interactive Gameplay Cycle that enforces ante-hoc strategic justification and a state-locking Checksum that prevents error accumulation, the protocol transforms a powerful but opaque LRM into a disciplined "Clear Box" agent. We demonstrate the efficacy of this method through an exhaustive 25-move case study in the complex strategic game "Caps i Caps". The agent not only solved the high-complexity puzzle but also demonstrated Second-Order Agency, identifying flaws in its own supervisor-approved plans and adapting its core integrity protocols mid-task. The STAR-XAI Protocol offers a practical pathway to creating AI agents that are not just high-performing, but also transparent, auditable, and trustworthy by design.

SmaRT: Style-Modulated Robust Test-Time Adaptation for Cross-Domain Brain Tumor Segmentation in MRI

Authors:Yuanhan Wang, Yifei Chen, Shuo Jiang, Wenjing Yu, Mingxuan Liu, Beining Wu, Jinying Zong, Feiwei Qin, Changmiao Wang, Qiyuan Tian
Date:2025-09-22 15:50:59

Reliable brain tumor segmentation in MRI is indispensable for treatment planning and outcome monitoring, yet models trained on curated benchmarks often fail under domain shifts arising from scanner and protocol variability as well as population heterogeneity. Such gaps are especially severe in low-resource and pediatric cohorts, where conventional test-time or source-free adaptation strategies often suffer from instability and structural inconsistency. We propose SmaRT, a style-modulated robust test-time adaptation framework that enables source-free cross-domain generalization. SmaRT integrates style-aware augmentation to mitigate appearance discrepancies, a dual-branch momentum strategy for stable pseudo-label refinement, and structural priors enforcing consistency, integrity, and connectivity. This synergy ensures both adaptation stability and anatomical fidelity under extreme domain shifts. Extensive evaluations on sub-Saharan Africa and pediatric glioma datasets show that SmaRT consistently outperforms state-of-the-art methods, with notable gains in Dice accuracy and boundary precision. Overall, SmaRT bridges the gap between algorithmic advances and equitable clinical applicability, supporting robust deployment of MRI-based neuro-oncology tools in diverse clinical environments. Our source code is available at https://github.com/baiyou1234/SmaRT.

Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning

Authors:Zhenyun Deng, Yulong Chen, Andreas Vlachos
Date:2025-09-22 15:47:07

Extracting individual sentences from a document as evidence or reasoning steps is commonly done in many NLP tasks. However, extracted sentences often lack context necessary to make them understood, e.g., coreference and background information. To this end, we propose a content selection and planning framework for zero-shot decontextualisation, which determines what content should be mentioned and in what order for a sentence to be understood out of context. Specifically, given a potentially ambiguous sentence and its context, we first segment it into basic semantically-independent units. We then identify potentially ambiguous units from the given sentence, and extract relevant units from the context based on their discourse relations. Finally, we generate a content plan to rewrite the sentence by enriching each ambiguous unit with its relevant units. Experimental results demonstrate that our approach is competitive for sentence decontextualisation, producing sentences that exhibit better semantic integrity and discourse coherence, outperforming existing methods.

Improving After-sales Service: Deep Reinforcement Learning for Dynamic Time Slot Assignment with Commitments and Customer Preferences

Authors:Xiao Mao, Albert H. Schrotenboer, Guohua Wu, Willem van Jaarsveld
Date:2025-09-22 15:09:39

Problem definition: For original equipment manufacturers (OEMs), high-tech maintenance is a strategic component in after-sales services, involving close coordination between customers and service engineers. Each customer suggests several time slots for their maintenance task, from which the OEM must select one. This decision needs to be made promptly to support customers' planning. At the end of each day, routes for service engineers are planned to fulfill the tasks scheduled for the following day. We study this hierarchical and sequential decision-making problem-the Dynamic Time Slot Assignment Problem with Commitments and Customer Preferences (DTSAP-CCP)-in this paper. Methodology/results: Two distinct approaches are proposed: 1) an attention-based deep reinforcement learning with rollout execution (ADRL-RE) and 2) a scenario-based planning approach (SBP). The ADRL-RE combines a well-trained attention-based neural network with a rollout framework for online trajectory simulation. To support the training, we develop a neural heuristic solver that provides rapid route planning solutions, enabling efficient learning in complex combinatorial settings. The SBP approach samples several scenarios to guide the time slot assignment. Numerical experiments demonstrate the superiority of ADRL-RE and the stability of SBP compared to both rule-based and rollout-based approaches. Furthermore, the strong practicality of ADRL-RE is verified in a case study of after-sales service for large medical equipment. Implications: This study provides OEMs with practical decision-support tools for dynamic maintenance scheduling, balancing customer preferences and operational efficiency. In particular, our ADRL-RE shows strong real-world potential, supporting timely and customer-aligned maintenance scheduling.

SocialTraj: Two-Stage Socially-Aware Trajectory Prediction for Autonomous Driving via Conditional Diffusion Model

Authors:Xiao Zhou, Zengqi Peng, Jun Ma
Date:2025-09-22 14:42:51

Accurate trajectory prediction of surrounding vehicles (SVs) is crucial for autonomous driving systems to avoid misguided decisions and potential accidents. However, achieving reliable predictions in highly dynamic and complex traffic scenarios remains a significant challenge. One of the key impediments lies in the limited effectiveness of current approaches to capture the multi-modal behaviors of drivers, which leads to predicted trajectories that deviate from actual future motions. To address this issue, we propose SocialTraj, a novel trajectory prediction framework integrating social psychology principles through social value orientation (SVO). By utilizing Bayesian inverse reinforcement learning (IRL) to estimate the SVO of SVs, we obtain the critical social context to infer the future interaction trend. To ensure modal consistency in predicted behaviors, the estimated SVOs of SVs are embedded into a conditional denoising diffusion model that aligns generated trajectories with historical driving styles. Additionally, the planned future trajectory of the ego vehicle (EV) is explicitly incorporated to enhance interaction modeling. Extensive experiments on NGSIM and HighD datasets demonstrate that SocialTraj is capable of adapting to highly dynamic and interactive scenarios while generating socially compliant and behaviorally consistent trajectory predictions, outperforming existing baselines. Ablation studies demonstrate that dynamic SVO estimation and explicit ego-planning components notably improve prediction accuracy and substantially reduce inference time.

Remote Sensing-Oriented World Model

Authors:Yuxi Lu, Biao Wu, Zhidong Li, Kunqi Li, Chenya Huang, Huacan Wang, Qizhen Lan, Ronghao Chen, Ling Chen, Bin Liang
Date:2025-09-22 14:02:39

World models have shown potential in artificial intelligence by predicting and reasoning about world states beyond direct observations. However, existing approaches are predominantly evaluated in synthetic environments or constrained scene settings, limiting their validation in real-world contexts with broad spatial coverage and complex semantics. Meanwhile, remote sensing applications urgently require spatial reasoning capabilities for disaster response and urban planning. This paper bridges these gaps by introducing the first framework for world modeling in remote sensing. We formulate remote sensing world modeling as direction-conditioned spatial extrapolation, where models generate semantically consistent adjacent image tiles given a central observation and directional instruction. To enable rigorous evaluation, we develop RSWISE (Remote Sensing World-Image Spatial Evaluation), a benchmark containing 1,600 evaluation tasks across four scenarios: general, flood, urban, and rural. RSWISE combines visual fidelity assessment with instruction compliance evaluation using GPT-4o as a semantic judge, ensuring models genuinely perform spatial reasoning rather than simple replication. Afterwards, we present RemoteBAGEL, a unified multimodal model fine-tuned on remote sensing data for spatial extrapolation tasks. Extensive experiments demonstrate that RemoteBAGEL consistently outperforms state-of-the-art baselines on RSWISE.

Towards Learning Boulder Excavation with Hydraulic Excavators

Authors:Jonas Gruetter, Lorenzo Terenzi, Pascal Egli, Marco Hutter
Date:2025-09-22 12:27:06

Construction sites frequently require removing large rocks before excavation or grading can proceed. Human operators typically extract these boulders using only standard digging buckets, avoiding time-consuming tool changes to specialized grippers. This task demands manipulating irregular objects with unknown geometries in harsh outdoor environments where dust, variable lighting, and occlusions hinder perception. The excavator must adapt to varying soil resistance--dragging along hard-packed surfaces or penetrating soft ground--while coordinating multiple hydraulic joints to secure rocks using a shovel. Current autonomous excavation focuses on continuous media (soil, gravel) or uses specialized grippers with detailed geometric planning for discrete objects. These approaches either cannot handle large irregular rocks or require impractical tool changes that interrupt workflow. We train a reinforcement learning policy in simulation using rigid-body dynamics and analytical soil models. The policy processes sparse LiDAR points (just 20 per rock) from vision-based segmentation and proprioceptive feedback to control standard excavator buckets. The learned agent discovers different strategies based on soil resistance: dragging along the surface in hard soil and penetrating directly in soft conditions. Field tests on a 12-ton excavator achieved 70% success across varied rocks (0.4-0.7m) and soil types, compared to 83% for human operators. This demonstrates that standard construction equipment can learn complex manipulation despite sparse perception and challenging outdoor conditions.

Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring

Authors:Jacob J Webber, Oliver Watts, Lovisa Wihlborg, David Wheatley, Johnny Tam, Christine Weaver, Suvankar Pal, Siddharthan Chandran, Cassia Valentini-Botinhao
Date:2025-09-22 12:03:58

Monitoring the progression of neurodegenerative disease has important applications in the planning of treatment and the evaluation of future medications. Whereas much of the state-of-the-art in health monitoring from speech has been focused on classifying patients versus healthy controls, or predicting real-world health metrics, we propose here a novel measure of disease progression: the severity score. This score is derived from a model trained to minimize what we call the comparator loss. The comparator loss ensures scores follow an ordering relation, which can be based on diagnosis, clinically annotated scores, or simply the chronological order of the recordings. In addition to giving a more detailed picture than a simple discrete classification, the proposed comparator loss-based system has the potential to incorporate information from disparate health metrics, which is critical for making full use of small health-related datasets. We evaluated our proposed models based on their ability to affirmatively track the progression of patients with motor neuron disease (MND), the correlation of their output with clinical annotations such as ALSFRS-R, as well as their ability to distinguish between subjects with MND and healthy controls.

GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera

Authors:Xiaoyu Wang, Yan Rui Tan, William Leong, Sunan Huang, Rodney Teo, Cheng Xiang
Date:2025-09-22 07:26:40

This paper proposes an image-based visual servoing (IBVS) framework for UAV navigation and collision avoidance using only an RGB camera. While UAV navigation has been extensively studied, it remains challenging to apply IBVS in missions involving multiple visual targets and collision avoidance. The proposed method achieves navigation without explicit path planning, and collision avoidance is realized through AI-based monocular depth estimation from RGB images. Unlike approaches that rely on stereo cameras or external workstations, our framework runs fully onboard a Jetson platform, ensuring a self-contained and deployable system. Experimental results validate that the UAV can navigate across multiple AprilTags and avoid obstacles effectively in GPS-denied environments.

In vivo and predictive interplay evaluation methodology for lung and esophageal cancer patients treated in free breathing with IMPT

Authors:Giorgio Cartechini, Esther Kneepkens, Gloria Vilches-Freixas, Indra Lubken, Marije Velders, Sebastiaan Nijsten, Mirko Unipan, Ilaria Rinaldi
Date:2025-09-22 07:24:05

Purpose: Pencil beam scanning proton therapy is sensitive to respiratory motion, leading to potential dose inhomogeneities due to interplay effects. We developed and validated a predictive framework to assess interplay and motion robustness in lung and esophageal cancer patients treated under free-breathing conditions. Methods: A synthetic-breathing-based predictive model was implemented in the RayStation treatment planning system (TPS) and validated against an in vivo approach using patient-specific respiratory traces and machine log files. Both were benchmarked against the Monte Carlo engine FRED. To demonstrate the methodology, the framework was applied to two clinical cases treated on the Mevion S250i system without rescanning. Dose accumulation incorporated respiratory phase, range ($\pm$3 %), and setup ($\pm$5 mm) uncertainties. Results: Excellent agreement was observed between TPS and FRED ($<$1 % mean dose difference) and between predictive and in vivo models ($<$2 % across DVH metrics). Cumulative dose distributions for the primary CTV converged after five fractions, confirming robust delivery. Conclusions: This automated, clinically integrated framework enables pre-treatment prediction and in vivo validation of interplay and motion robustness in PBS plans. Preliminary results support its clinical utility, especially for hypofractionation and targets with large motion ($>$2 cm). A larger cohort study is ongoing and will be reported separately.

Fast Trajectory Planner with a Reinforcement Learning-based Controller for Robotic Manipulators

Authors:Yongliang Wang, Hamidreza Kasaei
Date:2025-09-22 06:45:21

Generating obstacle-free trajectories for robotic manipulators in unstructured and cluttered environments remains a significant challenge. Existing motion planning methods often require additional computational effort to generate the final trajectory by solving kinematic or dynamic equations. This paper highlights the strong potential of model-free reinforcement learning methods over model-based approaches for obstacle-free trajectory planning in joint space. We propose a fast trajectory planning system for manipulators that combines vision-based path planning in task space with reinforcement learning-based obstacle avoidance in joint space. We divide the framework into two key components. The first introduces an innovative vision-based trajectory planner in task space, leveraging the large-scale fast segment anything (FSA) model in conjunction with basis spline (B-spline)-optimized kinodynamic path searching. The second component enhances the proximal policy optimization (PPO) algorithm by integrating action ensembles (AE) and policy feedback (PF), which greatly improve precision and stability in goal-reaching and obstacle avoidance within the joint space. These PPO enhancements increase the algorithm's adaptability across diverse robotic tasks, ensuring consistent execution of commands from the first component by the manipulator, while also enhancing both obstacle avoidance efficiency and reaching accuracy. The experimental results demonstrate the effectiveness of PPO enhancements, as well as simulation-to-simulation (Sim-to-Sim) and simulation-to-reality (Sim-to-Real) transfer, in improving model robustness and planner efficiency in complex scenarios. These enhancements allow the robot to perform obstacle avoidance and real-time trajectory planning in obstructed environments. Project page available at: https://sites.google.com/view/ftp4rm/home

Quantum Noise Reduction in the Space-based Gravitational Wave Antenna DECIGO Using Optical Springs and Homodyne Detection scheme

Authors:Kenji Tsuji, Tomohiro Ishikawa, Kentaro Komori, Yutaro Enomoto, Yuta Michimura, Kurumi Umemura, Shoki Iwaguchi, Keiko Kokeyama, Seiji Kawamura
Date:2025-09-22 06:23:00

The DECi-hertz Interferometer Gravitational-wave Observatory (DECIGO) is a planned space-based, next-generation gravitational wave detector aimed at observing primordial gravitational waves originating form cosmic inflation. This work focuses on reducing the quantum noise, in the instrument's observation band of 0.1 to 10 Hz, by employing optical springs and a homodyne detection scheme. Although detuning 1000\,km long arm cavities was previously considered ineffective due to quantum state degradation from diffraction losses, we revisit this problem by formulating a new, rigorous model for quantum state of light by accounting for the vacuum state mixing as a result of diffraction losses. This work shows that high sensitivities can be achieved by employing optimal configurations of optical springs and homodyne detection schemes even with diffraction losses. These improvements alone are still not sufficient to achieve sensitivities to detect primordial gravitational waves as other technical noises limit further improvement.

Institutional Research Computing Capabilities in Australia: 2024

Authors:Slava Kitaeff, Luc Betbeder-Matibet, Jake Carroll, Stephen Giugni, David Abramson, John Zaitseff, Sarah Walters, David Powell, Chris Bording, Trung Nguyen, Angus Macoustra, Fabien Voisin, Bowen Chen, Jarrod Hurley
Date:2025-09-22 04:28:29

Institutional research computing infrastructure plays a vital role in Australia's research ecosystem, complementing and extending national facilities. This paper analyses research computing capabilities across Australian universities and organisations, showing how institutional systems support research excellence through local compute resources, specialised hardware, and cluster solutions. Our study finds that nearly 112,258 CPU cores and 2,241 GPUs serve over 6,000 researchers as essential bridges between desktops and national facilities, enabling workflows from development to large-scale computations. The estimated replacement value of this infrastructure is $144M AUD. Drawing on detailed data from multiple institutions, we identify key patterns in deployment, utilisation, and strategic alignment with research priorities. Institutional resources provide critical support for data-intensive projects, facilitate training and higher-degree student research, enable prototyping and development, and ensure data sovereignty compliance when required. The analysis shows how these facilities leverage national investments while addressing institution-specific needs that national systems cannot meet. We present evidence that strategic investment in institutional capabilities yields significant returns through greater research productivity, enhanced graduate training, and improved outcomes. The study offers insights for organisations planning computing strategies and highlights the importance of maintaining robust institutional resources alongside national facilities.

AERO-MPPI: Anchor-Guided Ensemble Trajectory Optimization for Agile Mapless Drone Navigation

Authors:Xin Chen, Rui Huang, Longbin Tang, Lin Zhao
Date:2025-09-22 03:21:51

Agile mapless navigation in cluttered 3D environments poses significant challenges for autonomous drones. Conventional mapping-planning-control pipelines incur high computational cost and propagate estimation errors. We present AERO-MPPI, a fully GPU-accelerated framework that unifies perception and planning through an anchor-guided ensemble of Model Predictive Path Integral (MPPI) optimizers. Specifically, we design a multi-resolution LiDAR point-cloud representation that rapidly extracts spatially distributed "anchors" as look-ahead intermediate endpoints, from which we construct polynomial trajectory guides to explore distinct homotopy path classes. At each planning step, we run multiple MPPI instances in parallel and evaluate them with a two-stage multi-objective cost that balances collision avoidance and goal reaching. Implemented entirely with NVIDIA Warp GPU kernels, AERO-MPPI achieves real-time onboard operation and mitigates the local-minima failures of single-MPPI approaches. Extensive simulations in forests, verticals, and inclines demonstrate sustained reliable flight above 7 m/s, with success rates above 80% and smoother trajectories compared to state-of-the-art baselines. Real-world experiments on a LiDAR-equipped quadrotor with NVIDIA Jetson Orin NX 16G confirm that AERO-MPPI runs in real time onboard and consistently achieves safe, agile, and robust flight in complex cluttered environments. The code will be open-sourced upon acceptance of the paper.

UIPro: Unleashing Superior Interaction Capability For GUI Agents

Authors:Hongxin Li, Jingran Su, Jingfan Chen, Zheng Ju, Yuntao Chen, Qing Li, Zhaoxiang Zhang
Date:2025-09-22 03:04:53

Building autonomous agents that perceive and operate graphical user interfaces (GUIs) like humans has long been a vision in the field of artificial intelligence. Central to these agents is the capability for GUI interaction, which involves GUI understanding and planning capabilities. Existing methods have tried developing GUI agents based on the multi-modal comprehension ability of vision-language models (VLMs). However, the limited scenario, insufficient size, and heterogeneous action spaces hinder the progress of building generalist GUI agents. To resolve these issues, this paper proposes \textbf{UIPro}, a novel generalist GUI agent trained with extensive multi-platform and multi-task GUI interaction data, coupled with a unified action space. We first curate a comprehensive dataset encompassing 20.6 million GUI understanding tasks to pre-train UIPro, granting it a strong GUI grounding capability, which is key to downstream GUI agent tasks. Subsequently, we establish a unified action space to harmonize heterogeneous GUI agent task datasets and produce a merged dataset to foster the action prediction ability of UIPro via continued fine-tuning. Experimental results demonstrate UIPro's superior performance across multiple GUI task benchmarks on various platforms, highlighting the effectiveness of our approach.

Physics-Informed Operator Learning for Hemodynamic Modeling

Authors:Ryan Chappell, Chayan Banerjee, Kien Nguyen, Clinton Fookes
Date:2025-09-22 00:24:44

Accurate modeling of personalized cardiovascular dynamics is crucial for non-invasive monitoring and therapy planning. State-of-the-art physics-informed neural network (PINN) approaches employ deep, multi-branch architectures with adversarial or contrastive objectives to enforce partial differential equation constraints. While effective, these enhancements introduce significant training and implementation complexity, limiting scalability and practical deployment. We investigate physics-informed neural operator learning models as efficient supervisory signals for training simplified architectures through knowledge distillation. Our approach pre-trains a physics-informed DeepONet (PI-DeepONet) on high-fidelity cuffless blood pressure recordings to learn operator mappings from raw wearable waveforms to beat-to-beat pressure signals under embedded physics constraints. This pre-trained operator serves as a frozen supervisor in a lightweight knowledge-distillation pipeline, guiding streamlined base models that eliminate complex adversarial and contrastive learning components while maintaining performance. We characterize the role of physics-informed regularization in operator learning and demonstrate its effectiveness for supervisory guidance. Through extensive experiments, our operator-supervised approach achieves performance parity with complex baselines (correlation: 0.766 vs. 0.770, RMSE: 4.452 vs. 4.501), while dramatically reducing architectural complexity from eight critical hyperparameters to a single regularization coefficient and decreasing training overhead by 4%. Our results demonstrate that operator-based supervision effectively replaces intricate multi-component training strategies, offering a more scalable and interpretable approach to physiological modeling with reduced implementation burden.

Agentic AI for Multi-Stage Physics Experiments at a Large-Scale User Facility Particle Accelerator

Authors:Thorsten Hellert, Drew Bertwistle, Simon C. Leemann, Antonin Sulc, Marco Venturini
Date:2025-09-21 22:11:03

We present the first language-model-driven agentic artificial intelligence (AI) system to autonomously execute multi-stage physics experiments on a production synchrotron light source. Implemented at the Advanced Light Source particle accelerator, the system translates natural language user prompts into structured execution plans that combine archive data retrieval, control-system channel resolution, automated script generation, controlled machine interaction, and analysis. In a representative machine physics task, we show that preparation time was reduced by two orders of magnitude relative to manual scripting even for a system expert, while operator-standard safety constraints were strictly upheld. Core architectural features, plan-first orchestration, bounded tool access, and dynamic capability selection, enable transparent, auditable execution with fully reproducible artifacts. These results establish a blueprint for the safe integration of agentic AI into accelerator experiments and demanding machine physics studies, as well as routine operations, with direct portability across accelerators worldwide and, more broadly, to other large-scale scientific infrastructures.

Seeing is Deceiving: Mirror-Based LiDAR Spoofing for Autonomous Vehicle Deception

Authors:Selma Yahia, Ildi Alla, Girija Bangalore Mohan, Daniel Rau, Mridula Singh, Valeria Loscri
Date:2025-09-21 22:05:36

Autonomous vehicles (AVs) rely heavily on LiDAR sensors for accurate 3D perception. We show a novel class of low-cost, passive LiDAR spoofing attacks that exploit mirror-like surfaces to inject or remove objects from an AV's perception. Using planar mirrors to redirect LiDAR beams, these attacks require no electronics or custom fabrication and can be deployed in real settings. We define two adversarial goals: Object Addition Attacks (OAA), which create phantom obstacles, and Object Removal Attacks (ORA), which conceal real hazards. We develop geometric optics models, validate them with controlled outdoor experiments using a commercial LiDAR and an Autoware-equipped vehicle, and implement a CARLA-based simulation for scalable testing. Experiments show mirror attacks corrupt occupancy grids, induce false detections, and trigger unsafe planning and control behaviors. We discuss potential defenses (thermal sensing, multi-sensor fusion, light-fingerprinting) and their limitations.

Radial Velocity Strategies for the Orbital Refinement of Exoplanet Direct Imaging Targets

Authors:Zhexing Li, Stephen R. Kane, Sarah Blunt, Caleb K. Harada
Date:2025-09-21 17:34:47

Many potential direct imaging candidates suffer from large orbital period uncertainties, leading to challenges in accurate predictions of future orbital positions and imprecise direct imaging measurements of planetary parameters. To improve the precision in orbital properties, precursor radial velocity (RV) follow-up observations for selected candidates are essential. This study examines the impact of three variables on the orbital period uncertainties of long-period giant planets: the number of future observations, the temporal gap between past and future data, and the temporal coverage of upcoming observations. Our simulations indicate that the orbital phases at which future RV observations are acquired play a significant role in reducing period uncertainties. Additionally, observing too frequently within a given time frame adds limited value to the program once a certain number of observations has been achieved. The temporal gap proves to be the most important factor when there is no strict end time to the observing campaign. However, if a strict end time is set, starting observations earlier yields improved reductions in orbital period uncertainty. These insights offer practical guidance for planning efficient RV follow-up campaigns to maximize the science yield of future space-based direct imaging missions.

Time Series Forecasting Using a Hybrid Deep Learning Method: A Bi-LSTM Embedding Denoising Auto Encoder Transformer

Authors:Sahar Koohfar, Wubeshet Woldemariam
Date:2025-09-21 17:16:32

Time series data is a prevalent form of data found in various fields. It consists of a series of measurements taken over time. Forecasting is a crucial application of time series models, where future values are predicted based on historical data. Accurate forecasting is essential for making well-informed decisions across industries. When it comes to electric vehicles (EVs), precise predictions play a key role in planning infrastructure development, load balancing, and energy management. This study introduces a BI-LSTM embedding denoising autoencoder model (BDM) designed to address time series problems, focusing on short-term EV charging load prediction. The performance of the proposed model is evaluated by comparing it with benchmark models like Transformer, CNN, RNN, LSTM, and GRU. Based on the results of the study, the proposed model outperforms the benchmark models in four of the five-time steps, demonstrating its effectiveness for time series forecasting. This research makes a significant contribution to enhancing time series forecasting, thereby improving decision-making processes.

A community-driven optimization framework for redrawing school attendance boundaries

Authors:Hongzhao Guan, Paul Riggins, Tyler Simko, Jasmine Mangat, Cassandra Moe, Urooj Haider, Frank Pantano, Effie G. McMillian, Genevieve Siegel-Hawley, Pascal Van Hentenryck, Nabeel Gillani
Date:2025-09-21 15:42:50

The vast majority of US public school districts use school attendance boundaries to determine which student addresses are assigned to which schools. Existing work shows how redrawing boundaries can be a powerful policy lever for increasing access and opportunity for historically disadvantaged groups, even while maintaining other priorities like minimizing driving distances and preserving existing social ties between students and families. This study introduces a multi-objective algorithmic school rezoning framework and applies it to a large-scale rezoning effort impacting over 50,000 students through an ongoing researcher-school district partnership. The framework is designed to incorporate feedback from community members and policymakers, both by deciding which goals are optimized and also by placing differential ``importance'' on goals through weights from community surveys. Empirical results reveal the framework's ability to surface school redistricting plans that simultaneously advance a number of objectives often thought to be in competition with one another, including socioeconomic integration, transportation efficiency, and stable feeder patterns (transitions) between elementary, middle, and high schools. The paper also highlights how local education policymakers navigate several practical challenges, like building political will to make change in a polarized policy climate. The framework is built using open-source tools and publicly released to support school districts in exploring and implementing new policies to improve educational access and opportunity in the coming years.

RISE: Adaptive music playback for Realtime Intensity Synchronization with Exercise

Authors:Alexander Wang, Chris Donahue, Dhruv Jain
Date:2025-09-21 15:07:47

We propose a system to adapt a user's music to their exercise by aligning high-energy music segments with intense intervals of the workout. Listening to music during exercise can boost motivation and performance. However, the structure of the music may be different from the user's natural phases of rest and work, causing users to rest longer than needed while waiting for a motivational section, or lose motivation mid-work if the section ends too soon. To address this, our system, called RISE, automatically estimates the intense segments in music and uses component-based music rearrangement techniques to dynamically extend and shorten different segments of the user's song to fit the ongoing exercise routine. Our system takes as input the rest and work durations to guide adaptation. Currently, this is determined either via a pre-defined plan or manual input during the workout. We evaluated RISE with 12 participants and compared our system to a non-adaptive music baseline while exercising in our lab. Participants found our rearrangements keeps intensity estimation accurate, and many recalled moments when intensity alignment helped them push through their workout.

$\texttt{DiffSyn}$: A Generative Diffusion Approach to Materials Synthesis Planning

Authors:Elton Pan, Soonhyoung Kwon, Sulin Liu, Mingrou Xie, Alexander J. Hoffman, Yifei Duan, Thorben Prein, Killian Sheriff, Yuriy Roman-Leshkov, Manuel Moliner, Rafael Gomez-Bombarelli, Elsa Olivetti
Date:2025-09-21 14:19:47

The synthesis of crystalline materials, such as zeolites, remains a significant challenge due to a high-dimensional synthesis space, intricate structure-synthesis relationships and time-consuming experiments. Considering the one-to-many relationship between structure and synthesis, we propose $\texttt{DiffSyn}$, a generative diffusion model trained on over 23,000 synthesis recipes spanning 50 years of literature. $\texttt{DiffSyn}$ generates probable synthesis routes conditioned on a desired zeolite structure and an organic template. $\texttt{DiffSyn}$ achieves state-of-the-art performance by capturing the multi-modal nature of structure-synthesis relationships. We apply $\texttt{DiffSyn}$ to differentiate among competing phases and generate optimal synthesis routes. As a proof of concept, we synthesize a UFI material using $\texttt{DiffSyn}$-generated synthesis routes. These routes, rationalized by density functional theory binding energies, resulted in the successful synthesis of a UFI material with a high Si/Al$_{\text{ICP}}$ of 19.0, which is expected to improve thermal stability and is higher than that of any previously recorded.

CoPlanner: An Interactive Motion Planner with Contingency-Aware Diffusion for Autonomous Driving

Authors:Ruiguo Zhong, Ruoyu Yao, Pei Liu, Xiaolong Chen, Rui Yang, Jun Ma
Date:2025-09-21 13:54:26

Accurate trajectory prediction and motion planning are crucial for autonomous driving systems to navigate safely in complex, interactive environments characterized by multimodal uncertainties. However, current generation-then-evaluation frameworks typically construct multiple plausible trajectory hypotheses but ultimately adopt a single most likely outcome, leading to overconfident decisions and a lack of fallback strategies that are vital for safety in rare but critical scenarios. Moreover, the usual decoupling of prediction and planning modules could result in socially inconsistent or unrealistic joint trajectories, especially in highly interactive traffic. To address these challenges, we propose a contingency-aware diffusion planner (CoPlanner), a unified framework that jointly models multi-agent interactive trajectory generation and contingency-aware motion planning. Specifically, the pivot-conditioned diffusion mechanism anchors trajectory sampling on a validated, shared short-term segment to preserve temporal consistency, while stochastically generating diverse long-horizon branches that capture multimodal motion evolutions. In parallel, we design a contingency-aware multi-scenario scoring strategy that evaluates candidate ego trajectories across multiple plausible long-horizon evolution scenarios, balancing safety, progress, and comfort. This integrated design preserves feasible fallback options and enhances robustness under uncertainty, leading to more realistic interaction-aware planning. Extensive closed-loop experiments on the nuPlan benchmark demonstrate that CoPlanner consistently surpasses state-of-the-art methods on both Val14 and Test14 datasets, achieving significant improvements in safety and comfort under both reactive and non-reactive settings. Code and model will be made publicly available upon acceptance.

Lensed quasars in CatNorth I. Wide-separation candidates

Authors:Di Wu, Zizhao He, Nan Li, Shenzhe Cui, Yuming Fu, XueBing Wu, Dan Qiu
Date:2025-09-21 13:06:17

Wide-separation lensed quasars (WSLQs) are a rare subclass of strongly lensed quasars produced by massive galaxy clusters. They provide valuable probes of dark-matter halos and quasar host galaxies. However, only about ten WSLQ systems are currently known, which limits further studies. To enlarge the sample from wide-area surveys, we developed a catalog-based pipeline and applied it to the CatNorth database, a catalog of quasar candidates constructed from Gaia DR3. CatNorth contains 1,545,514 quasar candidates with about 90% purity and a Gaia G-band limiting magnitude of roughly 21. The pipeline has three stages. First, we identify groups with separations between 10 and 72 arcsec using a HEALPix grid with 25.6 arcsec spacing and a friends-of-friends search. We then filter by intra-group color and spectral similarity, reducing the 1,545,514 sources to 14,244 groups while retaining all known, discoverable WSLQs. Finally, a visual check, guided by image geometry and the presence of likely foreground lenses, yields the candidate list with quality labels. We identify 333 new WSLQ candidates with separations from 10 to 56.8 arcsec. Using available SDSS DR16 and DESI DR1 spectroscopy, we uncover two new candidate systems; the remaining 331 candidates lack sufficient spectra and are labeled as 45 grade A, 98 grade B, and 188 grade C. We also compile 29 confirmed dual quasars as a by-product. When feasible, we plan follow-up spectroscopy and deeper imaging to confirm WSLQs among these candidates and enable the related science.

From domain-landmark graph learning to problem-landmark graph generation

Authors:Cristian Pérez-Corral, Antonio Garrido, Laura Sebastia
Date:2025-09-21 12:41:56

Landmarks have long played a pivotal role in automated planning, serving as crucial elements for improving the planning algorithms. The main limitation of classical landmark extraction methods is their sensitivity to specific planning tasks. This results in landmarks fully tailored to individual instances, thereby limiting their applicability across other instances of the same planning domain. We propose a novel approach that learns landmark relationships from multiple planning tasks of a planning domain. This leads to the creation of a \textit{probabilistic lifted ordering graph}, as a structure that captures weighted abstractions of relationships between parameterized landmarks. Although these orderings are not 100\% true (they are probabilistic), they can still be very useful in planning. Next, given a new planning task for that domain, we instantiate the relationships from that graph to this particular instance. This instantiation operates in two phases. First, it generates two graphs: the former instantiating information from the initial state and the latter from the goal state. Second, it combines these two graphs into one unified graph by searching equivalences to extract landmark orderings. We evaluate the precision and recallof the information found by our approach over well-known planning domains.

Orchestrate, Generate, Reflect: A VLM-Based Multi-Agent Collaboration Framework for Automated Driving Policy Learning

Authors:Zengqi Peng, Yusen Xie, Yubin Wang, Rui Yang, Qifeng Chen, Jun Ma
Date:2025-09-21 11:43:25

The advancement of foundation models fosters new initiatives for policy learning in achieving safe and efficient autonomous driving. However, a critical bottleneck lies in the manual engineering of reward functions and training curricula for complex and dynamic driving tasks, which is a labor-intensive and time-consuming process. To address this problem, we propose OGR (Orchestrate, Generate, Reflect), a novel automated driving policy learning framework that leverages vision-language model (VLM)-based multi-agent collaboration. Our framework capitalizes on advanced reasoning and multimodal understanding capabilities of VLMs to construct a hierarchical agent system. Specifically, a centralized orchestrator plans high-level training objectives, while a generation module employs a two-step analyze-then-generate process for efficient generation of reward-curriculum pairs. A reflection module then facilitates iterative optimization based on the online evaluation. Furthermore, a dedicated memory module endows the VLM agents with the capabilities of long-term memory. To enhance robustness and diversity of the generation process, we introduce a parallel generation scheme and a human-in-the-loop technique for augmentation of the reward observation space. Through efficient multi-agent cooperation and leveraging rich multimodal information, OGR enables the online evolution of reinforcement learning policies to acquire interaction-aware driving skills. Extensive experiments in the CARLA simulator demonstrate the superior performance, robust generalizability across distinct urban scenarios, and strong compatibility with various RL algorithms. Further real-world experiments highlight the practical viability and effectiveness of our framework. The source code will be available upon acceptance of the paper.

End2Race: Efficient End-to-End Imitation Learning for Real-Time F1Tenth Racing

Authors:Zhijie Qiao, Haowei Li, Zhong Cao, Henry X. Liu
Date:2025-09-21 03:08:51

F1Tenth is a widely adopted reduced-scale platform for developing and testing autonomous racing algorithms, hosting annual competitions worldwide. With high operating speeds, dynamic environments, and head-to-head interactions, autonomous racing requires algorithms that diverge from those in classical autonomous driving. Training such algorithms is particularly challenging: the need for rapid decision-making at high speeds severely limits model capacity. To address this, we propose End2Race, a novel end-to-end imitation learning algorithm designed for head-to-head autonomous racing. End2Race leverages a Gated Recurrent Unit (GRU) architecture to capture continuous temporal dependencies, enabling both short-term responsiveness and long-term strategic planning. We also adopt a sigmoid-based normalization function that transforms raw LiDAR scans into spatial pressure tokens, facilitating effective model training and convergence. The algorithm is extremely efficient, achieving an inference time of less than 0.5 milliseconds on a consumer-class GPU. Experiments in the F1Tenth simulator demonstrate that End2Race achieves a 94.2% safety rate across 2,400 overtaking scenarios, each with an 8-second time limit, and successfully completes overtakes in 59.2% of cases. This surpasses previous methods and establishes ours as a leading solution for the F1Tenth racing testbed. Code is available at https://github.com/michigan-traffic-lab/End2Race.