planning - 2025-10-21

Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain

Authors:Yulin Luo, Chun-Kai Fan, Menghang Dong, Jiayu Shi, Mengdi Zhao, Bo-Wen Zhang, Cheng Chi, Jiaming Liu, Gaole Dai, Rongyu Zhang, Ruichuan An, Kun Wu, Zhengping Che, Shaoxuan Xie, Guocai Yao, Zhongxia Zhao, Pengwei Wang, Guang Liu, Zhongyuan Wang, Tiejun Huang, Shanghang Zhang

Date:2025-10-20 17:59:03

Building robots that can perceive, reason, and act in dynamic, unstructured environments remains a core challenge. Recent embodied systems often adopt a dual-system paradigm, where System 2 handles high-level reasoning while System 1 executes low-level control. In this work, we refer to System 2 as the embodied brain, emphasizing its role as the cognitive core for reasoning and decision-making in manipulation tasks. Given this role, systematic evaluation of the embodied brain is essential. Yet existing benchmarks emphasize execution success, or when targeting high-level reasoning, suffer from incomplete dimensions and limited task realism, offering only a partial picture of cognitive capability. To bridge this gap, we introduce RoboBench, a benchmark that systematically evaluates multimodal large language models (MLLMs) as embodied brains. Motivated by the critical roles across the full manipulation pipeline, RoboBench defines five dimensions-instruction comprehension, perception reasoning, generalized planning, affordance prediction, and failure analysis-spanning 14 capabilities, 25 tasks, and 6092 QA pairs. To ensure realism, we curate datasets across diverse embodiments, attribute-rich objects, and multi-view scenes, drawing from large-scale real robotic data. For planning, RoboBench introduces an evaluation framework, MLLM-as-world-simulator. It evaluate embodied feasibility by simulating whether predicted plans can achieve critical object-state changes. Experiments on 14 MLLMs reveal fundamental limitations: difficulties with implicit instruction comprehension, spatiotemporal reasoning, cross-scenario planning, fine-grained affordance understanding, and execution failure diagnosis. RoboBench provides a comprehensive scaffold to quantify high-level cognition, and guide the development of next-generation embodied MLLMs. The project page is in https://robo-bench.github.io.

Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics

Authors:Akshara Prabhakar, Roshan Ram, Zixiang Chen, Silvio Savarese, Frank Wang, Caiming Xiong, Huan Wang, Weiran Yao

Date:2025-10-20 17:55:11

As information grows exponentially, enterprises face increasing pressure to transform unstructured data into coherent, actionable insights. While autonomous agents show promise, they often struggle with domain-specific nuances, intent alignment, and enterprise integration. We present Enterprise Deep Research (EDR), a multi-agent system that integrates (1) a Master Planning Agent for adaptive query decomposition, (2) four specialized search agents (General, Academic, GitHub, LinkedIn), (3) an extensible MCP-based tool ecosystem supporting NL2SQL, file analysis, and enterprise workflows, (4) a Visualization Agent for data-driven insights, and (5) a reflection mechanism that detects knowledge gaps and updates research direction with optional human-in-the-loop steering guidance. These components enable automated report generation, real-time streaming, and seamless enterprise deployment, as validated on internal datasets. On open-ended benchmarks including DeepResearch Bench and DeepConsult, EDR outperforms state-of-the-art agentic systems without any human steering. We release the EDR framework and benchmark trajectories to advance research on multi-agent reasoning applications. Code at https://github.com/SalesforceAIResearch/enterprise-deep-research and Dataset at https://huggingface.co/datasets/Salesforce/EDR-200

TCLK Must Stay! CAMAC Must Go! How Does Fermilab Move Forward

Authors:M. R. Austin, L. Carmichael, D. McArthur, E. Milton, A. Quilty

Date:2025-10-20 17:06:30

The current Timing System at Fermilab has been around for 40 years and currently relies on 7 CAMAC crates and over 100 CAMAC cards to produce the Tevatron Clock (TCLK). Thanks to the ingenuity of those before us, this has allowed Fermilab the flexibility to change the timing and Events for its accelerator as beamlines and projects have changed over the years. With the advent of the Proton Improvement Plan-II (PIP-II), the Timing System at Fermilab is being reimagined into a single chassis with even greater flexibility and functionality for decades to come while tackling the ever-challenging task of maintaining backwards compatibility.

The Marked Edge Walk: A Novel MCMC Algorithm for Sampling of Graph Partitions

Authors:Atticus McWhorter, Daryl DeFord

Date:2025-10-20 16:28:42

Novel Markov Chain Monte Carlo (MCMC) methods have enabled the generation of large ensembles of redistricting plans through graph partitioning. However, existing algorithms such as Reversible Recombination (RevReCom) and Metropolized Forest Recombination (MFR) are constrained to sampling from distributions related to spanning trees. We introduce the marked edge walk (MEW), a novel MCMC algorithm for sampling from the space of graph partitions under a tunable distribution. The walk operates on the space of spanning trees with marked edges, allowing for calculable transition probabilities for use in the Metropolis-Hastings algorithm. Empirical results on real-world dual graphs show convergence under target distributions unrelated to spanning trees. For this reason, MEW represents an advancement in flexible ensemble generation.

A Mimamsa Inspired Framework For Instruction Sequencing In AI Agents

Authors:Bama Srinivasan

Date:2025-10-20 16:06:53

This paper presents a formal framework for sequencing instructions in AI agents, inspired by the Indian philosophical system of Mimamsa. The framework formalizes sequencing mechanisms through action object pairs in three distinct ways: direct assertion (Srutikrama) for temporal precedence, purpose driven sequencing (Arthakrama) for functional dependencies, and iterative procedures (Pravrittikrama) for distinguishing between parallel and sequential execution in repetitive tasks. It introduces the syntax and semantics of an action object imperative logic, extending the MIRA formalism (Srinivasan and Parthasarathi, 2021) with explicit deduction rules for sequencing. The correctness of instruction sequencing is established through a validated theorem, which is based on object dependencies across successive instructions. This is further supported by proofs of soundness and completeness. This formal verification enables reliable instruction sequencing, impacting AI applications across areas like task planning and robotics by addressing temporal reasoning and dependency modeling.

OG-Rank: Learning to Rank Fast and Slow with Uncertainty and Reward-Trend Guided Adaptive Exploration

Authors:Praphul Singh, Corey Barrett, Sumana Srivasta, Irfan Bulu, Sri Gadde, Krishnaram Kenthapadi

Date:2025-10-20 15:00:02

Clinicians need ranking systems that work in real time and still justify their choices. Motivated by the need for a low-latency, decoder-based reranker, we present OG-Rank, a single-decoder approach that pairs a pooled first-token scoring signal with an uncertainty-gated explanation step. The model scores all candidates in one pass and generates a brief, structured rationale only when the list is genuinely ambiguous, keeping latency predictable. Trained with a curriculum that concentrates effort on hard cases, OG-Rank delivers strong effectiveness on encounter-scoped order selection (fast path: Recall@1~0.45, nDCG@20~0.625) and improves further when the gate activates (Recall@1~0.56, nDCG@20~0.699 at a 45\% gate rate), while compact backbones show similar gains under the same policy. Encoder baselines trail in both effectiveness and flexibility. The result is a practical recipe: rank fast by default and explain when it helps, a pattern that applies broadly to decision tasks where selective generation buys accuracy at acceptable cost. The single-policy design simplifies deployment and budget planning, and the curriculum principle (spend more on the hard cases, less on the easy ones) readily transfers beyond clinical order selection.

Towards Optimal Control and Algorithmic Structure of Decompression Schedules

Authors:Benjamin Marsh

Date:2025-10-20 14:00:51

We formalise decompression planning as an optimal control problem with gas feasibility windows (ppO$_2$, END), affine ceilings, and convex penalties in normalised oversaturation. We prove existence, a monotone no re-descent structure and bang-bang ascents under a mild monotonicity assumption on inert fraction, and establish dwell time KKT conditions. We give pseudo-polynomial DP and label-setting algorithms with a priori error bounds, derive Lipschitz regularity of the online value function, and discuss multi-species extensions. The efficient frontier is continuous and generally nonconvex. We provide the first formal existence and bang-bang structure proof under mixed gas feasibility windows.

Distributed Spatial-Temporal Trajectory Optimization for Unmanned-Aerial-Vehicle Swarm

Authors:Xiaobo Zheng, Pan Tang, Defu Lin, Shaoming He

Date:2025-10-20 13:45:50

Swarm trajectory optimization problems are a well-recognized class of multi-agent optimal control problems with strong nonlinearity. However, the heuristic nature of needing to set the final time for agents beforehand and the time-consuming limitation of the significant number of iterations prohibit the application of existing methods to large-scale swarm of Unmanned Aerial Vehicles (UAVs) in practice. In this paper, we propose a spatial-temporal trajectory optimization framework that accomplishes multi-UAV consensus based on the Alternating Direction Multiplier Method (ADMM) and uses Differential Dynamic Programming (DDP) for fast local planning of individual UAVs. The introduced framework is a two-level architecture that employs Parameterized DDP (PDDP) as the trajectory optimizer for each UAV, and ADMM to satisfy the local constraints and accomplish the spatial-temporal parameter consensus among all UAVs. This results in a fully distributed algorithm called Distributed Parameterized DDP (D-PDDP). In addition, an adaptive tuning criterion based on the spectral gradient method for the penalty parameter is proposed to reduce the number of algorithmic iterations. Several simulation examples are presented to verify the effectiveness of the proposed algorithm.

Volumetric Non-Invasive Cardiac Mapping for Accessible Global Arrhythmia Characterization

Authors:Jorge Vicente-Puig, Judit Chamorro-Servent, Ernesto Zacur, Inés Llorente-Lipe, Marta Martínez, Jorge Sanchez, Jana Reventós, Ivo Roca-Luque, Lluis Mont, Felipe Atienza, Andreu M. Climent, Maria S. Guillem, Ismael Hernández-Romero

Date:2025-10-20 13:42:47

Cardiac arrhythmias are a major cause of morbidity and mortality increasing the risk of stroke, heart failure, and sudden cardiac death. Imageless electrocardiographic imaging (ECGI) provides a non invasive alternative to electrical mapping from body surface potentials, but conventional ECGI is confined to epicardial reconstructions and can miss arrhythmias originating in deeper myocardium. We address this by reconstructing three dimensional cardiac activity with a volumetric formulation that solves an inverse source problem via Green's functions, enabling full volume activation mapping and improved localization in anatomically complex regions. We evaluate the approach on simulated premature ventricular beats and on four challenging patient cases, a right ventricular outflow tract premature ventricular contraction, a left bundle branch block, a ventricular tachycardia, and Wolff Parkinson White, and additionally assess performance on an open source myocardial infarction dataset. Results show that volumetric ECGI recovers 3D activation and sharpens arrhythmia origin localization, achieving a 59.3% reduction in geodesic error between estimated and simulated origins relative to surface only methods; in patient cases, activation patterns align with clinical diagnoses. Overall, imageless volumetric ECGI offers accessible, non invasive 3D activation mapping that overcomes a core limitation of surface restricted techniques and may improve preprocedural planning, ablation target guidance, and selection or optimization of cardiac resynchronization therapy.

HumanMPC - Safe and Efficient MAV Navigation among Humans

Authors:Simon Schaefer, Helen Oleynikova, Sandra Hirche, Stefan Leutenegger

Date:2025-10-20 13:24:36

Safe and efficient robotic navigation among humans is essential for integrating robots into everyday environments. Most existing approaches focus on simplified 2D crowd navigation and fail to account for the full complexity of human body dynamics beyond root motion. We present HumanMPC, a Model Predictive Control (MPC) framework for 3D Micro Air Vehicle (MAV) navigation among humans that combines theoretical safety guarantees with data-driven models for realistic human motion forecasting. Our approach introduces a novel twist to reachability-based safety formulation that constrains only the initial control input for safety while modeling its effects over the entire planning horizon, enabling safe yet efficient navigation. We validate HumanMPC in both simulated experiments using real human trajectories and in the real-world, demonstrating its effectiveness across tasks ranging from goal-directed navigation to visual servoing for human tracking. While we apply our method to MAVs in this work, it is generic and can be adapted by other platforms. Our results show that the method ensures safety without excessive conservatism and outperforms baseline approaches in both efficiency and reliability.

SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries

Authors:Chenxu Dang, Haiyan Liu, Guangjun Bao, Pei An, Xinyue Tang, Jie Ma, Bingchuan Sun, Yan Wang

Date:2025-10-20 12:26:25

Semantic occupancy has emerged as a powerful representation in world models for its ability to capture rich spatial semantics. However, most existing occupancy world models rely on static and fixed embeddings or grids, which inherently limit the flexibility of perception. Moreover, their ``in-place classification" over grids exhibits a potential misalignment with the dynamic and continuous nature of real scenarios.In this paper, we propose SparseWorld, a novel 4D occupancy world model that is flexible, adaptive, and efficient, powered by sparse and dynamic queries. We propose a Range-Adaptive Perception module, in which learnable queries are modulated by the ego vehicle states and enriched with temporal-spatial associations to enable extended-range perception. To effectively capture the dynamics of the scene, we design a State-Conditioned Forecasting module, which replaces classification-based forecasting with regression-guided formulation, precisely aligning the dynamic queries with the continuity of the 4D environment. In addition, We specifically devise a Temporal-Aware Self-Scheduling training strategy to enable smooth and efficient training. Extensive experiments demonstrate that SparseWorld achieves state-of-the-art performance across perception, forecasting, and planning tasks. Comprehensive visualizations and ablation studies further validate the advantages of SparseWorld in terms of flexibility, adaptability, and efficiency. The code is available at https://github.com/MSunDYY/SparseWorld.

Active Inference for an Intelligent Agent in Autonomous Reconnaissance Missions

Authors:Johan Schubert, Farzad Kamrani, Tove Gustavi

Date:2025-10-20 11:35:46

We develop an active inference route-planning method for the autonomous control of intelligent agents. The aim is to reconnoiter a geographical area to maintain a common operational picture. To achieve this, we construct an evidence map that reflects our current understanding of the situation, incorporating both positive and "negative" sensor observations of possible target objects collected over time, and diffusing the evidence across the map as time progresses. The generative model of active inference uses Dempster-Shafer theory and a Gaussian sensor model, which provides input to the agent. The generative process employs a Bayesian approach to update a posterior probability distribution. We calculate the variational free energy for all positions within the area by assessing the divergence between a pignistic probability distribution of the evidence map and a posterior probability distribution of a target object based on the observations, including the level of surprise associated with receiving new observations. Using the free energy, we direct the agents' movements in a simulation by taking an incremental step toward a position that minimizes the free energy. This approach addresses the challenge of exploration and exploitation, allowing agents to balance searching extensive areas of the geographical map while tracking identified target objects.

Diverse Planning with Simulators via Linear Temporal Logic

Authors:Mustafa F. Abdelwahed, Alice Toniolo, Joan Espasa, Ian P. Gent

Date:2025-10-20 10:59:09

Autonomous agents rely on automated planning algorithms to achieve their objectives. Simulation-based planning offers a significant advantage over declarative models in modelling complex environments. However, relying solely on a planner that produces a single plan may not be practical, as the generated plans may not always satisfy the agent's preferences. To address this limitation, we introduce $\texttt{FBI}_\texttt{LTL}$, a diverse planner explicitly designed for simulation-based planning problems. $\texttt{FBI}_\texttt{LTL}$ utilises Linear Temporal Logic (LTL) to define semantic diversity criteria, enabling agents to specify what constitutes meaningfully different plans. By integrating these LTL-based diversity models directly into the search process, $\texttt{FBI}_\texttt{LTL}$ ensures the generation of semantically diverse plans, addressing a critical limitation of existing diverse planning approaches that may produce syntactically different but semantically identical solutions. Extensive evaluations on various benchmarks consistently demonstrate that $\texttt{FBI}_\texttt{LTL}$ generates more diverse plans compared to a baseline approach. This work establishes the feasibility of semantically-guided diverse planning in simulation-based environments, paving the way for innovative approaches in realistic, non-symbolic domains where traditional model-based approaches fail.

Quantitative Stability in Discrete Optimal Transport

Authors:William Ford

Date:2025-10-20 10:49:19

This work investigates several aspects related to quantitative stability in optimal transport, as well as uniqueness of the dual transport problem. Our main contributions are as follows. Chapter 1: Observations regarding the quantitative stability of optimal transport plans with respect to Wasserstein distance on the product space. Chapter 2: Extention of strong convexity inequalities for the Kantorovich functional to a larger class of source measures, using glueing arguments recently used for the quantitative stability of optimal transport maps. Chapters 3/4: A qualitative description of the behaviour of the fully discrete transport problem under perturbation of the support positions, as well as quantitative stability under uniqueness assumptions. Chapter 5: Extention of known uniqueness criteria for the dual transport problem. We show that when one marginal measure has Lipschitz-path connected support and the other has bounded support, the values of dual optimisers are unique up to a constant for a large family of costs, including $p$-costs for all $p>1$.

SmartSustain Recommender System: Navigating Sustainability Trade-offs in Personalized City Trip Planning

Authors:Ashmi Banerjee, Melih Mert Aksoy, Wolfgang Wörndl

Date:2025-10-20 09:56:47

Tourism is a major contributor to global carbon emissions and over-tourism, creating an urgent need for recommender systems that not only inform but also gently steer users toward more sustainable travel decisions. Such choices, however, often require balancing complex trade-offs between environmental impact, cost, convenience, and personal interests. To address this, we present the SmartSustain Recommender, a web application designed to nudge users toward eco-friendlier options through an interactive, user-centric interface. The system visualizes the broader consequences of travel decisions by combining CO2e emissions, destination popularity, and seasonality with personalized interest matching. It employs mechanisms such as interactive city cards for quick comparisons, dynamic banners that surface sustainable alternatives in specific trade-off scenarios, and real-time impact feedback using animated environmental indicators. A preliminary user study with 21 participants indicated strong usability and perceived effectiveness. The system is accessible at https://smartsustainrecommender.web.app.

Implicit State Estimation via Video Replanning

Authors:Po-Chen Ko, Jiayuan Mao, Yu-Hsiang Fu, Hsien-Jeng Yeh, Chu-Rong Chen, Wei-Chiu Ma, Yilun Du, Shao-Hua Sun

Date:2025-10-20 09:02:25

Video-based representations have gained prominence in planning and decision-making due to their ability to encode rich spatiotemporal dynamics and geometric relationships. These representations enable flexible and generalizable solutions for complex tasks such as object manipulation and navigation. However, existing video planning frameworks often struggle to adapt to failures at interaction time due to their inability to reason about uncertainties in partially observed environments. To overcome these limitations, we introduce a novel framework that integrates interaction-time data into the planning process. Our approach updates model parameters online and filters out previously failed plans during generation. This enables implicit state estimation, allowing the system to adapt dynamically without explicitly modeling unknown state variables. We evaluate our framework through extensive experiments on a new simulated manipulation benchmark, demonstrating its ability to improve replanning performance and advance the field of video-based decision-making.

High-Level Multi-Robot Trajectory Planning And Spurious Behavior Detection

Authors:Fernando Salanova, Jesús Roche, Cristian Mahuela, Eduardo Montijano

Date:2025-10-20 07:47:51

The reliable execution of high-level missions in multi-robot systems with heterogeneous agents, requires robust methods for detecting spurious behaviors. In this paper, we address the challenge of identifying spurious executions of plans specified as a Linear Temporal Logic (LTL) formula, as incorrect task sequences, violations of spatial constraints, timing inconsis- tencies, or deviations from intended mission semantics. To tackle this, we introduce a structured data generation framework based on the Nets-within-Nets (NWN) paradigm, which coordinates robot actions with LTL-derived global mission specifications. We further propose a Transformer-based anomaly detection pipeline that classifies robot trajectories as normal or anomalous. Experi- mental evaluations show that our method achieves high accuracy (91.3%) in identifying execution inefficiencies, and demonstrates robust detection capabilities for core mission violations (88.3%) and constraint-based adaptive anomalies (66.8%). An ablation experiment of the embedding and architecture was carried out, obtaining successful results where our novel proposition performs better than simpler representations.

An adaptive hierarchical control framework for quadrupedal robots in planetary exploration

Authors:Franek Stark, Rohit Kumar, Shubham Vyas, Hannah Isermann, Jonas Haack, Mihaela Popescu, Jakob Middelberg, Dennis Mronga, Frank Kirchner

Date:2025-10-20 07:37:49

Planetary exploration missions require robots capable of navigating extreme and unknown environments. While wheeled rovers have dominated past missions, their mobility is limited to traversable surfaces. Legged robots, especially quadrupeds, can overcome these limitations by handling uneven, obstacle-rich, and deformable terrains. However, deploying such robots in unknown conditions is challenging due to the need for environment-specific control, which is infeasible when terrain and robot parameters are uncertain. This work presents a modular control framework that combines model-based dynamic control with online model adaptation and adaptive footstep planning to address uncertainties in both robot and terrain properties. The framework includes state estimation for quadrupeds with and without contact sensing, supports runtime reconfiguration, and is integrated into ROS 2 with open-source availability. Its performance was validated on two quadruped platforms, multiple hardware architectures, and in a volcano field test, where the robot walked over 700 m.

SimpleVSF: VLM-Scoring Fusion for Trajectory Prediction of End-to-End Autonomous Driving

Authors:Peiru Zheng, Yun Zhao, Zhan Gong, Hong Zhu, Shaohua Wu

Date:2025-10-20 06:09:57

End-to-end autonomous driving has emerged as a promising paradigm for achieving robust and intelligent driving policies. However, existing end-to-end methods still face significant challenges, such as suboptimal decision-making in complex scenarios. In this paper,we propose SimpleVSF (Simple VLM-Scoring Fusion), a novel framework that enhances end-to-end planning by leveraging the cognitive capabilities of Vision-Language Models (VLMs) and advanced trajectory fusion techniques. We utilize the conventional scorers and the novel VLM-enhanced scorers. And we leverage a robust weight fusioner for quantitative aggregation and a powerful VLM-based fusioner for qualitative, context-aware decision-making. As the leading approach in the ICCV 2025 NAVSIM v2 End-to-End Driving Challenge, our SimpleVSF framework demonstrates state-of-the-art performance, achieving a superior balance between safety, comfort, and efficiency.

DiffVLA++: Bridging Cognitive Reasoning and End-to-End Driving through Metric-Guided Alignment

Authors:Yu Gao, Yiru Wang, Anqing Jiang, Heng Yuwen, Wang Shuo, Sun Hao, Wang Jijun

Date:2025-10-20 04:49:14

Conventional end-to-end (E2E) driving models are effective at generating physically plausible trajectories, but often fail to generalize to long-tail scenarios due to the lack of essential world knowledge to understand and reason about surrounding environments. In contrast, Vision-Language-Action (VLA) models leverage world knowledge to handle challenging cases, but their limited 3D reasoning capability can lead to physically infeasible actions. In this work we introduce DiffVLA++, an enhanced autonomous driving framework that explicitly bridges cognitive reasoning and E2E planning through metric-guided alignment. First, we build a VLA module directly generating semantically grounded driving trajectories. Second, we design an E2E module with a dense trajectory vocabulary that ensures physical feasibility. Third, and most critically, we introduce a metric-guided trajectory scorer that guides and aligns the outputs of the VLA and E2E modules, thereby integrating their complementary strengths. The experiment on the ICCV 2025 Autonomous Grand Challenge leaderboard shows that DiffVLA++ achieves EPDMS of 49.12.

Decentralized Real-Time Planning for Multi-UAV Cooperative Manipulation via Imitation Learning

Authors:Shantnav Agarwal, Javier Alonso-Mora, Sihao Sun

Date:2025-10-20 04:35:54

Existing approaches for transporting and manipulating cable-suspended loads using multiple UAVs along reference trajectories typically rely on either centralized control architectures or reliable inter-agent communication. In this work, we propose a novel machine learning based method for decentralized kinodynamic planning that operates effectively under partial observability and without inter-agent communication. Our method leverages imitation learning to train a decentralized student policy for each UAV by imitating a centralized kinodynamic motion planner with access to privileged global observations. The student policy generates smooth trajectories using physics-informed neural networks that respect the derivative relationships in motion. During training, the student policies utilize the full trajectory generated by the teacher policy, leading to improved sample efficiency. Moreover, each student policy can be trained in under two hours on a standard laptop. We validate our method in both simulation and real-world environments to follow an agile reference trajectory, demonstrating performance comparable to that of centralized approaches.

Optimizing Transmission FLASH Radiotherapy for Large-Field Post-Mastectomy Breast Treatment

Authors:Ahmal Jawad Zafar, Sunil William Dutta, Matthew Joseph Case, Zachary Diamond, Duncan Bohannon, Reshma Jagsi, Xiaofeng Yang, Jun Zhou

Date:2025-10-19 23:17:07

We investigated the effects of scanning speed, beam configuration, and dose-rate modeling on the FLASH effect in post-mastectomy proton transmission-beam (TB) planning and evaluated whether optimizing the spot-scanning path can enhance FLASH. Five left-sided post-mastectomy patients (32 Gy in 5 fractions) were replanned with single-energy (249 MeV) tangential TBs plus a clinical en face background beam. FLASH was evaluated with two models: Krieger's FLASH effectiveness model (FEM) and Folkerts' average dose-rate (ADR) framework. Plans used conventional pencil-beam scanning, split-field delivery, and GA-optimized spot sequences, with vertical scan speeds varied from 10 to 20 mm/ms. FLASH in normal tissues was defined as the percentage of voxels meeting the threshold (>= 4 Gy at >= 40 Gy/s); once a voxel met the criterion, a dose-adjustment factor of 0.67 was applied. The FLASH effect was highly sensitive to scanning pattern and model choice. Increasing vertical scan speed from 10 to 20 mm/ms increased FLASH in the CTV by 22% (ADR) and 12% (FEM); in skin it rose from 41.4% to 58.8% (ADR) and from 8.4% to 13.1% (FEM). Split-field delivery increased the temporal separation between vertical spot columns and yielded superior FLASH, including up to a 9.2 Gy reduction in CTV Dmean with ADR. GA-based optimization shortened scan time and achieved FLASH comparable to split-field delivery, with a CTV Dmean reduction of 7.87 Gy (ADR-GA) and skin Dmean reductions of 2-3 Gy. These findings indicate that FLASH outcomes depend strongly on scanning trajectory, scan speed, and model selection. In addition, path-minimizing spot-delivery optimization (e.g., GA) can further improve dose-rate distributions in healthy voxels.

Proactive and Fair Epidemic Resource Allocation Through an Integrated Supply Chain Framework: Insights from a COVID-19 Study

Authors:Kimiya Jozani, Nihal A. Sageer, Hode Eldardiry, Sait Tunc, Esra Buyuktahtakin Toy

Date:2025-10-19 19:16:09

Timely and effective decision-making is critical during epidemics to reduce preventable infections and deaths. This demands integrated models that jointly capture disease dynamics, vaccine distribution, regional disparities, and behavioral responses. However, most existing approaches decouple epidemic forecasting from logistics planning, hindering adaptive and regionally responsive interventions. We propose a novel epidemiological-optimization framework that jointly models epidemic progression and a multiscale vaccine supply chain. The model incorporates spatio-temporally varying effective infection rates to reflect regional policy and behavioral dynamics. It supports coordinated, data-driven decision-making across spatial scales through two formulations: a multi-objective Gini-based model and a knapsack-based model that leverages regional vulnerability indicators for tractability and improved mitigation. To address computational complexity, we design two scalable heuristic decomposition algorithms inspired by the Benders decomposition. The model is validated using COVID-19 data in the U.S.. We introduce SARIMA-based forecasting as a novel approach for validating epidemic-optimization models under data limitations. The results show that our approach can prevent more than 2 million infections and 30,000 deaths in just six months while significantly improving the accessibility of vaccines in underserved regions. Our framework demonstrates that integrating fairness and epidemic dynamics with vaccine logistics leads to superior outcomes compared to traditional myopic policies. Fairness improves overall efficiency in the long term by prioritizing the most vulnerable populations, leading to better long-term public health outcomes. The model offers policymakers a scalable and operationally relevant tool to strengthen preparedness and ensure a more effective and equitable response to epidemics.

Quantile Regression, Variational Autoencoders, and Diffusion Models for Uncertainty Quantification: A Spatial Analysis of Sub-seasonal Wind Speed Prediction

Authors:Ganglin Tian, Anastase Alexandre Charantonis, Camille Le Coz, Alexis Tantet, Riwal Plougonven

Date:2025-10-19 18:26:46

This study aims to improve the spatial representation of uncertainties when regressing surface wind speeds from large-scale atmospheric predictors for sub-seasonal forecasting. Sub-seasonal forecasting often relies on large-scale atmospheric predictors such as 500 hPa geopotential height (Z500), which exhibit higher predictability than surface variables and can be downscaled to obtain more localised information. Previous work by Tian et al. (2024) demonstrated that stochastic perturbations based on model residuals can improve ensemble dispersion representation in statistical downscaling frameworks, but this method fails to represent spatial correlations and physical consistency adequately. More sophisticated approaches are needed to capture the complex relationships between large-scale predictors and local-scale predictands while maintaining physical consistency. Probabilistic deep learning models offer promising solutions for capturing complex spatial dependencies. This study evaluates three probabilistic methods with distinct uncertainty quantification mechanisms: Quantile Regression Neural Network that directly models distribution quantiles, Variational Autoencoders that leverage latent space sampling, and Diffusion Models that utilise iterative denoising. These models are trained on ERA5 reanalysis data and applied to ECMWF sub-seasonal hindcasts to regress probabilistic wind speed ensembles. Our results show that probabilistic downscaling approaches provide more realistic spatial uncertainty representations compared to simpler stochastic methods, with each probabilistic model offering different strengths in terms of ensemble dispersion, deterministic skill, and physical consistency. These findings establish probabilistic downscaling as an effective enhancement to operational sub-seasonal wind forecasts for renewable energy planning and risk assessment.

Real-time MRI-based fetal femur length measurement

Authors:Johannes Barcsay, Sara Neves Silva, Jordina Aviles Verdera, Charline Bradshaw, Mary Rutherford, Susanne Schulz-Heise, Jana Hutter

Date:2025-10-19 16:34:41

Purpose: To develop and evaluate a real-time method for automatic planning and measurement of fetal femur length - an important indicator of antenatal growth - during MRI. While routinely assessed by ultrasound, MRI-based femur length measurements remain challenging due to bone-slice misalignment, fetal motion, and the need for manual assessment. Methods: A low-latency 3D U-Net was trained on 59 scans acquired at 0.55T (gestational age 18-40 weeks) to localise the proximal and distal endpoints of the femur. These coordinates were employed to automatically adapt a 30-second EPI sequence for in-plane femur coverage in real-time. Retrospective evaluation was performed in 72 scans (19-39 weeks, including 19 pathological cases), and real-time testing in 24 cases (17-39 weeks). Automated results were compared with manual expert annotations and, in 57/72 cases, with matched clinical ultrasound measurements. Precision was evaluated against inter-observer variability and analysed across gestational age and maternal BMI. Furthermore, bilateral femur length consistency was evaluated. Results: Retrospective analysis demonstrated a mean endpoint localisation error of 5.5 mm and femur length deviation of 3.0 mm. Among the 49/72 cases with both femora automatically extracted, the left-right difference was 2.8+-2.7 mm. Precision was unaffected by maternal BMI (p>0.05), but correlated with gestational age (p<0.05). Real-time planning and assessment were successful in 22/24 cases, with a mean deviation of 4.2 mm. Conclusions: A real-time, automated method for MRI-based femur length measurement was developed, enabling precise, motion-robust, and operator-independent growth assessment, supporting future large-scale evaluation in growth-compromised pregnancies.

Agentic Inequality

Authors:Matthew Sharp, Omer Bilgin, Iason Gabriel, Lewis Hammond

Date:2025-10-19 14:32:46

Autonomous AI agents, capable of complex planning and action, represent a significant technological evolution beyond current generative tools. As these systems become integrated into political and economic life, their distribution and capabilities will be highly consequential. This paper introduces and explores "agentic inequality" - the potential disparities in power, opportunity, and outcomes stemming from differential access to, and capabilities of, AI agents. We analyse the dual potential of this technology, exploring how agents could both exacerbate existing divides and, under the right conditions, serve as a powerful equalising force. To this end, the paper makes three primary contributions. First, it establishes an analytical framework by delineating the three core dimensions through which this inequality can manifest: disparities in the availability, quality, and quantity of agents. Second, it argues that agentic inequality is distinct from prior technological divides. Unlike tools that primarily augment human abilities, agents act as autonomous delegates, creating novel power asymmetries through scalable goal delegation and direct agent-to-agent competition that are poised to reshape outcomes across economic and socio-political spheres. Finally, it provides a systematic analysis of the technical and socioeconomic drivers - from model release strategies to market incentives - that will shape the distribution of agentic power, concluding with a research agenda for navigating the complex governance challenges ahead.

DiRAC - Distributed Robot Awareness and Consensus

Authors:Uday Gopan, Manjari Kulkarni, Lakshasri S, Kashish Mittal, Sriram Radhakrishna, Aditya Naskar, Rameshwar DL

Date:2025-10-19 14:21:03

DiRAC is a scalable, distributed framework designed to enable efficient task assignment and path planning in very large robotic swarms. It introduces a novel zone-partitioned architecture with dynamically elected leaders and a tick-synchronized consensus protocol that yields strong consistency and deterministic outcomes. For path planning, DiRAC uses a novel algorithm, a force-based decentralized planner for real-time collision resolution. Validated within ROS 2 middleware through preliminary simulation, DiRAC demonstrates architectural scalability and modular efficiency in simulated warehouse environments, laying the groundwork for real-world deployment in large-scale industrial and logistics domains.

See or Say Graphs: Agent-Driven Scalable Graph Understanding with Vision-Language Models

Authors:Shuo Han, Yukun Cao, Zezhong Ding, Zengyi Gao, S Kevin Zhou, Xike Xie

Date:2025-10-19 09:20:44

Vision-language models (VLMs) have shown promise in graph understanding, but remain limited by input-token constraints, facing scalability bottlenecks and lacking effective mechanisms to coordinate textual and visual modalities. To address these challenges, we propose GraphVista, a unified framework that enhances both scalability and modality coordination in graph understanding. For scalability, GraphVista organizes graph information hierarchically into a lightweight GraphRAG base, which retrieves only task-relevant textual descriptions and high-resolution visual subgraphs, compressing redundant context while preserving key reasoning elements. For modality coordination, GraphVista introduces a planning agent that routes tasks to the most suitable modality-using the text modality for simple property reasoning and the visual modality for local and structurally complex reasoning grounded in explicit topology. Extensive experiments demonstrate that GraphVista scales to large graphs, up to $200\times$ larger than those used in existing benchmarks, and consistently outperforms existing textual, visual, and fusion-based methods, achieving up to $4.4\times$ quality improvement over the state-of-the-art baselines by fully exploiting the complementary strengths of both modalities.

Bayesian reliability acceptance sampling plans for competing risks data under interval censoring

Authors:Biswabrata Pradhan, Rathin Das

Date:2025-10-19 07:33:03

We obtain a reliability acceptance sampling plan for independent competing risk data under interval censoring schemes using the Bayesian approach. At first, the Bayesian reliability acceptance sampling plan is obtained where the decision criteria of accepting a lot is pre-fixed. For large samples, computing Bayes risk is computationally intensive. Therefore, an approximate Bayes risk is obtained using the asymptotic properties of the maximum likelihood estimators. Lastly, the Bayesian reliability acceptance sampling plan is obtained, where the decision function is arbitrary. The manufacturer can derive an optimal decision function by minimizing the Bayes risk among all decision functions. This optimal decision function is known as Bayes decision function. The optimal sampling plan is obtained by minimizing the Bayes risk. The algorithms are provided for the computation of optimum Bayesian reliability acceptance sampling plan. Numerical results are provided and comparisons between the Bayesian reliability acceptance sampling plans are carried out.

Vision-Centric 4D Occupancy Forecasting and Planning via Implicit Residual World Models

Authors:Jianbiao Mei, Yu Yang, Xuemeng Yang, Licheng Wen, Jiajun Lv, Botian Shi, Yong Liu

Date:2025-10-19 06:45:37

End-to-end autonomous driving systems increasingly rely on vision-centric world models to understand and predict their environment. However, a common ineffectiveness in these models is the full reconstruction of future scenes, which expends significant capacity on redundantly modeling static backgrounds. To address this, we propose IR-WM, an Implicit Residual World Model that focuses on modeling the current state and evolution of the world. IR-WM first establishes a robust bird's-eye-view representation of the current state from the visual observation. It then leverages the BEV features from the previous timestep as a strong temporal prior and predicts only the "residual", i.e., the changes conditioned on the ego-vehicle's actions and scene context. To alleviate error accumulation over time, we further apply an alignment module to calibrate semantic and dynamic misalignments. Moreover, we investigate different forecasting-planning coupling schemes and demonstrate that the implicit future state generated by world models substantially improves planning accuracy. On the nuScenes benchmark, IR-WM achieves top performance in both 4D occupancy forecasting and trajectory planning.