planning - 2026-04-03

When to ASK: Uncertainty-Gated Language Assistance for Reinforcement Learning

Authors:Juarez Monteiro, Nathan Gavenski, Gianlucca Zuin, Adriano Veloso

Date:2026-04-02 16:19:20

Reinforcement learning (RL) agents often struggle with out-of-distribution (OOD) scenarios, leading to high uncertainty and random behavior. While language models (LMs) contain valuable world knowledge, larger ones incur high computational costs, hindering real-time use, and exhibit limitations in autonomous planning. We introduce Adaptive Safety through Knowledge (ASK), which combines smaller LMs with trained RL policies to enhance OOD generalization without retraining. ASK employs Monte Carlo Dropout to assess uncertainty and queries the LM for action suggestions only when uncertainty exceeds a set threshold. This selective use preserves the efficiency of existing policies while leveraging the language model's reasoning in uncertain situations. In experiments on the FrozenLake environment, ASK shows no improvement in-domain, but demonstrates robust navigation in transfer tasks, achieving a reward of 0.95. Our findings indicate that effective neuro-symbolic integration requires careful orchestration rather than simple combination, highlighting the need for sufficient model scale and effective hybridization mechanisms for successful OOD generalization.

Visual Decoding Operators: Towards a Compositional Theory of Visualization Perception

Authors:Sheng Long, Remco Chang, Eugene Wu, Alex Kale, Matthew Kay

Date:2026-04-02 16:11:53

Prior work on perceptual effectiveness has decomposed visualizations into smaller common units (e.g., channels such as angle, position, and length) to establish rankings. While useful, these decompositions lack the computational structure to predict performance for new visualization $\times$ task combinations, requiring new experiments for each. We propose an alternative unit of analysis: operationalizing quantitative visualization interpretation as sequences of composable visual decoding operators. Using probability density function (PDF) and cumulative distribution function (CDF) charts, we examine how chart-specific tasks can be decomposed into reusable, chart-agnostic perceptual operations and characterize their error profiles through hierarchical Bayesian modeling. We then test generalizability by composing learned operators to predict performance on a structurally different task: Moritz et al.'s [35] scatterplot mean-estimation experiment, where the chart type, chart dimensions, and analytic goal all differ from the learning conditions. With a pre-registered analysis plan, we compose operators under six candidate strategies and evaluate each against empirical data with no parameters fit to the response data. One strategy captures both bias and variance of observed responses; five alternatives fail in distinguishable ways. We argue that this decoding-operator-oriented approach to empirical visualization research and theory-building lays the groundwork for generative models that can predict a distribution of likely interpretations under different viewing conditions, new chart types, and new tasks. Free copy of this paper and supplemental materials: https://osf.io/prtfq; experiment interface: https://gleaming-dolphin-799fda.netlify.app/vis-decode-slider.

UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving

Authors:Yongkang Li, Lijun Zhou, Sixu Yan, Bencheng Liao, Tianyi Yan, Kaixin Xiong, Long Chen, Hongwei Xie, Bing Wang, Guang Chen, Hangjun Ye, Wenyu Liu, Haiyang Sun, Xinggang Wang

Date:2026-04-02 15:48:45

Vision-Language-Action (VLA) models have recently emerged in autonomous driving, with the promise of leveraging rich world knowledge to improve the cognitive capabilities of driving systems. However, adapting such models for driving tasks currently faces a critical dilemma between spatial perception and semantic reasoning. Consequently, existing VLA systems are forced into suboptimal compromises: directly adopting 2D Vision-Language Models yields limited spatial perception, whereas enhancing them with 3D spatial representations often impairs the native reasoning capacity of VLMs. We argue that this dilemma largely stems from the coupled optimization of spatial perception and semantic reasoning within shared model parameters. To overcome this, we propose UniDriveVLA, a Unified Driving Vision-Language-Action model based on Mixture-of-Transformers that addresses the perception-reasoning conflict via expert decoupling. Specifically, it comprises three experts for driving understanding, scene perception, and action planning, which are coordinated through masked joint attention. In addition, we combine a sparse perception paradigm with a three-stage progressive training strategy to improve spatial perception while maintaining semantic reasoning capability. Extensive experiments show that UniDriveVLA achieves state-of-the-art performance in open-loop evaluation on nuScenes and closed-loop evaluation on Bench2Drive. Moreover, it demonstrates strong performance across a broad range of perception, prediction, and understanding tasks, including 3D detection, online mapping, motion forecasting, and driving-oriented VQA, highlighting its broad applicability as a unified model for autonomous driving. Code and model have been released at https://github.com/xiaomi-research/unidrivevla

Auction-Based Online Policy Adaptation for Evolving Objectives

Authors:Guruprerana Shabadi, Kaushik Mallik

Date:2026-04-02 15:17:20

We consider multi-objective reinforcement learning problems where objectives come from an identical family -- such as the class of reachability objectives -- and may appear or disappear at runtime. Our goal is to design adaptive policies that can efficiently adjust their behaviors as the set of active objectives changes. To solve this problem, we propose a modular framework where each objective is supported by a selfish local policy, and coordination is achieved through a novel auction-based mechanism: policies bid for the right to execute their actions, with bids reflecting the urgency of the current state. The highest bidder selects the action, enabling a dynamic and interpretable trade-off among objectives. Going back to the original adaptation problem, when objectives change, the system adapts by simply adding or removing the corresponding policies. Moreover, as objectives arise from the same family, identical copies of a parameterized policy can be deployed, facilitating immediate adaptation at runtime. We show how the selfish local policies can be computed by turning the problem into a general-sum game, where the policies compete against each other to fulfill their own objectives. To succeed, each policy must not only optimize its own objective, but also reason about the presence of other goals and learn to produce calibrated bids that reflect relative priority. In our implementation, the policies are trained concurrently using proximal policy optimization (PPO). We evaluate on Atari Assault and a gridworld-based path-planning task with dynamic targets. Our method achieves substantially better performance than monolithic policies trained with PPO.

PRO-SPECT: Probabilistically Safe Scalable Planning for Energy-Aware Coordinated UAV-UGV Teams in Stochastic Environments

Authors:Roger Fowler, Cahit Ikbal Er, Benjamin Johnsenberg, Yasin Yazicioglu

Date:2026-04-02 15:13:40

We consider energy-aware planning for an unmanned aerial vehicle (UAV) and unmanned ground vehicle (UGV) team operating in a stochastic environment. The UAV must visit a set of air points in minimum time while respecting energy constraints, relying on the UGV as a mobile charging station. Unlike prior work that assumed deterministic travel times or used fixed robustness margins, we model travel times as random variables and bound the probability of failure (energy depletion) across the entire mission to a user-specified risk level. We formulate the problem as a Mixed-Integer Program and propose PRO-SPECT, a polynomial-time algorithm that generates risk-bounded plans. The algorithm supports both offline planning and online re-planning, enabling the team to adapt to disturbances while preserving the risk bound. We provide theoretical results on solution feasibility and time complexity. We also demonstrate the performance of our method via numerical comparisons and simulations.

LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model

Authors:Jiachun Jin, Zetong Zhou, Xiao Yang, Hao Zhang, Pengfei Liu, Jun Zhu, Zhijie Deng

Date:2026-04-02 14:22:29

Unified models (UMs) hold promise for their ability to understand and generate content across heterogeneous modalities. Compared to merely generating visual content, the use of UMs for interleaved cross-modal reasoning is more promising and valuable, e.g., for solving understanding problems that require dense visual thinking, improving visual generation through self-reflection, or modeling visual dynamics of the physical world guided by stepwise action interventions. However, existing UMs necessitate pixel decoding as a bridge due to their disjoint visual representations for understanding and generation, which is both ineffective and inefficient. In this paper, we introduce LatentUM, a novel unified model that represents all modalities within a shared semantic latent space, eliminating the need for pixel-space mediation between visual understanding and generation. This design naturally enables flexible interleaved cross-modal reasoning and generation. Beyond improved computational efficiency, the shared representation substantially alleviates codec bias and strengthens cross-modal alignment, allowing LatentUM to achieve state-of-the-art performance on the Visual Spatial Planning benchmark, push the limits of visual generation through self-reflection, and support world modeling by predicting future visual states within the shared semantic latent space.

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Authors:Xinlei Yu, Zhangquan Chen, Yongbo He, Tianyu Fu, Cheng Yang, Chengming Xu, Yue Ma, Xiaobin Hu, Zhe Cao, Jie Xu, Guibin Zhang, Jiale Tao, Jiayi Zhang, Siyuan Ma, Kaituo Feng, Haojie Huang, Youxing Li, Ronghao Chen, Huacan Wang, Chenglin Wu, Zikun Su, Xiaogang Xu, Kelu Yao, Kun Wang, Chen Gao, Yue Liao, Ruqi Huang, Tao Jin, Cheng Tan, Jiangning Zhang, Wenqi Ren, Yanwei Fu, Yong Liu, Yu Wang, Xiangyu Yue, Yu-Gang Jiang, Shuicheng Yan

Date:2026-04-02 13:36:37

Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift is driven by the structural limitations of explicit-space computation, including linguistic redundancy, discretization bottlenecks, sequential inefficiency, and semantic loss. This survey aims to provide a unified and up-to-date landscape of latent space in language-based models. We organize the survey into five sequential perspectives: Foundation, Evolution, Mechanism, Ability, and Outlook. We begin by delineating the scope of latent space, distinguishing it from explicit or verbal space and from the latent spaces commonly studied in generative visual models. We then trace the field's evolution from early exploratory efforts to the current large-scale expansion. To organize the technical landscape, we examine existing work through the complementary lenses of mechanism and ability. From the perspective of Mechanism, we identify four major lines of development: Architecture, Representation, Computation, and Optimization. From the perspective of Ability, we show how latent space supports a broad capability spectrum spanning Reasoning, Planning, Modeling, Perception, Memory, Collaboration, and Embodiment. Beyond consolidation, we discuss the key open challenges, and outline promising directions for future research. We hope this survey serves not only as a reference for existing work, but also as a foundation for understanding latent space as a general computational and systems paradigm for next-generation intelligence.

Bridging Discrete Planning and Continuous Execution for Redundant Robot

Authors:Teng Yan, Yue Yu, Yihan Liu, Bingzhuo Zhong

Date:2026-04-02 13:23:54

Voxel-grid reinforcement learning is widely adopted for path planning in redundant manipulators due to its simplicity and reproducibility. However, direct execution through point-wise numerical inverse kinematics on 7-DoF arms often yields step-size jitter, abrupt joint transitions, and instability near singular configurations. This work proposes a bridging framework between discrete planning and continuous execution without modifying the discrete planner itself. On the planning side, step-normalized 26-neighbor Cartesian actions and a geometric tie-breaking mechanism are introduced to suppress unnecessary turns and eliminate step-size oscillations. On the execution side, a task-priority damped least-squares (TP-DLS) inverse kinematics layer is implemented. This layer treats end-effector position as a primary task, while posture and joint centering are handled as subordinate tasks projected into the null space, combined with trust-region clipping and joint velocity constraints. On a 7-DoF manipulator in random sparse, medium, and dense environments, this bridge raises planning success in dense scenes from about 0.58 to 1.00, shortens representative path length from roughly 1.53 m to 1.10 m, and while keeping end-effector error below 1 mm, reduces peak joint accelerations by over an order of magnitude, substantially improving the continuous execution quality of voxel-based RL paths on redundant manipulators.

World Action Verifier: Self-Improving World Models via Forward-Inverse Asymmetry

Authors:Yuejiang Liu, Fan Feng, Lingjing Kong, Weifeng Lu, Jinzhou Tang, Kun Zhang, Kevin Murphy, Chelsea Finn, Yilun Du

Date:2026-04-02 12:48:36

General-purpose world models promise scalable policy evaluation, optimization, and planning, yet achieving the required level of robustness remains challenging. Unlike policy learning, which primarily focuses on optimal actions, a world model must be reliable over a much broader range of suboptimal actions, which are often insufficiently covered by action-labeled interaction data. To address this challenge, we propose World Action Verifier (WAV), a framework that enables world models to identify their own prediction errors and self-improve. The key idea is to decompose action-conditioned state prediction into two factors -- state plausibility and action reachability -- and verify each separately. We show that these verification problems can be substantially easier than predicting future states due to two underlying asymmetries: the broader availability of action-free data and the lower dimensionality of action-relevant features. Leveraging these asymmetries, we augment a world model with (i) a diverse subgoal generator obtained from video corpora and (ii) a sparse inverse model that infers actions from a subset of state features. By enforcing cycle consistency among generated subgoals, inferred actions, and forward rollouts, WAV provides an effective verification mechanism in under-explored regimes, where existing methods typically fail. Across nine tasks spanning MiniGrid, RoboMimic, and ManiSkill, our method achieves 2x higher sample efficiency while improving downstream policy performance by 18%.

Optimizing Relational Queries over Array-Valued Data in Columnar Systems

Authors:Maroua Zeblah, Etienne Couritas, Sarah Chlyah, Pierre Genevès, Nils Gesbert, Nabil Layaïda

Date:2026-04-02 12:30:55

Modern analytical workloads increasingly combine relational data with array-valued attributes. While columnar database systems efficiently process such workloads, their ability to optimize queries that interleave relational operators with array manipulations remains limited. This paper introduces A3D-RA, an extended relational algebra supporting array-valued attributes, together with a comprehensive framework for algebraic reasoning and optimization. We formalize its data model and semantics, develop a complete set of equivalence-preserving transformation rules capturing pairwise interactions between relational and array operators, and propose a plan enumeration strategy with an optimality guarantee that remains polynomial in all non-join operators. We design A3D-RA as a modular, backend-independent optimization layer that can be instantiated over existing analytical database systems. Experimental results across three high-performance engines on a real-world workload show consistent performance gains enabled by the proposed algebraic optimization layer.

Set-Theoretic Receding Horizon Control for Obstacle Avoidance and Overtaking in Autonomous Highway Driving

Authors:Gianni Cario, Valentino Carriuolo, Alessandro Casavola, Gianfranco Gagliardi, Marco Lupia, Franco Angelo Torchiaro

Date:2026-04-02 08:56:18

This article addresses obstacle avoidance motion planning for autonomous vehicles, specifically focusing on highway overtaking maneuvers. The control design challenge is handled by considering a mathematical vehicle model that captures both lateral and longitudinal dynamics. Unlike existing numerical optimization methods that suffer from significant online computational overhead, this work extends the state-of-the-art by leveraging a fast set-theoretic ellipsoidal Model Predictive Control (Fast-MPC) technique. While originally restricted to stabilization tasks, the proposed framework is successfully adapted to handle motion planning for vehicles modeled as uncertain polytopic discrete-time linear systems. The control action is computed online via a set-membership evaluation against a structured sequence of nested inner ellipsoidal approximations of the exact one-step ahead controllable set within a receding horizon framework. A six-degrees-of-freedom (6-DOF) nonlinear model characterizes the vehicle dynamics, while a polytopic embedding approximates the nonlinearities within a linear framework with parameter uncertainties. Finally, to assess performance and real-time feasibility, comparative co-simulations against a baseline Non-Linear MPC (NLMPC) were conducted. Using the high-fidelity CARLA 3D simulator, results demonstrate that the proposed approach seamlessly rejects dynamic traffic disturbances while reducing online computational time by over 90% compared to standard optimization-based approaches.

Bridging Deep Learning and Integer Linear Programming: A Predictive-to-Prescriptive Framework for Supply Chain Analytics

Authors:Khai Banh Nghiep, Duc Nguyen Minh, Lan Hoang Thi

Date:2026-04-02 08:41:02

Although demand forecasting is a critical component of supply chain planning, actual retail data can exhibit irreconcilable seasonality, irregular spikes, and noise, rendering precise projections nearly unattainable. This paper proposes a three-step analytical framework that combines forecasting and operational analytics. The first stage consists of exploratory data analysis, where delivery-tracked data from 180,519 transactions are partitioned, and long-term trends, seasonality, and delivery-related attributes are examined. Secondly, the forecasting performance of a statistical time series decomposition model N-BEATS MSTL and a recent deep learning architecture N-HiTS were compared. N-BEATS and N-HiTS were both statistically, and hence were N-BEATS's and N-HiTS's statistically selected. Most recent time series deep learning models, N-HiTS, N-BEATS. N-HiTS and N-BEATS N-HiTS and N-HiTS outperformed the statistical benchmark to a large extent. N-BEATS was selected to be the most optimized model, as the one with the lowest forecasting error, in the 3rd and final stage forecasting values of the next 4 weeks of 1918 units, and provided those as a model with a set of deterministically integer linear program outcomes that are aimed to minimize the total delivery time with a set of bound budget, capacity, and service constraints. The solution allocation provided a feasible and cost-optimal shipping plan. Overall, the study provides a compelling example of the practical impact of precise forecasting and simple, highly interpretable model optimization in logistics.

Improving Operational Feasibility in Large-Scale Power System Planning

Authors:Gereon Recht, Oussama Alaya, Benedikt Jahn, Karl-Kiên Cao, Hendrik Lens

Date:2026-04-02 08:39:16

Large-scale power system planning mostly uses linearized, active power only approximations of the power flow equations, ignores many operational constraints, and tests the operational feasibility of the resulting systems only under strongly simplifying assumptions. We propose an approach to obtain solutions to large instances of the alternating current capacity expansion problem via redispatch and reinforcement of an initial solution. The problem formulation considers simultaneous expansion of generators, reactive compensation devices, storage systems, and transmission. Furthermore, it includes operational constraints via startup procedures and capability curves of power sources and simplified stability limits via constraints on voltage angle differences and voltage magnitudes. To obtain initial solutions, we test several established and partly modified power flow approximations and integrate them into an approach for iterative transmission expansion planning, thereby obtaining convex formulations. We demonstrate the approach on large problem instances covering the islands of Great Britain and Ireland at the transmission level, for which we extend the open data source to model reactive power. We find that including transmission losses to determine the initial solution is most decisive, as the amount of redispatch and reinforcements necessary to obtain an alternating current feasible solution is reduced, whereas incorporating reactive power constraints did not lead to further improvements. Our approach ensures an alternating current feasible system under weak assumptions, thus guaranteeing steady-state voltage stability and allowing subsequent dynamic grid simulations, which is instrumental for planning stable future inverter-dominated power systems.

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

Authors:Yang Zhou, Xiaofeng Wang, Hao Shao, Letian Wang, Guosheng Zhao, Jiangnan Shao, Jiagang Zhu, Tingdong Yu, Zheng Zhu, Guan Huang, Steven L. Waslander

Date:2026-04-02 08:33:18

Recently, world-action models (WAM) have emerged to bridge vision-language-action (VLA) models and world models, unifying their reasoning and instruction-following capabilities and spatio-temporal world modeling. However, existing WAM approaches often focus on modeling 2D appearance or latent representations, with limited geometric grounding-an essential element for embodied systems operating in the physical world. We present DriveDreamer-Policy, a unified driving world-action model that integrates depth generation, future video generation, and motion planning within a single modular architecture. The model employs a large language model to process language instructions, multi-view images, and actions, followed by three lightweight generators that produce depth, future video, and actions. By learning a geometry-aware world representation and using it to guide both future prediction and planning within a unified framework, the proposed model produces more coherent imagined futures and more informed driving actions, while maintaining modularity and controllable latency. Experiments on the Navsim v1 and v2 benchmarks demonstrate that DriveDreamer-Policy achieves strong performance on both closed-loop planning and world generation tasks. In particular, our model reaches 89.2 PDMS on Navsim v1 and 88.7 EPDMS on Navsim v2, outperforming existing world-model-based approaches while producing higher-quality future video and depth predictions. Ablation studies further show that explicit depth learning provides complementary benefits to video imagination and improves planning robustness.

Grounding AI-in-Education Development in Teachers' Voices: Findings from a National Survey in Indonesia

Authors:Nurul Aisyah, Muhammad Dehan Al Kautsar, Arif Hidayat, Fajri Koto

Date:2026-04-02 05:17:00

Despite emerging use in Indonesian classrooms, there is limited large-scale, teacher-centred evidence on how AI is used in practice and what support teachers need, hindering the development of context-appropriate AI systems and policies. To address this gap, we conduct a nationwide survey of 349 K-12 teachers across elementary, junior high, and senior high schools. We find increasing use of AI for pedagogy, content development, and teaching media, although adoption remains uneven. Elementary teachers report more consistent use, while senior high teachers engage less; mid-career teachers assign higher importance to AI, and teachers in Eastern Indonesia perceive greater value. Across levels, teachers primarily use AI to reduce instructional preparation workload (e.g., assessment, lesson planning, and material development). However, generic outputs, infrastructure constraints, and limited contextual alignment continue to hinder effective classroom integration.

Smooth Feedback Motion Planning with Reduced Curvature

Authors:Aref Amiri, Steven M. LaValle

Date:2026-04-02 04:48:55

Feedback motion planning over cell decompositions provides a robust method for generating collision-free robot motion with formal guarantees. However, existing algorithms often produce paths with unnecessary bending, leading to slower motion and higher control effort. This paper presents a computationally efficient method to mitigate this issue for a given simplicial decomposition. A heuristic is introduced that systematically aligns and assigns local vector fields to produce more direct trajectories, complemented by a novel geometric algorithm that constructs a maximal star-shaped chain of simplexes around the goal. This creates a large ``funnel'' in which an optimal, direct-to-goal control law can be safely applied. Simulations demonstrate that our method generates measurably more direct paths, reducing total bending by an average of 91.40\% and LQR control effort by an average of 45.47\%. Furthermore, comparative analysis against sampling-based and optimization-based planners confirms the time efficacy and robustness of our approach. While the proposed algorithms work over any finite-dimensional simplicial complex embedded in the collision-free subset of the configuration space, the practical application focuses on low-dimensional ($d\le3$) configuration spaces, where simplicial decomposition is computationally tractable.

Robust Autonomous Control of a Magnetic Millirobot in In Vitro Cardiac Flow

Authors:Anuruddha Bhattacharjee, Xinhao Chen, Lamar O. Mair, Suraj Raval, Yancy Diaz-Mercado, Axel Krieger

Date:2026-04-02 01:48:04

Untethered magnetic millirobots offer significant potential for minimally invasive cardiac therapies; however, achieving reliable autonomous control in pulsatile cardiac flow remains challenging. This work presents a vision-guided control framework enabling precise autonomous navigation of a magnetic millirobot in an in vitro heart phantom under physiologically relevant flow conditions. The system integrates UNet-based localization, A* path planning, and a sliding mode controller with a disturbance observer (SMC-DOB) designed for multi-coil electromagnetic actuation. Although drag forces are estimated using steady-state CFD simulations, the controller compensates for transient pulsatile disturbances during closed-loop operation. In static fluid, the SMC-DOB achieved sub-millimeter accuracy (root-mean-square error, RMSE = 0.49 mm), outperforming PID and MPC baselines. Under moderate pulsatile flow (7 cm/s peak, 20 cP), it reduced RMSE by 37% and peak error by 2.4$\times$ compared to PID. It further maintained RMSE below 2 mm (0.27 body lengths) under elevated pulsatile flow (10 cm/s peak, 20 cP) and under low-viscosity conditions (4.3 cP, 7 cm/s peak), where baseline controllers exhibited unstable or failed tracking. These results demonstrate robust closed-loop magnetic control under time-varying cardiac flow disturbances and support the feasibility of autonomous millirobot navigation for targeted drug delivery.

Soft MPCritic: Amortized Model Predictive Value Iteration

Authors:Thomas Banker, Nathan P. Lawrence, Ali Mesbah

Date:2026-04-01 23:35:00

Reinforcement learning (RL) and model predictive control (MPC) offer complementary strengths, yet combining them at scale remains computationally challenging. We propose soft MPCritic, an RL-MPC framework that learns in (soft) value space while using sample-based planning for both online control and value target generation. soft MPCritic instantiates MPC through model predictive path integral control (MPPI) and trains a terminal Q-function with fitted value iteration, aligning the learned value function with the planner and implicitly extending the effective planning horizon. We introduce an amortized warm-start strategy that recycles planned open-loop action sequences from online observations when computing batched MPPI-based value targets. This makes soft MPCritic computationally practical, while preserving solution quality. soft MPCritic plans in a scenario-based fashion with an ensemble of dynamic models trained for next-step prediction accuracy. Together, these ingredients enable soft MPCritic to learn effectively through robust, short-horizon planning on classic and complex control tasks. These results establish soft MPCritic as a practical and scalable blueprint for synthesizing MPC policies in settings where policy extraction and direct, long-horizon planning may fail.

Leveraging the Value of Information in POMDP Planning

Authors:Zakariya Laouar, Qi Heng Ho, Zachary Sunberg

Date:2026-04-01 22:18:55

Partially observable Markov decision processes (POMDPs) offer a principled formalism for planning under state and transition uncertainty. Despite advances made towards solving large POMDPs, obtaining performant policies under limited planning time remains a major challenge due to the curse of dimensionality and the curse of history. For many POMDP problems, the value of information (VOI) - the expected performance gain from reasoning about observations - varies over the belief space. We introduce a dynamic programming framework that exploits this structure by conditionally processing observations based on the value of information at each belief. Building on this framework, we propose Value of Information Monte Carlo planning (VOIMCP), a Monte Carlo Tree Search algorithm that allocates computational effort more efficiently by selectively disregarding observation information when the VOI is low, avoiding unnecessary branching of observations. We provide theoretical guarantees on the near-optimality of our VOI reasoning framework and derive non-asymptotic convergence bounds for VOIMCP. Simulation evaluations demonstrate that VOIMCP outperforms baselines on several POMDP benchmarks.

Rendezvous Planning from Sparse Observations of Optimally Controlled Targets

Authors:Thomas A. Scott, Lukas Taus, Yen-Hsi Richard Tsai, Tan Bui-Thanh, Justin G. R. Delva

Date:2026-04-01 22:09:19

We develop a probabilistic framework for \emph{rendezvous planning}: given sparse, noisy observations of a fast-moving target, plan rendezvous spatiotemporal coordinates for a set of significantly slower seeking agents. The unknown target trajectory is estimated under uncertain dynamics using a filtering approach that combines a kernel-based maximum a posteriori estimation with Gaussian process correction, producing a mixture over trajectory hypotheses. This estimate is used to select spatiotemporal rendezvous points that maximize the probability of successful rendezvous. Points are chosen sequentially by greedily minimizing failure probability in the current belief space, which is updated after each step by conditioning on unsuccessful rendezvous attempts. We show that the failure-conditioned update correctly captures the posterior belief for subsequent decisions, ensuring that each step in the greedy sequence is informed by a statistically consistent representation of the remaining search space, and derive the corresponding Bayesian updates incorporating temporal correlations intrinsic to the trajectory model. This result provides a systematic framework for planning under uncertainty in applications of autonomous rendezvous such as unmanned aerial vehicle refueling, spacecraft servicing, autonomous surface vessel operations, search and rescue missions, and missile defense. In each, the motion of the target entity can be modeled using a system of differential equations undergoing optimal control for a chosen objective, in our example case Hamilton--Jacobi--Bellman solutions for minimum arrival time of a Dubins car with uncertain turning radius and destination.

Open-loop POMDP Simplification and Safe Skipping of Replanning with Formal Performance Guarantees

Authors:Da Kong, Vadim Indelman

Date:2026-04-01 20:06:29

Partially Observable Markov Decision Processes (POMDPs) provide a principled mathematical framework for decision-making under uncertainty. However, the exact solution to POMDPs is computationally intractable. In this paper, we address the computational intractability by introducing a novel framework for adaptive open-loop simplification with formal performance guarantees. Our method adaptively interleaves open-loop and closed-loop planning via a topology-based belief tree, enabling a significant reduction in planning complexity. The key contribution lies in the derivation of efficiently computable bounds which provide formal guarantees and can be used to ensure that our simplification can identify the immediate optimal action of the original POMDP problem. Our framework therefore provides computationally tractable performance guarantees for macro-actions within POMDPs. Furthermore, we propose a novel framework for safely skipping replanning during execution, supported by theoretical guarantees on multi-step open-loop action sequences. To the best of our knowledge, this framework is the first to address skipping replanning with formal performance guarantees. Practical online solvers for our proposed simplification are developed, including a sampling-based solver and an anytime solver. Empirical results demonstrate substantial computational speedups while maintaining provable performance guarantees, advancing the tractability and efficiency of POMDP planning.

A High Voltage Test System Meeting Requirements Under Normal and All Single Contingencies Conditions of Peak, Dominant, and Light Loadings for Transmission Expansion Planning Studies (TEP) and TEP Case Studies

Authors:Bhuban Dhamala, Mona Ghassemi

Date:2026-04-01 19:47:07

This paper presents a high-voltage test system designed specifically for transmission expansion planning (TEP) and explores multiple TEP studies using this test system. The network incorporates long transmission lines, lines are accurately modeled, and line parameters are calculated using the equivalent π circuit model for long transmission lines to account for the distributed nature of line parameters. The paper provides detailed load flow analyses for both normal and all contingency conditions for three different loading conditions (peak load, dominant load, and light load), demonstrating that the proposed test system offers technically feasible load flow solutions at these loading scenarios. As the real power system is subject to various loading scenarios and should be effectively operable under all conditions, this test system accurately replicates the properties of real power systems. Furthermore, this paper presents multiple TEP cases to supply the load at a new location. TEP cases are conducted with different numbers of transmission line connections, and each case is underscored by its respective maximum capacity satisfying all technical requirements for normal and all single contingencies under three different scenarios. The cost of TEP for each case is calculated and compared in terms of the average cost per MW of power delivered to the new bus.

Bridging classical and martingale Schrödinger bridges

Authors:Julio Backhoff, Mathias Beiglböck, Giorgia Bifronte, Armand Ley

Date:2026-04-01 18:04:48

We investigate the martingale Schrödinger bridge, recently introduced by Nutz and Wiesel as a distinguished martingale transport plan between two probability measures in convex order. We show that this construction extends naturally to arbitrary dimension and admits several equivalent characterizations. In particular, we identify its continuous-time counterpart as the continuous martingale with prescribed marginals that minimizes a weighted quadratic energy measuring the deviation from Brownian motion. In the irreducible case, we prove that this continuous martingale Schrödinger bridge coincides with the Föllmer martingale, that is, with the Doob martingale associated to a suitable Föllmer process. More generally, we relate the martingale Schrödinger bridge to a variational problem over base measures and to the dual formulation of the corresponding weak optimal transport problem, thereby clarifying its connection with the classical Schrödinger bridge.

Collaborative Task and Path Planning for Heterogeneous Robotic Teams using Multi-Agent PPO

Authors:Matthias Rubio, Julia Richter, Hendrik Kolvenbach, Marco Hutter

Date:2026-04-01 17:53:51

Efficient robotic extraterrestrial exploration requires robots with diverse capabilities, ranging from scientific measurement tools to advanced locomotion. A robotic team enables the distribution of tasks over multiple specialized subsystems, each providing specific expertise to complete the mission. The central challenge lies in efficiently coordinating the team to maximize utilization and the extraction of scientific value. Classical planning algorithms scale poorly with problem size, leading to long planning cycles and high inference costs due to the combinatorial growth of possible robot-target allocations and possible trajectories. Learning-based methods are a viable alternative that move the scaling concern from runtime to training time, setting a critical step towards achieving real-time planning. In this work, we present a collaborative planning strategy based on Multi-Agent Proximal Policy Optimization (MAPPO) to coordinate a team of heterogeneous robots to solve a complex target allocation and scheduling problem. We benchmark our approach against single-objective optimal solutions obtained through exhaustive search and evaluate its ability to perform online replanning in the context of a planetary exploration scenario.

Property-Level Flood Risk Assessment Using AI-Enabled Street-View Lowest Floor Elevation Extraction and ML Imputation Across Texas

Authors:Xiangpeng Li, Yu-Hsuan Ho, Sam D Brody, Ali Mostafavi

Date:2026-04-01 17:08:43

This paper argues that AI-enabled analysis of street-view imagery, complemented by performance-gated machine-learning imputation, provides a viable pathway for generating building-specific elevation data at regional scale for flood risk assessment. We develop and apply a three-stage pipeline across 18 areas of interest (AOIs) in Texas that (1) extracts LFE and the height difference between street grade and the lowest floor (HDSL) from Google Street View imagery using the Elev-Vision framework, (2) imputes missing HDSL values with Random Forest and Gradient Boosting models trained on 16 terrain, hydrologic, geographic, and flood-exposure features, and (3) integrates the resulting elevation dataset with Fathom 1-in-100 year inundation surfaces and USACE depth-damage functions to estimate property-specific interior flood depth and expected loss. Across 12,241 residential structures, street-view imagery was available for 73.4% of parcels and direct LFE/HDSL extraction was successful for 49.0% (5,992 structures). Imputation was retained for 13 AOIs where cross-validated performance was defensible, with selected models achieving R suqre values from 0.159 to 0.974; five AOIs were explicitly excluded from prediction because performance was insufficient. The results show that street-view-based elevation mapping is not universally available for every property, but it is sufficiently scalable to materially improve regional flood-risk characterization by moving beyond hazard exposure to structure-level estimates of interior inundation and expected damage. Scientifically, the study advances LFE estimation from a pilot-scale proof of concept to a regional, end-to-end workflow. Practically, it offers a replicable framework for jurisdictions that lack comprehensive Elevation Certificates but need parcel-level information to support mitigation, planning, and flood-risk management.

Sub-metre Lunar DEM Generation and Validation from Chandrayaan-2 OHRC Multi-View Imagery Using Open-Source Photogrammetry

Authors:Aaranay Aadi, Jai Singla, Nitant Dube, Oleg Alexandrov

Date:2026-04-01 15:43:04

High-resolution digital elevation models (DEMs) of the lunar surface are essential for surface mobility planning, landing site characterization, and planetary science. The Orbiter High Resolution Camera (OHRC) on board Chandrayaan-2 has the best ground sampling capabilities of any lunar orbital imaging currently in use by acquiring panchromatic imagery at a resolution of roughly 20-30 cm per pixel. This work presents, for the first time, the generation of sub-metre DEMs from OHRC multi-view imagery using an exclusively open-source pipeline. Candidate stereo pairs are identified from non-paired OHRC archives through geometric analysis of image metadata, employing baseline-to-height (B/H) ratio computation and convergence angle estimation. Dense stereo correspondence and ray triangulation are then applied to generate point clouds, which are gridded into DEMs at effective spatial resolutions between approximately 24 and 54 cm across five geographically distributed lunar sites. Absolute elevation consistency is established through Iterative Closest Point (ICP) alignment against Lunar Reconnaissance Orbiter Narrow Angle Camera (NAC) Digital Terrain Models, followed by constant-bias offset correction. Validation against NAC reference terrain yields a vertical RMSE of 5.85 m (at native OHRC resolution), and a horizontal accuracy of less than 30 cm assessed by planimetric feature matching.

Car Dependency in Urban Accessibility

Authors:Bruno Campanelli, Francesco Marzolla, Matteo Bruno, Hygor Piaget Monteiro Melo, Vittorio Loreto

Date:2026-04-01 15:19:44

To achieve net-zero emissions, cities must transition away from reliance on private vehicles. However, car-centric urban growth has transformed the automobile from a convenience tool into a necessity for accessing essential services, creating significant "car dependency". This study introduces a novel Car Dependency Index (CDI) that quantifies the accessibility gap between private and public transport across 18 cities in Europe and North America. Utilising high-resolution geospatial data and numerical simulations, we reveal pronounced spatial inequalities, showing that car dependency remains a primary driver of car ownership even when accounting for income. A ``what-if" simulation of the planned metro expansion in Rome predicts a reduction of approximately 60,000 commuting vehicles, yet highlights that isolated interventions have localised impacts. We conclude that systemic, network-level transit expansions are essential to dismantle car-based systems and foster equitable, sustainable urban mobility. Our framework provides policymakers with an objective, scalable tool to identify viable areas for car-free zones and target infrastructure investments effectively.

DLWM: Dual Latent World Models enable Holistic Gaussian-centric Pre-training in Autonomous Driving

Authors:Yiyao Zhu, Ying Xue, Haiming Zhang, Guangfeng Jiang, Wending Zhou, Xu Yan, Jiantao Gao, Yingjie Cai, Bingbing Liu, Zhen Li, Shaojie Shen

Date:2026-04-01 14:41:21

Vision-based autonomous driving has gained much attention due to its low costs and excellent performance. Compared with dense BEV (Bird's Eye View) or sparse query models, Gaussian-centric method is a comprehensive yet sparse representation by describing scene with 3D semantic Gaussians. In this paper, we introduce DLWM, a novel paradigm with Dual Latent World Models specifically designed to enable holistic gaussian-centric pre-training in autonomous driving using two stages. In the first stage, DLWM predicts 3D Gaussians from queries by self-supervised reconstructing multi-view semantic and depth images. Equipped with fine-grained contextual features, in the second stage, two latent world models are trained separately for temporal feature learning, including Gaussian-flow-guided latent prediction for downstream occupancy perception and forecasting tasks, and ego-planning-guided latent prediction for motion planning. Extensive experiments in SurroundOcc and nuScenes benchmarks demonstrate that DLWM shows significant performance gains across Gaussian-centric 3D occupancy perception, 4D occupancy forecasting and motion planning tasks.

Site-specific ILC Detector Installation Plan

Authors:Karsten Buesser, Thomas Schoerner

Date:2026-04-01 13:49:27

Both ILC detector concepts, ILD and SiD, are very complex machines, the assembly and installation of which in the experimental cavern at the ILC interaction point will be complicated endeavours. These procedures require careful planning and logistics, taking into account numerous constraints and boundary conditions. Some of these are described in this document. Especially ILD has already invested significant effort into elaborate installation plans, which will be briefly described in this document. However, with the ILC not being a secured project and a final interaction point not chosen, all these plans have to be considered preliminary and need to be further detailed, taking into account the concrete site-specific and project-specific conditions.

OkanNet: A Lightweight Deep Learning Architecture for Classification of Brain Tumor from MRI Images

Authors:Okan Uçar, Murat Kurt

Date:2026-04-01 13:29:53

Medical imaging techniques, especially Magnetic Resonance Imaging (MRI), are accepted as the gold standard in the diagnosis and treatment planning of neurological diseases. However, the manual analysis of MRI images is a time-consuming process for radiologists and is prone to human error due to fatigue. In this study, two different Deep Learning approaches were developed and analyzed comparatively for the automatic detection and classification of brain tumors (Glioma, Meningioma, Pituitary, and No Tumor). In the first approach, a custom Convolutional Neural Network (CNN) architecture named "OkanNet", which has a low computational cost and fast training time, was designed from scratch. In the second approach, the Transfer Learning method was applied using the 50-layer ResNet-50 [1] architecture, pre-trained on the ImageNet dataset. In experiments conducted on an extended dataset compiled by Masoud Nickparvar containing a total of $7,023$ MRI images, the Transfer Learning-based ResNet-50 model exhibited superior classification performance, achieving $96.49\%$ Accuracy and $0.963$ Precision. In contrast, the custom OkanNet architecture reached an accuracy rate of $88.10\%$; however, it proved to be a strong alternative for mobile and embedded systems with limited computational power by yielding results approximately $3.2$ times faster ($311$ seconds) than ResNet-50 in terms of training time. This study demonstrates the trade-off between model depth and computational efficiency in medical image analysis through experimental data.