planning - 2026-04-27

Spend Less, Fit Better: Budget-Efficient Scaling Law Fitting via Active Experiment Selection

Authors:Sijie Li, Shanda Li, Haowei Lin, Weiwei Sun, Ameet Talwalkar, Yiming Yang
Date:2026-04-24 17:59:42

Scaling laws are used to plan multi-million-dollar training runs, but fitting those laws can itself cost millions. In modern large-scale workflows, assembling a sufficiently informative set of pilot experiments is already a major budget-allocation problem rather than a routine preprocessing step. We formulate scaling-law fitting as budget-aware sequential experimental design: given a finite pool of runnable experiments with heterogeneous costs, choose which runs to execute so as to maximize extrapolation accuracy in a high-cost target region. We then propose an uncertainty-aware method for sequentially allocating experimental budget toward the runs most useful for target-region extrapolation. Across a diverse benchmark of scaling-law tasks, our method consistently outperforms classical design-based baselines, and often approaches the performance of fitting on the full experimental set while using only about 10% of the total training budget. Our code is available at https://github.com/PlanarG/active-sl.

An Undecidability Proof for the Plan Existence Problem

Authors:Antonis Achilleos
Date:2026-04-24 17:36:17

The plan existence problem asks, given a goal in the form of a formula in modal logic, an initial epistemic state (a pointed Kripke model), and a set of epistemic actions, whether there exists a sequence of actions that can be applied to reach the goal. We prove that even in the case where the preconditions of the epistemic actions have modal depth at most 1, and there are no postconditions, the plan existence problem is undecidable. The (un)decidability of this problem was previously unknown.

ATRS: Adaptive Trajectory Re-splitting via a Shared Neural Policy for Parallel Optimization

Authors:Jiajun Yu, Guodong Liu, Li Wang, Pengxiang Zhou, Wentao Liu, Yin He, Chao Xu, Fei Gao, Yanjun Cao
Date:2026-04-24 16:58:14

Parallel trajectory optimization via the Alternating Direction Method of Multipliers (ADMM) has emerged as a scalable approach to long-horizon motion planning. However, existing frameworks typically decompose the problem into parallel subproblems based on a predefined fixed structure. Such structural rigidity often causes optimization stagnation in highly constrained regions, where a few lagging subproblems delay global convergence. A natural remedy is to adaptively re-split these stagnating segments online. Yet, deciding when, where, and how to split exceeds the capability of rule-based heuristics. To this end, we propose ATRS, a novel framework that embeds a shared Deep Reinforcement Learning policy into the parallel ADMM loop. We formulate this adaptive adjustment as a Multi-Agent Shared-Policy Markov Decision Process, where all trajectory segments act as homogeneous agents and share a unified neural policy network. This parameter-sharing architecture endows the system with size invariance, enabling it to handle dynamically changing segment counts during re-splitting and generalize to arbitrary trajectory lengths. Furthermore, our formulation inherently supports zero-shot generalization to unseen environments, as our network relies solely on the internal states of the numerical solver rather than on the geometric features of the environment. To ensure solver stability, a Confidence-Based Election mechanism selects only the most stagnating segment for re-splitting at each step. Extensive simulations demonstrate that ATRS accelerates convergence, reducing the number of iterations by up to 26.0% and the computation time by up to 19.1%. Real-world experiments further confirm its applicability to both large-scale offline global planning and real-time onboard replanning within 35 ms per cycle, with no sim-to-real degradation.

Generative Modeling of Neurodegenerative Brain Anatomy with 4D Longitudinal Diffusion Model

Authors:Nivetha Jayakumar, Swakshar Deb, Bahram Jafrasteh, Qingyu Zhao, Miaomiao Zhang
Date:2026-04-24 16:33:48

Understanding and predicting the progression of neurodegenerative diseases remains a major challenge in medical AI, with significant implications for early diagnosis, disease monitoring, and treatment planning. However, most available longitudinal neuroimaging datasets are temporally sparse with a few follow-up scans per subject. This scarcity of temporal data limits our ability to model and accurately capture the continuous anatomical changes related to disease progression in individual subjects. To address this problem, we propose a novel 4D (3DxT) diffusion-based generative framework that effectively models and synthesizes longitudinal brain anatomy over time, conditioned on available clinical variables such as health status, age, sex, and other relevant factors. Moreover, while most current approaches focus on manipulating image intensity or texture, our method explicitly learns the data distribution of topology-preserving spatiotemporal deformations to effectively capture the geometric changes of brain structures over time. This design enables the realistic generation of future anatomical states and the reconstruction of anatomically consistent disease trajectories, providing a more faithful representation of longitudinal brain changes. We validate our model through both synthetic sequence generation and downstream longitudinal disease classification, as well as brain segmentation. Experiments on two large-scale longitudinal neuroimage datasets demonstrate that our method outperforms state-of-the-art baselines in generating anatomically accurate, temporally consistent, and clinically meaningful brain trajectories. Our code is available on Github.

Curvature of optimal transport with respect to the cost and applications to inverse optimal transport

Authors:Gabriel Peyré, Clarice Poon, Oscar Tron
Date:2026-04-24 15:48:55

We study the inverse optimal transport problem of recovering the ground cost from an optimal transport plan. In discrete settings, this problem reduces to inverse linear programming and is intrinsically ill-posed, exhibiting non-identifiability and flat directions. We show that in the continuous setting, the regularity of the marginals fundamentally alters the structure of the inverse problem. Assuming smooth positive densities for the source and target measures, we characterize the second variation of the optimal transport functional with respect to the ground cost in Hölder spaces. In particular, we show that it is non-degenerate modulo the natural transport invariances, yielding a strict curvature property that is absent in discrete transport. As a consequence, we obtain local identifiability and stability results for inverse optimal transport. For the structured family of bilinear costs (i.e. Mahalanobis parametrizations), the ground cost can be uniquely recovered--up to the intrinsic invariances--from a single optimal coupling under a natural spanning condition. We further show that this identifiability property is generic under arbitrarily small perturbations of the marginals, while settings where the optimal transport map is affine (for instance Gaussian or elliptical marginals) remain degenerate. Finally, we establish precise bounds on the bias and statistical variance of inverse optimal transport under entropic regularization. These results reveal a structural parallel between forward and inverse optimal transport: regularity of the marginals ensures smooth optimal maps in the forward problem, while non-degeneracy of the induced transport plan yields curvature and local invertibility in the inverse problem.

Cross-Stage Coherence in Hierarchical Driving VQA: Explicit Baselines and Learned Gated Context Projectors

Authors:Gautam Kumar Jain, Carsten Markgraf, Julian Stähler
Date:2026-04-24 13:54:48

Graph Visual Question Answering (GVQA) for autonomous driving organizes reasoning into ordered stages, namely Perception, Prediction, and Planning, where planning decisions should remain consistent with the model's own perception. We present a comparative study of cross-stage context passing on DriveLM-nuScenes using two complementary mechanisms. The explicit variant evaluates three prompt-based conditioning strategies on a domain-adapted 4B VLM (Mini-InternVL2-4B-DA-DriveLM) without additional training, reducing NLI contradiction by up to 42.6% and establishing a strong zero-training baseline. The implicit variant introduces gated context projectors, which extract a hidden-state vector from one stage and inject a normalized, gated projection into the next stage's input embeddings. These projectors are jointly trained with stage-specific QLoRA adapters on a general-purpose 8B VLM (InternVL3-8B-Instruct) while updating only approximately 0.5% of parameters. The implicit variant achieves a statistically significant 34% reduction in planning-stage NLI contradiction (bootstrap 95% CIs, p < 0.05) and increases cross-stage entailment by 50%, evaluated with a multilingual NLI classifier to account for mixed-language outputs. Planning language quality also improves (CIDEr +30.3%), but lexical overlap and structural consistency degrade due to the absence of driving-domain pretraining. Since the two variants use different base models, we present them as complementary case studies: explicit context passing provides a strong training-free baseline for surface consistency, while implicit gated projection delivers significant planning-stage semantic gains, suggesting domain adaptation as a plausible next ingredient for full-spectrum improvement.

Machine Learning for Multi-messenger Probes of New Physics and Cosmology: A Review and Perspective

Authors:Andrea Addazi, Konstantin Belotsky, Vitaly Beylin, Timur Bikbaev, Deen Chen, Filippo Fabrocini, Stefano Giagu, Krid Jinklub, Artem Kharakhashyan, Maxim Khlopov, Vladimir Korchagin, Maxim Krasnov, Atharv Mahajan, Antonino Marciano, Andrey Mayorov, Antonio Morais, Roman Pasechnik, Jackson Levi Said, Danila Sopin, Viktor Stasenko, Oem Trivedi
Date:2026-04-24 11:29:09

The multi-messenger exploration of dark matter and physics beyond the Standard Model has emerged as a central direction in modern astro-particle physics, particularly following the discovery of gravitational waves. In this work, we present a comprehensive review and forward-looking perspective on machine-learning-enhanced multi-messenger approaches, combining information from gravitational waves, cosmic rays, gamma rays, neutrinos, and collider experiments. We summarize the current state of the field, discuss recent methodological developments, and outline a coherent research program aimed at integrating heterogeneous datasets within a unified inference framework. Our collaboration proposes here a plan for forthcoming analyses aiming at extracting information on the properties and interactions of dark matter, and finally on its genesis, combining multi-messenger astronomy techniques and inputs from laboratory physics. The main objectives planned in this line of research comprise: i) the multi-messenger analysis of new physics in cosmology, including mainly, but not only, several different models of dark matter; ii) the phenomenology of new physics signatures in ground-based cosmic rays experiments, with cross-correlation to the corresponding physical, astrophysical and cosmological observations; iii) the development of machine learning methods for data analysis in ground-based cosmic rays experiments, in light of the new physics signatures. We note that several groups have explored the use of multi-messenger observations, including gravitational waves, to probe alternative dark matter candidates. The present work builds on these developments by focusing on the role of machine learning in integrating heterogeneous datasets. We foresee that such a cross-fertilizing approach will represent the right path to extract information about the main questions left in fundamental physics.

Adapted Wasserstein Barycenters of Gaussian Processes

Authors:Francesco Mattesini, Johannes Wiesel
Date:2026-04-24 11:14:42

We investigate barycenters of Gaussian process laws in adapted Wasserstein space. The adapted Wasserstein distance refines classical optimal transport by enforcing compatibility of transport plans with the temporal flow of information, and is therefore well suited for stochastic systems with filtration constraints, as common in stochastic control, mathematical finance and sequential decision problems. Within this framework, we consider weighted Fréchet means of Gaussian process laws and prove that the associated barycenter problem admits Gaussian solutions. We derive a characterization of adapted Wasserstein barycenters in terms of the means and covariance operators of the underlying processes, and we analyze their existence, uniqueness, and regularity properties under natural assumptions. The Gaussian setting reveals a tractable and structurally rich subclass of adapted transport problems, bridging adapted optimal transport and Bures--Wasserstein geometry. Our results identify adapted Wasserstein barycenters as natural representatives of collections of Gaussian models and suggest new applications in stochastic optimization, robust finance, and sequential statistics.

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

Authors:Zhengxu Yu, Yu Fu, Zhiyuan He, Yuxuan Huang, Lee Ka Yiu, Meng Fang, Weilin Luo, Jun Wang
Date:2026-04-24 11:02:44

Individual agent capabilities have advanced rapidly through modular skills and tool integrations, yet multi-agent systems remain constrained by fixed team structures, tightly coupled coordination logic, and session-bound learning. We argue that this reflects a deeper absence: a principled organisational layer that governs how a workforce of agents is assembled, governed, and improved over time, decoupled from what individual agents know. To fill this gap, we introduce \emph{OneManCompany (OMC)}, a framework that elevates multi-agent systems to the organisational level. OMC encapsulates skills, tools, and runtime configurations into portable agent identities called \emph{Talents}, orchestrated through typed organisational interfaces that abstract over heterogeneous backends. A community-driven \emph{Talent Market} enables on-demand recruitment, allowing the organisation to close capability gaps and reconfigure itself dynamically during execution. Organisational decision-making is operationalised through an \emph{Explore-Execute-Review} ($\text{E}^2$R) tree search, which unifies planning, execution, and evaluation in a single hierarchical loop: tasks are decomposed top-down into accountable units and execution outcomes are aggregated bottom-up to drive systematic review and refinement. This loop provides formal guarantees on termination and deadlock freedom while mirroring the feedback mechanisms of human enterprises. Together, these contributions transform multi-agent systems from static, pre-configured pipelines into self-organising and self-improving AI organisations capable of adapting to open-ended tasks across diverse domains. Empirical evaluation on PRDBench shows that OMC achieves an $84.67\%$ success rate, surpassing the state of the art by $15.48$ percentage points, with cross-domain case studies further demonstrating its generality.

Beyond Land Surface Temperature: Explainable Spatial Machine Learning Reveals Urban Morphology Effects on Human-Centric Heat Stress

Authors:Yuan Wang, Shengao Yi, Xiaojiang Li, Pengyuan Liu, Zhiwei Yang, Ronita Bardhan, Rudi Stouffs
Date:2026-04-24 10:48:38

Heat exposure connects the built environment and public health, directly shaping the livability and sustainability of urban areas. Understanding the spatial heterogeneity of heat exposure and its drivers is vital for climate-adaptive urban planning. However, most planning-oriented studies rely on land surface temperature (LST), and whether LST adequately represents human heat exposure and how it differs from physiologically relevant heat stress remains insufficiently examined. Here, adopting Landsat-retrieved 30-m LST and GPU-accelerated 1-m universal thermal climate index (UTCI) in Singapore, this study establishes a comprehensive "Modeling-Comparing-Assessing" framework to systematically evaluate the spatial and mechanistic discrepancies between the two metrics. We further investigate pronounced non-stationary and threshold-based quantitative relationships of the two metrics with urban factors by employing a novel geographically weighted XGBoost (GW-XGBoost) and generalized additive model (GAM) workflow. Our results demonstrate notable discrepancies in spatial patterns of LST and UTCI, along with substantial spatial heterogeneity in how 2D and 3D urban factors impact these two thermal metrics, as revealed by explainable GW-XGBoost models (global out-of-bag R2 = 0.855 for LST and 0.905 for UTCI, respectively). Crucially, spatially explicit SHAP interprets that sky view factor plays a central role in explaining UTCI variability but exhibits a comparatively marginal independent contribution to LST, indicating that LST inadequately captures shading-driven and radiative processes governing actual human heat stress. Notably, SHAP-GAM analysis indicates that higher albedo is associated with increased UTCI. These novel findings provide evidence for integrating physiologically relevant thermal indices to inform targeted heat risk management and climate-adaptive urban planning.

CognitiveTwin: Robust Multi-Modal Digital Twins for Predicting Cognitive Decline in Alzheimer's Disease

Authors:Bulent Soykan, Gulsah Hancerliogullari Koksalmis, Hsin-Hsiung Huang, Laura J. Brattain
Date:2026-04-24 10:40:51

Predicting individual cognitive decline in Alzheimer's disease (AD) is difficult due to the heterogeneity of disease progression. Reliable clinical tools require not only high accuracy but also fairness across demographics and robustness to missing data. We present CognitiveTwin, a digital twin framework that predicts patient-specific cognitive trajectories. The model integrates multi-modal longitudinal data (cognitive scores, magnetic resonance imaging, positron emission tomography, cerebrospinal fluid biomarkers, and genetics). We use a Transformer-based architecture to fuse these modalities and a Deep Markov Model to capture temporal dynamics. We trained and evaluated the framework using data from 1,666 patients in the TADPOLE (Alzheimer's Disease Neuroimaging Initiative) dataset. We assessed the model for prediction error, demographic fairness, and robustness to missing-not-at-random (MNAR) data patterns. ognitiveTwin provides accurate and personalized predictions of cognitive decline. Its demonstrated fairness across patient demographics and resilience to clinical dropout make it a reliable tool for clinical trial enrichment and personalized care planning.

FETS Benchmark: Foundation Models Outperform Dataset-specific Machine Learning in Energy Time Series Forecasting

Authors:Marco Obermeier, Marco Pruckner, Florian Haselbeck, Andreas Zeiselmair
Date:2026-04-24 08:00:50

Driven by the transition towards a climate-neutral energy system, accurate energy time series forecasting is critical for planning and operation. Yet, it remains largely a dataset-specific task, requiring comprehensive training data, limiting scalability, and resulting in high model development and maintenance effort. Recently, foundation models that aim to learn generalizable patterns via extensive pretraining have shown superior performance in multiple prediction tasks. Despite their success and strong potential to address challenges in energy forecasting, their application in this domain remains largely unexplored. We address this gap by presenting the Foundation Models in Energy Time Series Forecasting (FETS) benchmark. We (1) provide a structured overview of energy forecasting use cases along three main dimensions: stakeholders, attributes, and data categories; (2) collect and analyze 54 datasets across 9 data categories, guided by typical stakeholder interests; (3) benchmark foundation models against classical machine learning approaches across different forecasting settings. Foundation models consistently outperform dataset-specific optimized machine learning approaches across all settings and data categories, despite the latter having seen the full historic target data during training. In particular, covariate-informed foundation models achieve the strongest performance. Further analysis reveals a strong correlation between predictive performance and spectral entropy, performance saturation beyond a certain context length, and improved performance at higher aggregation levels such as national load, district heating, and power grid data. Overall, our findings highlight the strong potential of foundation models as scalable and generalizable forecasting solutions for the energy domain, particularly in data-constrained and privacy-sensitive settings.

Evaluation of image simulation open source solutions for simulation of synthetic images in lunar environment

Authors:Jai G Singla, Hinal B Patel, Nitant Dube
Date:2026-04-24 07:26:48

Synthetic image generation is one of the crucial input for planetary missions. It enables researchers and engineers to visualize planned planetary missions, test imaging systems and plan exploration activities in a virtual environment before actual deployment. Image simulation is essential for assessing landing sites, detecting hazards, and validating navigation systems in a missions. This study offers a detailed evaluation of various image simulation approaches for the lunar environment, with particular emphasis on the effects of different camera models and light illumination conditions on the quality of synthetic lunar images. These images are produced using real Digital Elevation Models (DEM) and terrain data derived from instruments such as Chandrayaan-2 Orbiter High Resolution Camera (OHRC) and NASA's Wide Angle Camera (WAC), and Narrow Angle Camera (NAC) instruments. This research aims to improve the reliability of synthetic imagery in supporting autonomous navigation and decision-making systems in lunar exploration. This work contributes to the development of more effective tools for generating important information for future lunar missions and enhances the understanding of the moon's surface environment.

A Probabilistic Framework for Hierarchical Goal Recognition

Authors:Chenyuan Zhang, Katherine Ip, Hamid Rezatofighi, Buser Say, Mor Vered
Date:2026-04-24 05:58:30

Goal recognition aims to infer an agent's goal from observations of its behaviour. In realistic settings, recognition can benefit from exploiting hierarchical task structure and reasoning under uncertainty. Planning-based goal recognition has made substantial progress over the past decade, but to the best of our knowledge no existing approach jointly integrates hierarchical task structure with probabilistic inference. In this paper, we introduce the first planning-based probabilistic framework for hierarchical goal recognition over Hierarchical Task Networks (HTNs). We instantiate the framework by exploiting an HTN planner with a three-stage generative model for likelihood estimation, yielding posterior distributions over goal hypotheses. Empirical results show improved recognition performance over the existing HTN-based recognizer on HTN benchmarks. Overall, the framework lays a foundation for probabilistic goal recognition grounded in hierarchical planning structure, moving goal recognition toward more practical settings.

Navigating Large-Scale Document Collections: MuDABench for Multi-Document Analytical QA

Authors:Zhanli Li, Yixuan Cao, Lvzhou Luo, Ping Luo
Date:2026-04-24 05:28:51

This paper introduces the task of analytical question answering over large, semi-structured document collections. We present MuDABench, a benchmark for multi-document analytical QA, where questions require extracting and synthesizing information across numerous documents to perform quantitative analysis. Unlike existing multi-document QA benchmarks that typically require information from only a few documents with limited cross-document reasoning, MuDABench demands extensive inter-document analysis and aggregation. Constructed via distant supervision by leveraging document-level metadata and annotated financial databases, MuDABench comprises over 80,000 pages and 332 analytical QA instances. We also propose an evaluation protocol that measures final answer accuracy and uses intermediate-fact coverage as an auxiliary diagnostic signal for the reasoning process. Experiments reveal that standard RAG systems, which treat all documents as a flat retrieval pool, perform poorly. To address these limitations, we propose a multi-agent workflow that orchestrates planning, extraction, and code generation modules. While this approach substantially improves both process and outcome metrics, a significant gap remains compared to human expert performance. Our analysis identifies two primary bottlenecks: single-document information extraction accuracy and insufficient domain-specific knowledge in current systems. MuDABench is available at https://github.com/Zhanli-Li/MuDABench.

CodeGraphVLP: Code-as-Planner Meets Semantic-Graph State for Non-Markovian Vision-Language-Action Models

Authors:Khoa Vo, Sieu Tran, Taisei Hanyu, Yuki Ikebe, Duy Nguyen, Bui Duy Quoc Nghi, Minh Vu, Anthony Gunderman, Chase Rainwater, Anh Nguyen, Ngan Le
Date:2026-04-24 05:27:27

Vision-Language-Action (VLA) models promise generalist robot manipulation, but are typically trained and deployed as short-horizon policies that assume the latest observation is sufficient for action reasoning. This assumption breaks in non-Markovian long-horizon tasks, where task-relevant evidence can be occluded or appear only earlier in the trajectory, and where clutter and distractors make fine-grained visual grounding brittle. We present CodeGraphVLP, a hierarchical framework that enables reliable long-horizon manipulation by combining a persistent semantic-graph state with an executable code-based planner and progress-guided visual-language prompting. The semantic-graph maintains task-relevant entities and relations under partial observability. The synthesized planner executes over this semantic-graph to perform efficient progress checks and outputs a subtask instruction together with subtask-relevant objects. We use these outputs to construct clutter-suppressed observations that focus the VLA executor on critical evidence. On real-world non-Markovian tasks, CodeGraphVLP improves task completion over strong VLA baselines and history-enabled variants while substantially lowering planning latency compared to VLM-in-the-loop planning. We also conduct extensive ablation studies to confirm the contributions of each component.

Using a generative model for out-of-sample testing of two-stage stochastic programs

Authors:Ashutosh Shukla, John J. Hasenbein, Erhan Kutanoglu
Date:2026-04-24 04:58:42

Stochastic programming models for decision-making under uncertainty often suffer from scenario scarcity, where obtaining representative samples of uncertain parameters requires expensive simulations or measurements. This work presents a framework that leverages the Normal-to-Anything (NORTA) generative model to enhance the reliability of two-stage stochastic programming solutions through comprehensive out-of-sample testing when scenario data is limited. The NORTA model efficiently generates synthetic scenarios that preserve both marginal distributions and correlation structures from limited available data, offering a computationally tractable alternative to expensive physics-based simulations. We demonstrate the approach through a case study on power grid resilience planning against flood events in Texas, where we use 16 high-fidelity flood scenarios to generate 800 additional synthetic scenarios for validation. The results show that NORTA-generated scenarios accurately capture essential statistical properties, with the out-of-sample performance of first-stage decisions closely matching expectations from the original stochastic programming model. This framework enables decision-makers to assess the robustness of their solutions when obtaining additional real-world data is prohibitively expensive. The approach bridges machine learning and operations research by providing a practical solution to scenario generation challenges in stochastic programming.

V-STC: A Time-Efficient Multi-Vehicle Coordinated Trajectory Planning Approach

Authors:Pengfei Liu, Jialing Zhou, Yuezu Lv, Guanghui Wen, Tingwen Huang
Date:2026-04-24 03:55:32

Coordinating the motions of multiple autonomous vehicles (AVs) requires planning frameworks that ensure safety while making efficient use of space and time. This paper presents a new approach, termed variable-time-step spatio-temporal corridor (V-STC), that enhances the temporal efficiency of multi-vehicle coordination. An optimization model is formulated to construct a V-STC for each AV, in which both the spatial configuration of the corridor cubes and their time durations are treated as decision variables. By allowing the corridor's spatial position and time step to vary, the constructed V-STC reduces the overall temporal occupancy of each AV while maintaining collision-free separation in the spatio-temporal domain. Based on the generated V-STC, a dynamically feasible trajectory is then planned independently for each AV. Simulation studies demonstrate that the proposed method achieves safe multi-vehicle coordination and yields more time-efficient motion compared with existing STC approaches.

Energy-Efficient Multi-Robot Coverage Path Planning of Non-Convex Regions of Interests

Authors:Sourav Raxit, Jose Fuentes, Paulo Padrao, Abdullah Al Redwan Newaz, Md Tamjidul Hoque, Mark Kulp, Leonardo Bobadilla
Date:2026-04-24 03:33:23

This letter presents an energy-efficient multi-robot coverage path planning (MRCPP) framework for large, nonconvex Regions of Interest (ROI) containing obstacles and no-fly zones (NFZ). Existing minimum-energy coverage planning algorithms utilize meta-heuristic boustrophedon workspace decomposition. Therefore, even with minimum energy objectives and energy consumption constraints, they cannot achieve optimal energy efficiency. Moreover, most existing frameworks support only a single type of robotic platform. MRCPP overcomes these limitations by: generating globally-informed swath generation, creating parallel sweeping paths with minimal turns, calculating safety buffers to ensure safe turning clearance, using an efficient mTSP solver to balance workloads and minimize mission time, and connecting disjoint segments via a modified visibility graph that tracks heading angles while maintaining transitions within safe regions. The efficacy of the proposed MRCPP framework is demonstrated through real-world experiments involving autonomous aerial vehicles (AAVs) and autonomous surface vehicles (ASVs). Evaluations demonstrate that the proposed MRCPP consistently outperforms state-of-the-art planners, reducing average total energy consumption by 3\% to 40\% for a team of 3 robots and computation time by an order of magnitude, while maintaining balanced workload distribution and strong scalability across increasing fleet sizes. The MRCPP framework is released as an open-source package and videos of real-world and simulated experiments are available at https://mrc-pp.github.io.

Learning Coverage- and Power-Optimal Transmitter Placement from Building Maps: A Comparative Study of Direct and Indirect Neural Approaches

Authors:Çağkan Yapar
Date:2026-04-23 20:22:53

Optimal wireless transmitter placement is a central task in radio-network planning, yet exhaustive search becomes prohibitively expensive at scale. This paper studies the single-transmitter setting under a fixed learned propagation surrogate, where exhaustive per-pixel evaluation remains tractable and provides surrogate-exact ground truth. We introduce a dataset of 167,525 urban scenarios (RadioMapSeer-Deployment) with dual surrogate-exact labels for coverage-optimal and power-optimal transmitter locations. Ground-truth analysis reveals an asymmetric coverage-power trade-off: coverage-optimal placement sacrifices 13.86% of received power, whereas power-optimal placement sacrifices only 5.50% of coverage; the best achievable balanced placement lies at $\bar{d}=2.60$ from the ideal point (100%,100%). We evaluate two learning formulations: indirect heatmap-based models that predict received-power radio maps, and direct score-map models that predict the objective landscape over feasible transmitter locations. Within the heatmap family, discriminative models deliver one-shot predictions 1350-2400x faster than exhaustive search, while diffusion models additionally support multi-sample inference that improves single-objective performance and, by reusing the same sample pool under a balanced criterion, recovers strong balanced placements without explicit multi-objective training. Dual score-map strategies combining power and coverage score maps match the exhaustive balanced optimum ($\bar{d}=2.60$) and remain close across smaller candidate budgets, at 14-22x speedups after candidate re-evaluation. Both formulations admit very fast one-shot inference; on this benchmark, dual score-map methods are strongest for balanced placement, whereas heatmap formulations remain attractive for their physically meaningful intermediate maps and, in the diffusion setting, for inference-time search.

Documentless Assessments Using Nominal Group Interviews

Authors:Eduardo Miranda
Date:2026-04-23 18:43:44

This paper describes a group interview technique designed to support documentless process assessments while promoting at the same time collaboration among assessment participants. The method was successfully used in one consulting assignment where it got previously discording participants, talking to each other and agreeing on the issues. The technique borrows from agile software development the concept of user stories to cast CMMIs specific practices in concrete terms and the Planning Poker technique, instead of document reviews and audit like interviews, for fact finding and corroboration.

Long-Horizon Manipulation via Trace-Conditioned VLA Planning

Authors:Isabella Liu, An-Chieh Cheng, Rui Yan, Geng Chen, Ri-Zhao Qiu, Xueyan Zou, Sha Yi, Hongxu Yin, Xiaolong Wang, Sifei Liu
Date:2026-04-23 17:59:04

Long-horizon manipulation remains challenging for vision-language-action (VLA) policies: real tasks are multi-step, progress-dependent, and brittle to compounding execution errors. We present LoHo-Manip, a modular framework that scales short-horizon VLA execution to long-horizon instruction following via a dedicated task-management VLM. The manager is decoupled from the executor and is invoked in a receding-horizon manner: given the current observation, it predicts a progress-aware remaining plan that combines (i) a subtask sequence with an explicit done + remaining split as lightweight language memory, and (ii) a visual trace -- a compact 2D keypoint trajectory prompt specifying where to go and what to approach next. The executor VLA is adapted to condition on the rendered trace, thereby turning long-horizon decision-making into repeated local control by following the trace. Crucially, predicting the remaining plan at each step yields an implicit closed loop: failed steps persist in subsequent outputs, and traces update accordingly, enabling automatic continuation and replanning without hand-crafted recovery logic or brittle visual-history buffers. Extensive experiments spanning embodied planning, long-horizon reasoning, trajectory prediction, and end-to-end manipulation in simulation and on a real Franka robot demonstrate strong gains in long-horizon success, robustness, and out-of-distribution generalization. Project page: https://www.liuisabella.com/LoHoManip

Task-Driven Co-Design of Heterogeneous Multi-Robot Systems

Authors:Maximilian Stralz, Meshal Alharbi, Yujun Huang, Gioele Zardini
Date:2026-04-23 17:44:52

Designing multi-agent robotic systems requires reasoning across tightly coupled decisions spanning heterogeneous domains, including robot design, fleet composition, and planning. Much effort has been devoted to isolated improvements in these domains, whereas system-level co-design considering trade-offs and task requirements remains underexplored. In this work, we present a formal and compositional framework for the task-driven co-design of heterogeneous multi-robot systems. Building on a monotone co-design theory, we introduce general abstractions of robots, fleets, planners, executors, and evaluators as interconnected design problems with well-defined interfaces that are agnostic to both implementations and tasks. This structure enables efficient joint optimization of robot design, fleet composition, and planning under task-specific performance constraints. A series of case studies demonstrates the capabilities of the framework. Various component models can be seamlessly incorporated, including new robot types, task profiles, and probabilistic sensing objectives, while non-obvious design alternatives are systematically uncovered with optimality guarantees. The results highlight the flexibility, scalability, and interpretability of the proposed approach, and illustrate how formal co-design enables principled reasoning about complex heterogeneous multi-robot systems.

Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development

Authors:Magnus Palmblad, Jared M. Ragland, Benjamin A. Neely
Date:2026-04-23 14:50:35

The capabilities of AI-assisted coding are progressing at breakneck speed. Chat-based vibe coding has evolved into fully fledged AI-assisted, agentic software development using agent scaffolds where the human developer creates a plan that agentic AIs implement. One current trend is utilizing documents beyond this plan document, such as project and method-scoped documents. Here we propose GROUNDING$.$md, a community-governed, field-scoped epistemic grounding document, using mass spectrometry-based proteomics as an example. This explicit field-scoped grounding document encodes Hard Constraints (non-negotiable validity invariants empirically required for scientific correctness) and Convention Parameters (community-agreed defaults) that override all other contexts to enforce validity, regardless of what the user prompts. In practice, this will empower a non-domain expert to generate code, tools, and software that have best practices baked in at the ground level, providing confidence to the software developer but also to those reviewing or using the final product. Undoubtedly it is easier to have agentic AIs adhere to guidelines than humans, and this opportunity allows for organizations to develop epistemic grounding documents in such a way as to keep domain experts in the loop in a future of democratized generation of bespoke software solutions.

Agentic Artificial Intelligence in Finance: A Comprehensive Survey

Authors:Irene Aldridge, Jolie An, Riley Burke, Michael Cao, Chia-Yi Chien, Kexin Deng, Ruipeng Deng, Yichen Gao, Olivia Guo, Shunran He, Zheng Li, George Lin, Weihang Lin, Percy Lyu, Alex Ng, Qi Wang, Hanxi Xiao, Dora Xu, Yuanyuan Xue, Sheng Zhang, Sirui Zhang, Yun Zhang, Sirui Zhao, Xiaolong Zhao, Yihan Zhao, Waner Zheng
Date:2026-04-23 13:37:06

The emergence of agentic artificial intelligence (AI) represents a fundamental transformation in financial markets, characterized by autonomous systems capable of reasoning, planning, and adaptive decision-making with minimal human intervention. This comprehensive survey synthesizes recent advances in agentic AI across multiple dimensions of financial operations, including system architecture, market applications, regulatory frameworks, and systemic implications. We examine how agentic AI differs from traditional algorithmic trading and generative AI through its capacity for goal-oriented autonomy, continuous learning, and multi-agent coordination. Our analysis shows that while agentic AI offers substantial potential for enhanced market efficiency, liquidity provision, and risk management, it also introduces novel challenges related to market stability, regulatory compliance, interpretability, and systemic risk. Through a systematic review of foundational research, technical architectures, market applications, and governance frameworks, this survey provides scholars and practitioners with a structured understanding of how agentic AI is reshaping financial markets and identifies critical research directions for ensuring that these systems enhance both operational efficiency and market resilience.

Estimator-Aligned Prospective Sample Size Determination for Designs Using Inverse Probability of Treatment Weighting

Authors:Taekwon Hong, Daeyoung Lim, Woojung Bae, Yong Ma
Date:2026-04-23 13:20:58

In observational studies, accurately characterizing variance is critical for sample size determination, yet unaccounted-for variability from propensity score estimation and the resulting weights limit the accuracy of standard variance approximations for design. Existing approaches often rely on heuristics or randomized controlled trial (RCT) formulas that treat weights as fixed, potentially misaligning prospective design with the causal estimator used at analysis. We propose an estimator-aligned framework for prospective sample size determination based on generalized estimating equations (GEE) and stacked M-estimation. By merging the propensity score model and marginal structural model (MSM) into a single system of estimating equations, the method propagates nuisance-model uncertainty and directly targets the large-sample variance of the IPTW estimator. For study planning, we estimate a pilot-based large-sample variance factor and introduce a bootstrap stabilization procedure that accounts for both within- and between-pilot variability. The framework applies uniformly across binary, count, and continuous outcomes through link-specific GEE representations under a common design principle. Simulation studies motivated by post-marketing safety and healthcare cost applications demonstrate that anchoring design to this variance improves power calibration relative to conventional RCT-style formulas, particularly in settings with weight instability, outcome sparsity, or heavy-tailed variability.

Entropic regularization of Monge's problem

Authors:Marcel Nutz, Chenyang Zhong
Date:2026-04-23 11:57:56

We study the vanishing-regularization limit of entropically regularized optimal transport (EOT) for the Euclidean distance cost $c(x,y)=\|x-y\|$ in dimension $d>1$. We develop a comprehensive variational convergence framework that entails two main results. First, we resolve the longstanding entropic selection problem: the EOT minimizer converges to a distinguished optimal transport plan that is characterized explicitly as the solution of a constrained EOT problem on each transport ray. Denoting by $\varepsilon>0$ the regularization parameter, this selection holds for all $o(\varepsilon)$-approximate minimizers, with sharp failure at the $O(\varepsilon)$ scale. Second, we establish an explicit second-order expansion of the entropic transport cost. The second-order term encodes the geometry of the regularization and reveals the optimal asymptotic tradeoff between entropy and transport cost.

MISTY: High-Throughput Motion Planning via Mixer-based Single-step Drifting

Authors:Yining Xing, Zehong Ke, Yiqian Tu, Zhiyuan Liu, Wenhao Yu, Jianqiang Wang
Date:2026-04-23 09:50:34

Multi-modal trajectory generation is essential for safe autonomous driving, yet existing diffusion-based planners suffer from high inference latency due to iterative neural function evaluations. This paper presents MISTY (Mixer-based Inference for Single-step Trajectory-drifting Yield), a high-throughput generative motion planner that achieves state-of-the-art closed-loop performance with pure single-step inference. MISTY integrates a vectorized Sub-Graph encoder to capture environment context, a Variational Autoencoder to structure expert trajectories into a compact 32-dimensional latent manifold, and an ultra-lightweight MLP-Mixer decoder to eliminate quadratic attention complexity. Importantly, we introduce a latent-space drifting loss that shifts the complex distribution evolution entirely to the training phase. By formulating explicit attractive and repulsive forces, this mechanism empowers the model to synthesize novel, proactive maneuvers, such as active overtaking, that are virtually absent from the raw expert demonstrations. Extensive evaluations on the nuPlan benchmark demonstrate that MISTY achieves state-of-the-art results on the challenging Test14-hard split, with comprehensive scores of 80.32 and 82.21 in non-reactive and reactive settings, respectively. Operating at over 99 FPS with an end-to-end latency of 10.1 ms, MISTY offers an order-of-magnitude speedup over iterative diffusion planners while while achieving significantly robust generation.

Instance-level Visual Active Tracking with Occlusion-Aware Planning

Authors:Haowei Sun, Kai Zhou, Hao Gao, Shiteng Zhang, Jinwu Hu, Xutao Wen, Qixiang Ye, Mingkui Tan
Date:2026-04-23 09:11:50

Visual Active Tracking (VAT) aims to control cameras to follow a target in 3D space, which is critical for applications like drone navigation and security surveillance. However, it faces two key bottlenecks in real-world deployment: confusion from visually similar distractors caused by insufficient instance-level discrimination and severe failure under occlusions due to the absence of active planning. To address these, we propose OA-VAT, a unified pipeline with three complementary modules. First, a training-free Instance-Aware Offline Prototype Initialization aggregates multi-view augmented features via DINOv3 to construct discriminative instance prototypes, mitigating distractor confusion. Second, an Online Prototype Enhancement Tracker enhances prototypes online and integrates a confidence-aware Kalman filter for stable tracking under appearance and motion changes. Third, an Occlusion-Aware Trajectory Planner, trained on our new Planning-20k dataset, uses conditional diffusion to generate obstacle-avoiding paths for occlusion recovery. Experiments demonstrate OA-VAT achieves 0.93 average SR on UnrealCV (+2.2% vs. SOTA TrackVLA), 90.8% average CAR on real-world datasets (+12.1% vs. SOTA GC-VAT), and 81.6% TSR on a DJI Tello drone. Running at 35 FPS on an RTX 3090, it delivers robust, real-time performance for practical deployment.

HiCrew: Hierarchical Reasoning for Long-Form Video Understanding via Question-Aware Multi-Agent Collaboration

Authors:Yuehan Zhu, Jingqi Zhao, Jiawen Zhao, Xudong Mao, Baoquan Zhao
Date:2026-04-23 09:04:32

Long-form video understanding remains fundamentally challenged by pervasive spatiotemporal redundancy and intricate narrative dependencies that span extended temporal horizons. While recent structured representations compress visual information effectively, they frequently sacrifice temporal coherence, which is critical for causal reasoning. Meanwhile, existing multi-agent frameworks operate through rigid, pre-defined workflows that fail to adapt their reasoning strategies to question-specific demands. In this paper, we introduce HiCrew, a hierarchical multi-agent framework that addresses these limitations through three core contributions. First, we propose a Hybrid Tree structure that leverages shot boundary detection to preserve temporal topology while performing relevance-guided hierarchical clustering within semantically coherent segments. Second, we develop a Question-Aware Captioning mechanism that synthesizes intent-driven visual prompts to generate precision-oriented semantic descriptions. Third, we integrate a Planning Layer that dynamically orchestrates agent collaboration by adaptively selecting roles and execution paths based on question complexity. Extensive experiments on EgoSchema and NExT-QA validate the effectiveness of our approach, demonstrating strong performance across diverse question types with particularly pronounced gains in temporal and causal reasoning tasks that benefit from our hierarchical structure-preserving design.