planning - 2026-03-23

AI Agents Can Already Autonomously Perform Experimental High Energy Physics

Authors:Eric A. Moreno, Samuel Bright-Thonney, Andrzej Novak, Dolores Garcia, Philip Harris
Date:2026-03-20 17:55:27

Large language model-based AI agents are now able to autonomously execute substantial portions of a high energy physics (HEP) analysis pipeline with minimal expert-curated input. Given access to a HEP dataset, an execution framework, and a corpus of prior experimental literature, we find that Claude Code succeeds in automating all stages of a typical analysis: event selection, background estimation, uncertainty quantification, statistical inference, and paper drafting. We argue that the experimental HEP community is underestimating the current capabilities of these systems, and that most proposed agentic workflows are too narrowly scoped or scaffolded to specific analysis structures. We present a proof-of-concept framework, Just Furnish Context (JFC), that integrates autonomous analysis agents with literature-based knowledge retrieval and multi-agent review, and show that this is sufficient to plan, execute, and document a credible high energy physics analysis. We demonstrate this by conducting analyses on open data from ALEPH, DELPHI, and CMS to perform electroweak, QCD, and Higgs boson measurements. Rather than replacing physicists, these tools promise to offload the repetitive technical burden of analysis code development, freeing researchers to focus on physics insight, truly novel method development, and rigorous validation. Given these developments, we advocate for new strategies for how the community trains students, organizes analysis efforts, and allocates human expertise.

The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning

Authors:Jiyu Lim, Youngwoo Yoon, Kwanghyun Park
Date:2026-03-20 17:40:21

Conventional robot social behavior generation has been limited in flexibility and autonomy, relying on predefined motions or human feedback. This study proposes CRISP (Critique-and-Replan for Interactive Social Presence), an autonomous framework where a robot critiques and replans its own actions by leveraging a Vision-Language Model (VLM) as a `human-like social critic.' CRISP integrates (1) extraction of movable joints and constraints by analyzing the robot's description file (e.g., MJCF), (2) generation of step-by-step behavior plans based on situational context, (3) generation of low-level joint control code by referencing visual information (joint range-of-motion visualizations), (4) VLM-based evaluation of social appropriateness and naturalness, including pinpointing erroneous steps, and (5) iterative refinement of behaviors through reward-based search. This approach is not tied to a specific robot API; it can generate subtly different, human-like motions on various platforms using only the robot's structure file. In a user study involving five different robot types and 20 scenarios, including mobile manipulators and humanoids, our proposed method achieved significantly higher preference and situational appropriateness ratings compared to previous methods. This research presents a general framework that minimizes human intervention while expanding the robot's autonomous interaction capabilities and cross-platform applicability. Detailed result videos and supplementary information regarding this work are available at: https://limjiyu99.github.io/inner-critic/

Reducing the Incentive to Tank: The Ex Post Gold Plan

Authors:Bret Benesh
Date:2026-03-20 16:25:22

Many recent proposals for reducing tanking in draft lotteries share a common structure: losses improve draft position early in the season while wins improve draft position later. While such systems improve late-season incentives, they retain a predictable pivot point that tanking teams can exploit strategically. This paper proposes a simple modification that introduces uncertainty into the timing of the incentive switch. The proposed metric, the \emph{Realized Elimination Wins Determinant} (REWIND), ranks teams according to the number of wins obtained after their ex post elimination date, which makes this a variation of the Gold Plan. Because the ex post elimination date cannot be known with certainty during the season, the mechanism weakens incentives for strategic losing while preserving incentives for competitive effort after elimination. Moreover, the ex post elimination date is typically earlier than other proposed pivot points, so there is a longer period where a tanking team's best strategy is to win. The Ex Post Gold plan uses the REWIND metric to create a simple system where every team will be incentivized to win at least half of their games in most seasons.

A Unified Platform and Quality Assurance Framework for 3D Ultrasound Reconstruction with Robotic, Optical, and Electromagnetic Tracking

Authors:Lewis Howell, Manisha Waterston, Tze Min Wah, James H. Chandler, James R. McLaughlan
Date:2026-03-20 15:56:50

Three-dimensional (3D) Ultrasound (US) can facilitate diagnosis, treatment planning, and image-guided therapy. However, current studies rarely provide a comprehensive evaluation of volumetric accuracy and reproducibility, highlighting the need for robust Quality Assurance (QA) frameworks, particularly for tracked 3D US reconstruction using freehand or robotic acquisition. This study presents a QA framework for 3D US reconstruction and a flexible open source platform for tracked US research. A custom phantom containing geometric inclusions with varying symmetry properties enables straightforward evaluation of optical, electromagnetic, and robotic kinematic tracking for 3D US at different scanning speeds and insonation angles. A standardised pipeline performs real-time segmentation and 3D reconstruction of geometric targets (DSC = 0.97, FPS = 46) without GPU acceleration, followed by automated registration and comparison with ground-truth geometries. Applying this framework showed that our robotic 3D US achieves state-of-the-art reconstruction performance (DSC-3D = 0.94 +- 0.01, HD95 = 1.17 +- 0.12), approaching the spatial resolution limit imposed by the transducer. This work establishes a flexible experimental platform and a reproducible validation methodology for 3D US reconstruction. The proposed framework enables robust cross-platform comparisons and improved reporting practices, supporting the safe and effective clinical translation of 3D ultrasound in diagnostic and image-guided therapy applications.

Uncertainty Matters: Structured Probabilistic Online Mapping for Motion Prediction in Autonomous Driving

Authors:Pritom Gogoi, Faris Janjoš, Bin Yang, Andreas Look
Date:2026-03-20 15:56:48

Online map generation and trajectory prediction are critical components of the autonomous driving perception-prediction-planning pipeline. While modern vectorized mapping models achieve high geometric accuracy, they typically treat map estimation as a deterministic task, discarding structural uncertainty. Existing probabilistic approaches often rely on diagonal covariance matrices, which assume independence between points and fail to capture the strong spatial correlations inherent in road geometry. To address this, we propose a structured probabilistic formulation for online map generation. Our method explicitly models intra-element dependencies by predicting a dense covariance matrix, parameterized via a Low-Rank plus Diagonal (LRPD) covariance decomposition. This formulation represents uncertainty as a combination of a low-rank component, which captures global spatial structure, and a diagonal component representing independent local noise, thereby capturing geometric correlations without the prohibitive computational cost of full covariance matrices. Evaluations on the nuScenes dataset demonstrate that our uncertainty-aware framework yields consistent improvements in online map generation quality compared to deterministic baselines. Furthermore, our approach establishes new state-of-the-art performance for map-based motion prediction, highlighting the critical role of uncertainty in planning tasks. Code is published under link-available-soon.

Fine-tuning Timeseries Predictors Using Reinforcement Learning

Authors:Hugo Cazaux, Ralph Rudd, Hlynur Stefánsson, Sverrir Ólafsson, Eyjólfur Ingi Ásgeirsson
Date:2026-03-20 15:44:40

This chapter presents three major reinforcement learning algorithms used for fine-tuning financial forecasters. We propose a clear implementation plan for backpropagating the loss of a reinforcement learning task to a model trained using supervised learning, and compare the performance before and after the fine-tuning. We find an increase in performance after fine-tuning, and transfer learning properties to the models, indicating the benefits of fine-tuning. We also highlight the tuning process and empirical results for future implementation by practitioners.

Orchestrating Human-AI Software Delivery: A Retrospective Longitudinal Field Study of Three Software Modernization Programs

Authors:Maximiliano Armesto, Christophe Kolb
Date:2026-03-20 15:14:36

Evidence on AI in software engineering still leans heavily toward individual task completion, while evidence on team-level delivery remains scarce. We report a retrospective longitudinal field study of Chiron, an industrial platform that coordinates humans and AI agents across four delivery stages: analysis, planning, implementation, and validation. The study covers three real software modernization programs -- a COBOL banking migration (~30k LOC), a large accounting modernization (~400k LOC), and a .NET/Angular mortgage modernization (~30k LOC) -- observed across five delivery configurations: a traditional baseline and four successive platform versions (V1--V4). The benchmark separates observed outcomes (stage durations, task volumes, validation-stage issues, first-release coverage) from modeled outcomes (person-days and senior-equivalent effort under explicit staffing scenarios). Under baseline staffing assumptions, portfolio totals move from 36.0 to 9.3 summed project-weeks; modeled raw effort falls from 1080.0 to 232.5 person-days; modeled senior-equivalent effort falls from 1080.0 to 139.5 SEE-days; validation-stage issue load falls from 8.03 to 2.09 issues per 100 tasks; and first-release coverage rises from 77.0% to 90.5%. V3 and V4 add acceptance-criteria validation, repository-native review, and hybrid human-agent execution, simultaneously improving speed, coverage, and issue load. The evidence supports a central thesis: the largest gains appear when AI is embedded in an orchestrated workflow rather than deployed as an isolated coding assistant.

GustPilot: A Hierarchical DRL-INDI Framework for Wind-Resilient Quadrotor Navigation

Authors:Amir Atef Habel, Roohan Ahmed Khan, Fawad Mehboob, Clement Fortin, Dzmitry Tsetserukou
Date:2026-03-20 14:06:11

Wind disturbances remain a key barrier to reliable autonomous navigation for lightweight quadrotors, where the rapidly varying airflow can destabilize both planning and tracking. This paper introduces GustPilot, a hierarchical wind-resilient navigation stack in which a deep reinforcement learning (DRL) policy generates inertial-frame velocity reference for gate traversal. At the same time, a geometric Incremental Nonlinear Dynamic Inversion (INDI) controller provides low-level tracking with fast residual disturbance rejection. The INDI layer achieves this by providing incremental feedback on both specific linear acceleration and angular acceleration rate, using onboard sensor measurements to reject wind disturbances rapidly. Robustness is obtained through a two-level strategy, wind-aware planning learned via fan-jet domain randomization during training, and rapid execution-time disturbance rejection by the INDI tracking controller. We evaluate GustPilot in real flights on a 50g quad-copter platform against a DRL-PID baseline across four scenarios ranging from no-wind to fully dynamic conditions with a moving gate and a moving disturbance source. Despite being trained only in a minimal single-gate and single-fan setup, the policy generalizes to significantly more complex environments (up to six gates and four fans) without retraining. Across 80 experiments, DRL-INDI achieves a 94.7% versus 55.0% for DRL-PID as average Overall Success Rate (OSR), reduces tracking RMSE up to 50%, and sustains speeds up to 1.34 m/s under wind disturbances up to 3.5 m/s. These results demonstrate that combining DRL-based velocity planning with structured INDI disturbance rejection provides a practical and generalizable approach to wind-resilient autonomous flight navigation.

On the Ability of Transformers to Verify Plans

Authors:Yash Sarrof, Yupei Du, Katharina Stein, Alexander Koller, Sylvie Thiébaux, Michael Hahn
Date:2026-03-20 13:55:29

Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when generalization should be expected has been limited. We take important steps towards addressing this gap by analyzing the ability of decoder-only models to verify whether a given plan correctly solves a given planning instance. To analyse the general setting where the number of objects -- and thus the effective input alphabet -- grows at test time, we introduce C*-RASP, an extension of C-RASP designed to establish length generalization guarantees for transformers under the simultaneous growth in sequence length and vocabulary size. Our results identify a large class of classical planning domains for which transformers can provably learn to verify long plans, and structural properties that significantly affects the learnability of length generalizable solutions. Empirical experiments corroborate our theory.

How did the Urban Network Flow Adapt to the Collapse of the Carola Bridge?

Authors:Jyotirmaya Ijaradar, Ning Xie, Lei Wei, Sebastian Pape, Matthias Körner, Meng Wang
Date:2026-03-20 13:50:33

The unexpected collapse of the Carola Bridge in Dresden, Germany, provides a rare opportunity to characterise how urban network traffic adapts to an unexpected infrastructure disruption. This study develops a data-driven analytical framework using traffic data from the Dresden traffic management system to assess the short-term impacts of the disruption. By combining statistical comparisons of pre- and post-collapse motorised traffic distributions, peak-hour shifts, and Park-and-Ride data analyses, the framework reveals how traffic dynamics and traveller choices adjust under infrastructure disruption. Results reveal that the two closest bridges, the Albert and Marien Bridges, absorb the majority of the diverted motorised traffic. In particular, the daily traffic volume on the Albert bridge increases by up to 81%, which is equivalent to 3.5 hours of traffic operating with maximum flow. Peak hours on critical links are significantly prolonged, reaching up to 250 minutes. Besides redistribution, the overall daily motorised traffic crossing the Elbe river declines by approximately 8,000 vehicles, while Park-and-Ride usage increases by up to 188%, suggesting a potential travel mode shift after the disruption. The study reveals the patterns of traffic redistribution following an unexpected disruption and provides insights for resilience planning and emergency traffic management.

Low-Latency Stateful Stream Processing through Timely and Accurate Prefetching

Authors:Eleni Zapridou, Anastasia Ailamaki
Date:2026-03-20 12:07:41

Mission-critical applications often run "forever" and process large data volumes in real time while demanding low latency. To handle the large state of these applications, modern streaming engines rely on key-value stores and store state on local storage or remotely, but accessing such state inflates latency. As today's engines tightly couple the data path with state I/O, a tuple triggers state access only when it reaches a stateful operator, placing I/O on the critical path and stalling the CPU. However, the keys used to access the state are frequently known earlier in the query plan. Building on this insight, we propose Keyed Prefetching, which decouples the data path from state access by extracting future access keys at upstream operators and proactively staging the corresponding state in memory before tuples arrive. This overlaps I/O with ongoing computation and hides the latency of large-state accesses. We pair Keyed Prefetching with Timestamp-Aware Caching, a cache-eviction policy that jointly manages previously accessed and prefetched entries to use memory efficiently. Together, these techniques reduce latency for long-running, real-time queries without sacrificing throughput.

Problem difficulty and waiting time shape the level of detail and temporal organization of visual strategies in human planning

Authors:Mattia Eluchans, Giovanni Pezzulo
Date:2026-03-20 11:47:36

Planning entails identifying sequences of actions to reach a goal, yet we still have incomplete knowledge of how problem constraints, such as difficulty and available time, influence the visual strategies supporting plan construction, both in terms of coverage of the to-be-executed plans and its temporal organization. To fill this gap, we recorded participants' cursor and eye movements in a multi-target problem solving task on a grid. We manipulated two orthogonal dimensions: problem difficulty, by introducing the novel construct of misleadingness, which measures how nodes' distances on the grid diverged from their relative position along the solution, and waiting time, by allowing participants either to act immediately or wait before moving. We found that difficulty significantly affected both performance and gaze: harder problems reduced success rates, required more corrections and pauses, elicited longer pre-movement inspection that provided higher coverage of the to-be-executed plan, and more re-fixations. When participants could start immediately, they did so without fully consolidating their plan. This led to more pauses and backtracks, but also to more precise gaze-cursor alignment during execution, suggesting improved online control compensating for incomplete planning. With increased planning time, greater difficulty led participants to achieve a better temporal alignment between pre-movement visual inspection and cursor movement during execution. Overall, our results suggest that problem difficulty increases the visual coverage of the upcoming plan, whereas time availability shapes the extent of replanning during execution and determines whether gaze-path coherence emerges before movement or only during execution in difficult problems.

Multi-Agent Motion Planning on Industrial Magnetic Levitation Platforms: A Hybrid ADMM-HOCBF approach

Authors:Bavo Tistaert, Stan Servaes, Alejandro Gonzalez-Garcia, Ibrahim Ibrahim, Louis Callens, Jan Swevers, Wilm Decré
Date:2026-03-20 10:28:24

This paper presents a novel hybrid motion planning method for holonomic multi-agent systems. The proposed decentralised model predictive control (MPC) framework tackles the intractability of classical centralised MPC for a growing number of agents while providing safety guarantees. This is achieved by combining a decentralised version of the alternating direction method of multipliers (ADMM) with a centralised high-order control barrier function (HOCBF) architecture. Simulation results show significant improvement in scalability over classical centralised MPC. We validate the efficacy and real-time capability of the proposed method by developing a highly efficient C++ implementation and deploying the resulting trajectories on a real industrial magnetic levitation platform.

Offshore oil and gas platform dynamics in the North Sea, Gulf of Mexico, and Persian Gulf: Exploiting the Sentinel-1 archive

Authors:Robin Spanier, Thorsten Hoeser, John Truckenbrodt, Felix Bachofer, Claudia Kuenzer
Date:2026-03-20 09:40:32

The increasing use of marine spaces by offshore infrastructure, including oil and gas platforms, underscores the need for consistent, scalable monitoring. Offshore development has economic, environmental, and regulatory implications, yet maritime areas remain difficult to monitor systematically due to their inaccessibility and spatial extent. This study presents an automated approach to the spatiotemporal detection of offshore oil and gas platforms based on freely available Earth observation data. Leveraging Sentinel-1 archive data and deep learning-based object detection, a consistent quarterly time series of platform locations for three major production regions: the North Sea, the Gulf of Mexico, and the Persian Gulf, was created for the period 2017-2025. In addition, platform size, water depth, distance to the coast, national affiliation, and installation and decommissioning dates were derived. 3,728 offshore platforms were identified in 2025, 356 in the North Sea, 1,641 in the Gulf of Mexico, and 1,731 in the Persian Gulf. While expansion was observed in the Persian Gulf until 2024, the Gulf of Mexico and the North Sea saw a decline in platform numbers from 2018-2020. At the same time, a pronounced dynamic was apparent. More than 2,700 platforms were installed or relocated to new sites, while a comparable number were decommissioned or relocated. Furthermore, the increasing number of platforms with short lifespans points to a structural change in the offshore sector associated with the growing importance of mobile offshore units such as jack-ups or drillships. The results highlighted the potential of freely available Earth observation data and deep learning for consistent, long-term monitoring of marine infrastructure. The derived dataset is public and provides a basis for offshore monitoring, maritime planning, and analyses of the transformation of the offshore energy sector.

Envy-Free School Redistricting Between Two Groups

Authors:Daisuke Shibatani, Yutaro Yamaguchi
Date:2026-03-20 07:11:32

We study an application of fair division theory to school redistricting. Procaccia, Robinson, and Tucker-Foltz (SODA 2024) recently proposed a mathematical model to generate redistricting plans that provide theoretically guaranteed fairness among demographic groups of students. They showed that an almost proportional allocation can be found by adding $O(g \log g)$ extra seats in total, where $g$ is the number of groups. In contrast, for three or more groups, adding $o(n)$ extra seats is not sufficient to obtain an almost envy-free allocation in general, where $n$ is the total number of students. In this paper, we focus on the case of two groups. We introduce a relevant relaxation of envy-freeness, termed 1-relaxed envy-freeness, which limits the capacity violation not in total but at each school to at most one. We show that there always exists a 1-relaxed envy-free allocation, which can be found in polynomial time.

DynFlowDrive: Flow-Based Dynamic World Modeling for Autonomous Driving

Authors:Xiaolu Liu, Yicong Li, Song Wang, Junbo Chen, Angela Yao, Jianke Zhu
Date:2026-03-20 06:19:31

Recently, world models have been incorporated into the autonomous driving systems to improve the planning reliability. Existing approaches typically predict future states through appearance generation or deterministic regression, which limits their ability to capture trajectory-conditioned scene evolution and leads to unreliable action planning. To address this, we propose DynFlowDrive, a latent world model that leverages flow-based dynamics to model the transition of world states under different driving actions. By adopting the rectifiedflow formulation, the model learns a velocity field that describes how the scene state changes under different driving actions, enabling progressive prediction of future latent states. Building upon this, we further introduce a stability-aware multi-mode trajectory selection strategy that evaluates candidate trajectories according to the stability of the induced scene transitions. Extensive experiments on the nuScenes and NavSim benchmarks demonstrate consistent improvements across diverse driving frameworks without introducing additional inference overhead. Source code will be abaliable at https://github.com/xiaolul2/DynFlowDrive.

Legged Autonomous Surface Science In Analogue Environments (LASSIE): Making Every Robotic Step Count in Planetary Exploration

Authors:Cristina G. Wilson, Marion Nachon, Shipeng Liu, John G. Ruck, J. Diego Caporale, Benjamin E. McKeeby, Yifeng Zhang, Jordan M. Bretzfelder, John Bush, Alivia M. Eng, Ethan Fulcher, Emmy B. Hughes, Ian C. Rankin, Jelis J. Sostre Cortés, Sophie Silver, Michael R. Zanetti, Ryan C. Ewing, Kenton R. Fisher, Douglas J. Jerolmack, Daniel E. Koditschek, Frances Rivera-Hernández, Thomas F. Shipley, Feifei Qian
Date:2026-03-20 05:55:53

The ability to efficiently and effectively explore planetary surfaces is currently limited by the capability of wheeled rovers to traverse challenging terrains, and by pre-programmed data acquisition plans with limited in-situ flexibility. In this paper, we present two novel approaches to address these limitations: (i) high-mobility legged robots that use direct surface interactions to collect rich information about the terrain's mechanics to guide exploration; (ii) human-inspired data acquisition algorithms that enable robots to reason about scientific hypotheses and adapt exploration priorities based on incoming ground-sensing measurements. We successfully verify our approach through lab work and field deployments in two planetary analog environments. The new capability for legged robots to measure soil mechanical properties is shown to enable effective traversal of challenging terrains. When coupled with other geologic properties (e.g., composition, thermal properties, and grain size data etc), soil mechanical measurements reveal key factors governing the formation and development of geologic environments. We then demonstrate how human-inspired algorithms turn terrain-sensing robots into teammates, by supporting more flexible and adaptive data collection decisions with human scientists. Our approach therefore enables exploration of a wider range of planetary environments and new substrate investigation opportunities through integrated human-robot systems that support maximum scientific return.

ContractionPPO: Certified Reinforcement Learning via Differentiable Contraction Layers

Authors:Vrushabh Zinage, Narek Harutyunyan, Eric Verheyden, Fred Y. Hadaegh, Soon-Jo Chung
Date:2026-03-20 04:32:18

Legged locomotion in unstructured environments demands not only high-performance control policies but also formal guarantees to ensure robustness under perturbations. Control methods often require carefully designed reference trajectories, which are challenging to construct in high-dimensional, contact-rich systems such as quadruped robots. In contrast, Reinforcement Learning (RL) directly learns policies that implicitly generate motion, and uniquely benefits from access to privileged information, such as full state and dynamics during training, that is not available at deployment. We present ContractionPPO, a framework for certified robust planning and control of legged robots by augmenting Proximal Policy Optimization (PPO) RL with a state-dependent contraction metric layer. This approach enables the policy to maximize performance while simultaneously producing a contraction metric that certifies incremental exponential stability of the simulated closed-loop system. The metric is parameterized as a Lipschitz neural network and trained jointly with the policy, either in parallel or as an auxiliary head of the PPO backbone. While the contraction metric is not deployed during real-world execution, we derive upper bounds on the worst-case contraction rate and show that these bounds ensure the learned contraction metric generalizes from simulation to real-world deployment. Our hardware experiments on quadruped locomotion demonstrate that ContractionPPO enables robust, certifiably stable control even under strong external perturbations.

Continual Learning for Food Category Classification Dataset: Enhancing Model Adaptability and Performance

Authors:Piyush Kaushik Bhattacharyya, Devansh Tomar, Shubham Mishra, Divyanshu Rai, Yug Pratap Singh, Harsh Yadav, Krutika Verma, Vishal Meena, N Sangita Achary
Date:2026-03-20 04:03:53

Conventional machine learning pipelines often struggle to recognize categories absent from the original trainingset. This gap typically reduces accuracy, as fixed datasets rarely capture the full diversity of a domain. To address this, we propose a continual learning framework for text-guided food classification. Unlike approaches that require retraining from scratch, our method enables incremental updates, allowing new categories to be integrated without degrading prior knowledge. For example, a model trained on Western cuisines could later learn to classify dishes such as dosa or kimchi. Although further refinements are needed, this design shows promise for adaptive food recognition, with applications in dietary monitoring and personalized nutrition planning.

CeRLP: A Cross-embodiment Robot Local Planning Framework for Visual Navigation

Authors:Haoyu Xi, Mingao Tan, Xinming Zhang, Siwei Cheng, Shanze Wang, Yin Gu, Xiaoyu Shen, Wei Zhang
Date:2026-03-20 03:17:18

Visual navigation for cross-embodiment robots is challenging due to variations in robot and camera configurations, which can lead to the failure of navigation tasks. Previous approaches typically rely on collecting massive datasets across different robots, which is highly data-intensive, or fine-tuning models, which is time-consuming. Furthermore, both methods often lack explicit consideration of robot geometry. In this paper, we propose a Cross-embodiment Robot Local Planning (CeRLP) framework for general visual navigation, which abstracts visual information into a unified geometric formulation and applies to heterogeneous robots with varying physical dimensions, camera parameters, and camera types. CeRLP introduces a depth estimation scale correction method that utilizes offline pre-calibration to resolve the scale ambiguity of monocular depth estimation, thereby recovering precise metric depth images. Furthermore, CeRLP designs a visual-to-scan abstraction module that projects varying visual inputs into height-adaptive laser scans, making the policy robust to heterogeneous robots. Experiments in simulation environments demonstrate that CeRLP outperforms comparative methods, validating its robust obstacle avoidance capabilities as a local planner. Additionally, extensive real-world experiments verify the effectiveness of CeRLP in tasks such as point-to-point navigation and vision-language navigation, demonstrating its generalization across varying robot and camera configurations.

Wearable Foundation Models Should Go Beyond Static Encoders

Authors:Yu Yvonne Wu, Yuwei Zhang, Hyungjun Yoon, Ting Dang, Dimitris Spathis, Tong Xia, Qiang Yang, Jing Han, Dong Ma, Sung-Ju Lee, Cecilia Mascolo
Date:2026-03-20 02:11:21

Wearable foundation models (WFMs), trained on large volumes of data collected by affordable, always-on devices, have demonstrated strong performance on short-term, well-defined health monitoring tasks, including activity recognition, fitness tracking, and cardiovascular signal assessment. However, most existing WFMs primarily map short temporal windows to predefined labels via static encoders, emphasizing retrospective prediction rather than reasoning over evolving personal history, context, and future risk trajectories. As a result, they are poorly suited for modeling chronic, progressive, or episodic health conditions that unfold over weeks, months or years. Hence, we argue that WFMs must move beyond static encoders and be explicitly designed for longitudinal, anticipatory health reasoning. We identify three foundational shifts required to enable this transition: (1) Structurally rich data, which goes beyond isolated datasets or outcome-conditioned collection to integrated multimodal, long-term personal trajectories, and contextual metadata, ideally supported by open and interoperable data ecosystems; (2) Longitudinal-aware multimodal modeling, which prioritizes long-context inference, temporal abstraction, and personalization over cross-sectional or population-level prediction; and (3) Agentic inference systems, which move beyond static prediction to support planning, decision-making, and clinically grounded intervention under uncertainty. Together, these shifts reframe wearable health monitoring from retrospective signal interpretation toward continuous, anticipatory, and human-aligned health support.

Planning Autonomous Vehicle Maneuvering in Work Zones Through Game-Theoretic Trajectory Generation

Authors:Mayar Nour, Atrisha Sarkar, Mohamed H. Zaki
Date:2026-03-20 01:48:14

Work zone navigation remains one of the most challenging manoeuvres for autonomous vehicles (AVs), where constrained geometries and unpredictable traffic patterns create a high-risk environment. Despite extensive research on AV trajectory planning, few studies address the decision-making required to navigate work zones safely. This paper proposes a novel game-theoretic framework for trajectory generation and control to enhance the safety of lane changes in a work zone environment. By modelling the lane change manoeuvre as a non-cooperative game between vehicles, we use a game-theoretic planner to generate trajectories that balance safety, progress, and traffic stability. The simulation results show that the proposed game-theoretic model reduces the frequency of conflicts by 35 percent and decreases the probability of high risk safety events compared to traditional vehicle behaviour planning models in safety-critical highway work-zone scenarios.

Unlabeled Multi-Robot Motion Planning with Improved Separation Trade-offs

Authors:Tsuri Farhana, Omrit Filtser, Shalev Goldshtein
Date:2026-03-19 22:10:47

We study unlabeled multi-robot motion planning for unit-disk robots in a polygonal environment. Although the problem is hard in general, polynomial-time solutions exist under appropriate separation assumptions on start and target positions. Banyassady et al. (SoCG'22) guarantee feasibility in simple polygons under start--start and target--target distances of at least $4$, and start--target distances of at least $3$, but without optimality guarantees. Solovey et al. (RSS'15) provide a near-optimal solution in general polygonal domains, under stricter conditions: start/target positions must have pairwise distance at least $4$, and at least $\sqrt{5}\approx2.236$ from obstacles. This raises the question of whether polynomial-time algorithms can be obtained in even more densely packed environments. In this paper we present a generalized algorithm that achieve different trade-offs on the robots-separation and obstacles-separation bounds, all significantly improving upon the state of the art. Specifically, we obtain polynomial-time constant-approximation algorithms to minimize the total path length when (i) the robots-separation is $2\tfrac{2}{3}$ and the obstacles-separation is $1\tfrac{2}{3}$, or (ii) the robots-separation is $\approx3.291$ and the obstacles-separation $\approx1.354$. Additionally, we introduce a different strategy yielding a polynomial-time solution when the robots-separation is only $2$, and the obstacles-separation is $3$. Finally, we show that without any robots-separation assumption, obstacles-separation of at least $1.5$ may be necessary for a solution to exist.

Process Faster, Pay Less: Functional Isolation for Stream Processing

Authors:Eleni Zapridou, Michael Koepf, Panagiotis Sioulas, Ioannis Mytilinis, Anastasia Ailamaki
Date:2026-03-19 20:18:15

Concurrent workloads often extract insights from high-throughput, real-time data streams. Existing stream processing engines isolate each query's resources, ensuring robust performance but incurring high infrastructure costs. In contrast, sharing work reduces the amount of necessary resources but introduces inter-query interference, leading to performance degradation for some queries. We introduce FunShare, a stream-processing system that improves resource efficiency without compromising performance by dynamically grouping queries based on their performance characteristics. FunShare strategically relaxes query interdependencies and minimizes redundant computation while preserving individual query performance. It achieves this by using an adaptive optimization framework that monitors execution metrics, accurately estimates computation overlaps, and reconfigures execution plans on the fly in response to changes in the underlying data streams. Our evaluation demonstrates that FunShare minimizes resource consumption compared to isolated execution while maintaining or improving throughput for all queries.

Computer-Orchestrated Design of Algorithms: From Join Specification to Implementation

Authors:Zeyuan Hu
Date:2026-03-19 19:52:09

Equipping query processing systems with provable theoretical guarantees has been a central focus at the intersection of database theory and systems in recent years. However, the divergence between theoretical abstractions and system assumptions creates a gap between an algorithm's high-level logical specification and its low-level physical implementation. Ensuring the correctness of this logical-to-physical translation is crucial for realizing theoretical optimality as practical performance gains. Existing database testing frameworks struggle to address this need because necessary algorithm-specific inputs such as join trees are absent from standard test case generation, and integrating complex algorithms into these frameworks imposes prohibitive engineering overhead. Fallback solutions, such as using macro-benchmark queries, are inherently too noisy for isolating intricate defects during this translation. In this experience paper, we present a retrospective analysis of $\mathsf{CODA}$, a computer-orchestrated testing framework utilized during the physical co-design of TreeTracker Join ($\mathsf{TTJ}$), a theoretically optimal yet practical join algorithm recently published in ACM TODS. By synthesizing minimal reproducible examples, $\mathsf{CODA}$ successfully isolates subtle translation defects, such as state mismanagement and mapping conflicts between join trees and bushy plans. We demonstrate that this logical-to-physical translation process is a bidirectional feedback loop: early structural testing not only hardened $\mathsf{TTJ}$'s physical implementation but also exposed a boundary condition that directly refined the formal precondition of $\mathsf{TTJ}$ itself. Finally, we detail how confronting these translation challenges drove the architectural evolution of $\mathsf{CODA}$ into a robust, structure-aware test generation pipeline for join-tree-dependent algorithms.

When both Grounding and not Grounding are Bad -- A Partially Grounded Encoding of Planning into SAT (Extended Version)

Authors:João Filipe, Gregor Behnke
Date:2026-03-19 19:46:49

Classical planning problems are typically defined using lifted first-order representations, which offer compactness and generality. While most planners ground these representations to simplify reasoning, this can cause an exponential blowup in size. Recent approaches instead operate directly on the lifted level to avoid full grounding. We explore a middle ground between fully lifted and fully grounded planning by introducing three SAT encodings that keep actions lifted while partially grounding predicates. Unlike previous SAT encodings, which scale quadratically with plan length, our approach scales linearly, enabling better performance on longer plans. Empirically, our best encoding outperforms the state of the art in length-optimal planning on hard-to-ground domains.

A Closed-Form CLF-CBF Controller for Whole-Body Continuum Soft Robot Collision Avoidance

Authors:Kiwan Wong, Maximillian Stölzle, Wei Xiao, Daniela Rus
Date:2026-03-19 19:34:49

Safe operation is essential for deploying robots in human-centered 3D environments. Soft continuum manipulators provide passive safety through mechanical compliance, but still require active control to achieve reliable collision avoidance. Existing approaches, such as sampling-based planning, are often computationally expensive and lack formal safety guarantees, which limits their use for real-time whole-body avoidance. This paper presents a closed-form Control Lyapunov Function--Control Barrier Function (CLF--CBF) controller for real-time 3D obstacle avoidance in soft continuum manipulators without online optimization. By analytically embedding safety constraints into the control input, the proposed method ensures stability and safety under the stated modeling assumptions, while avoiding feasibility issues commonly encountered in online optimization-based methods. The resulting controller is up to $10\times$ faster than standard CLF--CBF quadratic-programming approaches and up to $100\times$ faster than traditional sampling-based planners. Simulation and hardware experiments on a tendon-driven soft manipulator demonstrate accurate 3D trajectory tracking and robust obstacle avoidance in cluttered environments. These results show that the proposed framework provides a scalable and provably safe control strategy for soft robots operating in dynamic, safety-critical settings.

Speculative Policy Orchestration: A Latency-Resilient Framework for Cloud-Robotic Manipulation

Authors:Chanh Nguyen, Shutong Jin, Florian T. Pokorny, Erik Elmroth
Date:2026-03-19 19:24:14

Cloud robotics enables robots to offload high-dimensional motion planning and reasoning to remote servers. However, for continuous manipulation tasks requiring high-frequency control, network latency and jitter can severely destabilize the system, causing command starvation and unsafe physical execution. To address this, we propose Speculative Policy Orchestration (SPO), a latency-resilient cloud-edge framework. SPO utilizes a cloud-hosted world model to pre-compute and stream future kinematic waypoints to a local edge buffer, decoupling execution frequency from network round-trip time. To mitigate unsafe execution caused by predictive drift, the edge node employs an $ε$-tube verifier that strictly bounds kinematic execution errors. The framework is coupled with an Adaptive Horizon Scaling mechanism that dynamically expands or shrinks the speculative pre-fetch depth based on real-time tracking error. We evaluate SPO on continuous RLBench manipulation tasks under emulated network delays. Results show that even when deployed with learned models of modest accuracy, SPO reduces network-induced idle time by over 60% compared to blocking remote inference. Furthermore, SPO discards approximately 60% fewer cloud predictions than static caching baselines. Ultimately, SPO enables fluid, real-time cloud-robotic control while maintaining bounded physical safety.

A Discovery Plan for Pharmacy Benefit Managers Collusion

Authors:Lawrence W. Abrams
Date:2026-03-19 19:08:54

The Federal Trade Commission has recently filed an administrative complaint against the Big 3 pharmacy benefit managers claiming they engaged in unfair conduct in violation of Section 5 of the FTC Act. They never used the word collusion in the complaint and chose not to sue under The Sherman Act, Section 1. We view this as a novel case of market design collusion rather than a case of price collusion. The Big 3 PBMs are conceptualized as auctioneers soliciting rebate bids off unit list prices in exchange for favored positions on formularies. We will show how the fairness standard of the FTC Act can be made operational by judging fairness against economic theories of good auction design. Discovery is focused on finding explicit communication among the Big 3 PBMs in 2012 to change the so-called winner s determination equation of this auction, adding high gross rebates as a basis for formulary position assignments. On the other hand, we will argue that a case based on a bevy of anecdotes comparing only net unit prices will fail due to complexities in the winners determination equation.

Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer

Authors:Chenyang Gu, Mingyuan Zhang, Haozhe Xie, Zhongang Cai, Lei Yang, Ziwei Liu
Date:2026-03-19 17:59:51

Prior motion generation largely follows two paradigms: continuous diffusion models that excel at kinematic control, and discrete token-based generators that are effective for semantic conditioning. To combine their strengths, we propose a three-stage framework comprising condition feature extraction (Perception), discrete token generation (Planning), and diffusion-based motion synthesis (Control). Central to this framework is MoTok, a diffusion-based discrete motion tokenizer that decouples semantic abstraction from fine-grained reconstruction by delegating motion recovery to a diffusion decoder, enabling compact single-layer tokens while preserving motion fidelity. For kinematic conditions, coarse constraints guide token generation during planning, while fine-grained constraints are enforced during control through diffusion-based optimization. This design prevents kinematic details from disrupting semantic token planning. On HumanML3D, our method significantly improves controllability and fidelity over MaskControl while using only one-sixth of the tokens, reducing trajectory error from 0.72 cm to 0.08 cm and FID from 0.083 to 0.029. Unlike prior methods that degrade under stronger kinematic constraints, ours improves fidelity, reducing FID from 0.033 to 0.014.