planning - 2026-01-19

Generative Scenario Rollouts for End-to-End Autonomous Driving

Authors:Rajeev Yasarla, Deepti Hegde, Shizhong Han, Hsin-Pai Cheng, Yunxiao Shi, Meysam Sadeghigooghari, Shweta Mahajan, Apratim Bhattacharyya, Litian Liu, Risheek Garrepalli, Thomas Svantesson, Fatih Porikli, Hong Cai

Date:2026-01-16 17:59:28

Vision-Language-Action (VLA) models are emerging as highly effective planning models for end-to-end autonomous driving systems. However, current works mostly rely on imitation learning from sparse trajectory annotations and under-utilize their potential as generative models. We propose Generative Scenario Rollouts (GeRo), a plug-and-play framework for VLA models that jointly performs planning and generation of language-grounded future traffic scenes through an autoregressive rollout strategy. First, a VLA model is trained to encode ego vehicle and agent dynamics into latent tokens under supervision from planning, motion, and language tasks, facilitating text-aligned generation. Next, GeRo performs language-conditioned autoregressive generation. Given multi-view images, a scenario description, and ego-action questions, it generates future latent tokens and textual responses to guide long-horizon rollouts. A rollout-consistency loss stabilizes predictions using ground truth or pseudo-labels, mitigating drift and preserving text-action alignment. This design enables GeRo to perform temporally consistent, language-grounded rollouts that support long-horizon reasoning and multi-agent planning. On Bench2Drive, GeRo improves driving score and success rate by +15.7 and +26.2, respectively. By integrating reinforcement learning with generative rollouts, GeRo achieves state-of-the-art closed-loop and open-loop performance, demonstrating strong zero-shot robustness. These results highlight the promise of generative, language-conditioned reasoning as a foundation for safer and more interpretable end-to-end autonomous driving.

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

Authors:Shaofeng Yin, Jiaxin Ge, Zora Zhiruo Wang, Xiuyu Li, Michael J. Black, Trevor Darrell, Angjoo Kanazawa, Haiwen Feng

Date:2026-01-16 09:11:55

Vision-as-inverse-graphics, the concept of reconstructing an image as an editable graphics program is a long-standing goal of computer vision. Yet even strong VLMs aren't able to achieve this in one-shot as they lack fine-grained spatial and physical grounding capability. Our key insight is that closing this gap requires interleaved multimodal reasoning through iterative execution and verification. Stemming from this, we present VIGA (Vision-as-Inverse-Graphic Agent) that starts from an empty world and reconstructs or edits scenes through a closed-loop write-run-render-compare-revise procedure. To support long-horizon reasoning, VIGA combines (i) a skill library that alternates generator and verifier roles and (ii) an evolving context memory that contains plans, code diffs, and render history. VIGA is task-agnostic as it doesn't require auxiliary modules, covering a wide range of tasks such as 3D reconstruction, multi-step scene editing, 4D physical interaction, and 2D document editing, etc. Empirically, we found VIGA substantially improves one-shot baselines on BlenderGym (35.32%) and SlideBench (117.17%). Moreover, VIGA is also model-agnostic as it doesn't require finetuning, enabling a unified protocol to evaluate heterogeneous foundation VLMs. To better support this protocol, we introduce BlenderBench, a challenging benchmark that stress-tests interleaved multimodal reasoning with graphics engine, where VIGA improves by 124.70%.

Modular and Mobile Capacity Planning for Hyperconnected Supply Chain Networks

Authors:Xiaoyue Liu, Walid Klibi, Benoit Montreuil

Date:2026-01-16 09:08:47

The increased volatility of markets and the pressing need for resource sustainability are driving supply chains towards more agile, distributed, and dynamic designs. Motivated by the Physical Internet initiative, we introduce the Dynamic Stochastic Modular and Mobile Capacity Planning (DSMMCP) problem, which fosters hyperconnectivity through a network-of-networks architecture with modular and mobile capacities. The problem addresses both demand and supply uncertainties by incorporating short-term leasing of modular facilities and dynamic relocation of resources. We formulate DSMMCP as a partially adaptive multi-stage stochastic program that minimizes the expected multi-period costs under uncertainty. To tackle the inherent NP-hardness, we develop an enhanced stochastic dual dynamic integer programming (SDDiP) algorithm, which integrates strengthened cut generation, a tailored alternating cut strategy, and an efficient parallelization framework, and we establish structural dominance and monotonicity properties that formalize the value of the strengthened cuts and partial adaptivity. Numerical experiments inspired by a real case study of a large U.S. construction company demonstrate that the DSMMCP framework achieves approximately 15% cost savings over static planning while improving resilience, reducing outsourcing costs, and supporting sustainability. Complementary experiments on synthetic instances confirm the effectiveness of the proposed SDDiP algorithm in terms of solution quality and runtime, as well as the scalability and robustness of the partially adaptive stochastic modeling framework across different network sizes and uncertainty levels.

MiCA: A Mobility-Informed Causal Adapter for Lightweight Epidemic Forecasting

Authors:Suhan Guo, Jiahong Deng, Furao Shen

Date:2026-01-16 08:41:06

Accurate forecasting of infectious disease dynamics is critical for public health planning and intervention. Human mobility plays a central role in shaping the spatial spread of epidemics, but mobility data are noisy, indirect, and difficult to integrate reliably with disease records. Meanwhile, epidemic case time series are typically short and reported at coarse temporal resolution. These conditions limit the effectiveness of parameter-heavy mobility-aware forecasters that rely on clean and abundant data. In this work, we propose the Mobility-Informed Causal Adapter (MiCA), a lightweight and architecture-agnostic module for epidemic forecasting. MiCA infers mobility relations through causal discovery and integrates them into temporal forecasting models via gated residual mixing. This design allows lightweight forecasters to selectively exploit mobility-derived spatial structure while remaining robust under noisy and data-limited conditions, without introducing heavy relational components such as graph neural networks or full attention. Extensive experiments on four real-world epidemic datasets, including COVID-19 incidence, COVID-19 mortality, influenza, and dengue, show that MiCA consistently improves lightweight temporal backbones, achieving an average relative error reduction of 7.5\% across forecasting horizons. Moreover, MiCA attains performance competitive with SOTA spatio-temporal models while remaining lightweight.

Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

Authors:Fenglin Zhang, Jie Wang

Date:2026-01-16 06:18:22

In this paper, we introduce a framework for contextual distributionally robust optimization (DRO) that considers the causal and continuous structure of the underlying distribution by developing interpretable and tractable decision rules that prescribe decisions using covariates. We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance that encourages continuous transport plans while preserving the causal consistency. We then formulate a contextual DRO model with a CSD-based ambiguity set, termed Causal Sinkhorn DRO (Causal-SDRO), and derive its strong dual reformulation where the worst-case distribution is characterized as a mixture of Gibbs distributions. To solve the corresponding infinite-dimensional policy optimization, we propose the Soft Regression Forest (SRF) decision rule, which approximates optimal policies within arbitrary measurable function spaces. The SRF preserves the interpretability of classical decision trees while being fully parametric, differentiable, and Lipschitz smooth, enabling intrinsic interpretation from both global and local perspectives. To solve the Causal-SDRO with parametric decision rules, we develop an efficient stochastic compositional gradient algorithm that converges to an $\varepsilon$-stationary point at a rate of $O(\varepsilon^{-4})$, matching the convergence rate of standard stochastic gradient descent. Finally, we validate our method through numerical experiments on synthetic and real-world datasets, demonstrating its superior performance and interpretability.

The Dynamic Team Orienteering Problem in Spatial Crowdsourcing: A Scenario Sampling Approach

Authors:Zhibin Wu, Songhao Shen, Yufeng Zhou, Qin Lei

Date:2026-01-16 05:47:04

In services such as retail audits and urban infrastructure monitoring, a platform dispatches rewarded, location-based micro-tasks to mobile workers traveling along personal origin-destination (OD) trips under hard time budgets. As requests with time constraints arrive online over a finite horizon, the platform must decide which requests to accept and how to route workers to maximize collected profit. We model this setting as the Dynamic Team Orienteering Problem in Spatial Crowdsourcing (DTOP-SC). To solve this problem, we propose a scenario-sampling rolling-horizon framework that mitigates myopic bias by augmenting each planning epoch with sampled virtual tasks. At each epoch, the augmented task set defines a deterministic static subproblem solved via an adaptive large neighborhood search (ALNS). We also formulate a mixed-integer programming model to provide offline reference solutions. Computational experiments are conducted on synthetic DTOP-SC instances generated from real-world road-map coordinates and on a dynamic team orienteering (DTOP) benchmark. On the map-based instances, the proposed policy exhibits stable gaps with respect to time-limited MIP solutions across the tested scales, while maintaining smooth computational scalability as the problem size increases. On the DTOP benchmark, the policy achieves an average decision time of 0.14s per instance, with 192-198s reported for multiple plan approach as an indicative reference, while maintaining competitive profit.

Analyzing Residential Speeding Using Connected Vehicle Data: A Case Study in Charlottesville, VA Area

Authors:Shi Feng, B. Brian Park, Andrew Mondschein

Date:2026-01-16 03:37:23

This study uses connected vehicle data to analyze speeding behavior on residential roads. A scalable pipeline processes trajectory data and supplements missing speed limits to generate summaries at OpenStreetMap's way ID level. The findings reveal a highly skewed distribution of both aggressive and reckless speeding. Based on a case study of Charlottesville, VA's connected vehicle data on residential roads, we found that 38% of segments had at least one instance of aggressive speeding, and 20% had at least one instance of reckless speeding. In addition, night time speeding is 27 times more prevalent than day time, and extreme violations on specific road segments highlight how severe the issue can be. Several segments rank among the top 10 for both aggressive and reckless speedings, indicating that there exist high-risk residential roads. These findings support the need for both spatial and behavioral interventions. The analysis provides a rich foundation for policy and planning, offering a valuable complement to traditional enforcement and planning tools. In conclusion, this framework sets the foundation for future applications in traffic safety analytics, demonstrating the growing potential of telematics data to inform safer, more livable communities.

Balancing Economic Cost and Disease Impact: Optimization Models for Wolbachia-Based Dengue Control

Authors:Kyrho Corum, Renier Mendoza, Victoria May P. Mendoza, Arrianne Crystal Velasco

Date:2026-01-16 03:19:56

Dengue, which affects millions of people each year, is one of the most common diseases transmitted by infected \textit{Aedes aegypti} mosquitoes. In the Philippines, the annual economic cost of dengue infections is estimated at around PHP 17 billion. Previous studies have shown that controlling the population of mosquitoes capable of transmitting the dengue virus can effectively reduce dengue infection rates. This study explores the use of Wolbachia as a strategy for dengue control by targeting mosquitoes. Since the release of Wolbachia-infected mosquitoes involves substantial costs, careful planning is necessary to balance disease control with the associated economic burden. To address this, we propose a mathematical model that captures the dynamics of releasing Wolbachia-carrying mosquitoes and the transmission of dengue in a population. We formulate single- and multi-objective optimization frameworks to minimize the economic costs associated with releasing Wolbachia-infected mosquitoes and the hospitalization costs resulting from dengue infections. This study aims to provide insights into the practical application of Wolbachia-based interventions for controlling dengue transmission. While the analysis is grounded in the Philippine context, the approach is general enough to be applicable to other dengue-endemic countries.

Sparse Data Tree Canopy Segmentation: Fine-Tuning Leading Pretrained Models on Only 150 Images

Authors:David Szczecina, Hudson Sun, Anthony Bertnyk, Niloofar Azad, Kyle Gao, Lincoln Linlin Xu

Date:2026-01-16 01:20:32

Tree canopy detection from aerial imagery is an important task for environmental monitoring, urban planning, and ecosystem analysis. Simulating real-life data annotation scarcity, the Solafune Tree Canopy Detection competition provides a small and imbalanced dataset of only 150 annotated images, posing significant challenges for training deep models without severe overfitting. In this work, we evaluate five representative architectures, YOLOv11, Mask R-CNN, DeepLabv3, Swin-UNet, and DINOv2, to assess their suitability for canopy segmentation under extreme data scarcity. Our experiments show that pretrained convolution-based models, particularly YOLOv11 and Mask R-CNN, generalize significantly better than pretrained transformer-based models. DeeplabV3, Swin-UNet and DINOv2 underperform likely due to differences between semantic and instance segmentation tasks, the high data requirements of Vision Transformers, and the lack of strong inductive biases. These findings confirm that transformer-based architectures struggle in low-data regimes without substantial pretraining or augmentation and that differences between semantic and instance segmentation further affect model performance. We provide a detailed analysis of training strategies, augmentation policies, and model behavior under the small-data constraint and demonstrate that lightweight CNN-based methods remain the most reliable for canopy detection on limited imagery.

Where to Touch, How to Contact: Hierarchical RL-MPC Framework for Geometry-Aware Long-Horizon Dexterous Manipulation

Authors:Zhixian Xie, Yu Xiang, Michael Posa, Wanxin Jin

Date:2026-01-16 01:20:15

A key challenge in contact-rich dexterous manipulation is the need to jointly reason over geometry, kinematic constraints, and intricate, nonsmooth contact dynamics. End-to-end visuomotor policies bypass this structure, but often require large amounts of data, transfer poorly from simulation to reality, and generalize weakly across tasks/embodiments. We address those limitations by leveraging a simple insight: dexterous manipulation is inherently hierarchical - at a high level, a robot decides where to touch (geometry) and move the object (kinematics); at a low level it determines how to realize that plan through contact dynamics. Building on this insight, we propose a hierarchical RL--MPC framework in which a high-level reinforcement learning (RL) policy predicts a contact intention, a novel object-centric interface that specifies (i) an object-surface contact location and (ii) a post-contact object-level subgoal pose. Conditioned on this contact intention, a low-level contact-implicit model predictive control (MPC) optimizes local contact modes and replans with contact dynamics to generate robot actions that robustly drive the object toward each subgoal. We evaluate the framework on non-prehensile tasks, including geometry-generalized pushing and object 3D reorientation. It achieves near-100% success with substantially reduced data (10x less than end-to-end baselines), highly robust performance, and zero-shot sim-to-real transfer.

ARC Prize 2025: Technical Report

Authors:François Chollet, Mike Knoop, Gregory Kamradt, Bryan Landers

Date:2026-01-15 23:23:56

The ARC-AGI benchmark series serves as a critical measure of few-shot generalization on novel tasks, a core aspect of intelligence. The ARC Prize 2025 global competition targeted the newly released ARC-AGI-2 dataset, which features greater task complexity compared to its predecessor. The Kaggle competition attracted 1,455 teams and 15,154 entries, with the top score reaching 24% on the ARC-AGI-2 private evaluation set. Paper submissions nearly doubled year-over-year to 90 entries, reflecting the growing research interest in fluid intelligence and abstract reasoning. The defining theme of 2025 is the emergence of the refinement loop -- a per-task iterative program optimization loop guided by a feedback signal. Refinement loops come in a variety of forms, in particular evolutionary program synthesis approaches and application-layer refinements to commercial AI systems. Such refinement loops are also possible in weight space, as evidenced by zero-pretraining deep learning methods which are now achieving competitive performance with remarkably small networks (7M parameters). In parallel, four frontier AI labs (Anthropic, Google DeepMind, OpenAI, and xAI) reported ARC-AGI performance in public model cards in 2025, establishing ARC-AGI as an industry standard benchmark for AI reasoning. However, our analysis indicates that current frontier AI reasoning performance remains fundamentally constrained to knowledge coverage, giving rise to new forms of benchmark contamination. In this paper, we survey the top-performing methods, examine the role of refinement loops in AGI progress, discuss knowledge-dependent overfitting, and preview ARC-AGI-3, which introduces interactive reasoning challenges that require exploration, planning, memory, goal acquisition, and alignment capabilities.

Cooperative UAVs for Remote Data Collection under Limited Communications: An Asynchronous Multiagent Learning Framework

Authors:Cuong Le, Symeon Chatzinotas, Thang X. Vu

Date:2026-01-15 20:33:27

This paper addresses the joint optimization of trajectories and bandwidth allocation for multiple Unmanned Aerial Vehicles (UAVs) to enhance energy efficiency in the cooperative data collection problem. We focus on an important yet underestimated aspect of the system, where action synchronization across all UAVs is impossible. Since most existing learning-based solutions are not designed to learn in this asynchronous environment, we formulate the trajectory planning problem as a Decentralized Partially Observable Semi-Markov Decision Process and introduce an asynchronous multi-agent learning algorithm to learn UAVs' cooperative policies. Once the UAVs' trajectory policies are learned, the bandwidth allocation can be optimally solved based on local observations at each collection point. Comprehensive empirical results demonstrate the superiority of the proposed method over other learning-based and heuristic baselines in terms of both energy efficiency and mission completion time. Additionally, the learned policies exhibit robustness under varying environmental conditions.

Approximately Optimal Global Planning for Contact-Rich SE(2) Manipulation on a Graph of Reachable Sets

Authors:Simin Liu, Tong Zhao, Bernhard Paus Graesdal, Peter Werner, Jiuguang Wang, John Dolan, Changliu Liu, Tao Pang

Date:2026-01-15 20:00:30

If we consider human manipulation, it is clear that contact-rich manipulation (CRM)-the ability to use any surface of the manipulator to make contact with objects-can be far more efficient and natural than relying solely on end-effectors (i.e., fingertips). However, state-of-the-art model-based planners for CRM are still focused on feasibility rather than optimality, limiting their ability to fully exploit CRM's advantages. We introduce a new paradigm that computes approximately optimal manipulator plans. This approach has two phases. Offline, we construct a graph of mutual reachable sets, where each set contains all object orientations reachable from a starting object orientation and grasp. Online, we plan over this graph, effectively computing and sequencing local plans for globally optimized motion. On a challenging, representative contact-rich task, our approach outperforms a leading planner, reducing task cost by 61%. It also achieves a 91% success rate across 250 queries and maintains sub-minute query times, ultimately demonstrating that globally optimized contact-rich manipulation is now practical for real-world tasks.

DeepUrban: Interaction-Aware Trajectory Prediction and Planning for Automated Driving by Aerial Imagery

Authors:Constantin Selzer, Fabian B. Flohr

Date:2026-01-15 16:18:42

The efficacy of autonomous driving systems hinges critically on robust prediction and planning capabilities. However, current benchmarks are impeded by a notable scarcity of scenarios featuring dense traffic, which is essential for understanding and modeling complex interactions among road users. To address this gap, we collaborated with our industrial partner, DeepScenario, to develop DeepUrban-a new drone dataset designed to enhance trajectory prediction and planning benchmarks focusing on dense urban settings. DeepUrban provides a rich collection of 3D traffic objects, extracted from high-resolution images captured over urban intersections at approximately 100 meters altitude. The dataset is further enriched with comprehensive map and scene information to support advanced modeling and simulation tasks. We evaluate state-of-the-art (SOTA) prediction and planning methods, and conducted experiments on generalization capabilities. Our findings demonstrate that adding DeepUrban to nuScenes can boost the accuracy of vehicle predictions and planning, achieving improvements up to 44.1 % / 44.3% on the ADE / FDE metrics. Website: https://iv.ee.hm.edu/deepurban

SatMap: Revisiting Satellite Maps as Prior for Online HD Map Construction

Authors:Kanak Mazumder, Fabian B. Flohr

Date:2026-01-15 15:39:27

Online high-definition (HD) map construction is an essential part of a safe and robust end-to-end autonomous driving (AD) pipeline. Onboard camera-based approaches suffer from limited depth perception and degraded accuracy due to occlusion. In this work, we propose SatMap, an online vectorized HD map estimation method that integrates satellite maps with multi-view camera observations and directly predicts a vectorized HD map for downstream prediction and planning modules. Our method leverages lane-level semantics and texture from satellite imagery captured from a Bird's Eye View (BEV) perspective as a global prior, effectively mitigating depth ambiguity and occlusion. In our experiments on the nuScenes dataset, SatMap achieves 34.8% mAP performance improvement over the camera-only baseline and 8.5% mAP improvement over the camera-LiDAR fusion baseline. Moreover, we evaluate our model in long-range and adverse weather conditions to demonstrate the advantages of using a satellite prior map. Source code will be available at https://iv.ee.hm.edu/satmap/.

CHORAL: Traversal-Aware Planning for Safe and Efficient Heterogeneous Multi-Robot Routing

Authors:David Morilla-Cabello, Eduardo Montijano

Date:2026-01-15 12:34:22

Monitoring large, unknown, and complex environments with autonomous robots poses significant navigation challenges, where deploying teams of heterogeneous robots with complementary capabilities can substantially improve both mission performance and feasibility. However, effectively modeling how different robotic platforms interact with the environment requires rich, semantic scene understanding. Despite this, existing approaches often assume homogeneous robot teams or focus on discrete task compatibility rather than continuous routing. Consequently, scene understanding is not fully integrated into routing decisions, limiting their ability to adapt to the environment and to leverage each robot's strengths. In this paper, we propose an integrated semantic-aware framework for coordinating heterogeneous robots. Starting from a reconnaissance flight, we build a metric-semantic map using open-vocabulary vision models and use it to identify regions requiring closer inspection and capability-aware paths for each platform to reach them. These are then incorporated into a heterogeneous vehicle routing formulation that jointly assigns inspection tasks and computes robot trajectories. Experiments in simulation and in a real inspection mission with three robotic platforms demonstrate the effectiveness of our approach in planning safer and more efficient routes by explicitly accounting for each platform's navigation capabilities. We release our framework, CHORAL, as open source to support reproducibility and deployment of diverse robot teams.

Terrain-Adaptive Mobile 3D Printing with Hierarchical Control

Authors:Shuangshan Nors Li, J. Nathan Kutz

Date:2026-01-15 09:15:06

Mobile 3D printing on unstructured terrain remains challenging due to the conflict between platform mobility and deposition precision. Existing gantry-based systems achieve high accuracy but lack mobility, while mobile platforms struggle to maintain print quality on uneven ground. We present a framework that tightly integrates AI-driven disturbance prediction with multi-modal sensor fusion and hierarchical hardware control, forming a closed-loop perception-learning-actuation system. The AI module learns terrain-to-perturbation mappings from IMU, vision, and depth sensors, enabling proactive compensation rather than reactive correction. This intelligence is embedded into a three-layer control architecture: path planning, predictive chassis-manipulator coordination, and precision hardware execution. Through outdoor experiments on terrain with slopes and surface irregularities, we demonstrate sub-centimeter printing accuracy while maintaining full platform mobility. This AI-hardware integration establishes a practical foundation for autonomous construction in unstructured environments.

Service Provisioning and Path Planning with Obstacle Avoidance for Low-Altitude Wireless Networks

Authors:Senning Wan, Bin Li, Hongbin Chen, Lei Liu

Date:2026-01-15 08:39:37

This paper investigates the three-dimensional (3D) deployment of uncrewed aerial vehicles (UAVs) as aerial base stations in heterogeneous communication networks under constraints imposed by diverse ground obstacles. Given the diverse data demands of user equipments (UEs), a user satisfaction model is developed to provide personalized services. In particular, when a UE is located within a ground obstacle, the UAV must approach the obstacle boundary to ensure reliable service quality. Considering constraints such as UAV failures due to battery depletion, heterogeneous UEs, and obstacles, we aim to maximize overall user satisfaction by jointly optimizing the 3D trajectories of UAVs, transmit beamforming vectors, and binary association indicators between UAVs and UEs. To address the complexity and dynamics of the problem, a block coordinate descent method is adopted to decompose it into two subproblems. The beamforming subproblem is efficiently addressed via a bisection-based water-filling algorithm. For the trajectory and association subproblem, we design a deep reinforcement learning algorithm based on proximal policy optimization to learn an adaptive control policy. Simulation results demonstrate that the proposed scheme outperforms baseline schemes in terms of convergence speed and overall system performance. Moreover, it achieves efficient association and accurate obstacle avoidance.

HyMGP: A Customized MILP-Based Tool for Techno-Economic Planning of Islanded Microgrids

Authors:Andres Intriago, Rongxing Hu, Nabil Mohammed, S. Gokul Krishnan, Konstantinos Kotsovos, Issam Gereige, Nesren Attiah, Ali Basaheeh, Sarah Aqeel, Hamad A. Saiari, Shehab Ahmed, Charalambos Konstantinou

Date:2026-01-15 08:38:35

This paper presents a customized microgrid planning algorithm and tool, HyMGP, for remote sites in arid regions, which is formulated as a Mixed Integer Linear Programming (MILP) problem. HyMGP is compared with HOMER Pro to evaluate its performance in optimizing the sizing of microgrid components, including photovoltaic panels (PVs), vertical axis wind turbines (VAWTs), and battery energy storage systems (BESS), for remote and off-grid applications. The study focuses on a standalone microgrid in the Saudi Arabia, considering high solar irradiance, limited wind availability, and a constant load profile composed of continuous cathodic protection and daytime cooling. In the simulation environment, comparisons with HOMER solutions demonstrate the advantages of HyMGP, which provides optimal and more flexible solutions by allowing user-defined component specifications and strictly enforcing all constraints. Further analysis shows that incorporating wind turbines reduces the Net Present Cost (NPC) by decreasing the required PV and battery capacities. Increasing battery autonomy leads to a higher NPC in both PV-only and hybrid systems due to the need for larger storage. Finally, lithium iron phosphate (Li-ion LFP) batteries are found to be more cost effective than lead acid, offering lower NPCs due to their longer lifespan, deeper discharge capability, and fewer replacement cycles.

Warm Hybrid Axion Inflation in $α$-Attractor Models Constrained by ACT and Future Plan experiments

Authors:Waqas Ahmed, Waqar Ahmad, Ahsan Illahi, M. Junaid

Date:2026-01-15 07:39:43

We present a comprehensive study of warm hybrid inflation within the framework of $α$-attractor models, where an axionic inflaton is coupled to a waterfall field in the presence of thermal dissipation. The model is analyzed for both linear ($Υ\propto T$) and cubic ($Υ\propto T^{3}$) dissipation regimes. Confronting the theoretical predictions with the latest observational data from Planck+BICEP/Keck, P-ACT-LB-BK18 and SPT, and , we find that in the weak dissipative regime ($Q_{*} \lesssim 10^{-5}$), the scalar spectral index $n_{s} \simeq 0.965$ lies at the boundary of the combined P-ACT-LB-BK18 constraints, while the tensor-to-scalar ratio $r$ remains within observable ranges. For stronger dissipation ($Q_{*} \gtrsim 10^{-5}$), the model predicts values of $n_{s}$ well within the $1$--$2σ$ confidence region of all datasets, with tensor modes remaining fully observable in both dissipation scenarios. These results indicate that forthcoming CMB polarization experiments may be capable of detecting primordial gravitational waves, thereby providing a robust observational test of warm hybrid inflation across different dissipative regimes.

Fairness Driven Multi-Agent Path Finding Problem

Authors:Aditi Anand, Dildar Ali, Suman Banerjee

Date:2026-01-15 07:08:42

The Multi-Agent Path Finding (MAPF) problem aims at finding non-conflicting paths for multiple agents from their respective sources to destinations. This problem arises in multiple real-life situations, including robot motion planning and airspace assignment for unmanned aerial vehicle movement. The problem is computationally expensive, and adding to it, the agents are rational and can misreport their private information. In this paper, we study both variants of the problem under the realm of fairness. For the non-rational agents, we propose a heuristic solution for this problem. Considering the agents are rational, we develop a mechanism and demonstrate that it is a dominant strategy, incentive compatible, and individually rational. We employ various solution methodologies to highlight the effectiveness and efficiency of the proposed solution approaches.

CoCoPlan: Adaptive Coordination and Communication for Multi-robot Systems in Dynamic and Unknown Environments

Authors:Xintong Zhang, Junfeng Chen, Yuxiao Zhu, Bing Luo, Meng Guo

Date:2026-01-15 06:50:21

Multi-robot systems can greatly enhance efficiency through coordination and collaboration, yet in practice, full-time communication is rarely available and interactions are constrained to close-range exchanges. Existing methods either maintain all-time connectivity, rely on fixed schedules, or adopt pairwise protocols, but none adapt effectively to dynamic spatio-temporal task distributions under limited communication, resulting in suboptimal coordination. To address this gap, we propose CoCoPlan, a unified framework that co-optimizes collaborative task planning and team-wise intermittent communication. Our approach integrates a branch-and-bound architecture that jointly encodes task assignments and communication events, an adaptive objective function that balances task efficiency against communication latency, and a communication event optimization module that strategically determines when, where and how the global connectivity should be re-established. Extensive experiments demonstrate that it outperforms state-of-the-art methods by achieving a 22.4% higher task completion rate, reducing communication overhead by 58.6%, and improving the scalability by supporting up to 100 robots in dynamic environments. Hardware experiments include the complex 2D office environment and large-scale 3D disaster-response scenario.

OT-Drive: Out-of-Distribution Off-Road Traversable Area Segmentation via Optimal Transport

Authors:Zhihua Zhao, Guoqiang Li, Chen Min, Kangping Lu

Date:2026-01-15 00:23:45

Reliable traversable area segmentation in unstructured environments is critical for planning and decision-making in autonomous driving. However, existing data-driven approaches often suffer from degraded segmentation performance in out-of-distribution (OOD) scenarios, consequently impairing downstream driving tasks. To address this issue, we propose OT-Drive, an Optimal Transport--driven multi-modal fusion framework. The proposed method formulates RGB and surface normal fusion as a distribution transport problem. Specifically, we design a novel Scene Anchor Generator (SAG) to decompose scene information into the joint distribution of weather, time-of-day, and road type, thereby constructing semantic anchors that can generalize to unseen scenarios. Subsequently, we design an innovative Optimal Transport-based multi-modal fusion module (OT Fusion) to transport RGB and surface normal features onto the manifold defined by the semantic anchors, enabling robust traversable area segmentation under OOD scenarios. Experimental results demonstrate that our method achieves 95.16% mIoU on ORFD OOD scenarios, outperforming prior methods by 6.35%, and 89.79% mIoU on cross-dataset transfer tasks, surpassing baselines by 13.99%.These results indicate that the proposed model can attain strong OOD generalization with only limited training data, substantially enhancing its practicality and efficiency for real-world deployment.

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

Authors:Hanna Foerster, Robert Mullins, Tom Blanchard, Nicolas Papernot, Kristina Nikolić, Florian Tramèr, Ilia Shumailov, Cheng Zhang, Yiren Zhao

Date:2026-01-14 23:06:35

AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss. The only known robust defense is architectural isolation that strictly separates trusted task planning from untrusted environment observations. However, applying this design to Computer Use Agents (CUAs) -- systems that automate tasks by viewing screens and executing actions -- presents a fundamental challenge: current agents require continuous observation of UI state to determine each action, conflicting with the isolation required for security. We resolve this tension by demonstrating that UI workflows, while dynamic, are structurally predictable. We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content, providing provable control flow integrity guarantees against arbitrary instruction injections. Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks, which manipulate UI elements to trigger unintended valid paths within the plan. We evaluate our design on OSWorld, and retain up to 57% of the performance of frontier models while improving performance for smaller open-source models by up to 19%, demonstrating that rigorous security and utility can coexist in CUAs.

A feasibility study for a Doppler Reflectometer System in the JT-60SA tokamak

Authors:D. Carralero, T. Happel, T. Estrada, T. Tokuzawa, J. Martínez, E. de la Luna, A. Cappa, J. García

Date:2026-01-14 22:29:05

In this work we present a study on the viability and practicality of installing a Doppler reflectometer (DR) system in the JT-60SA advanced tokamak. First, we discuss its scientific scope in the context of the JT-60SA research plan. We identify a number of fields in which a DR would be very relevant for the accomplishment of said plan and outline a scientific program for the diagnostic. Then, starting from a number of design hypothesis, we use a ray tracing code to carry out a feasibility study for a number of relevant scenarios and identify a geometric solution for the installation of a DR such that both core and edge can be probed in the prescribed wave number range, thus achieving the proposed scientific objectives. Finally, we perform a preliminary discussion on the different possibilities for a conceptual design (including a minimum viable system and a baseline system) and their requirements in terms of components and space. We conclude that a viable conceptual design could be carried out using a small fraction of a horizontal port, leaving room for additional diagnostic systems.

Forecasting Seasonal Peaks of Pediatric Respiratory Infections Using an Alert-Based Model Combining SIR Dynamics and Historical Trends in Santiago, Chile

Authors:Gloria Henríquez, Jhoan Báez, Víctor Riquelme, Pedro Gajardo, Michel Royer, Héctor Ramírez

Date:2026-01-14 19:27:55

Acute respiratory infections (ARI) are a major cause of pediatric hospitalization in Chile, producing marked winter increases in demand that challenge hospital planning. This study presents an alert-based forecasting model to predict the timing and magnitude of ARI hospitalization peaks in Santiago. The approach integrates a seasonal SIR model with a historical mobile predictor, activated by a derivative-based alert system that detects early epidemic growth. Daily hospitalization data from DEIS were smoothed using a 15-day moving average and Savitzky-Golay filtering, and parameters were estimated using a penalized loss function to reduce sensitivity to noise. Retrospective evaluation and real-world implementation in major Santiago pediatric hospitals during 2023 and 2024 show that peak date can be anticipated about one month before the event and predicted with high accuracy two weeks in advance. Peak magnitude becomes informative roughly ten days before the peak and stabilizes one week prior. The model provides a practical and interpretable tool for hospital preparedness.

On some Exotic Cylindrical Algebraic Decompositions and Cells

Authors:Lucas Michel

Date:2026-01-14 19:02:02

Cylindrical Algebraic Decompositions (CADs) endowed with additional topological properties have found applications beyond their original logical setting, including algorithmic optimizations in CAD construction, robot motion planning, and the algorithmic study of the topology of semi-algebraic sets. In this paper, we construct explicit examples of CADs and CAD cells that refute several conjectures and open questions of J. H. Davenport, A. Locatelli, and G. K. Sankaran concerning these topological assumptions.

Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning

Authors:Chi-Pin Huang, Yunze Man, Zhiding Yu, Min-Hung Chen, Jan Kautz, Yu-Chiang Frank Wang, Fu-En Yang

Date:2026-01-14 18:59:59

Vision-Language-Action (VLA) tasks require reasoning over complex visual scenes and executing adaptive actions in dynamic environments. While recent studies on reasoning VLAs show that explicit chain-of-thought (CoT) can improve generalization, they suffer from high inference latency due to lengthy reasoning traces. We propose Fast-ThinkAct, an efficient reasoning framework that achieves compact yet performant planning through verbalizable latent reasoning. Fast-ThinkAct learns to reason efficiently with latent CoTs by distilling from a teacher, driven by a preference-guided objective to align manipulation trajectories that transfers both linguistic and visual planning capabilities for embodied control. This enables reasoning-enhanced policy learning that effectively connects compact reasoning to action execution. Extensive experiments across diverse embodied manipulation and reasoning benchmarks demonstrate that Fast-ThinkAct achieves strong performance with up to 89.3\% reduced inference latency over state-of-the-art reasoning VLAs, while maintaining effective long-horizon planning, few-shot adaptation, and failure recovery.

TEMPO: A Realistic Multi-Domain Benchmark for Temporal Reasoning-Intensive Retrieval

Authors:Abdelrahman Abdallah, Mohammed Ali, Muhammad Abdul-Mageed, Adam Jatowt

Date:2026-01-14 14:45:20

Existing temporal QA benchmarks focus on simple fact-seeking queries from news corpora, while reasoning-intensive retrieval benchmarks lack temporal grounding. However, real-world information needs often require reasoning about temporal evolution and synthesizing evidence across time periods. We introduce TEMPO, the first benchmark combining temporal reasoning with reasoning-intensive retrieval across 13 domains. TEMPO features: (1) 1,730 complex queries requiring deep temporal reasoning such as tracking changes, identifying trends, or comparing cross-period evidence; (2) step-wise retrieval planning with 3,976 decomposed steps and gold documents mapped to each step for multi-hop evaluation; and (3) novel temporal metrics including Temporal Coverage@k and Temporal Precision@k measuring whether results span required time periods. Evaluation of 12 retrieval systems reveals substantial challenges: the best model (DiVeR) achieves only 32.0 NDCG@10 and 71.4\% Temporal Coverage@10, demonstrating difficulty in retrieving temporally complete evidence. We believe TEMPO provides a challenging benchmark for improving temporal reasoning in retrieval and RAG systems. Our code and data are available at https://github.com/tempo-bench/Tempo. See also our official website: https://tempo-bench.github.io/.

Terminally constrained flow-based generative models from an optimal control perspective

Authors:Weiguo Gao, Ming Li, Qianxiao Li

Date:2026-01-14 13:32:15

We address the problem of sampling from terminally constrained distributions with pre-trained flow-based generative models through an optimal control formulation. Theoretically, we characterize the value function by a Hamilton-Jacobi-Bellman equation and derive the optimal feedback control as the minimizer of the associated Hamiltonian. We show that as the control penalty increases, the controlled process recovers the reference distribution, while as the penalty vanishes, the terminal law converges to a generalized Wasserstein projection onto the constraint manifold. Algorithmically, we introduce Terminal Optimal Control with Flow-based models (TOCFlow), a geometry-aware sampling-time guidance method for pre-trained flows. Solving the control problem in a terminal co-moving frame that tracks reference trajectories yields a closed-form scalar damping factor along the Riemannian gradient, capturing second-order curvature effects without matrix inversions. TOCFlow therefore matches the geometric consistency of Gauss-Newton updates at the computational cost of standard gradient guidance. We evaluate TOCFlow on three high-dimensional scientific tasks spanning equality, inequality, and global statistical constraints, namely Darcy flow, constrained trajectory planning, and turbulence snapshot generation with Kolmogorov spectral scaling. Across all settings, TOCFlow improves constraint satisfaction over Euclidean guidance and projection baselines while preserving the reference model's generative quality.