planning - 2026-02-14

Systematic Analysis of Penalty-Optimised Illumination Design for Tomographic Volumetric Additive Manufacturing via the Extendable Framework TVAM AID Using the Core Imaging Library

Authors:Nicole Pellizzon, Richard Huber, Jon Spangenberg, Jakob Sauer Jørgensen

Date:2026-02-12 17:09:52

Tomographic Volumetric Additive Manufacturing(TVAM) is a novel manufacturing method that allows for the fast creation of objects of complex geometry in layerless fashion. The process is based on the solidification of photopolymer that occurs when a sufficient threshold dose of light-energy is absorbed. In order to create complex shapes, an illumination plan must be designed to force solidification in some desired areas while leaving other regions liquid. Determining an illumination plan can be considered as an optimisation problem where a variety of objective functionals (penalties) can be used. This work considers a selection of penalty functions and their impact on selected printing metrics; linking the shape of penalty functions to ranges of light-energy dose levels in in-part regions that should be printed and out-of-part regions that should remain liquid. Further, the threshold parameters that are typically used to demarcate minimum light-energy for in-part regions and maximum light-energy for out-of-part regions are investigated systematically as design parameters on both existing and new methods. This enables the characterisation of their effects on some selected printing metrics as well as informed selection for default values. This work is underpinned by a reproducible and extensible framework, TVAM Adaptive Illumination Design(TVAM AID), which makes use of the open-source Core Imaging Library(CIL) that is designed for tomographic imaging with an emphasis on reconstruction. The foundation of TVAM AID which is presented here can hence be easily enhanced by existing functionality in CIL thus lowering the barrier to entry and encouraging use of strategies that already exist for reconstruction optimisation.

Multi Graph Search for High-Dimensional Robot Motion Planning

Authors:Itamar Mishani, Maxim Likhachev

Date:2026-02-12 15:50:15

Efficient motion planning for high-dimensional robotic systems, such as manipulators and mobile manipulators, is critical for real-time operation and reliable deployment. Although advances in planning algorithms have enhanced scalability to high-dimensional state spaces, these improvements often come at the cost of generating unpredictable, inconsistent motions or requiring excessive computational resources and memory. In this work, we introduce Multi-Graph Search (MGS), a search-based motion planning algorithm that generalizes classical unidirectional and bidirectional search to a multi-graph setting. MGS maintains and incrementally expands multiple implicit graphs over the state space, focusing exploration on high-potential regions while allowing initially disconnected subgraphs to be merged through feasible transitions as the search progresses. We prove that MGS is complete and bounded-suboptimal, and empirically demonstrate its effectiveness on a range of manipulation and mobile manipulation tasks. Demonstrations, benchmarks and code are available at https://multi-graph-search.github.io/.

RF-Modulated Adaptive Communication Improves Multi-Agent Robotic Exploration

Authors:Lorin Achey, Breanne Crockett, Christoffer Heckman, Bradley Hayes

Date:2026-02-12 15:33:17

Reliable coordination and efficient communication are critical challenges for multi-agent robotic exploration of environments where communication is limited. This work introduces Adaptive-RF Transmission (ART), a novel communication-aware planning algorithm that dynamically modulates transmission location based on signal strength and data payload size, enabling heterogeneous robot teams to share information efficiently without unnecessary backtracking. We further explore an extension to this approach called ART-SST, which enforces signal strength thresholds for high-fidelity data delivery. Through over 480 simulations across three cave-inspired environments, ART consistently outperforms existing strategies, including full rendezvous and minimum-signal heuristic approaches, achieving up to a 58% reduction in distance traveled and up to 52% faster exploration times compared to baseline methods. These results demonstrate that adaptive, payload-aware communication significantly improves coverage efficiency and mission speed in complex, communication-constrained environments, offering a promising foundation for future planetary exploration and search-and-rescue missions.

Multi UAVs Preflight Planning in a Shared and Dynamic Airspace

Authors:Amath Sow, Mauricio Rodriguez Cesen, Fabiola Martins Campos de Oliveira, Mariusz Wzorek, Daniel de Leng, Mattias Tiger, Fredrik Heintz, Christian Esteve Rothenberg

Date:2026-02-12 15:18:46

Preflight planning for large-scale Unmanned Aerial Vehicle (UAV) fleets in dynamic, shared airspace presents significant challenges, including temporal No-Fly Zones (NFZs), heterogeneous vehicle profiles, and strict delivery deadlines. While Multi-Agent Path Finding (MAPF) provides a formal framework, existing methods often lack the scalability and flexibility required for real-world Unmanned Traffic Management (UTM). We propose DTAPP-IICR: a Delivery-Time Aware Prioritized Planning method with Incremental and Iterative Conflict Resolution. Our framework first generates an initial solution by prioritizing missions based on urgency. Secondly, it computes roundtrip trajectories using SFIPP-ST, a novel 4D single-agent planner (Safe Flight Interval Path Planning with Soft and Temporal Constraints). SFIPP-ST handles heterogeneous UAVs, strictly enforces temporal NFZs, and models inter-agent conflicts as soft constraints. Subsequently, an iterative Large Neighborhood Search, guided by a geometric conflict graph, efficiently resolves any residual conflicts. A completeness-preserving directional pruning technique further accelerates the 3D search. On benchmarks with temporal NFZs, DTAPP-IICR achieves near-100% success with fleets of up to 1,000 UAVs and gains up to 50% runtime reduction from pruning, outperforming batch Enhanced Conflict-Based Search in the UTM context. Scaling successfully in realistic city-scale operations where other priority-based methods fail even at moderate deployments, DTAPP-IICR is positioned as a practical and scalable solution for preflight planning in dense, dynamic urban airspace.

Safety Beyond the Training Data: Robust Out-of-Distribution MPC via Conformalized System Level Synthesis

Authors:Anutam Srinivasan, Antoine Leeman, Glen Chou

Date:2026-02-12 15:11:44

We present a novel framework for robust out-of-distribution planning and control using conformal prediction (CP) and system level synthesis (SLS), addressing the challenge of ensuring safety and robustness when using learned dynamics models beyond the training data distribution. We first derive high-confidence model error bounds using weighted CP with a learned, state-control-dependent covariance model. These bounds are integrated into an SLS-based robust nonlinear model predictive control (MPC) formulation, which performs constraint tightening over the prediction horizon via volume-optimized forward reachable sets. We provide theoretical guarantees on coverage and robustness under distributional drift, and analyze the impact of data density and trajectory tube size on prediction coverage. Empirically, we demonstrate our method on nonlinear systems of increasing complexity, including a 4D car and a {12D} quadcopter, improving safety and robustness compared to fixed-bound and non-robust baselines, especially outside of the data distribution.

Time-Inhomogeneous Volatility Aversion for Financial Applications of Reinforcement Learning

Authors:Federico Cacciamani, Roberto Daluiso, Marco Pinciroli, Michele Trapletti, Edoardo Vittori

Date:2026-02-12 15:00:28

In finance, sequential decision problems are often faced, for which reinforcement learning (RL) emerges as a promising tool for optimisation without the need of analytical tractability. However, the objective of classical RL is the expected cumulated reward, while financial applications typically require a trade-off between return and risk. In this work, we focus on settings where one cares about the time split of the total return, ruling out most risk-aware generalisations of RL which optimise a risk measure defined on the latter. We notice that a preference for homogeneous splits, which we found satisfactory for hedging, can be unfit for other problems, and therefore propose a new risk metric which still penalises uncertainty of the single rewards, but allows for an arbitrary planning of their target levels. We study the properties of the resulting objective and the generalisation of learning algorithms to optimise it. Finally, we show numerical results on toy examples.

Adaptive-Horizon Conflict-Based Search for Closed-Loop Multi-Agent Path Finding

Authors:Jiarui Li, Federico Pecora, Runyu Zhang, Gioele Zardini

Date:2026-02-12 14:55:16

MAPF is a core coordination problem for large robot fleets in automated warehouses and logistics. Existing approaches are typically either open-loop planners, which generate fixed trajectories and struggle to handle disturbances, or closed-loop heuristics without reliable performance guarantees, limiting their use in safety-critical deployments. This paper presents ACCBS, a closed-loop algorithm built on a finite-horizon variant of CBS with a horizon-changing mechanism inspired by iterative deepening in MPC. ACCBS dynamically adjusts the planning horizon based on the available computational budget, and reuses a single constraint tree to enable seamless transitions between horizons. As a result, it produces high-quality feasible solutions quickly while being asymptotically optimal as the budget increases, exhibiting anytime behavior. Extensive case studies demonstrate that ACCBS combines flexibility to disturbances with strong performance guarantees, effectively bridging the gap between theoretical optimality and practical robustness for large-scale robot deployment.

Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation

Authors:Wei Chen, Yancheng Long, Mingqiao Liu, Haojie Ding, Yankai Yang, Hongyang Wei, Yi-Fan Zhang, Bin Wen, Fan Yang, Tingting Gao, Han Li, Long Chen

Date:2026-02-12 14:12:14

While diffusion models have shown exceptional capabilities in aesthetic image synthesis, they often struggle with complex spatial understanding and reasoning. Existing approaches resort to Multimodal Large Language Models (MLLMs) to enhance this capability. However, they either incur high computational costs through joint training or suffer from spatial information loss when relying solely on textual prompts. To alleviate these limitations, we propose a Spatial Chain-of-Thought (SCoT) framework, a plug-and-play approach that effectively bridges the reasoning capabilities of MLLMs with the generative power of diffusion models. Specifically, we first enhance the diffusion model's layout awareness by training it on an interleaved text-coordinate instruction format. We then leverage state-of-the-art MLLMs as planners to generate comprehensive layout plans, transferring their spatial planning capabilities directly to the generation process. Extensive experiments demonstrate that our method achieves state-of-the-art performance on image generation benchmarks and significantly outperforms baselines on complex reasoning tasks, while also showing strong efficacy in image editing scenarios.

Data-Driven Trajectory Imputation for Vessel Mobility Analysis

Authors:Giannis Spiliopoulos, Alexandros Troupiotis-Kapeliaris, Kostas Patroumpas, Nikolaos Liapis, Dimitrios Skoutas, Dimitris Zissis, Nikos Bikakis

Date:2026-02-12 12:40:27

Modeling vessel activity at sea is critical for a wide range of applications, including route planning, transportation logistics, maritime safety, and environmental monitoring. Over the past two decades, the Automatic Identification System (AIS) has enabled real-time monitoring of hundreds of thousands of vessels, generating huge amounts of data daily. One major challenge in using AIS data is the presence of large gaps in vessel trajectories, often caused by coverage limitations or intentional transmission interruptions. These gaps can significantly degrade data quality, resulting in inaccurate or incomplete analysis. State-of-the-art imputation approaches have mainly been devised to tackle gaps in vehicle trajectories, even when the underlying road network is not considered. But the motion patterns of sailing vessels differ substantially, e.g., smooth turns, maneuvering near ports, or navigating in adverse weather conditions. In this application paper, we propose HABIT, a lightweight, configurable H3 Aggregation-Based Imputation framework for vessel Trajectories. This data-driven framework provides a valuable means to impute missing trajectory segments by extracting, analyzing, and indexing motion patterns from historical AIS data. Our empirical study over AIS data across various timeframes, densities, and vessel types reveals that HABIT produces maritime trajectory imputations performing comparably to baseline methods in terms of accuracy, while performing better in terms of latency while accounting for vessel characteristics and their motion patterns.

Where Bits Matter in World Model Planning: A Paired Mixed-Bit Study for Efficient Spatial Reasoning

Authors:Suraj Ranganath, Anish Patnaik, Vaishak Menon

Date:2026-02-12 12:32:51

Efficient spatial reasoning requires world models that remain reliable under tight precision budgets. We study whether low-bit planning behavior is determined mostly by total bitwidth or by where bits are allocated across modules. Using DINO-WM on the Wall planning task, we run a paired-goal mixed-bit evaluation across uniform, mixed, asymmetric, and layerwise variants under two planner budgets. We observe a consistent three-regime pattern: 8-bit and 6-bit settings remain close to FP16, 3-bit settings collapse, and 4-bit settings are allocation-sensitive. In that transition region, preserving encoder precision improves planning relative to uniform quantization, and near-size asymmetric variants show the same encoder-side direction. In a later strict 22-cell replication with smaller per-cell episode count, the mixed-versus-uniform INT4 sign becomes budget-conditioned, which further highlights the sensitivity of this transition regime. These findings motivate module-aware, budget-aware quantization policies as a broader research direction for efficient spatial reasoning. Code and run artifacts are available at https://github.com/suraj-ranganath/DINO-MBQuant.

LAMP: Implicit Language Map for Robot Navigation

Authors:Sibaek Lee, Hyeonwoo Yu, Giseop Kim, Sunwook Choi

Date:2026-02-12 12:09:03

Recent advances in vision-language models have made zero-shot navigation feasible, enabling robots to follow natural language instructions without requiring labeling. However, existing methods that explicitly store language vectors in grid or node-based maps struggle to scale to large environments due to excessive memory requirements and limited resolution for fine-grained planning. We introduce LAMP (Language Map), a novel neural language field-based navigation framework that learns a continuous, language-driven map and directly leverages it for fine-grained path generation. Unlike prior approaches, our method encodes language features as an implicit neural field rather than storing them explicitly at every location. By combining this implicit representation with a sparse graph, LAMP supports efficient coarse path planning and then performs gradient-based optimization in the learned field to refine poses near the goal. This coarse-to-fine pipeline, language-driven, gradient-guided optimization is the first application of an implicit language map for precise path generation. This refinement is particularly effective at selecting goal regions not directly observed by leveraging semantic similarities in the learned feature space. To further enhance robustness, we adopt a Bayesian framework that models embedding uncertainty via the von Mises-Fisher distribution, thereby improving generalization to unobserved regions. To scale to large environments, LAMP employs a graph sampling strategy that prioritizes spatial coverage and embedding confidence, retaining only the most informative nodes and substantially reducing computational overhead. Our experimental results, both in NVIDIA Isaac Sim and on a real multi-floor building, demonstrate that LAMP outperforms existing explicit methods in both memory efficiency and fine-grained goal-reaching accuracy.

A day-ahead market model for power systems: benchmarking and security implications

Authors:Andrej Stankovski, Blazhe Gjorgiev, James Ciyu Qin, Giovanni Sansavini

Date:2026-02-12 11:35:37

Power system security assessments, e.g. via cascading outage models, often use operational set-points based on optimal power flow (OPF) dispatch. However, driven by cost minimization, OPF provides an ideal, albeit unrealistic, clearing of the generating units, disregarding the complex interactions among market participants. The security of the system, therefore, may be overestimated. To address this gap, we introduce a market model with a social-welfare-based day-ahead market clearing mechanism. The security implications are analyzed via Cascades, a cascading outage analysis framework. We apply this framework to the IEEE-118 bus system with three independent control zones. The results show that market dispatch leads to an increase in demand not served of up to 80% higher than OPF, highlighting a security overestimation. Operators can use this information to properly allocate reserves and perform efficient expansion planning strategies.

How to Optimize Multispecies Set Predictions in Presence-Absence Modeling ?

Authors:Sébastien Gigot--Léandri, Gaétan Morand, Alexis Joly, François Munoz, David Mouillot, Christophe Botella, Maximilien Servajean

Date:2026-02-12 09:52:26

Species distribution models (SDMs) commonly produce probabilistic occurrence predictions that must be converted into binary presence-absence maps for ecological inference and conservation planning. However, this binarization step is typically heuristic and can substantially distort estimates of species prevalence and community composition. We present MaxExp, a decision-driven binarization framework that selects the most probable species assemblage by directly maximizing a chosen evaluation metric. MaxExp requires no calibration data and is flexible across several scores. We also introduce the Set Size Expectation (SSE) method, a computationally efficient alternative that predicts assemblages based on expected species richness. Using three case studies spanning diverse taxa, species counts, and performance metrics, we show that MaxExp consistently matches or surpasses widely used thresholding and calibration methods, especially under strong class imbalance and high rarity. SSE offers a simpler yet competitive option. Together, these methods provide robust, reproducible tools for multispecies SDM binarization.

AC-MASAC: An Attentive Curriculum Learning Framework for Heterogeneous UAV Swarm Coordination

Authors:Wanhao Liu, Junhong Dai, Yixuan Zhang, Shengyun Yin, Panshuo Li

Date:2026-02-12 09:03:34

Cooperative path planning for heterogeneous UAV swarms poses significant challenges for Multi-Agent Reinforcement Learning (MARL), particularly in handling asymmetric inter-agent dependencies and addressing the risks of sparse rewards and catastrophic forgetting during training. To address these issues, this paper proposes an attentive curriculum learning framework (AC-MASAC). The framework introduces a role-aware heterogeneous attention mechanism to explicitly model asymmetric dependencies. Moreover, a structured curriculum strategy is designed, integrating hierarchical knowledge transfer and stage-proportional experience replay to address the issues of sparse rewards and catastrophic forgetting. The proposed framework is validated on a custom multi-agent simulation platform, and the results show that our method has significant advantages over other advanced methods in terms of Success Rate, Formation Keeping Rate, and Success-weighted Mission Time. The code is available at \textcolor{red}{https://github.com/Wanhao-Liu/AC-MASAC}.

LUVE : Latent-Cascaded Ultra-High-Resolution Video Generation with Dual Frequency Experts

Authors:Chen Zhao, Jiawei Chen, Hongyu Li, Zhuoliang Kang, Shilin Lu, Xiaoming Wei, Kai Zhang, Jian Yang, Ying Tai

Date:2026-02-12 04:35:16

Recent advances in video diffusion models have significantly improved visual quality, yet ultra-high-resolution (UHR) video generation remains a formidable challenge due to the compounded difficulties of motion modeling, semantic planning, and detail synthesis. To address these limitations, we propose \textbf{LUVE}, a \textbf{L}atent-cascaded \textbf{U}HR \textbf{V}ideo generation framework built upon dual frequency \textbf{E}xperts. LUVE employs a three-stage architecture comprising low-resolution motion generation for motion-consistent latent synthesis, video latent upsampling that performs resolution upsampling directly in the latent space to mitigate memory and computational overhead, and high-resolution content refinement that integrates low-frequency and high-frequency experts to jointly enhance semantic coherence and fine-grained detail generation. Extensive experiments demonstrate that our LUVE achieves superior photorealism and content fidelity in UHR video generation, and comprehensive ablation studies further validate the effectiveness of each component. The project is available at \href{https://unicornanrocinu.github.io/LUVE_web/}{https://github.io/LUVE/}.

Budget-Constrained Agentic Large Language Models: Intention-Based Planning for Costly Tool Use

Authors:Hanbing Liu, Chunhao Tian, Nan An, Ziyuan Wang, Pinyan Lu, Changyuan Yu, Qi Qi

Date:2026-02-12 04:01:30

We study budget-constrained tool-augmented agents, where a large language model must solve multi-step tasks by invoking external tools under a strict monetary budget. We formalize this setting as sequential decision making in context space with priced and stochastic tool executions, making direct planning intractable due to massive state-action spaces, high variance of outcomes and prohibitive exploration cost. To address these challenges, we propose INTENT, an inference-time planning framework that leverages an intention-aware hierarchical world model to anticipate future tool usage, risk-calibrated cost, and guide decisions online. Across cost-augmented StableToolBench, INTENT strictly enforces hard budget feasibility while substantially improving task success over baselines, and remains robust under dynamic market shifts such as tool price changes and varying budgets.

Implications of AI Involvement for Trust in Expert Advisory Workflows Under Epistemic Dependence

Authors:Dennis Kim, Roya Daneshi, Bruce Draper, Sarath Sreedharan

Date:2026-02-12 03:31:05

The increasing integration of AI-powered tools into expert workflows, such as medicine, law, and finance, raises a critical question: how does AI involvement influence a user's trust in the human expert, the AI system, and their combination? To investigate this, we conducted a user study (N=77) featuring a simulated course-planning task. We compared various conditions that differed in both the presence of AI and the specific mode of human-AI collaboration. Our results indicate that while the advisor's ability to create a correct schedule is important, the user's perception of expertise and trust is also influenced by how the expert utilized the AI assistant. These findings raise important considerations for the design of human-AI hybrid teams, particularly when the adoption of recommendations depends on the end-user's perception of the recommender's expertise.

An Efficient Hybrid Heuristic for the Transmission Expansion Planning under Uncertainties

Authors:Yure Rocha, Teobaldo Bulhões, Anand Subramanian, Joaquim Dias Garcia

Date:2026-02-12 02:21:54

We address the stochastic transmission expansion planning (STEP) problem considering uncertainties in renewable generation capacity and demand. STEP's objective is to minimize the total investment cost of new transmission lines and generation cost. To tackle the computational challenges of large-scale systems, we propose a heuristic approach that combines the progressive hedging (PH) algorithm for scenario-wise decomposition with an integrated framework for solving the resulting subproblems. The latter combines a destroy-and-repair operator, a beam search procedure, and a mixed-integer programming approach. The proposed framework is evaluated on large-scale systems from the literature, containing up to 10000 nodes, adapted to multiple scenarios based on parameters from the California test system (CATS). Compared with a non-trivial baseline algorithm that includes the integrated MIP and heuristics, the proposed PH-based framework consistently improved solution quality for the six systems considered (including CATS), achieving an average optimality gap reduction of 16.23% within a 2-hour time limit.

Effective Task Planning with Missing Objects using Learning-Informed Object Search

Authors:Raihan Islam Arnob, Max Merlin, Abhishek Paudel, Benned Hedegaard, George Konidaris, Gregory Stein

Date:2026-02-12 00:56:35

Task planning for mobile robots often assumes full environment knowledge and so popular approaches, like planning via the PDDL, cannot plan when the locations of task-critical objects are unknown. Recent learning-driven object search approaches are effective, but operate as standalone tools and so are not straightforwardly incorporated into full task planners, which must additionally determine both what objects are necessary and when in the plan they should be sought out. To address this limitation, we develop a planning framework centered around novel model-based LIOS actions: each a policy that aims to find and retrieve a single object. High-level planning treats LIOS actions as deterministic and so -- informed by model-based calculations of the expected cost of each -- generates plans that interleave search and execution for effective, sound, and complete learning-informed task planning despite uncertainty. Our work effectively reasons about uncertainty while maintaining compatibility with existing full-knowledge solvers. In simulated ProcTHOR homes and in the real world, our approach outperforms non-learned and learned baselines on tasks including retrieval and meal prep.

Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis

Authors:Abylay Amanbayev, Brian Tsan, Tri Dang, Florin Rusu

Date:2026-02-11 23:40:26

Retrieval-Augmented Generation (RAG) applications increasingly rely on Filtered Approximate Nearest Neighbor Search (FANNS) to combine semantic retrieval with metadata constraints. While algorithmic innovations for FANNS have been proposed, there remains a lack of understanding regarding how generic filtering strategies perform within Vector Databases. In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into FAISS, Milvus, and pgvector. To provide a robust benchmarking framework, we introduce a new relational dataset, \textit{MoReVec}, consisting of two tables, featuring 768-dimensional text embeddings and a rich schema of metadata attributes. We further propose the \textit{Global-Local Selectivity (GLS)} correlation metric to quantify the relationship between filters and query vectors. Our experiments reveal that algorithmic adaptations within the engine often override raw index performance. Specifically, we find that: (1) \textit{Milvus} achieves superior recall stability through hybrid approximate/exact execution; (2) \textit{pgvector}'s cost-based query optimizer frequently selects suboptimal execution plans, favoring approximate index scans even when exact sequential scans would yield perfect recall at comparable latency; and (3) partition-based indexes (IVFFlat) outperform graph-based indexes (HNSW) for low-selectivity queries. To facilitate this analysis, we extend the widely-used \textit{ANN-Benchmarks} to support filtered vector search and make it available online. Finally, we synthesize our findings into a set of practical guidelines for selecting index types and configuring query optimizers for hybrid search workloads.

Optimizing Agent Planning for Security and Autonomy

Authors:Aashish Kolluri, Rishi Sharma, Manuel Costa, Boris Köpf, Tobias Nießen, Mark Russinovich, Shruti Tople, Santiago Zanella-Béguelin

Date:2026-02-11 22:37:02

Indirect prompt injection attacks threaten AI agents that execute consequential actions, motivating deterministic system-level defenses. Such defenses can provably block unsafe actions by enforcing confidentiality and integrity policies, but currently appear costly: they reduce task completion rates and increase token usage compared to probabilistic defenses. We argue that existing evaluations miss a key benefit of system-level defenses: reduced reliance on human oversight. We introduce autonomy metrics to quantify this benefit: the fraction of consequential actions an agent can execute without human-in-the-loop (HITL) approval while preserving security. To increase autonomy, we design a security-aware agent that (i) introduces richer HITL interactions, and (ii) explicitly plans for both task progress and policy compliance. We implement this agent design atop an existing information-flow control defense against prompt injection and evaluate it on the AgentDojo and WASP benchmarks. Experiments show that this approach yields higher autonomy without sacrificing utility.

When Visibility Outpaces Verification: Delayed Verification and Narrative Lock-in in Agentic AI Discourse

Authors:Hanjing Shi, Dominic DiFranzo

Date:2026-02-11 22:30:12

Agentic AI systems-autonomous entities capable of independent planning and execution-reshape the landscape of human-AI trust. Long before direct system exposure, user expectations are mediated through high-stakes public discourse on social platforms. However, platform-mediated engagement signals (e.g., upvotes) may inadvertently function as a ``credibility proxy,'' potentially stifling critical evaluation. This paper investigates the interplay between social proof and verification timing in online discussions of agentic AI. Analyzing a longitudinal dataset from two distinct Reddit communities with contrasting interaction cultures-r/OpenClaw and r/Moltbook-we operationalize verification cues via reproducible lexical rules and model the ``time-to-first-verification'' using a right-censored survival analysis framework. Our findings reveal a systemic ``Popularity Paradox'': high-visibility discussions in both subreddits experience significantly delayed or entirely absent verification cues compared to low-visibility threads. This temporal lag creates a critical window for ``Narrative Lock-in,'' where early, unverified claims crystallize into collective cognitive biases before evidence-seeking behaviors emerge. We discuss the implications of this ``credibility-by-visibility'' effect for AI safety and propose ``epistemic friction'' as a design intervention to rebalance engagement-driven platforms.

Causal-JEPA: Learning World Models through Object-Level Latent Interventions

Authors:Heejeong Nam, Quentin Le Lidec, Lucas Maes, Yann LeCun, Randall Balestriero

Date:2026-02-11 21:47:26

World models require robust relational understanding to support prediction, reasoning, and control. While object-centric representations provide a useful abstraction, they are not sufficient to capture interaction-dependent dynamics. We therefore propose C-JEPA, a simple and flexible object-centric world model that extends masked joint embedding prediction from image patches to object-centric representations. By applying object-level masking that requires an object's state to be inferred from other objects, C-JEPA induces latent interventions with counterfactual-like effects and prevents shortcut solutions, making interaction reasoning essential. Empirically, C-JEPA leads to consistent gains in visual question answering, with an absolute improvement of about 20\% in counterfactual reasoning compared to the same architecture without object-level masking. On agent control tasks, C-JEPA enables substantially more efficient planning by using only 1\% of the total latent input features required by patch-based world models, while achieving comparable performance. Finally, we provide a formal analysis demonstrating that object-level masking induces a causal inductive bias via latent interventions. Our code is available at https://github.com/galilai-group/cjepa.

NOvA's Current and Future Sterile Neutrino Searches

Authors:Adam Lister

Date:2026-02-11 20:28:37

The NOvA experiment's most recent search for eV-scale sterile neutrinos under a 3+1 model simultaneously analyses muon neutrino and neutral current datasets from the NuMI beam at its Near ($\sim$\qty{1}{km} baseline) and Far (\qty{810}{km} baseline) detectors to look for oscillations consistent with a sterile neutrino. The analysis is systematically limited in the region of parameter space where $Δm^2_{41} \gtrsim 1~\mathrm{eV}^2$. This region of parameter space is preferred by sterile neutrino interpretations of current experimental anomalies and so improving sensitivity here is high-priority. These proceedings present our current search strategy, and discusses future plans to include data from a second beamline, the Booster Neutrino Beam, to improve our sensitivity in systematics-dominated regions of parameter space.

MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

Authors:Yejin Kim, Wilbert Pumacay, Omar Rayyan, Max Argus, Winson Han, Eli VanderBilt, Jordi Salvador, Abhay Deshpande, Rose Hendrix, Snehal Jauhri, Shuo Liu, Nur Muhammad Mahi Shafiullah, Maya Guru, Ainaz Eftekhar, Karen Farley, Donovan Clay, Jiafei Duan, Arjun Guru, Piper Wolters, Alvaro Herrasti, Ying-Chun Lee, Georgia Chalvatzaki, Yuchen Cui, Ali Farhadi, Dieter Fox, Ranjay Krishna

Date:2026-02-11 20:16:31

Deploying robots at scale demands robustness to the long tail of everyday situations. The countless variations in scene layout, object geometry, and task specifications that characterize real environments are vast and underrepresented in existing robot benchmarks. Measuring this level of generalization requires infrastructure at a scale and diversity that physical evaluation alone cannot provide. We introduce MolmoSpaces, a fully open ecosystem to support large-scale benchmarking of robot policies. MolmoSpaces consists of over 230k diverse indoor environments, ranging from handcrafted household scenes to procedurally generated multiroom houses, populated with 130k richly annotated object assets, including 48k manipulable objects with 42M stable grasps. Crucially, these environments are simulator-agnostic, supporting popular options such as MuJoCo, Isaac, and ManiSkill. The ecosystem supports the full spectrum of embodied tasks: static and mobile manipulation, navigation, and multiroom long-horizon tasks requiring coordinated perception, planning, and interaction across entire indoor environments. We also design MolmoSpaces-Bench, a benchmark suite of 8 tasks in which robots interact with our diverse scenes and richly annotated objects. Our experiments show MolmoSpaces-Bench exhibits strong sim-to-real correlation (R = 0.96, \r{ho} = 0.98), confirm newer and stronger zero-shot policies outperform earlier versions in our benchmarks, and identify key sensitivities to prompt phrasing, initial joint positions, and camera occlusion. Through MolmoSpaces and its open-source assets and tooling, we provide a foundation for scalable data generation, policy training, and benchmark creation for robot learning research.

H-WM: Robotic Task and Motion Planning Guided by Hierarchical World Model

Authors:Wenyuan Chen, Jinbang Huang, Oscar Pang, Zhiyuan Li, Xiao Hu, Lingfeng Zhang, Zhanguang Zhang, Mark Coates, Tongtong Cao, Xingyue Quan, Yingxue Zhang

Date:2026-02-11 19:08:36

World models are becoming central to robotic planning and control, as they enable prediction of future state transitions. Existing approaches often emphasize video generation or natural language prediction, which are difficult to directly ground in robot actions and suffer from compounding errors over long horizons. Traditional task and motion planning relies on symbolic logic world models, such as planning domains, that are robot-executable and robust for long-horizon reasoning. However, these methods typically operate independently of visual perception, preventing synchronized symbolic and perceptual state prediction. We propose a Hierarchical World Model (H-WM) that jointly predicts logical and visual state transitions within a unified bilevel framework. H-WM combines a high-level logical world model with a low-level visual world model, integrating the robot-executable, long-horizon robustness of symbolic reasoning with perceptual grounding from visual observations. The hierarchical outputs provide stable and consistent intermediate guidance for long-horizon tasks, mitigating error accumulation and enabling robust execution across extended task sequences. To train H-WM, we introduce a robotic dataset that aligns robot motion with symbolic states, actions, and visual observations. Experiments across vision-language-action (VLA) control policies demonstrate the effectiveness and generality of the approach.

Unmasking LHAASO J2108+5157: Near Infrared Insights into a Mysterious TeV Source

Authors:Josep Martí, Pedro L. Luque-Escamilla, Josep M. Paredes, José Martínez Aroza

Date:2026-02-11 18:58:44

LHAASO J2108+5157 is one of the few ultra-high energy gamma-ray sources in the LHAASO catalogue without secure counterpart at longer wavelengths. Several Galactic scenarios have been proposed, including an evolved supernova remnant and a pulsar wind nebula. Yet, no shocked gas, shell-like structure, or compact pulsar candidate has been identified. Follow-up observations with VERITAS and the LST-1 prototype have not firmly clarified its nature. A recent microquasar candidate from GMRT radio data remains uncertain. Here we present the first dedicated near-infrared study of the field, combining deep JHKs imaging with narrow band observations targeting the H2 v=1-0 S(1) line. Our observations were initially planned to encompass the full source region, but now only partially cover the latest updated position and size of LHAASO J2108+5157. We find no evidence of shocked emission, extended nebular structures, or an accreting compact object signature in the covered field. The GMRT radio source, despite its jet-like morphology, exhibits near-infrared properties incompatible with both a Galactic microquasar and a nearby radio galaxy, discouraging an association with the gamma-ray emission. Our analysis reveals no convincing counterpart consistent within the positional uncertainty, leaving LHAASO J2108+5157 as an enigmatic ultra-high energy emitter that requires deeper observations.

PhyCritic: Multimodal Critic Models for Physical AI

Authors:Tianyi Xiong, Shihao Wang, Guilin Liu, Yi Dong, Ming Li, Heng Huang, Jan Kautz, Zhiding Yu

Date:2026-02-11 18:35:39

With the rapid development of large multimodal models, reliable judge and critic models have become essential for open-ended evaluation and preference alignment, providing pairwise preferences, numerical scores, and explanatory justifications for assessing model-generated responses. However, existing critics are primarily trained in general visual domains such as captioning or image question answering, leaving physical AI tasks involving perception, causal reasoning, and planning largely underexplored. We introduce PhyCritic, a multimodal critic model optimized for physical AI through a two-stage RLVR pipeline: a physical skill warmup stage that enhances physically oriented perception and reasoning, followed by self-referential critic finetuning, where the critic generates its own prediction as an internal reference before judging candidate responses, improving judgment stability and physical correctness. Across both physical and general-purpose multimodal judge benchmarks, PhyCritic achieves strong performance gains over open-source baselines and, when applied as a policy model, further improves perception and reasoning in physically grounded tasks.

A receding-horizon multi-contact motion planner for legged robots in challenging environments

Authors:Daniel S. J. Derwent, Simon Watson, Bruno V. Adorno

Date:2026-02-11 18:25:29

We present a novel receding-horizon multi-contact motion planner for legged robots in challenging scenarios, able to plan motions such as chimney climbing, navigating very narrow passages or crossing large gaps. Our approach adds new capabilities to the state of the art, including the ability to reactively re-plan in response to new information, and planning contact locations and whole-body trajectories simultaneously, simplifying the implementation and removing the need for post-processing or complex multi-stage approaches. Our method is more resistant to local minima problems than other potential field based approaches, and our quadratic-program-based posture generator returns nodes more quickly than those of existing algorithms. Rigorous statistical analysis shows that, with short planning horizons (e.g., one step ahead), our planner is faster than the state-of-the-art across all scenarios tested (between 45% and 98% faster on average, depending on the scenario), while planning less efficient motions (requiring 5% fewer to 700% more stance changes on average). In all but one scenario (Chimney Walking), longer planning horizons (e.g., four steps ahead) extended the average planning times (between 73% faster and 400% slower than the state-of-the-art) but resulted in higher quality motion plans (between 8% more and 47% fewer stance changes than the state-of-the-art).

ContactGaussian-WM: Learning Physics-Grounded World Model from Videos

Authors:Meizhong Wang, Wanxin Jin, Kun Cao, Lihua Xie, Yiguang Hong

Date:2026-02-11 16:48:13

Developing world models that understand complex physical interactions is essential for advancing robotic planning and simulation.However, existing methods often struggle to accurately model the environment under conditions of data scarcity and complex contact-rich dynamic motion.To address these challenges, we propose ContactGaussian-WM, a differentiable physics-grounded rigid-body world model capable of learning intricate physical laws directly from sparse and contact-rich video sequences.Our framework consists of two core components: (1) a unified Gaussian representation for both visual appearance and collision geometry, and (2) an end-to-end differentiable learning framework that differentiates through a closed-form physics engine to infer physical properties from sparse visual observations.Extensive simulations and real-world evaluations demonstrate that ContactGaussian-WM outperforms state-of-the-art methods in learning complex scenarios, exhibiting robust generalization capabilities.Furthermore, we showcase the practical utility of our framework in downstream applications, including data synthesis and real-time MPC.