planning - 2025-08-10

Development of PANOSETI Telescopes for Ultra-High-Energy Gamma-Ray Astronomy

Authors:Nikolas Korzoun

Date:2025-08-07 16:27:57

Ultra-High-Energy (UHE, E $>100$ TeV) gamma rays are one of the few channels to search for and study Galactic PeVatrons. Among the most promising PeVatron candidates are the many UHE gamma-ray sources that have recently been identified on the Galactic Plane. Ground-based particle detectors see these sources as extended rather than point-like, and current generation Imaging Atmospheric Cherenkov Telescopes (IACTs) struggle to study them with effective areas and background rejection that are suboptimal at UHE. A cost-efficient way of constructing an array of IACTs explicitly designed for UHE sensitivity is to sparsely separate many small telescopes. We have simulated, prototyped, and twice deployed a pathfinder array that is instrumented with telescopes designed by the Panoramic Search for Extraterrestrial Intelligence (PANOSETI) team. These 0.5-meter Fresnel lens telescopes are purpose-built for imaging optical transients on nanosecond timescales and are equipped with a $10^\circ\times10^\circ$ silicon photomultiplier camera. Three PANOSETI telescopes were deployed twice in the same temporary configuration at Lick Observatory in March and October 2024. Here we give a brief description of the instrument and present a comparison of simulations with the data collected, including an analysis of the Crab Nebula. We also report on the ongoing deployment of PANOSETI telescopes for the Dark100 array that is planned to operate for five years at Palomar Observatory.

CleanUpBench: Embodied Sweeping and Grasping Benchmark

Authors:Wenbo Li, Guanting Chen, Tao Zhao, Jiyao Wang, Tianxin Hu, Yuwen Liao, Weixiang Guo, Shenghai Yuan

Date:2025-08-07 16:20:31

Embodied AI benchmarks have advanced navigation, manipulation, and reasoning, but most target complex humanoid agents or large-scale simulations that are far from real-world deployment. In contrast, mobile cleaning robots with dual mode capabilities, such as sweeping and grasping, are rapidly emerging as realistic and commercially viable platforms. However, no benchmark currently exists that systematically evaluates these agents in structured, multi-target cleaning tasks, revealing a critical gap between academic research and real-world applications. We introduce CleanUpBench, a reproducible and extensible benchmark for evaluating embodied agents in realistic indoor cleaning scenarios. Built on NVIDIA Isaac Sim, CleanUpBench simulates a mobile service robot equipped with a sweeping mechanism and a six-degree-of-freedom robotic arm, enabling interaction with heterogeneous objects. The benchmark includes manually designed environments and one procedurally generated layout to assess generalization, along with a comprehensive evaluation suite covering task completion, spatial efficiency, motion quality, and control performance. To support comparative studies, we provide baseline agents based on heuristic strategies and map-based planning. CleanUpBench bridges the gap between low-level skill evaluation and full-scene testing, offering a scalable testbed for grounded, embodied intelligence in everyday settings.

F2PASeg: Feature Fusion for Pituitary Anatomy Segmentation in Endoscopic Surgery

Authors:Lumin Chen, Zhiying Wu, Tianye Lei, Xuexue Bai, Ming Feng, Yuxi Wang, Gaofeng Meng, Zhen Lei, Hongbin Liu

Date:2025-08-07 15:04:07

Pituitary tumors often cause deformation or encapsulation of adjacent vital structures. Anatomical structure segmentation can provide surgeons with early warnings of regions that pose surgical risks, thereby enhancing the safety of pituitary surgery. However, pixel-level annotated video stream datasets for pituitary surgeries are extremely rare. To address this challenge, we introduce a new dataset for Pituitary Anatomy Segmentation (PAS). PAS comprises 7,845 time-coherent images extracted from 120 videos. To mitigate class imbalance, we apply data augmentation techniques that simulate the presence of surgical instruments in the training data. One major challenge in pituitary anatomy segmentation is the inconsistency in feature representation due to occlusions, camera motion, and surgical bleeding. By incorporating a Feature Fusion module, F2PASeg is proposed to refine anatomical structure segmentation by leveraging both high-resolution image features and deep semantic embeddings, enhancing robustness against intraoperative variations. Experimental results demonstrate that F2PASeg consistently segments critical anatomical structures in real time, providing a reliable solution for intraoperative pituitary surgery planning. Code: https://github.com/paulili08/F2PASeg.

EnergyPatchTST: Multi-scale Time Series Transformers with Uncertainty Estimation for Energy Forecasting

Authors:Wei Li, Zixin Wang, Qizheng Sun, Qixiang Gao, Fenglei Yang

Date:2025-08-07 14:48:39

Accurate and reliable energy time series prediction is of great significance for power generation planning and allocation. At present, deep learning time series prediction has become the mainstream method. However, the multi-scale time dynamics and the irregularity of real data lead to the limitations of the existing methods. Therefore, we propose EnergyPatchTST, which is an extension of the Patch Time Series Transformer specially designed for energy forecasting. The main innovations of our method are as follows: (1) multi-scale feature extraction mechanism to capture patterns with different time resolutions; (2) probability prediction framework to estimate uncertainty through Monte Carlo elimination; (3) integration path of future known variables (such as temperature and wind conditions); And (4) Pre-training and Fine-tuning examples to enhance the performance of limited energy data sets. A series of experiments on common energy data sets show that EnergyPatchTST is superior to other commonly used methods, the prediction error is reduced by 7-12%, and reliable uncertainty estimation is provided, which provides an important reference for time series prediction in the energy field.

DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

Authors:Xinrun Xu, Pi Bu, Ye Wang, Börje F. Karlsson, Ziming Wang, Tengtao Song, Qi Zhu, Jun Song, Zhiming Ding, Bo Zheng

Date:2025-08-07 13:58:19

Although Vision Language Models (VLMs) exhibit strong perceptual abilities and impressive visual reasoning, they struggle with attention to detail and precise action planning in complex, dynamic environments, leading to subpar performance. Real-world tasks typically require complex interactions, advanced spatial reasoning, long-term planning, and continuous strategy refinement, usually necessitating understanding the physics rules of the target scenario. However, evaluating these capabilities in real-world scenarios is often prohibitively expensive. To bridge this gap, we introduce DeepPHY, a novel benchmark framework designed to systematically evaluate VLMs' understanding and reasoning about fundamental physical principles through a series of challenging simulated environments. DeepPHY integrates multiple physical reasoning environments of varying difficulty levels and incorporates fine-grained evaluation metrics. Our evaluation finds that even state-of-the-art VLMs struggle to translate descriptive physical knowledge into precise, predictive control.

DistillDrive: End-to-End Multi-Mode Autonomous Driving Distillation by Isomorphic Hetero-Source Planning Model

Authors:Rui Yu, Xianghang Zhang, Runkai Zhao, Huaicheng Yan, Meng Wang

Date:2025-08-07 13:54:35

End-to-end autonomous driving has been recently seen rapid development, exerting a profound influence on both industry and academia. However, the existing work places excessive focus on ego-vehicle status as their sole learning objectives and lacks of planning-oriented understanding, which limits the robustness of the overall decision-making prcocess. In this work, we introduce DistillDrive, an end-to-end knowledge distillation-based autonomous driving model that leverages diversified instance imitation to enhance multi-mode motion feature learning. Specifically, we employ a planning model based on structured scene representations as the teacher model, leveraging its diversified planning instances as multi-objective learning targets for the end-to-end model. Moreover, we incorporate reinforcement learning to enhance the optimization of state-to-decision mappings, while utilizing generative modeling to construct planning-oriented instances, fostering intricate interactions within the latent space. We validate our model on the nuScenes and NAVSIM datasets, achieving a 50\% reduction in collision rate and a 3-point improvement in closed-loop performance compared to the baseline model. Code and model are publicly available at https://github.com/YuruiAI/DistillDrive

Building Effective Safety Guardrails in AI Education Tools

Authors:Hannah-Beth Clark, Laura Benton, Emma Searle, Margaux Dowland, Matthew Gregory, Will Gayne, John Roberts

Date:2025-08-07 13:09:47

There has been rapid development in generative AI tools across the education sector, which in turn is leading to increased adoption by teachers. However, this raises concerns regarding the safety and age-appropriateness of the AI-generated content that is being created for use in classrooms. This paper explores Oak National Academy's approach to addressing these concerns within the development of the UK Government's first publicly available generative AI tool - our AI-powered lesson planning assistant (Aila). Aila is intended to support teachers planning national curriculum-aligned lessons that are appropriate for pupils aged 5-16 years. To mitigate safety risks associated with AI-generated content we have implemented four key safety guardrails - (1) prompt engineering to ensure AI outputs are generated within pedagogically sound and curriculum-aligned parameters, (2) input threat detection to mitigate attacks, (3) an Independent Asynchronous Content Moderation Agent (IACMA) to assess outputs against predefined safety categories, and (4) taking a human-in-the-loop approach, to encourage teachers to review generated content before it is used in the classroom. Through our on-going evaluation of these safety guardrails we have identified several challenges and opportunities to take into account when implementing and testing safety guardrails. This paper highlights ways to build more effective safety guardrails in generative AI education tools including the on-going iteration and refinement of guardrails, as well as enabling cross-sector collaboration through sharing both open-source code, datasets and learnings.

ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning

Authors:Jelle Luijkx, Zlatan Ajanović, Laura Ferranti, Jens Kober

Date:2025-08-07 12:10:46

Human teaching effort is a significant bottleneck for the broader applicability of interactive imitation learning. To reduce the number of required queries, existing methods employ active learning to query the human teacher only in uncertain, risky, or novel situations. However, during these queries, the novice's planned actions are not utilized despite containing valuable information, such as the novice's capabilities, as well as corresponding uncertainty levels. To this end, we allow the novice to say: "I plan to do this, but I am uncertain." We introduce the Active Skill-level Data Aggregation (ASkDAgger) framework, which leverages teacher feedback on the novice plan in three key ways: (1) S-Aware Gating (SAG): Adjusts the gating threshold to track sensitivity, specificity, or a minimum success rate; (2) Foresight Interactive Experience Replay (FIER), which recasts valid and relabeled novice action plans into demonstrations; and (3) Prioritized Interactive Experience Replay (PIER), which prioritizes replay based on uncertainty, novice success, and demonstration age. Together, these components balance query frequency with failure incidence, reduce the number of required demonstration annotations, improve generalization, and speed up adaptation to changing domains. We validate the effectiveness of ASkDAgger through language-conditioned manipulation tasks in both simulation and real-world environments. Code, data, and videos are available at https://askdagger.github.io.

Congestion Mitigation Path Planning for Large-Scale Multi-Agent Navigation in Dense Environments

Authors:Takuro Kato, Keisuke Okumura, Yoko Sasaki, Naoya Yokomachi

Date:2025-08-07 10:43:19

In high-density environments where numerous autonomous agents move simultaneously in a distributed manner, streamlining global flows to mitigate local congestion is crucial to maintain overall navigation efficiency. This paper introduces a novel path-planning problem, congestion mitigation path planning (CMPP), which embeds congestion directly into the cost function, defined by the usage of incoming edges along agents' paths. CMPP assigns a flow-based multiplicative penalty to each vertex of a sparse graph, which grows steeply where frequently-traversed paths intersect, capturing the intuition that congestion intensifies where many agents enter the same area from different directions. Minimizing the total cost yields a set of coarse-level, time-independent routes that autonomous agents can follow while applying their own local collision avoidance. We formulate the problem and develop two solvers: (i) an exact mixed-integer nonlinear programming solver for small instances, and (ii) a scalable two-layer search algorithm, A-CMTS, which quickly finds suboptimal solutions for large-scale instances and iteratively refines them toward the optimum. Empirical studies show that augmenting state-of-the-art collision-avoidance planners with CMPP significantly reduces local congestion and enhances system throughput in both discrete- and continuous-space scenarios. These results indicate that CMPP improves the performance of multi-agent systems in real-world applications such as logistics and autonomous-vehicle operations.

Learning to See and Act: Task-Aware View Planning for Robotic Manipulation

Authors:Yongjie Bai, Zhouxia Wang, Yang Liu, Weixing Chen, Ziliang Chen, Mingtong Dai, Yongsen Zheng, Lingbo Liu, Guanbin Li, Liang Lin

Date:2025-08-07 09:21:20

Recent vision-language-action (VLA) models for multi-task robotic manipulation commonly rely on static viewpoints and shared visual encoders, which limit 3D perception and cause task interference, hindering robustness and generalization. In this work, we propose Task-Aware View Planning (TAVP), a framework designed to overcome these challenges by integrating active view planning with task-specific representation learning. TAVP employs an efficient exploration policy, accelerated by a novel pseudo-environment, to actively acquire informative views. Furthermore, we introduce a Mixture-of-Experts (MoE) visual encoder to disentangle features across different tasks, boosting both representation fidelity and task generalization. By learning to see the world in a task-aware way, TAVP generates more complete and discriminative visual representations, demonstrating significantly enhanced action prediction across a wide array of manipulation challenges. Extensive experiments on RLBench tasks show that our proposed TAVP model achieves superior performance over state-of-the-art fixed-view approaches. Visual results and code are provided at: https://hcplab-sysu.github.io/TAVP.

PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

Authors:Qi Guo, Xiaojun Jia, Shanmin Pang, Simeng Qin, Lin Wang, Ju Jia, Yang Liu, Qing Guo

Date:2025-08-07 08:54:54

Multimodal Large Language Models (MLLMs) are becoming integral to autonomous driving (AD) systems due to their strong vision-language reasoning capabilities. However, MLLMs are vulnerable to adversarial attacks, particularly adversarial patch attacks, which can pose serious threats in real-world scenarios. Existing patch-based attack methods are primarily designed for object detection models and perform poorly when transferred to MLLM-based systems due to the latter's complex architectures and reasoning abilities. To address these limitations, we propose PhysPatch, a physically realizable and transferable adversarial patch framework tailored for MLLM-based AD systems. PhysPatch jointly optimizes patch location, shape, and content to enhance attack effectiveness and real-world applicability. It introduces a semantic-based mask initialization strategy for realistic placement, an SVD-based local alignment loss with patch-guided crop-resize to improve transferability, and a potential field-based mask refinement method. Extensive experiments across open-source, commercial, and reasoning-capable MLLMs demonstrate that PhysPatch significantly outperforms prior methods in steering MLLM-based AD systems toward target-aligned perception and planning outputs. Moreover, PhysPatch consistently places adversarial patches in physically feasible regions of AD scenes, ensuring strong real-world applicability and deployability.

Preparing for the worst: Long-term and short-term weather extremes in resource adequacy assessment

Authors:Aleksander Grochowicz, Hannah C. Bloomfield, Marta Victoria

Date:2025-08-07 08:53:02

Security of supply is a common and important concern when integrating renewables in net-zero power systems. Extreme weather affects both demand and supply leading to power system stress; in Europe this stress spreads continentally beyond the meteorological root cause. We use an approach based on shadow prices to identify periods of elevated stress called system-defining events and analyse their impact on the power system. By classifying different types of system-defining events, we identify challenges to power system operation and planning. Crucially, we find the need for sufficient resilience back-up (power) capacities whose financial viability is precarious due to weather variability. Furthermore, we disentangle short- and long-term resilience challenges with distinct metrics and stress tests to incorporate both into future energy modelling assessments. Our methodology and implementation in the open model PyPSA-Eur can be re-applied to other systems and help researchers and policymakers in building more resilient and adequate energy systems.

FedGIN: Federated Learning with Dynamic Global Intensity Non-linear Augmentation for Organ Segmentation using Multi-modal Images

Authors:Sachin Dudda Nagaraju, Ashkan Moradi, Bendik Skarre Abrahamsen, Mattijs Elschot

Date:2025-08-07 08:16:35

Medical image segmentation plays a crucial role in AI-assisted diagnostics, surgical planning, and treatment monitoring. Accurate and robust segmentation models are essential for enabling reliable, data-driven clinical decision making across diverse imaging modalities. Given the inherent variability in image characteristics across modalities, developing a unified model capable of generalizing effectively to multiple modalities would be highly beneficial. This model could streamline clinical workflows and reduce the need for modality-specific training. However, real-world deployment faces major challenges, including data scarcity, domain shift between modalities (e.g., CT vs. MRI), and privacy restrictions that prevent data sharing. To address these issues, we propose FedGIN, a Federated Learning (FL) framework that enables multimodal organ segmentation without sharing raw patient data. Our method integrates a lightweight Global Intensity Non-linear (GIN) augmentation module that harmonizes modality-specific intensity distributions during local training. We evaluated FedGIN using two types of datasets: an imputed dataset and a complete dataset. In the limited dataset scenario, the model was initially trained using only MRI data, and CT data was added to assess its performance improvements. In the complete dataset scenario, both MRI and CT data were fully utilized for training on all clients. In the limited-data scenario, FedGIN achieved a 12 to 18% improvement in 3D Dice scores on MRI test cases compared to FL without GIN and consistently outperformed local baselines. In the complete dataset scenario, FedGIN demonstrated near-centralized performance, with a 30% Dice score improvement over the MRI-only baseline and a 10% improvement over the CT-only baseline, highlighting its strong cross-modality generalization under privacy constraints.

Plans for acceptance sampling by attributes when observations are destructive

Authors:Hugalf Bernburg, Katy Klauenberg

Date:2025-08-07 08:09:47

The international standard ISO 2859-2 provides plans for acceptance sampling by attributes, that ensure a defined quality level in isolated lots using the hypergeometric distribution. In destructive testing, the sample itself is damaged or changed such that the quality of an entire lot is less relevant than the quality of the lot that remains after removing the sample. Examples include assessing the germination of seeds and the conformity of in-service utility meters. This research highlights that the hypergeometric distribution cannot describe the frequentist consumer's risk of accepting a remaining lot with unsatisfactory quality. Consequently, sampling plans as those provided in ISO 2859-2 are ill-suited to assess the remaining lot when sampling destructively. In contrast, Bayesian statistics inherently infers the lot's quality after sampling. Using a reference prior, we show that sampling plans provided by ISO 2859-2 result in high specific consumer's risk for small remaining lots. The ISO 2859-2 being ill-suited, we design plans for destructive sampling that limit the (Bayesian) specific consumer's risk. To tabulate these plans in a similar way to ISO 2859-2, we propose a new representation that fixes the remaining lot size $N-n$ rather than the sample size $n$. This generalizable, concise and efficient representation is suitable for future standardization of destructive sampling.

Cognitive Duality for Adaptive Web Agents

Authors:Jiarun Liu, Chunhong Zhang, Zheng Hu

Date:2025-08-07 07:05:22

Web navigation represents a critical and challenging domain for evaluating artificial general intelligence (AGI), demanding complex decision-making within high-entropy, dynamic environments with combinatorially explosive action spaces. Current approaches to building autonomous web agents either focus on offline imitation learning or online exploration, but rarely integrate both paradigms effectively. Inspired by the dual-process theory of human cognition, we derive a principled decomposition into fast System 1 and slow System 2 cognitive processes. This decomposition provides a unifying perspective on existing web agent methodologies, bridging the gap between offline learning of intuitive reactive behaviors and online acquisition of deliberative planning capabilities. We implement this framework in CogniWeb, a modular agent architecture that adaptively toggles between fast intuitive processing and deliberate reasoning based on task complexity. Our evaluation on WebArena demonstrates that CogniWeb achieves competitive performance (43.96% success rate) while maintaining significantly higher efficiency (75% reduction in token usage).

Benchmarking Shortcutting Techniques for Multi-Robot-Arm Motion Planning

Authors:Philip Huang, Yorai Shaoul, Jiaoyang Li

Date:2025-08-07 04:37:40

Generating high-quality motion plans for multiple robot arms is challenging due to the high dimensionality of the system and the potential for inter-arm collisions. Traditional motion planning methods often produce motions that are suboptimal in terms of smoothness and execution time for multi-arm systems. Post-processing via shortcutting is a common approach to improve motion quality for efficient and smooth execution. However, in multi-arm scenarios, optimizing one arm's motion must not introduce collisions with other arms. Although existing multi-arm planning works often use some form of shortcutting techniques, their exact methodology and impact on performance are often vaguely described. In this work, we present a comprehensive study quantitatively comparing existing shortcutting methods for multi-arm trajectories across diverse simulated scenarios. We carefully analyze the pros and cons of each shortcutting method and propose two simple strategies for combining these methods to achieve the best performance-runtime tradeoff. Video, code, and dataset are available at https://philip-huang.github.io/mr-shortcut/.

Skin-SOAP: A Weakly Supervised Framework for Generating Structured SOAP Notes

Authors:Sadia Kamal, Tim Oates, Joy Wan

Date:2025-08-07 04:12:43

Skin carcinoma is the most prevalent form of cancer globally, accounting for over $8 billion in annual healthcare expenditures. Early diagnosis, accurate and timely treatment are critical to improving patient survival rates. In clinical settings, physicians document patient visits using detailed SOAP (Subjective, Objective, Assessment, and Plan) notes. However, manually generating these notes is labor-intensive and contributes to clinician burnout. In this work, we propose skin-SOAP, a weakly supervised multimodal framework to generate clinically structured SOAP notes from limited inputs, including lesion images and sparse clinical text. Our approach reduces reliance on manual annotations, enabling scalable, clinically grounded documentation while alleviating clinician burden and reducing the need for large annotated data. Our method achieves performance comparable to GPT-4o, Claude, and DeepSeek Janus Pro across key clinical relevance metrics. To evaluate this clinical relevance, we introduce two novel metrics MedConceptEval and Clinical Coherence Score (CCS) which assess semantic alignment with expert medical concepts and input features, respectively.

AgenticData: An Agentic Data Analytics System for Heterogeneous Data

Authors:Ji Sun, Guoliang Li, Peiyao Zhou, Yihui Ma, Jingzhe Xu, Yuan Li

Date:2025-08-07 03:33:59

Existing unstructured data analytics systems rely on experts to write code and manage complex analysis workflows, making them both expensive and time-consuming. To address these challenges, we introduce AgenticData, an innovative agentic data analytics system that allows users to simply pose natural language (NL) questions while autonomously analyzing data sources across multiple domains, including both unstructured and structured data. First, AgenticData employs a feedback-driven planning technique that automatically converts an NL query into a semantic plan composed of relational and semantic operators. We propose a multi-agent collaboration strategy by utilizing a data profiling agent for discovering relevant data, a semantic cross-validation agent for iterative optimization based on feedback, and a smart memory agent for maintaining short-term context and long-term knowledge. Second, we propose a semantic optimization model to refine and execute semantic plans effectively. Our system, AgenticData, has been tested using three benchmarks. Experimental results showed that AgenticData achieved superior accuracy on both easy and difficult tasks, significantly outperforming state-of-the-art methods.

Hierarchical Deep Deterministic Policy Gradient for Autonomous Maze Navigation of Mobile Robots

Authors:Wenjie Hu, Ye Zhou, Hann Woei Ho

Date:2025-08-07 03:06:22

Maze navigation is a fundamental challenge in robotics, requiring agents to traverse complex environments efficiently. While the Deep Deterministic Policy Gradient (DDPG) algorithm excels in control tasks, its performance in maze navigation suffers from sparse rewards, inefficient exploration, and long-horizon planning difficulties, often leading to low success rates and average rewards, sometimes even failing to achieve effective navigation. To address these limitations, this paper proposes an efficient Hierarchical DDPG (HDDPG) algorithm, which includes high-level and low-level policies. The high-level policy employs an advanced DDPG framework to generate intermediate subgoals from a long-term perspective and on a higher temporal scale. The low-level policy, also powered by the improved DDPG algorithm, generates primitive actions by observing current states and following the subgoal assigned by the high-level policy. The proposed method enhances stability with off-policy correction, refining subgoal assignments by relabeling historical experiences. Additionally, adaptive parameter space noise is utilized to improve exploration, and a reshaped intrinsic-extrinsic reward function is employed to boost learning efficiency. Further optimizations, including gradient clipping and Xavier initialization, are employed to improve robustness. The proposed algorithm is rigorously evaluated through numerical simulation experiments executed using the Robot Operating System (ROS) and Gazebo. Regarding the three distinct final targets in autonomous maze navigation tasks, HDDPG significantly overcomes the limitations of standard DDPG and its variants, improving the success rate by at least 56.59% and boosting the average reward by a minimum of 519.03 compared to baseline algorithms.

Optimal Planning for Multi-Robot Simultaneous Area and Line Coverage Using Hierarchical Cyclic Merging Regulation

Authors:Tianyuan Zheng, Jingang Yi, Kaiyan Yu

Date:2025-08-07 02:27:32

The double coverage problem focuses on determining efficient, collision-free routes for multiple robots to simultaneously cover linear features (e.g., surface cracks or road routes) and survey areas (e.g., parking lots or local regions) in known environments. In these problems, each robot carries two functional roles: service (linear feature footprint coverage) and exploration (complete area coverage). Service has a smaller operational footprint but incurs higher costs (e.g., time) compared to exploration. We present optimal planning algorithms for the double coverage problems using hierarchical cyclic merging regulation (HCMR). To reduce the complexity for optimal planning solutions, we analyze the manifold attachment process during graph traversal from a Morse theory perspective. We show that solutions satisfying minimum path length and collision-free constraints must belong to a Morse-bounded collection. To identify this collection, we introduce the HCMR algorithm. In HCMR, cyclic merging search regulates traversal behavior, while edge sequence back propagation converts these regulations into graph edge traversal sequences. Incorporating balanced partitioning, the optimal sequence is selected to generate routes for each robot. We prove the optimality of the HCMR algorithm under a fixed sweep direction. The multi-robot simulation results demonstrate that the HCMR algorithm significantly improves planned path length by at least 10.0%, reduces task time by at least 16.9% in average, and ensures conflict-free operation compared to other state-of-the-art planning methods.

INTENTION: Inferring Tendencies of Humanoid Robot Motion Through Interactive Intuition and Grounded VLM

Authors:Jin Wang, Weijie Wang, Boyuan Deng, Heng Zhang, Rui Dai, Nikos Tsagarakis

Date:2025-08-06 23:27:22

Traditional control and planning for robotic manipulation heavily rely on precise physical models and predefined action sequences. While effective in structured environments, such approaches often fail in real-world scenarios due to modeling inaccuracies and struggle to generalize to novel tasks. In contrast, humans intuitively interact with their surroundings, demonstrating remarkable adaptability, making efficient decisions through implicit physical understanding. In this work, we propose INTENTION, a novel framework enabling robots with learned interactive intuition and autonomous manipulation in diverse scenarios, by integrating Vision-Language Models (VLMs) based scene reasoning with interaction-driven memory. We introduce Memory Graph to record scenes from previous task interactions which embodies human-like understanding and decision-making about different tasks in real world. Meanwhile, we design an Intuitive Perceptor that extracts physical relations and affordances from visual scenes. Together, these components empower robots to infer appropriate interaction behaviors in new scenes without relying on repetitive instructions. Videos: https://robo-intention.github.io

Minicharged Particle Sensitivity of the MAPP Outrigger Detector

Authors:Matti Kalliokoski, Vasiliki A. Mitsou, Marc de Montigny, Abhinab Mukhopadhyay, Pierre-Philippe A. Ouimet, James Pinfold, Ameir Shaa, Michael Staelens

Date:2025-08-06 21:54:28

We present a detailed study of the projected background-free sensitivity of the MAPP Outrigger Detector (OD) to minicharged particles (mCPs) at the High-Luminosity Large Hadron Collider (HL-LHC). As the first upgrade to the MAPP Experiment, the MAPP OD is a standalone detector designed to offer enhanced sensitivity to high-mass mCPs with intermediate effective charges. The MAPP OD is planned for installation in a duct adjacent to the MAPP-1 detector, located between the LHC's UA83 gallery and the beamline. Considering mCP production via the Drell-Yan mechanism and various meson decays, the results show that, at the 95% confidence level, the MAPP OD can extend the experiment's upper mass reach to mCP masses of approximately 200 GeV at the HL-LHC.

Retrieval-Augmented Water Level Forecasting for Everglades

Authors:Rahuul Rangaraj, Jimeng Shi, Rajendra Paudel, Giri Narasimhan, Yanzhao Wu

Date:2025-08-06 21:27:12

Accurate water level forecasting is crucial for managing ecosystems such as the Everglades, a subtropical wetland vital for flood mitigation, drought management, water resource planning, and biodiversity conservation. While recent advances in deep learning, particularly time series foundation models, have demonstrated success in general-domain forecasting, their application in hydrology remains underexplored. Furthermore, they often struggle to generalize across diverse unseen datasets and domains, due to the lack of effective mechanisms for adaptation. To address this gap, we introduce Retrieval-Augmented Forecasting (RAF) into the hydrology domain, proposing a framework that retrieves historically analogous multivariate hydrological episodes to enrich the model input before forecasting. By maintaining an external archive of past observations, RAF identifies and incorporates relevant patterns from historical data, thereby enhancing contextual awareness and predictive accuracy without requiring the model for task-specific retraining or fine-tuning. Furthermore, we explore and compare both similarity-based and mutual information-based RAF methods. We conduct a comprehensive evaluation on real-world data from the Everglades, demonstrating that the RAF framework yields substantial improvements in water level forecasting accuracy. This study highlights the potential of RAF approaches in environmental hydrology and paves the way for broader adoption of adaptive AI methods by domain experts in ecosystem management. The code and data are available at https://github.com/rahuul2992000/WaterRAF.

Ab Initio Study of $^7$Li with Coupled $^6$Li + $n$ and $^6$He + $p$ Mass Partitions

Authors:Jakub Herko, Konstantinos Kravvaris, Petr Navrátil, Sofia Quaglioni, Guillaume Hupin, Mark A. Caprio

Date:2025-08-06 21:08:01

Background: Lithium plays an important role in nuclear astrophysics, fusion energy generation, and nuclear technology. From a theoretical point of view, the nucleus $^7$Li presents a remarkable challenge, as its bound states and resonances can be understood as being formed by a $^4$He and $^3$H pair, or simultaneously, a single neutron/proton coupled to a $^6$Li/$^6$He core. In light of this complexity, a consistent description of $^7$Li bound-state and continuum properties in a unified model presents a significant advancement towards a predictive theory of nuclear structure and reactions. Purpose: Towards achieving such a predictive description, we carry out calculations for $^7$Li within an ab initio framework, taking into account the mass/charge partitions $^6$Li + $n$ and $^6$He + $p$ in a single coupled-channel calculation. This approach allows us to both investigate the effects of the coupling between partitions on the spectrum of $^7$Li, and calculate the cross sections for the $^6$Li($n,p)^6$He and $^6$He($p,n)^6$Li reactions. Method: We use the no-core shell model with continuum, which is capable of describing both bound and scattering states in a unified framework. Results: Our calculation reproduces all the experimentally observed states of $^7$Li in the correct order and predicts new resonances. We also calculated the cross section of the reaction $^6$He($p,n)^6$Li and differential cross sections of the elastic scattering of protons on $^6$He to provide theoretical results for comparison with planned experiments. Conclusions: The overall shape of the cross section of the reaction $^6$Li$(n,p)^6$He as a function of energy is reproduced, although the absolute magnitude is overestimated due to omission of the ($n,\alpha$) reaction channel. A more accurate description of the cross section is achieved by phenomenological adjustmenting the energies of resonances.

Sequence Aware SAC Control for Engine Fuel Consumption Optimization in Electrified Powertrain

Authors:Wafeeq Jaleel, Md Ragib Rownak, Athar Hanif, Sidra Ghayour Bhatti, Qadeer Ahmed

Date:2025-08-06 20:53:11

As hybrid electric vehicles (HEVs) gain traction in heavy-duty trucks, adaptive and efficient energy management is critical for reducing fuel consumption while maintaining battery charge for long operation times. We present a new reinforcement learning (RL) framework based on the Soft Actor-Critic (SAC) algorithm to optimize engine control in series HEVs. We reformulate the control task as a sequential decision-making problem and enhance SAC by incorporating Gated Recurrent Units (GRUs) and Decision Transformers (DTs) into both actor and critic networks to capture temporal dependencies and improve planning over time. To evaluate robustness and generalization, we train the models under diverse initial battery states, drive cycle durations, power demands, and input sequence lengths. Experiments show that the SAC agent with a DT-based actor and GRU-based critic was within 1.8% of Dynamic Programming (DP) in fuel savings on the Highway Fuel Economy Test (HFET) cycle, while the SAC agent with GRUs in both actor and critic networks, and FFN actor-critic agent were within 3.16% and 3.43%, respectively. On unseen drive cycles (US06 and Heavy Heavy-Duty Diesel Truck (HHDDT) cruise segment), generalized sequence-aware agents consistently outperformed feedforward network (FFN)-based agents, highlighting their adaptability and robustness in real-world settings.

BTPG-max: Achieving Local Maximal Bidirectional Pairs for Bidirectional Temporal Plan Graphs

Authors:Yifan Su, Rishi Veerapaneni, Jiaoyang Li

Date:2025-08-06 19:51:37

Multi-Agent Path Finding (MAPF) requires computing collision-free paths for multiple agents in shared environment. Most MAPF planners assume that each agent reaches a specific location at a specific timestep, but this is infeasible to directly follow on real systems where delays often occur. To address collisions caused by agents deviating due to delays, the Temporal Plan Graph (TPG) was proposed, which converts a MAPF time dependent solution into a time independent set of inter-agent dependencies. Recently, a Bidirectional TPG (BTPG) was proposed which relaxed some dependencies into ``bidirectional pairs" and improved efficiency of agents executing their MAPF solution with delays. Our work improves upon this prior work by designing an algorithm, BPTG-max, that finds more bidirectional pairs. Our main theoretical contribution is in designing the BTPG-max algorithm is locally optimal, i.e. which constructs a BTPG where no additional bidirectional pairs can be added. We also show how in practice BTPG-max leads to BTPGs with significantly more bidirectional edges, superior anytime behavior, and improves robustness to delays.

RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case

Authors:Baihui Xiao, Chengjian Feng, Zhijian Huang, Feng yan, Yujie Zhong, Lin Ma

Date:2025-08-06 17:07:25

Collecting real-world data for rare high-risk scenarios, long-tailed driving events, and complex interactions remains challenging, leading to poor performance of existing autonomous driving systems in these critical situations. In this paper, we propose RoboTron-Sim that improves real-world driving in critical situations by utilizing simulated hard cases. First, we develop a simulated dataset called Hard-case Augmented Synthetic Scenarios (HASS), which covers 13 high-risk edge-case categories, as well as balanced environmental conditions such as day/night and sunny/rainy. Second, we introduce Scenario-aware Prompt Engineering (SPE) and an Image-to-Ego Encoder (I2E Encoder) to enable multimodal large language models to effectively learn real-world challenging driving skills from HASS, via adapting to environmental deviations and hardware differences between real-world and simulated scenarios. Extensive experiments on nuScenes show that RoboTron-Sim improves driving performance in challenging scenarios by around 50%, achieving state-of-the-art results in real-world open-loop planning. Qualitative results further demonstrate the effectiveness of RoboTron-Sim in better managing rare high-risk driving scenarios. Project page: https://stars79689.github.io/RoboTron-Sim/

Face-voice Association in Multilingual Environments (FAME) 2026 Challenge Evaluation Plan

Authors:Marta Moscati, Ahmed Abdullah, Muhammad Saad Saeed, Shah Nawaz, Rohan Kumar Das, Muhammad Zaigham Zaheer, Junaid Mir, Muhammad Haroon Yousaf, Khalid Malik, Markus Schedl

Date:2025-08-06 16:09:47

The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, audio-visual systems are among the most widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to the presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) 2026 Challenge focuses on exploring face-voice association under the unique condition of a multilingual scenario. This condition is inspired from the fact that half of the world's population is bilingual and most often people communicate under multilingual scenarios. The challenge uses a dataset named Multilingual Audio-Visual (MAV-Celeb) for exploring face-voice association in multilingual environments. This report provides the details of the challenge, dataset, baseline models, and task details for the FAME Challenge.

Augmentation-based Domain Generalization and Joint Training from Multiple Source Domains for Whole Heart Segmentation

Authors:Franz Thaler, Darko Stern, Gernot Plank, Martin Urschler

Date:2025-08-06 15:37:22

As the leading cause of death worldwide, cardiovascular diseases motivate the development of more sophisticated methods to analyze the heart and its substructures from medical images like Computed Tomography (CT) and Magnetic Resonance (MR). Semantic segmentations of important cardiac structures that represent the whole heart are useful to assess patient-specific cardiac morphology and pathology. Furthermore, accurate semantic segmentations can be used to generate cardiac digital twin models which allows e.g. electrophysiological simulation and personalized therapy planning. Even though deep learning-based methods for medical image segmentation achieved great advancements over the last decade, retaining good performance under domain shift -- i.e. when training and test data are sampled from different data distributions -- remains challenging. In order to perform well on domains known at training-time, we employ a (1) balanced joint training approach that utilizes CT and MR data in equal amounts from different source domains. Further, aiming to alleviate domain shift towards domains only encountered at test-time, we rely on (2) strong intensity and spatial augmentation techniques to greatly diversify the available training data. Our proposed whole heart segmentation method, a 5-fold ensemble with our contributions, achieves the best performance for MR data overall and a performance similar to the best performance for CT data when compared to a model trained solely on CT. With 93.33% DSC and 0.8388 mm ASSD for CT and 89.30% DSC and 1.2411 mm ASSD for MR data, our method demonstrates great potential to efficiently obtain accurate semantic segmentations from which patient-specific cardiac twin models can be generated.

Behaviorally Adaptive Multi-Robot Hazard Localization in Failure-Prone, Communication-Denied Environments

Authors:Alkesh K. Srivastava, Aamodh Suresh, Carlos Nieto-Granda

Date:2025-08-06 15:23:22

We address the challenge of multi-robot autonomous hazard mapping in high-risk, failure-prone, communication-denied environments such as post-disaster zones, underground mines, caves, and planetary surfaces. In these missions, robots must explore and map hazards while minimizing the risk of failure due to environmental threats or hardware limitations. We introduce a behavior-adaptive, information-theoretic planning framework for multi-robot teams grounded in the concept of Behavioral Entropy (BE), that generalizes Shannon entropy (SE) to capture diverse human-like uncertainty evaluations. Building on this formulation, we propose the Behavior-Adaptive Path Planning (BAPP) framework, which modulates information gathering strategies via a tunable risk-sensitivity parameter, and present two planning algorithms: BAPP-TID for intelligent triggering of high-fidelity robots, and BAPP-SIG for safe deployment under high risk. We provide theoretical insights on the informativeness of the proposed BAPP framework and validate its effectiveness through both single-robot and multi-robot simulations. Our results show that the BAPP stack consistently outperforms Shannon-based and random strategies: BAPP-TID accelerates entropy reduction, while BAPP-SIG improves robot survivability with minimal loss in information gain. In multi-agent deployments, BAPP scales effectively through spatial partitioning, mobile base relocation, and role-aware heterogeneity. These findings underscore the value of behavior-adaptive planning for robust, risk-sensitive exploration in complex, failure-prone environments.