planning - 2025-07-03

Modeling the Deterioration of Pavement Skid Resistance and Surface Texture After Preventive Maintenance

Authors:Lu Gao, Zia Din, Kinam Kim, Ahmed Senouci

Date:2025-07-02 15:57:52

This study investigates the deterioration of skid resistance and surface macrotexture following preventive maintenance using micro-milling techniques. Field data were collected from 31 asphalt pavement sections located across four climatic zones in Texas, encompassing a variety of surface types, milling depths, operational speeds, and drum configurations. A standardized data collection protocol was followed, with measurements taken before milling, immediately after treatment, and at 3, 6, 12, and 18 months post-treatment. Skid number and Mean Profile Depth (MPD) were used to evaluate surface friction and texture characteristics. The dataset was reformatted into a time-series structure with 930 observations, incorporating contextual variables such as climatic zone, treatment parameters, and baseline surface condition. A comparative modeling framework was applied to predict the deterioration trends of both skid resistance and macrotexture over time. Eight regression models, including linear, tree-based, and ensemble methods, were evaluated alongside a sequence-to-one transformer model. Results show that the transformer model achieved the highest prediction accuracy for skid resistance (R2=0.981), while Random Forest performing best for macrotexture prediction (R2 = 0.838). The findings indicate that the degradation of surface characteristics after preventive maintenance is nonlinear and influenced by a combination of environmental and operational factors. This study demonstrates the effectiveness of data-driven modeling in supporting transportation agencies with pavement performance forecasting and maintenance planning.

Probabilistic Proton Treatment Planning: a novel approach for optimizing underdosage and overdosage probabilities of target and organ structures

Authors:Jelte R. de Jong, Sebastiaan Breedveld, Steven J. M. Habraken, Mischa S. Hoogeman, Danny Lathouwers, Zoltán Perkó

Date:2025-07-02 14:45:43

Treatment planning uncertainties are typically managed using margin-based or robust optimization. Margin-based methods expand the clinical target volume (CTV) to a planning target volume, generally unsuited for proton therapy. Robust optimization considers worst-case scenarios, but its quality depends on the uncertainty scenario set: excluding extremes reduces robustness, while too many make plans overly conservative. Probabilistic optimization overcomes these limits by modeling a continuous scenario distribution. We propose a novel probabilistic optimization approach that steers plans toward individualized probability levels to control CTV and organs-at-risk (OARs) under- and overdosage. Voxel-wise dose percentiles ($d$) are estimated by expected value ($E$) and standard deviation (SD) as $E[d] \pm \delta \cdot SD[d]$, where $\delta$ is iteratively tuned to match the target percentile given Gaussian-distributed setup (3 mm) and range (3%) uncertainties. The method involves an inner optimization of $E[d] \pm \delta \cdot SD[d]$ for fixed $\delta$, and an outer loop updating $\delta$. Polynomial Chaos Expansion (PCE) provides accurate and efficient dose estimates during optimization. We validated the method on a spherical CTV abutted by an OAR in different directions and a horseshoe-shaped CTV surrounding a cylindrical spine. For spherical cases with similar CTV coverage, $P(D_{2\%} > 30 Gy)$ dropped by 10-15%; for matched OAR dose, $P(D_{98\%} > 57 Gy)$ increased by 67.5-71%. In spinal plans, $P(D_{98\%} > 57 Gy)$ increased by 10-15% while $P(D_{2\%} > 30 Gy)$ dropped 24-28%. Probabilistic and robust optimization times were comparable for spherical (hours) but longer for spinal cases (7.5 - 11.5 h vs. 9 - 20 min). Compared to discrete scenario-based optimization, the probabilistic method offered better OAR sparing or target coverage depending on the set priorities.

PathDB: A system for evaluating regular path queries

Authors:Roberto García, Renzo Angles, Vicente Rojas, Sebastián Ferrada

Date:2025-07-02 14:33:05

PathDB is a Java-based graph database designed for in-memory data loading and querying. By utilizing Regular Path Queries (RPQ) and a closed path algebra, PathDB processes paths through its three main components: the parser, the logical plan, and the physical plan. This modular design allows for targeted optimizations and modifications without impacting overall functionality. Benchmark experiments illustrate PathDB's execution times and flexibility in handling dynamic and complex path queries, compared to baseline methods like Depth-First Search (DFS) and Breadth-First Search (BFS) guided by an automaton, highlighting its optimizations that contribute to its performance.

Entropic optimal transport beyond product reference couplings: the Gaussian case on Euclidean space

Authors:Paul Freulon, Nikitas Georgakis, Victor Panaretos

Date:2025-07-02 13:40:21

The optimal transport problem with squared Euclidean cost consists in finding a coupling between two input measures that maximizes correlation. Consequently, the optimal coupling is often singular with respect to Lebesgue measure. Regularizing the optimal transport problem with an entropy term yields an approximation called entropic optimal transport. Entropic penalties steer the induced coupling toward a reference measure with desired properties. For instance, when seeking a diffuse coupling, the most popular reference measures are the Lebesgue measure and the product of the two input measures. In this work, we study the case where the reference coupling is not necessarily assumed to be a product. We focus on the Gaussian case as a motivating paradigm, and provide a reduction of this more general optimal transport criterion to a matrix optimization problem. This reduction enables us to provide a complete description of the solution, both in terms of the primal variable and the dual variables. We argue that flexibility in terms of the reference measure can be important in statistical contexts, for instance when one has prior information, when there is uncertainty regarding the measures to be coupled, or to reduce bias when the entropic problem is used to estimate the un-regularized transport problem. In particular, we show in numerical examples that choosing a suitable reference plan allows to reduce the bias caused by the entropic penalty.

Efficient Collision Detection for Long and Slender Robotic Links in Euclidean Distance Fields: Application to a Forestry Crane

Authors:Marc-Philip Ecker, Bernhard Bischof, Minh Nhat Vu, Christoph Fröhlich, Tobias Glück, Wolfgang Kemmetmüller

Date:2025-07-02 13:37:08

Collision-free motion planning in complex outdoor environments relies heavily on perceiving the surroundings through exteroceptive sensors. A widely used approach represents the environment as a voxelized Euclidean distance field, where robots are typically approximated by spheres. However, for large-scale manipulators such as forestry cranes, which feature long and slender links, this conventional spherical approximation becomes inefficient and inaccurate. This work presents a novel collision detection algorithm specifically designed to exploit the elongated structure of such manipulators, significantly enhancing the computational efficiency of motion planning algorithms. Unlike traditional sphere decomposition methods, our approach not only improves computational efficiency but also naturally eliminates the need to fine-tune the approximation accuracy as an additional parameter. We validate the algorithm's effectiveness using real-world LiDAR data from a forestry crane application, as well as simulated environment data.

An RRT* algorithm based on Riemannian metric model for optimal path planning

Authors:Yu Zhang, Qi Zhou, Xiao-Song Yang

Date:2025-07-02 13:26:33

This paper presents a Riemannian metric-based model to solve the optimal path planning problem on two-dimensional smooth submanifolds in high-dimensional space. Our model is based on constructing a new Riemannian metric on a two-dimensional projection plane, which is induced by the high-dimensional Euclidean metric on two-dimensional smooth submanifold and reflects the environmental information of the robot. The optimal path planning problem in high-dimensional space is therefore transformed into a geometric problem on the two-dimensional plane with new Riemannian metric. Based on the new Riemannian metric, we proposed an incremental algorithm RRT*-R on the projection plane. The experimental results show that the proposed algorithm is suitable for scenarios with uneven fields in multiple dimensions. The proposed algorithm can help the robot to effectively avoid areas with drastic changes in height, ground resistance and other environmental factors. More importantly, the RRT*-R algorithm shows better smoothness and optimization properties compared with the original RRT* algorithm using Euclidean distance in high-dimensional workspace. The length of the entire path by RRT*-R is a good approximation of the theoretical minimum geodesic distance on projection plane.

Ground calibration plan for the Athena/X-IFU microcalorimeter spectrometer

Authors:Alexeï Molin, François Pajot, Marc Audard, Marco Barbera, Sophie Beaumont, Edoardo Cucchetti, Matteo D'Andrea, Christophe Daniel, Roland den Hartog, Megan E. Eckart, Philippe Ferrando, Luciano Gottardi, Maurice Leutenegger, Simone Lotti, Lorenzo Natalucci, Philippe Peille, Jelle de Plaa, Etienne Pointecouteau, Scott Porter, Kosuke Sato, Joern Wilms, Vincent Albouys, Didier Barret, Massimo Cappi, Jan-Willem den Herder, Luigi Piro, Aurora Simionescu

Date:2025-07-02 09:29:50

The X-ray Integral Field Unit is the X-ray imaging spectrometer on-board one of ESA's next large missions, Athena. Athena is set to investigate the theme of the Hot and Energetic Universe, with a launch planned in the late-2030s. Based on a high sensitivity Transition Edge Sensor (TES) detector array operated at very low temperature (50 mK), X-IFU will provide spatially resolved high resolution spectroscopy of the X-ray sky in the 0.2-12 keV energy band, with an energy resolution goal of 4 eV up to 7 keV [3 eV design goal]. This paper presents the current calibration plan of the X-IFU. It provides the requirements applicable to the X-IFU calibration, describes the overall calibration strategy, and details the procedure and sources needed for the ground calibration of each parameter or characteristics of the X-IFU.

Quantum-Assisted Automatic Path-Planning for Robotic Quality Inspection in Industry 4.0

Authors:Eneko Osaba, Estibaliz Garrote, Pablo Miranda-Rodriguez, Alessia Ciacco, Itziar Cabanes, Aitziber Mancisidor

Date:2025-07-02 08:21:52

This work explores the application of hybrid quantum-classical algorithms to optimize robotic inspection trajectories derived from Computer-Aided Design (CAD) models in industrial settings. By modeling the task as a 3D variant of the Traveling Salesman Problem, incorporating incomplete graphs and open-route constraints, this study evaluates the performance of two D-Wave-based solvers against classical methods such as GUROBI and Google OR-Tools. Results across five real-world cases demonstrate competitive solution quality with significantly reduced computation times, highlighting the potential of quantum approaches in automation under Industry 4.0.

A Path Planning Model for Intercepting a Moving Target with Finite Obstacle Avoidance

Authors:Masuda Akter, Mohammed Mustafa Rizvi

Date:2025-07-02 08:06:07

This paper investigates the problem of computing a two-dimensional optimal curvature straight line (CS) shortest path for an unmanned aerial vehicle (UAV) to intercept a moving target, with both the UAV (pursuer) and target travelling at constant speeds. We formulate an optimal control problem that integrates two critical objectives: avoiding static obstacles and successfully intercepting the target. The approach introduces constraints derived from obstacle avoidance and target interception requirements. A geometric framework is developed, along with sufficient conditions for path optimality under the imposed constraints. The problem is initially examined in the presence of a single obstacle and later extended to scenarios involving a finite number of obstacles. Numerical experiments are carried out to evaluate the performance and efficiency of the proposed model using illustrative examples. Finally, we present a realistic case study using actual geographic data, including obstacle placement, target trajectory, and heading angles, to demonstrate the practical applicability and effectiveness of the proposed method in real-world scenarios.

VLAD: A VLM-Augmented Autonomous Driving Framework with Hierarchical Planning and Interpretable Decision Process

Authors:Cristian Gariboldi, Hayato Tokida, Ken Kinjo, Yuki Asada, Alexander Carballo

Date:2025-07-02 01:52:40

Recent advancements in open-source Visual Language Models (VLMs) such as LLaVA, Qwen-VL, and Llama have catalyzed extensive research on their integration with diverse systems. The internet-scale general knowledge encapsulated within these models presents significant opportunities for enhancing autonomous driving perception, prediction, and planning capabilities. In this paper we propose VLAD, a vision-language autonomous driving model, which integrates a fine-tuned VLM with VAD, a state-of-the-art end-to-end system. We implement a specialized fine-tuning approach using custom question-answer datasets designed specifically to improve the spatial reasoning capabilities of the model. The enhanced VLM generates high-level navigational commands that VAD subsequently processes to guide vehicle operation. Additionally, our system produces interpretable natural language explanations of driving decisions, thereby increasing transparency and trustworthiness of the traditionally black-box end-to-end architecture. Comprehensive evaluation on the real-world nuScenes dataset demonstrates that our integrated system reduces average collision rates by 31.82% compared to baseline methodologies, establishing a new benchmark for VLM-augmented autonomous driving systems.

A Multimessenger Strategy for Downselecting the Orientations of Galactic Close White Dwarf Binaries

Authors:Naoki Seto

Date:2025-07-01 23:56:50

The planned space-based gravitational wave detector, LISA, will provide a fundamentally new means of studying the orbital alignment of close white dwarf binaries. However, due to the inherent symmetry of their gravitational wave signals, a fourfold degeneracy arises in the transverse projections of their angular momentum vectors. In this paper, we demonstrate that by incorporating timing information from electromagnetic observations, such as radial velocity modulations and light curves, this degeneracy can be reduced to twofold.

Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration

Authors:Sunandita Patra, Mehtab Pathan, Mahmoud Mahfouz, Parisa Zehtabi, Wided Ouaja, Daniele Magazzeni, Manuela Veloso

Date:2025-07-01 22:56:08

Organizations around the world schedule jobs (programs) regularly to perform various tasks dictated by their end users. With the major movement towards using a cloud computing infrastructure, our organization follows a hybrid approach with both cloud and on-prem servers. The objective of this work is to perform capacity planning, i.e., estimate resource requirements, and job scheduling for on-prem grid computing environments. A key contribution of our approach is handling uncertainty in both resource usage and duration of the jobs, a critical aspect in the finance industry where stochastic market conditions significantly influence job characteristics. For capacity planning and scheduling, we simultaneously balance two conflicting objectives: (a) minimize resource usage, and (b) provide high quality-of-service to the end users by completing jobs by their requested deadlines. We propose approximate approaches using deterministic estimators and pair sampling-based constraint programming. Our best approach (pair sampling-based) achieves much lower peak resource usage compared to manual scheduling without compromising on the quality-of-service.

Judgment as Coordination: A Joint Systems View of Visualization Design Practice

Authors:Paul C. Parsons, Arran Ridley

Date:2025-07-01 22:08:01

Professional visualization design has become an increasingly important area of inquiry, yet much of the field's discourse remains anchored in researcher-centered contexts. Studies of design practice often focus on individual designers' decisions and reflections, offering limited insight into the collaborative and systemic dimensions of professional work. In this paper, we propose a systems-level reframing of design judgment grounded in the coordination and adaptation that sustain progress amid uncertainty, constraint, and misalignment. Drawing on sustained engagement across multiple empirical studies--including ethnographic observation of design teams and qualitative studies of individual practitioners--we identify recurring episodes in which coherence was preserved not by selecting an optimal option, but by repairing alignment, adjusting plans, and reframing goals. We interpret these dynamics through the lens of Joint Cognitive Systems, which provide tools for analyzing how judgment emerges as a distributed capacity within sociotechnical activity. This perspective surfaces often-invisible work in visualization design and offers researchers a new conceptual vocabulary for studying how design activity is sustained in practice.

Search-Based Robot Motion Planning With Distance-Based Adaptive Motion Primitives

Authors:Benjamin Kraljusic, Zlatan Ajanovic, Nermin Covic, Bakir Lacevic

Date:2025-07-01 21:33:33

This work proposes a motion planning algorithm for robotic manipulators that combines sampling-based and search-based planning methods. The core contribution of the proposed approach is the usage of burs of free configuration space (C-space) as adaptive motion primitives within the graph search algorithm. Due to their feature to adaptively expand in free C-space, burs enable more efficient exploration of the configuration space compared to fixed-sized motion primitives, significantly reducing the time to find a valid path and the number of required expansions. The algorithm is implemented within the existing SMPL (Search-Based Motion Planning Library) library and evaluated through a series of different scenarios involving manipulators with varying number of degrees-of-freedom (DoF) and environment complexity. Results demonstrate that the bur-based approach outperforms fixed-primitive planning in complex scenarios, particularly for high DoF manipulators, while achieving comparable performance in simpler scenarios.

VISTA: Open-Vocabulary, Task-Relevant Robot Exploration with Online Semantic Gaussian Splatting

Authors:Keiko Nagami, Timothy Chen, Javier Yu, Ola Shorinwa, Maximilian Adang, Carlyn Dougherty, Eric Cristofalo, Mac Schwager

Date:2025-07-01 18:35:05

We present VISTA (Viewpoint-based Image selection with Semantic Task Awareness), an active exploration method for robots to plan informative trajectories that improve 3D map quality in areas most relevant for task completion. Given an open-vocabulary search instruction (e.g., "find a person"), VISTA enables a robot to explore its environment to search for the object of interest, while simultaneously building a real-time semantic 3D Gaussian Splatting reconstruction of the scene. The robot navigates its environment by planning receding-horizon trajectories that prioritize semantic similarity to the query and exploration of unseen regions of the environment. To evaluate trajectories, VISTA introduces a novel, efficient viewpoint-semantic coverage metric that quantifies both the geometric view diversity and task relevance in the 3D scene. On static datasets, our coverage metric outperforms state-of-the-art baselines, FisherRF and Bayes' Rays, in computation speed and reconstruction quality. In quadrotor hardware experiments, VISTA achieves 6x higher success rates in challenging maps, compared to baseline methods, while matching baseline performance in less challenging maps. Lastly, we show that VISTA is platform-agnostic by deploying it on a quadrotor drone and a Spot quadruped robot. Open-source code will be released upon acceptance of the paper.

Landslide Detection and Mapping Using Deep Learning Across Multi-Source Satellite Data and Geographic Regions

Authors:Rahul A. Burange, Harsh K. Shinde, Omkar Mutyalwar

Date:2025-07-01 18:34:42

Landslides pose severe threats to infrastructure, economies, and human lives, necessitating accurate detection and predictive mapping across diverse geographic regions. With advancements in deep learning and remote sensing, automated landslide detection has become increasingly effective. This study presents a comprehensive approach integrating multi-source satellite imagery and deep learning models to enhance landslide identification and prediction. We leverage Sentinel-2 multispectral data and ALOS PALSAR-derived slope and Digital Elevation Model (DEM) layers to capture critical environmental features influencing landslide occurrences. Various geospatial analysis techniques are employed to assess the impact of terra in characteristics, vegetation cover, and rainfall on detection accuracy. Additionally, we evaluate the performance of multiple stateof-the-art deep learning segmentation models, including U-Net, DeepLabV3+, and Res-Net, to determine their effectiveness in landslide detection. The proposed framework contributes to the development of reliable early warning systems, improved disaster risk management, and sustainable land-use planning. Our findings provide valuable insights into the potential of deep learning and multi-source remote sensing in creating robust, scalable, and transferable landslide prediction models.

Geometry-aware 4D Video Generation for Robot Manipulation

Authors:Zeyi Liu, Shuang Li, Eric Cousineau, Siyuan Feng, Benjamin Burchfiel, Shuran Song

Date:2025-07-01 18:01:41

Understanding and predicting the dynamics of the physical world can enhance a robot's ability to plan and interact effectively in complex environments. While recent video generation models have shown strong potential in modeling dynamic scenes, generating videos that are both temporally coherent and geometrically consistent across camera views remains a significant challenge. To address this, we propose a 4D video generation model that enforces multi-view 3D consistency of videos by supervising the model with cross-view pointmap alignment during training. This geometric supervision enables the model to learn a shared 3D representation of the scene, allowing it to predict future video sequences from novel viewpoints based solely on the given RGB-D observations, without requiring camera poses as inputs. Compared to existing baselines, our method produces more visually stable and spatially aligned predictions across multiple simulated and real-world robotic datasets. We further show that the predicted 4D videos can be used to recover robot end-effector trajectories using an off-the-shelf 6DoF pose tracker, supporting robust robot manipulation and generalization to novel camera viewpoints.

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

Authors:Yating Wang, Haoyi Zhu, Mingyu Liu, Jiange Yang, Hao-Shu Fang, Tong He

Date:2025-07-01 17:59:44

In this paper, we introduce an innovative vector quantization based action tokenizer built upon the largest-scale action trajectory dataset to date, leveraging over 100 times more data than previous approaches. This extensive dataset enables our tokenizer to capture rich spatiotemporal dynamics, resulting in a model that not only accelerates inference but also generates smoother and more coherent action outputs. Once trained, the tokenizer can be seamlessly adapted to a wide range of downstream tasks in a zero-shot manner, from short-horizon reactive behaviors to long-horizon planning. A key finding of our work is that the domain gap between synthetic and real action trajectories is marginal, allowing us to effectively utilize a vast amount of synthetic data during training without compromising real-world performance. To validate our approach, we conducted extensive experiments in both simulated environments and on real robotic platforms. The results demonstrate that as the volume of synthetic trajectory data increases, the performance of our tokenizer on downstream tasks improves significantly-most notably, achieving up to a 30% higher success rate on two real-world tasks in long-horizon scenarios. These findings highlight the potential of our action tokenizer as a robust and scalable solution for real-time embodied intelligence systems, paving the way for more efficient and reliable robotic control in diverse application domains.Project website: https://xiaoxiao0406.github.io/vqvla.github.io

DMCIE: Diffusion Model with Concatenation of Inputs and Errors to Improve the Accuracy of the Segmentation of Brain Tumors in MRI Images

Authors:Sara Yavari, Rahul Nitin Pandya, Jacob Furst

Date:2025-07-01 17:34:50

Accurate segmentation of brain tumors in MRI scans is essential for reliable clinical diagnosis and effective treatment planning. Recently, diffusion models have demonstrated remarkable effectiveness in image generation and segmentation tasks. This paper introduces a novel approach to corrective segmentation based on diffusion models. We propose DMCIE (Diffusion Model with Concatenation of Inputs and Errors), a novel framework for accurate brain tumor segmentation in multi-modal MRI scans. We employ a 3D U-Net to generate an initial segmentation mask, from which an error map is generated by identifying the differences between the prediction and the ground truth. The error map, concatenated with the original MRI images, are used to guide a diffusion model. Using multimodal MRI inputs (T1, T1ce, T2, FLAIR), DMCIE effectively enhances segmentation accuracy by focusing on misclassified regions, guided by the original inputs. Evaluated on the BraTS2020 dataset, DMCIE outperforms several state-of-the-art diffusion-based segmentation methods, achieving a Dice Score of 93.46 and an HD95 of 5.94 mm. These results highlight the effectiveness of error-guided diffusion in producing precise and reliable brain tumor segmentations.

RTMap: Real-Time Recursive Mapping with Change Detection and Localization

Authors:Yuheng Du, Sheng Yang, Lingxuan Wang, Zhenghua Hou, Chengying Cai, Zhitao Tan, Mingxia Chen, Shi-Sheng Huang, Qiang Li

Date:2025-07-01 17:32:30

While recent online HD mapping methods relieve burdened offline pipelines and solve map freshness, they remain limited by perceptual inaccuracies, occlusion in dense traffic, and an inability to fuse multi-agent observations. We propose RTMap to enhance these single-traversal methods by persistently crowdsourcing a multi-traversal HD map as a self-evolutional memory. On onboard agents, RTMap simultaneously addresses three core challenges in an end-to-end fashion: (1) Uncertainty-aware positional modeling for HD map elements, (2) probabilistic-aware localization w.r.t. the crowdsourced prior-map, and (3) real-time detection for possible road structural changes. Experiments on several public autonomous driving datasets demonstrate our solid performance on both the prior-aided map quality and the localization accuracy, demonstrating our effectiveness of robustly serving downstream prediction and planning modules while gradually improving the accuracy and freshness of the crowdsourced prior-map asynchronously. Our source-code will be made publicly available at https://github.com/CN-ADLab/RTMap (Camera ready version incorporating reviewer suggestions will be updated soon).

Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Authors:Rizwan Qureshi, Ranjan Sapkota, Abbas Shah, Amgad Muneer, Anas Zafar, Ashmal Vayani, Maged Shoman, Abdelrahman B. M. Eldaly, Kai Zhang, Ferhat Sadak, Shaina Raza, Xinqi Fan, Ravid Shwartz-Ziv, Hong Yan, Vinjia Jain, Aman Chadha, Manoj Karkee, Jia Wu, Philip Torr, Seyedali Mirjalili

Date:2025-07-01 16:52:25

Can machines truly think, reason and act in domains like humans? This enduring question continues to shape the pursuit of Artificial General Intelligence (AGI). Despite the growing capabilities of models such as GPT-4.5, DeepSeek, Claude 3.5 Sonnet, Phi-4, and Grok 3, which exhibit multimodal fluency and partial reasoning, these systems remain fundamentally limited by their reliance on token-level prediction and lack of grounded agency. This paper offers a cross-disciplinary synthesis of AGI development, spanning artificial intelligence, cognitive neuroscience, psychology, generative models, and agent-based systems. We analyze the architectural and cognitive foundations of general intelligence, highlighting the role of modular reasoning, persistent memory, and multi-agent coordination. In particular, we emphasize the rise of Agentic RAG frameworks that combine retrieval, planning, and dynamic tool use to enable more adaptive behavior. We discuss generalization strategies, including information compression, test-time adaptation, and training-free methods, as critical pathways toward flexible, domain-agnostic intelligence. Vision-Language Models (VLMs) are reexamined not just as perception modules but as evolving interfaces for embodied understanding and collaborative task completion. We also argue that true intelligence arises not from scale alone but from the integration of memory and reasoning: an orchestration of modular, interactive, and self-improving components where compression enables adaptive behavior. Drawing on advances in neurosymbolic systems, reinforcement learning, and cognitive scaffolding, we explore how recent architectures begin to bridge the gap between statistical learning and goal-directed cognition. Finally, we identify key scientific, technical, and ethical challenges on the path to AGI.

Light Neutral Higgs-Boson Production at $e^+e^-$ Colliders in the Complex MSSM and NMSSM: A Full One-Loop Analysis

Authors:S. Heinemeyer, S. Paßehr, C. Schappacher

Date:2025-07-01 16:37:07

For future precision analyses of the Higgs boson at $\approx 125$ GeV, $h_{125}$, a precise knowledge of its production and decay properties is mandatory. While in the Standard Model (SM) these calculations are quite advanced, in many models beyond the SM (BSM) a precise calculation is missing so far. We present the calculation of the Higgs-strahlung cross-sections at $e^+e^-$ colliders for the light neutral Higgs boson production in the Next-to-Minimal Supersymmetric SM (NMSSM) with complex parameters (cNMSSM). The evaluation is based on a full one-loop calculation of the production mechanism $e^+e^- \to h_1 Z$, including soft, hard, and collinear photon radiation. The dependence of the Higgs boson production cross-sections on the relevant cNMSSM parameters is analyzed numerically. In certain scenarios we find sizable corrections to the Higgs-strahlung cross-section. Normally, they reach about $10%$ of the tree-level results, but can also exceed 20%. Finally, the calculation is compared to the corresponding one in the Minimal Supersymmetric SM (MSSM). The knowledge of the full one-loop contributions to the Higgs-boson production is particularly important for a sound theoretical intemeasurements at future $e^+e^-$ colliders such as the ILC, CLIC, LCF, FCC-ee, or CEPC. It is planned to implement the evaluation of the Higgs boson production cross-sections into an add-on package to the code FeynHiggs.

A Survey: Learning Embodied Intelligence from Physical Simulators and World Models

Authors:Xiaoxiao Long, Qingrui Zhao, Kaiwen Zhang, Zihao Zhang, Dingrui Wang, Yumeng Liu, Zhengjie Shu, Yi Lu, Shouzheng Wang, Xinzhe Wei, Wei Li, Wei Yin, Yao Yao, Jia Pan, Qiu Shen, Ruigang Yang, Xun Cao, Qionghai Dai

Date:2025-07-01 16:23:00

The pursuit of artificial general intelligence (AGI) has placed embodied intelligence at the forefront of robotics research. Embodied intelligence focuses on agents capable of perceiving, reasoning, and acting within the physical world. Achieving robust embodied intelligence requires not only advanced perception and control, but also the ability to ground abstract cognition in real-world interactions. Two foundational technologies, physical simulators and world models, have emerged as critical enablers in this quest. Physical simulators provide controlled, high-fidelity environments for training and evaluating robotic agents, allowing safe and efficient development of complex behaviors. In contrast, world models empower robots with internal representations of their surroundings, enabling predictive planning and adaptive decision-making beyond direct sensory input. This survey systematically reviews recent advances in learning embodied AI through the integration of physical simulators and world models. We analyze their complementary roles in enhancing autonomy, adaptability, and generalization in intelligent robots, and discuss the interplay between external simulation and internal modeling in bridging the gap between simulated training and real-world deployment. By synthesizing current progress and identifying open challenges, this survey aims to provide a comprehensive perspective on the path toward more capable and generalizable embodied AI systems. We also maintain an active repository that contains up-to-date literature and open-source projects at https://github.com/NJU3DV-LoongGroup/Embodied-World-Models-Survey.

Stealtooth: Breaking Bluetooth Security Abusing Silent Automatic Pairing

Authors:Keiichiro Kimura, Hiroki Kuzuno, Yoshiaki Shiraishi, Masakatu Morii

Date:2025-07-01 15:18:37

Bluetooth is a pervasive wireless communication technology used by billions of devices for short-range connectivity. The security of Bluetooth relies on the pairing process, where devices establish shared long-term keys for secure communications. However, many commercial Bluetooth devices implement automatic pairing functions to improve user convenience, creating a previously unexplored attack surface. We present Stealtooth, a novel attack that abuses unknown vulnerabilities in the automatic pairing functions in commercial Bluetooth devices to achieve completely silent device link key overwriting. The Stealtooth attack leverages the fact that Bluetooth audio devices automatically transition to pairing mode under specific conditions, enabling attackers to hijack pairing processes without user awareness or specialized tools. We also extend the attack into the MitM Stealtooth attack, combining automatic pairing abuse with power-saving mode techniques to enable man-in-the-middle attacks. We evaluate the attacks against 10 commercial Bluetooth devices from major manufacturers, demonstrating widespread vulnerabilities across diverse device types and manufacturers. Our practical implementation requires only commodity hardware and open-source software, highlighting the low barrier to entry for attackers. We propose defenses both device and protocol levels, including enhanced user notifications and standardized automatic pairing guidelines. Our findings reveal a critical tension between security and usability, showing that current automatic pairing implementations create systematic vulnerabilities. We responsibly disclosed our findings to affected vendors, with several already releasing patches.

A Probabilistic Approach to Wildfire Spread Prediction Using a Denoising Diffusion Surrogate Model

Authors:Wenbo Yu, Anirbit Ghosh, Tobias Sebastian Finn, Rossella Arcucci, Marc Bocquet, Sibo Cheng

Date:2025-07-01 14:04:06

Thanks to recent advances in generative AI, computers can now simulate realistic and complex natural processes. We apply this capability to predict how wildfires spread, a task made difficult by the unpredictable nature of fire and the variety of environmental conditions it depends on. In this study, We present the first denoising diffusion model for predicting wildfire spread, a new kind of AI framework that learns to simulate fires not just as one fixed outcome, but as a range of possible scenarios. By doing so, it accounts for the inherent uncertainty of wildfire dynamics, a feature that traditional models typically fail to represent. Unlike deterministic approaches that generate a single prediction, our model produces ensembles of forecasts that reflect physically meaningful distributions of where fire might go next. This technology could help us develop smarter, faster, and more reliable tools for anticipating wildfire behavior, aiding decision-makers in fire risk assessment and response planning.

A General Simulation-Based Optimisation Framework for Multipoint Constant-Stress Accelerated Life Tests

Authors:Owen McGrath, Kevin Burke

Date:2025-07-01 13:03:45

Accelerated life testing (ALT) is a method of reducing the lifetime of components through exposure to extreme stress. This method of obtaining lifetime information involves the design of a testing experiment, i.e., an accelerated test plan. In this work, we adopt a simulation-based approach to obtaining optimal test plans for constant-stress accelerated life tests with multiple design points. Within this simulation framework we can easily assess a variety of test plans by modifying the number of test stresses (and their levels) and evaluating the allocation of test units. We obtain optimal test plans by utilising the differential evolution (DE) optimisation algorithm, where the inputs to the objective function are the test plan parameters, and the output is the RMSE (root mean squared error) of out-of-sample (extrapolated) model predictions. When the life-stress distribution is correctly specified, we show that the optimal number of stress levels is related to the number of model parameters. In terms of test unit allocation, we show that the proportion of test units is inversely related to the stress level. Our general simulation framework provides an alternative approach to theoretical optimisation, and is particularly favourable for large/complex multipoint test plans where analytical optimisation could prove intractable. Our procedure can be applied to a broad range of experimental scenarios, and serves as a useful tool to aid practitioners seeking to maximise component lifetime information through accelerated life testing.

LoD-Loc v2: Aerial Visual Localization over Low Level-of-Detail City Models using Explicit Silhouette Alignment

Authors:Juelin Zhu, Shuaibang Peng, Long Wang, Hanlin Tan, Yu Liu, Maojun Zhang, Shen Yan

Date:2025-07-01 10:56:51

We propose a novel method for aerial visual localization over low Level-of-Detail (LoD) city models. Previous wireframe-alignment-based method LoD-Loc has shown promising localization results leveraging LoD models. However, LoD-Loc mainly relies on high-LoD (LoD3 or LoD2) city models, but the majority of available models and those many countries plan to construct nationwide are low-LoD (LoD1). Consequently, enabling localization on low-LoD city models could unlock drones' potential for global urban localization. To address these issues, we introduce LoD-Loc v2, which employs a coarse-to-fine strategy using explicit silhouette alignment to achieve accurate localization over low-LoD city models in the air. Specifically, given a query image, LoD-Loc v2 first applies a building segmentation network to shape building silhouettes. Then, in the coarse pose selection stage, we construct a pose cost volume by uniformly sampling pose hypotheses around a prior pose to represent the pose probability distribution. Each cost of the volume measures the degree of alignment between the projected and predicted silhouettes. We select the pose with maximum value as the coarse pose. In the fine pose estimation stage, a particle filtering method incorporating a multi-beam tracking approach is used to efficiently explore the hypothesis space and obtain the final pose estimation. To further facilitate research in this field, we release two datasets with LoD1 city models covering 10.7 km , along with real RGB queries and ground-truth pose annotations. Experimental results show that LoD-Loc v2 improves estimation accuracy with high-LoD models and enables localization with low-LoD models for the first time. Moreover, it outperforms state-of-the-art baselines by large margins, even surpassing texture-model-based methods, and broadens the convergence basin to accommodate larger prior errors.

World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model

Authors:Yupeng Zheng, Pengxuan Yang, Zebin Xing, Qichao Zhang, Yuhang Zheng, Yinfeng Gao, Pengfei Li, Teng Zhang, Zhongpu Xia, Peng Jia, Dongbin Zhao

Date:2025-07-01 09:36:38

End-to-end autonomous driving directly generates planning trajectories from raw sensor data, yet it typically relies on costly perception supervision to extract scene information. A critical research challenge arises: constructing an informative driving world model to enable perception annotation-free, end-to-end planning via self-supervised learning. In this paper, we present World4Drive, an end-to-end autonomous driving framework that employs vision foundation models to build latent world models for generating and evaluating multi-modal planning trajectories. Specifically, World4Drive first extracts scene features, including driving intention and world latent representations enriched with spatial-semantic priors provided by vision foundation models. It then generates multi-modal planning trajectories based on current scene features and driving intentions and predicts multiple intention-driven future states within the latent space. Finally, it introduces a world model selector module to evaluate and select the best trajectory. We achieve perception annotation-free, end-to-end planning through self-supervised alignment between actual future observations and predicted observations reconstructed from the latent space. World4Drive achieves state-of-the-art performance without manual perception annotations on both the open-loop nuScenes and closed-loop NavSim benchmarks, demonstrating an 18.1\% relative reduction in L2 error, 46.7% lower collision rate, and 3.75 faster training convergence. Codes will be accessed at https://github.com/ucaszyp/World4Drive.

A Practical Guide to Interpretable Role-Based Clustering in Multi-Layer Financial Networks

Authors:Christian Franssen, Iman van Lelyveld, Bernd Heidergott

Date:2025-07-01 09:30:31

Understanding the functional roles of financial institutions within interconnected markets is critical for effective supervision, systemic risk assessment, and resolution planning. We propose an interpretable role-based clustering approach for multi-layer financial networks, designed to identify the functional positions of institutions across different market segments. Our method follows a general clustering framework defined by proximity measures, cluster evaluation criteria, and algorithm selection. We construct explainable node embeddings based on egonet features that capture both direct and indirect trading relationships within and across market layers. Using transaction-level data from the ECB's Money Market Statistical Reporting (MMSR), we demonstrate how the approach uncovers heterogeneous institutional roles such as market intermediaries, cross-segment connectors, and peripheral lenders or borrowers. The results highlight the flexibility and practical value of role-based clustering in analyzing financial networks and understanding institutional behavior in complex market structures.

A waveform and time digitization mainboard prototype for TRIDENT neutrino experiment

Authors:Guangping Zhang, Yong Yang, Donglian Xu

Date:2025-07-01 08:13:00

TRIDENT experiment is a planned neutrino observatory at West Pacific Ocean to search for astrophysical neutrinos. The final full-scale detector array is expected to have about 1000 strings, each of which consists of 20 hybrid digital optical modules. Each module will contain a number of photomultiplier tubes (PMTs) and silicon photomultipliers to detect Cherenkov lights. In this paper, we present a custom designed digitization mainboard for the TRIDENT experiment. It includes PMT waveform digitization at a sampling rate of 125 MS/s using commercial analog-to-digital converters, and time digitization using time-to-digital converters implemented in a Field Programmable Gate Array (FPGA). We present its design and the first performances.