planning - 2025-09-08

ProToM: Promoting Prosocial Behaviour via Theory of Mind-Informed Feedback

Authors:Matteo Bortoletto, Yichao Zhou, Lance Ying, Tianmin Shu, Andreas Bulling

Date:2025-09-05 13:30:17

While humans are inherently social creatures, the challenge of identifying when and how to assist and collaborate with others - particularly when pursuing independent goals - can hinder cooperation. To address this challenge, we aim to develop an AI system that provides useful feedback to promote prosocial behaviour - actions that benefit others, even when not directly aligned with one's own goals. We introduce ProToM, a Theory of Mind-informed facilitator that promotes prosocial actions in multi-agent systems by providing targeted, context-sensitive feedback to individual agents. ProToM first infers agents' goals using Bayesian inverse planning, then selects feedback to communicate by maximising expected utility, conditioned on the inferred goal distribution. We evaluate our approach against baselines in two multi-agent environments: Doors, Keys, and Gems, as well as Overcooked. Our results suggest that state-of-the-art large language and reasoning models fall short of communicating feedback that is both contextually grounded and well-timed - leading to higher communication overhead and task speedup. In contrast, ProToM provides targeted and helpful feedback, achieving a higher success rate, shorter task completion times, and is consistently preferred by human users.

MultiSurv: A Multimodal Deep Survival Framework for Prostrate and Bladder Cancer

Authors:Noorul Wahab, Ethar Alzaid, Jiaqi Lv, Adam Shephard, Shan E Ahmed Raza

Date:2025-09-05 11:52:53

Accurate prediction of time-to-event outcomes is a central challenge in oncology, with significant implications for treatment planning and patient management. In this work, we present MultiSurv, a multimodal deep survival model utilising DeepHit with a projection layer and inter-modality cross-attention, which integrates heterogeneous patient data, including clinical, MRI, RNA-seq and whole-slide pathology features. The model is designed to capture complementary prognostic signals across modalities and estimate individualised time-to-biochemical recurrence in prostate cancer and time-to-cancer recurrence in bladder cancer. Our approach was evaluated in the context of the CHIMERA Grand Challenge, across two of the three provided tasks. For Task 1 (prostate cancer bio-chemical recurrence prediction), the proposed framework achieved a concordance index (C-index) of 0.843 on 5-folds cross-validation and 0.818 on CHIMERA development set, demonstrating robust discriminatory ability. For Task 3 (bladder cancer recurrence prediction), the model obtained a C-index of 0.662 on 5-folds cross-validation and 0.457 on development set, highlighting its adaptability and potential for clinical translation. These results suggest that leveraging multimodal integration with deep survival learning provides a promising pathway toward personalised risk stratification in prostate and bladder cancer. Beyond the challenge setting, our framework is broadly applicable to survival prediction tasks involving heterogeneous biomedical data.

Results and plans to apply interferometry to air shower observations at the Pierre Auger Observatory

Authors:Pim van Dillen

Date:2025-09-05 09:57:47

By analysing the radio emissions from air showers using interferometry, we can estimate their properties. In this contribution, we apply interferometry to reconstruct air-shower parameters based on measurements taken with the Auger Engineering Radio Array (AERA) at the Pierre Auger Observatory. This reconstruction method is achievable at AERA through precise clock synchronisation with a beacon and an accurate survey of the station locations. Interferometry has been applied to several thousand inclined air-shower observations for the first time, which allows for tests on the performance of air-shower geometry reconstruction, recovery of the radio signal from low-energy air showers, and methods to study the polarisation of the radio-emission mechanisms. Additionally, in this contribution, we will also provide an overview of efforts to enable interferometry for the recently installed radio detectors that are part of the AugerPrime upgrade.

Ground-Aware Octree-A* Hybrid Path Planning for Memory-Efficient 3D Navigation of Ground Vehicles

Authors:Byeong-Il Ham, Hyun-Bin Kim, Kyung-Soo Kim

Date:2025-09-05 09:15:20

In this paper, we propose a 3D path planning method that integrates the A* algorithm with the octree structure. Unmanned Ground Vehicles (UGVs) and legged robots have been extensively studied, enabling locomotion across a variety of terrains. Advances in mobility have enabled obstacles to be regarded not only as hindrances to be avoided, but also as navigational aids when beneficial. A modified 3D A* algorithm generates an optimal path by leveraging obstacles during the planning process. By incorporating a height-based penalty into the cost function, the algorithm enables the use of traversable obstacles to aid locomotion while avoiding those that are impassable, resulting in more efficient and realistic path generation. The octree-based 3D grid map achieves compression by merging high-resolution nodes into larger blocks, especially in obstacle-free or sparsely populated areas. This reduces the number of nodes explored by the A* algorithm, thereby improving computational efficiency and memory usage, and supporting real-time path planning in practical environments. Benchmark results demonstrate that the use of octree structure ensures an optimal path while significantly reducing memory usage and computation time.

On the quadratic barycentric transport problem

Authors:Nathael Gozlan, Thibaut Le Gouic, Paul-Marie Samson

Date:2025-09-05 08:58:34

We investigate the structure of optimal transport plans, dual optimizers, and geodesic paths for the quadratic barycentric transport problem.

An information metric for comparing and assessing informative interim decisions in sequential clinical trials

Authors:G. Caruso, W. F. Rosenberger, P. Mozgunov, N. Flournoy

Date:2025-09-05 08:23:14

Group sequential designs enable interim analyses and potential early stopping for efficacy or futility. While these adaptations improve trial efficiency and ethical considerations, they also introduce bias into the adapted analyses. We demonstrate how failing to account for informative interim decisions in the analysis can substantially affect posterior estimates of the treatment effect, often resulting in overly optimistic credible intervals aligned with the stopping decision. Drawing on information theory, we use the Kullback-Leibler divergence to quantify this distortion and highlight its use for post-hoc evaluation of informative interim decisions, with a focus on end-of-study inference. Unlike pointwise comparisons, this measure provides an integrated summary of this distortion on the whole parameter space. By comparing alternative decision boundaries and prior specifications, we illustrate how this measure can improve the understanding of trial results and inform the planning of future adaptive studies. We also introduce an expected version of this metric to support clinicians in choosing decision boundaries. This guidance complements traditional strategies based on type-I error rate control by offering insights into the distortion introduced to the treatment effect at each interim phase. The use of this pre-experimental measure is finally illustrated in a group sequential trial for evaluating a treatment for central nervous system disorders.

Cryo-RL: automating prostate cancer cryoablation planning with reinforcement learning

Authors:Trixia Simangan, Ahmed Nadeem Abbasi, Yipeng Hu, Shaheer U. Saeed

Date:2025-09-05 08:06:08

Cryoablation is a minimally invasive localised treatment for prostate cancer that destroys malignant tissue during de-freezing, while sparing surrounding healthy structures. Its success depends on accurate preoperative planning of cryoprobe placements to fully cover the tumour and avoid critical anatomy. This planning is currently manual, expertise-dependent, and time-consuming, leading to variability in treatment quality and limited scalability. In this work, we introduce Cryo-RL, a reinforcement learning framework that models cryoablation planning as a Markov decision process and learns an optimal policy for cryoprobe placement. Within a simulated environment that models clinical constraints and stochastic intraoperative variability, an agent sequentially selects cryoprobe positions and ice sphere diameters. Guided by a reward function based on tumour coverage, this agent learns a cryoablation strategy that leads to optimal cryoprobe placements without the need for any manually-designed plans. Evaluated on 583 retrospective prostate cancer cases, Cryo-RL achieved over 8 percentage-point Dice improvements compared with the best automated baselines, based on geometric optimisation, and matched human expert performance while requiring substantially less planning time. These results highlight the potential of reinforcement learning to deliver clinically viable, reproducible, and efficient cryoablation plans.

Multi-modal Uncertainty Robust Tree Cover Segmentation For High-Resolution Remote Sensing Images

Authors:Yuanyuan Gui, Wei Li, Yinjian Wang, Xiang-Gen Xia, Mauro Marty, Christian Ginzler, Zuyuan Wang

Date:2025-09-05 07:32:42

Recent advances in semantic segmentation of multi-modal remote sensing images have significantly improved the accuracy of tree cover mapping, supporting applications in urban planning, forest monitoring, and ecological assessment. Integrating data from multiple modalities-such as optical imagery, light detection and ranging (LiDAR), and synthetic aperture radar (SAR)-has shown superior performance over single-modality methods. However, these data are often acquired days or even months apart, during which various changes may occur, such as vegetation disturbances (e.g., logging, and wildfires) and variations in imaging quality. Such temporal misalignments introduce cross-modal uncertainty, especially in high-resolution imagery, which can severely degrade segmentation accuracy. To address this challenge, we propose MURTreeFormer, a novel multi-modal segmentation framework that mitigates and leverages aleatoric uncertainty for robust tree cover mapping. MURTreeFormer treats one modality as primary and others as auxiliary, explicitly modeling patch-level uncertainty in the auxiliary modalities via a probabilistic latent representation. Uncertain patches are identified and reconstructed from the primary modality's distribution through a VAE-based resampling mechanism, producing enhanced auxiliary features for fusion. In the decoder, a gradient magnitude attention (GMA) module and a lightweight refinement head (RH) are further integrated to guide attention toward tree-like structures and to preserve fine-grained spatial details. Extensive experiments on multi-modal datasets from Shanghai and Zurich demonstrate that MURTreeFormer significantly improves segmentation performance and effectively reduces the impact of temporally induced aleatoric uncertainty.

Comparative Evaluation of Traditional and Deep Learning Feature Matching Algorithms using Chandrayaan-2 Lunar Data

Authors:R. Makharia, J. G. Singla, Amitabh, N. Dube, H. Sharma

Date:2025-09-05 03:10:00

Accurate image registration is critical for lunar exploration, enabling surface mapping, resource localization, and mission planning. Aligning data from diverse lunar sensors -- optical (e.g., Orbital High Resolution Camera, Narrow and Wide Angle Cameras), hyperspectral (Imaging Infrared Spectrometer), and radar (e.g., Dual-Frequency Synthetic Aperture Radar, Selene/Kaguya mission) -- is challenging due to differences in resolution, illumination, and sensor distortion. We evaluate five feature matching algorithms: SIFT, ASIFT, AKAZE, RIFT2, and SuperGlue (a deep learning-based matcher), using cross-modality image pairs from equatorial and polar regions. A preprocessing pipeline is proposed, including georeferencing, resolution alignment, intensity normalization, and enhancements like adaptive histogram equalization, principal component analysis, and shadow correction. SuperGlue consistently yields the lowest root mean square error and fastest runtimes. Classical methods such as SIFT and AKAZE perform well near the equator but degrade under polar lighting. The results highlight the importance of preprocessing and learning-based approaches for robust lunar image registration across diverse conditions.

Hierarchical Reduced-Order Model Predictive Control for Robust Locomotion on Humanoid Robots

Authors:Adrian B. Ghansah, Sergio A. Esteban, Aaron D. Ames

Date:2025-09-05 00:31:32

As humanoid robots enter real-world environments, ensuring robust locomotion across diverse environments is crucial. This paper presents a computationally efficient hierarchical control framework for humanoid robot locomotion based on reduced-order models -- enabling versatile step planning and incorporating arm and torso dynamics to better stabilize the walking. At the high level, we use the step-to-step dynamics of the ALIP model to simultaneously optimize over step periods, step lengths, and ankle torques via nonlinear MPC. The ALIP trajectories are used as references to a linear MPC framework that extends the standard SRB-MPC to also include simplified arm and torso dynamics. We validate the performance of our approach through simulation and hardware experiments on the Unitree G1 humanoid robot. In the proposed framework the high-level step planner runs at 40 Hz and the mid-level MPC at 500 Hz using the onboard mini-PC. Adaptive step timing increased the push recovery success rate by 36%, and the upper body control improved the yaw disturbance rejection. We also demonstrate robust locomotion across diverse indoor and outdoor terrains, including grass, stone pavement, and uneven gym mats.

Kete: Predicting Known Minor Bodies in Images

Authors:D. Dahlen, Y. G. Kwon, J. R. Masiero, T. Spahr, A. K. Mainzer

Date:2025-09-04 21:29:51

Kete is an open-source software package for quickly and accurately predicting the positions and magnitudes of asteroids and comets in large-scale, all-sky surveys. It can predict observable objects for any ground or space-based telescope. Kete contains a collection of tools, including simple optical and thermal modeling, $n$-body orbit calculations, and custom multi-threaded SPICE kernel support. It can be used for observation planning, pre-discovery of detections at a large scale, and labeling known solar system objects in images. Here we demonstrate some of the capabilities by predicting all observations of every numbered asteroid seen by the Wide-field Infrared Survey Explorer (WISE) and Zwicky Transient Facility (ZTF) surveys during single years of their operations, predicting locations and magnitudes of 756,999 asteroids in over 11 million images.

Planning from Point Clouds over Continuous Actions for Multi-object Rearrangement

Authors:Kallol Saha, Amber Li, Angela Rodriguez-Izquierdo, Lifan Yu, Ben Eisner, Maxim Likhachev, David Held

Date:2025-09-04 20:07:15

Long-horizon planning for robot manipulation is a challenging problem that requires reasoning about the effects of a sequence of actions on a physical 3D scene. While traditional task planning methods are shown to be effective for long-horizon manipulation, they require discretizing the continuous state and action space into symbolic descriptions of objects, object relationships, and actions. Instead, we propose a hybrid learning-and-planning approach that leverages learned models as domain-specific priors to guide search in high-dimensional continuous action spaces. We introduce SPOT: Search over Point cloud Object Transformations, which plans by searching for a sequence of transformations from an initial scene point cloud to a goal-satisfying point cloud. SPOT samples candidate actions from learned suggesters that operate on partially observed point clouds, eliminating the need to discretize actions or object relationships. We evaluate SPOT on multi-object rearrangement tasks, reporting task planning success and task execution success in both simulation and real-world environments. Our experiments show that SPOT generates successful plans and outperforms a policy-learning approach. We also perform ablations that highlight the importance of search-based planning.

Wasserstein Distributionally Robust Adaptive Covariance Steering

Authors:Aditya Gahlawat, Vivek Khatana, Duo Wang, Sambhu H. Karumanchi, Naira Hovakimyan, Petros Voulgaris

Date:2025-09-04 18:18:33

We present a methodology for predictable and safe covariance steering control of uncertain nonlinear stochastic processes. The systems under consideration are subject to general uncertainties, which include unbounded random disturbances (aleatoric uncertainties) and incomplete model knowledge (state-dependent epistemic uncertainties). These general uncertainties lead to temporally evolving state distributions that are entirely unknown, can have arbitrary shapes, and may diverge unquantifiably from expected behaviors, leading to unpredictable and unsafe behaviors. Our method relies on an $\mathcal{L}_1$-adaptive control architecture that ensures robust control of uncertain stochastic processes while providing Wasserstein metric certificates in the space of probability measures. We show how these distributional certificates can be incorporated into the high-level covariance control steering to guarantee safe control. Unlike existing distributionally robust planning and control methodologies, our approach avoids difficult-to-verify requirements like the availability of finite samples from the true underlying distribution or an a priori knowledge of time-varying ambiguity sets to which the state distributions are assumed to belong.

$^{171}$Yb Reference Data

Authors:Ronen M. Kroeze, Sofus Laguna Kristensen, Sebastian Pucher

Date:2025-09-04 17:38:45

Ytterbium-171 is a versatile atomic species often used in quantum optics, precision metrology, and quantum computing. Consolidated atomic data is essential for the planning, execution, and evaluation of experiments. In this reference, we present physical and optical properties of neutral $^{171}$Yb relevant to these applications. We emphasize experimental results and supplement these with theoretical estimates. We present equations to convert values and derive important parameters. Tabulated results include key parameters for commonly used transitions in $^{171}$Yb (${}^1\mathrm{S}_0\rightarrow{}^1\mathrm{P}_1$, ${}^1\mathrm{S}_0\rightarrow{}^3\mathrm{P}_{0,1,2}\,$, ${}^3\mathrm{P}_{0,2}\rightarrow{}^3\mathrm{S}_1$, and ${}^3\mathrm{P}_0\rightarrow{}^3\mathrm{D}_1$). This dataset serves as an up-to-date reference for studies involving fermionic $^{171}$Yb.

SAFE--MA--RRT: Multi-Agent Motion Planning with Data-Driven Safety Certificates

Authors:Babak Esmaeili, Hamidreza Modares

Date:2025-09-04 17:34:59

This paper proposes a fully data-driven motion-planning framework for homogeneous linear multi-agent systems that operate in shared, obstacle-filled workspaces without access to explicit system models. Each agent independently learns its closed-loop behavior from experimental data by solving convex semidefinite programs that generate locally invariant ellipsoids and corresponding state-feedback gains. These ellipsoids, centered along grid-based waypoints, certify the dynamic feasibility of short-range transitions and define safe regions of operation. A sampling-based planner constructs a tree of such waypoints, where transitions are allowed only when adjacent ellipsoids overlap, ensuring invariant-to-invariant transitions and continuous safety. All agents expand their trees simultaneously and are coordinated through a space-time reservation table that guarantees inter-agent safety by preventing simultaneous occupancy and head-on collisions. Each successful edge in the tree is equipped with its own local controller, enabling execution without re-solving optimization problems at runtime. The resulting trajectories are not only dynamically feasible but also provably safe with respect to both environmental constraints and inter-agent collisions. Simulation results demonstrate the effectiveness of the approach in synthesizing synchronized, safe trajectories for multiple agents under shared dynamics and constraints, using only data and convex optimization tools.

Parking Availability Prediction via Fusing Multi-Source Data with A Self-Supervised Learning Enhanced Spatio-Temporal Inverted Transformer

Authors:Yin Huang, Yongqi Dong, Youhua Tang, Li Li

Date:2025-09-04 16:22:29

The rapid growth of private car ownership has worsened the urban parking predicament, underscoring the need for accurate and effective parking availability prediction to support urban planning and management. To address key limitations in modeling spatio-temporal dependencies and exploiting multi-source data for parking availability prediction, this study proposes a novel approach with SST-iTransformer. The methodology leverages K-means clustering to establish parking cluster zones (PCZs), extracting and integrating traffic demand characteristics from various transportation modes (i.e., metro, bus, online ride-hailing, and taxi) associated with the targeted parking lots. Upgraded on vanilla iTransformer, SST-iTransformer integrates masking-reconstruction-based pretext tasks for self-supervised spatio-temporal representation learning, and features an innovative dual-branch attention mechanism: Series Attention captures long-term temporal dependencies via patching operations, while Channel Attention models cross-variate interactions through inverted dimensions. Extensive experiments using real-world data from Chengdu, China, demonstrate that SST-iTransformer outperforms baseline deep learning models (including Informer, Autoformer, Crossformer, and iTransformer), achieving state-of-the-art performance with the lowest mean squared error (MSE) and competitive mean absolute error (MAE). Comprehensive ablation studies quantitatively reveal the relative importance of different data sources: incorporating ride-hailing data provides the largest performance gains, followed by taxi, whereas fixed-route transit features (bus/metro) contribute marginally. Spatial correlation analysis further confirms that excluding historical data from correlated parking lots within PCZs leads to substantial performance degradation, underscoring the importance of modeling spatial dependencies.

FaaSGuard: Secure CI/CD for Serverless Applications -- An OpenFaaS Case Study

Authors:Amine Barrak, Emna Ksontini, Ridouane Atike, Fehmi Jaafar

Date:2025-09-04 15:48:13

Serverless computing significantly alters software development by abstracting infrastructure management and enabling rapid, modular, event-driven deployments. Despite its benefits, the distinct characteristics of serverless functions, such as ephemeral execution and fine-grained scalability, pose unique security challenges, particularly in open-source platforms like OpenFaaS. Existing approaches typically address isolated phases of the DevSecOps lifecycle, lacking an integrated and comprehensive security strategy. To bridge this gap, we propose FaaSGuard, a unified DevSecOps pipeline explicitly designed for open-source serverless environments. FaaSGuard systematically embeds lightweight, fail-closed security checks into every stage of the development lifecycle-planning, coding, building, deployment, and monitoring-effectively addressing threats such as injection attacks, hard-coded secrets, and resource exhaustion. We validate our approach empirically through a case study involving 20 real-world serverless functions from public GitHub repositories. Results indicate that FaaSGuard effectively detects and prevents critical vulnerabilities, demonstrating high precision (95%) and recall (91%) without significant disruption to established CI/CD practices.

Improving Robustness of AlphaZero Algorithms to Test-Time Environment Changes

Authors:Isidoro Tamassia, Wendelin Böhmer

Date:2025-09-04 15:38:37

The AlphaZero framework provides a standard way of combining Monte Carlo planning with prior knowledge provided by a previously trained policy-value neural network. AlphaZero usually assumes that the environment on which the neural network was trained will not change at test time, which constrains its applicability. In this paper, we analyze the problem of deploying AlphaZero agents in potentially changed test environments and demonstrate how the combination of simple modifications to the standard framework can significantly boost performance, even in settings with a low planning budget available. The code is publicly available on GitHub.

Differential Morphological Profile Neural Networks for Semantic Segmentation

Authors:David Huangal, J. Alex Hurt

Date:2025-09-04 14:44:18

Semantic segmentation of overhead remote sensing imagery enables applications in mapping, urban planning, and disaster response. State-of-the-art segmentation networks are typically developed and tuned on ground-perspective photographs and do not directly address remote sensing challenges such as extreme scale variation, foreground-background imbalance, and large image sizes. We explore the incorporation of the differential morphological profile (DMP), a multi-scale shape extraction method based on grayscale morphology, into modern segmentation networks. Prior studies have shown that the DMP can provide critical shape information to Deep Neural Networks to enable superior detection and classification performance in overhead imagery. In this work, we extend prior DMPNet work beyond classification and object detection by integrating DMP features into three state-of-the-art convolutional and transformer semantic segmentation architectures. We utilize both direct input, which adapts the input stem of feature extraction architectures to accept DMP channels, and hybrid architectures, a dual-stream design that fuses RGB and DMP encoders. Using the iSAID benchmark dataset, we evaluate a variety of DMP differentials and structuring element shapes to more effectively provide shape information to the model. Our results show that while non-DMP models generally outperform the direct-input variants, hybrid DMP consistently outperforms direct-input and is capable of surpassing a non-DMP model on mIoU, F1, and Recall.

Lightweight Kinematic and Static Modeling of Cable-Driven Continuum Robots via Actuation-Space Energy Formulation

Authors:Ke Wu, Yuhao Wang, Kevin Henry, Cesare Stefanini, Gang Zheng

Date:2025-09-04 11:33:53

Continuum robots, inspired by octopus arms and elephant trunks, combine dexterity with intrinsic compliance, making them well suited for unstructured and confined environments. Yet their continuously deformable morphology poses challenges for motion planning and control, calling for accurate but lightweight models. We propose the Lightweight Actuation Space Energy Modeling (LASEM) framework for cable driven continuum robots, which formulates actuation potential energy directly in actuation space. LASEM yields an analytical forward model derived from geometrically nonlinear beam and rod theories via Hamilton's principle, while avoiding explicit modeling of cable backbone contact. It accepts both force and displacement inputs, thereby unifying kinematic and static formulations. Assuming the friction is neglected, the framework generalizes to nonuniform geometries, arbitrary cable routings, distributed loading and axial extensibility, while remaining computationally efficient for real-time use. Numerical simulations validate its accuracy, and a semi-analytical iterative scheme is developed for inverse kinematics. To address discretization in practical robots, LASEM further reformulates the functional minimization as a numerical optimization, which also naturally incorporates cable potential energy without explicit contact modeling.

Hybrid Reinforcement Learning and Search for Flight Trajectory Planning

Authors:Alberto Luise, Michele Lombardi, Florent Teichteil Koenigsbuch

Date:2025-09-04 11:01:43

This paper explores the combination of Reinforcement Learning (RL) and search-based path planners to speed up the optimization of flight paths for airliners, where in case of emergency a fast route re-calculation can be crucial. The fundamental idea is to train an RL Agent to pre-compute near-optimal paths based on location and atmospheric data and use those at runtime to constrain the underlying path planning solver and find a solution within a certain distance from the initial guess. The approach effectively reduces the size of the solver's search space, significantly speeding up route optimization. Although global optimality is not guaranteed, empirical results conducted with Airbus aircraft's performance models show that fuel consumption remains nearly identical to that of an unconstrained solver, with deviations typically within 1%. At the same time, computation speed can be improved by up to 50% as compared to using a conventional solver alone.

Object-Reconstruction-Aware Whole-body Control of Mobile Manipulators

Authors:Fatih Dursun, Bruno Vilhena Adorno, Simon Watson, Wei Pan

Date:2025-09-04 10:52:27

Object reconstruction and inspection tasks play a crucial role in various robotics applications. Identifying paths that reveal the most unknown areas of the object becomes paramount in this context, as it directly affects efficiency, and this problem is known as the view path planning problem. Current methods often use sampling-based path planning techniques, evaluating potential views along the path to enhance reconstruction performance. However, these methods are computationally expensive as they require evaluating several candidate views on the path. To this end, we propose a computationally efficient solution that relies on calculating a focus point in the most informative (unknown) region and having the robot maintain this point in the camera field of view along the path. We incorporated this strategy into the whole-body control of a mobile manipulator employing a visibility constraint without the need for an additional path planner. We conducted comprehensive and realistic simulations using a large dataset of 114 diverse objects of varying sizes from 57 categories to compare our method with a sampling-based planning strategy using Bayesian data analysis. Furthermore, we performed real-world experiments with an 8-DoF mobile manipulator to demonstrate the proposed method's performance in practice. Our results suggest that there is no significant difference in object coverage and entropy. In contrast, our method is approximately nine times faster than the baseline sampling-based method in terms of the average time the robot spends between views.

Keypoint-based Diffusion for Robotic Motion Planning on the NICOL Robot

Authors:Lennart Clasmeier, Jan-Gerrit Habekost, Connor Gäde, Philipp Allgeuer, Stefan Wermter

Date:2025-09-04 10:11:51

We propose a novel diffusion-based action model for robotic motion planning. Commonly, established numerical planning approaches are used to solve general motion planning problems, but have significant runtime requirements. By leveraging the power of deep learning, we are able to achieve good results in a much smaller runtime by learning from a dataset generated by these planners. While our initial model uses point cloud embeddings in the input to predict keypoint-based joint sequences in its output, we observed in our ablation study that it remained challenging to condition the network on the point cloud embeddings. We identified some biases in our dataset and refined it, which improved the model's performance. Our model, even without the use of the point cloud encodings, outperforms numerical models by an order of magnitude regarding the runtime, while reaching a success rate of up to 90% of collision free solutions on the test set.

FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction

Authors:Yifan Yang, Zhixiang Duan, Tianshi Xie, Fuyu Cao, Pinxi Shen, Peili Song, Piaopiao Jin, Guokang Sun, Shaoqing Xu, Yangwei You, Jingtai Liu

Date:2025-09-04 08:47:26

Robotic manipulation is a fundamental component of automation. However, traditional perception-planning pipelines often fall short in open-ended tasks due to limited flexibility, while the architecture of a single end-to-end Vision-Language-Action (VLA) offers promising capabilities but lacks crucial mechanisms for anticipating and recovering from failure. To address these challenges, we propose FPC-VLA, a dual-model framework that integrates VLA with a supervisor for failure prediction and correction. The supervisor evaluates action viability through vision-language queries and generates corrective strategies when risks arise, trained efficiently without manual labeling. A similarity-guided fusion module further refines actions by leveraging past predictions. Evaluation results on multiple simulation platforms (SIMPLER and LIBERO) and robot embodiments (WidowX, Google Robot, Franka) show that FPC-VLA outperforms state-of-the-art models in both zero-shot and fine-tuned settings. By activating the supervisor only at keyframes, our approach significantly increases task success rates with minimal impact on execution time. Successful real-world deployments on diverse, long-horizon tasks confirm FPC-VLA's strong generalization and practical utility for building more reliable autonomous systems.

Systematic Timing Leakage Analysis of NIST PQDSS Candidates: Tooling and Lessons Learned

Authors:Olivier Adjonyo, Sebastien Bardin, Emanuele Bellini, Gilbert Ndollane Dione, Mahmudul Faisal Al Ameen, Robert Merget, Frederic Recoules, Yanis Sellami

Date:2025-09-04 08:41:06

The PQDSS standardization process requires cryptographic primitives to be free from vulnerabilities, including timing and cache side-channels. Resistance to timing leakage is therefore an essential property, and achieving this typically relies on software implementations that follow constant-time principles. Moreover, ensuring that all implementations are constant-time is crucial for fair performance comparisons, as secure implementations often incur additional overhead. Such analysis also helps identify scheme proposals that are inherently difficult to implement in constant time. Because constant-time properties can be broken during compilation, it is often necessary to analyze the compiled binary directly. Since manual binary analysis is extremely challenging, automated analysis becomes highly important. Although several tools exist to assist with such analysis, they often have usability limitations and are difficult to set up correctly. To support the developers besides the NIST committee in verifying candidates, we developed a toolchain that automates configuration, execution, and result analysis for several widely used constant-time analysis tools. We selected TIMECOP and Binsec/Rel2 to verify constant-time policy compliance at the binary level, and dudect and RTLF to detect side-channel vulnerabilities through statistical analysis of execution time behavior. We demonstrate its effectiveness and practicability by evaluating the NIST PQDSS round 1 and round 2 implementations. We reported 26 issues in total to the respective developers, and 5 of them have already been fixed. We also discuss our different findings, as well as the benefits of shortcomings of the different tools.

Strengthening national capability in urban climate science: an Australian perspective

Authors:Negin Nazarian, Andy J Pitman, Mathew J Lipson, Melissa A Hart, Helen Cleugh, Ian Harman, Marcus J Thatcher, Annette L Hirsch, Giovanni Di Virgilio, Matthew L Riley, Nigel Tapper, Jason P Evans, Christian Jakob, Pascal Perez

Date:2025-09-04 07:47:49

Cities are experiencing significant warming and more frequent climate extremes, raising risks for over 90% of Australians living in cities. Yet many of our tools for climate prediction and projection lack accurate representations of these environments. We also lack the observations and datasets needed to evaluate model performance. This paper identifies critical gaps in Australias current capability, showing how they undermine climate impact and risk assessments in cities and may lead to poorly designed adaptation and mitigation strategies. These gaps, and the recommendations to address them, were identified through consultation with experts across research institutes, universities, two ARC Centres of Excellence, federal and state governments, and private agencies. Our recommendations span four key areas: city descriptive datasets, integrated observations, fit for purpose models, and a coordinated community of research and practice. Urgent action is needed to tailor models to Australia's unique urban landscapes and climates. This requires comprehensive, nationally consistent, high resolution datasets that capture the form, fabric, and function of contemporary and future cities. It also requires filling systematic gaps in integrated networks of urban climate observations for evaluation and benchmarking. At the same time, scientific understanding of key urban processes that influence weather and climate must advance, alongside improvements in their representation in physical models. This can be achieved through a national community of research and practice that codesigns and oversees an implementation plan, integrated with infrastructure such as ACCESS NRI and AURIN. Building this capability will enable us to answer critical questions about the interaction between cities and climate, protecting Australias urban populations and ensuring a resilient future.

Handling Infinite Domain Parameters in Planning Through Best-First Search with Delayed Partial Expansions

Authors:Ángel Aso-Mollar, Diego Aineto, Enrico Scala, Eva Onaindia

Date:2025-09-04 07:27:27

In automated planning, control parameters extend standard action representations through the introduction of continuous numeric decision variables. Existing state-of-the-art approaches have primarily handled control parameters as embedded constraints alongside other temporal and numeric restrictions, and thus have implicitly treated them as additional constraints rather than as decision points in the search space. In this paper, we propose an efficient alternative that explicitly handles control parameters as true decision points within a systematic search scheme. We develop a best-first, heuristic search algorithm that operates over infinite decision spaces defined by control parameters and prove a notion of completeness in the limit under certain conditions. Our algorithm leverages the concept of delayed partial expansion, where a state is not fully expanded but instead incrementally expands a subset of its successors. Our results demonstrate that this novel search algorithm is a competitive alternative to existing approaches for solving planning problems involving control parameters.

National social cost of carbon: An application of FUND

Authors:In Chang Hwang, Richard S. J. Tol

Date:2025-09-04 06:27:42

This paper presents a refined country-level integrated assessment model, FUND 3.9n, that extends the regional FUND 3.9 framework by incorporating sector-specific climate impact functions and parametric uncertainty analysis for 198 individual countries. The model enables estimation of the national social cost of carbon (NSCC), capturing heterogeneity across nations from economic structure, climate sensitivity, and population exposure. Our results demonstrate that both the NSCC and the global sum estimates are highly sensitive to damage specifications and preference parameters, including the pure rate of time preference and relative risk aversion. Compared to aggregated single-sector approaches, the disaggregated model with uncertainty yields higher values of the NSCC for low- and middle-income countries. The paper contributes to the literature by quantifying how sector-specific vulnerabilities and stochastic variability amplify climate damages and reshape global equity in the distribution of the NSCC. The NSCCs derived from our model offer policy-relevant metrics for adaptation planning, mitigation target setting, and equitable burden-sharing in international climate negotiations. This approach bridges the gap between globally harmonized carbon pricing and nationally differentiated climate impacts, providing a theoretically grounded and empirically rich framework for future climate policy design.

Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance

Authors:Neha Sunil, Megha Tippur, Arnau Saumell, Edward Adelson, Alberto Rodriguez

Date:2025-09-04 05:16:56

Manipulating clothing is challenging due to complex configurations, variable material dynamics, and frequent self-occlusion. Prior systems often flatten garments or assume visibility of key features. We present a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance to operate directly on crumpled and suspended garments. The correspondence model is trained on a custom, high-fidelity simulated dataset using a distributional loss that captures cloth symmetries and generates correspondence confidence estimates. These estimates guide a reactive state machine that adapts folding strategies based on perceptual uncertainty. In parallel, a visuotactile grasp affordance network, self-supervised using high-resolution tactile feedback, determines which regions are physically graspable. The same tactile classifier is used during execution for real-time grasp validation. By deferring action in low-confidence states, the system handles highly occluded table-top and in-air configurations. We demonstrate our task-agnostic grasp selection module in folding and hanging tasks. Moreover, our dense descriptors provide a reusable intermediate representation for other planning modalities, such as extracting grasp targets from human video demonstrations, paving the way for more generalizable and scalable garment manipulation.

Human Motion Video Generation: A Survey

Authors:Haiwei Xue, Xiangyang Luo, Zhanghao Hu, Xin Zhang, Xunzhi Xiang, Yuqin Dai, Jianzhuang Liu, Zhensong Zhang, Minglei Li, Jian Yang, Fei Ma, Zhiyong Wu, Changpeng Yang, Zonghong Dai, Fei Richard Yu

Date:2025-09-04 04:39:21

Human motion video generation has garnered significant research interest due to its broad applications, enabling innovations such as photorealistic singing heads or dynamic avatars that seamlessly dance to music. However, existing surveys in this field focus on individual methods, lacking a comprehensive overview of the entire generative process. This paper addresses this gap by providing an in-depth survey of human motion video generation, encompassing over ten sub-tasks, and detailing the five key phases of the generation process: input, motion planning, motion video generation, refinement, and output. Notably, this is the first survey that discusses the potential of large language models in enhancing human motion video generation. Our survey reviews the latest developments and technological trends in human motion video generation across three primary modalities: vision, text, and audio. By covering over two hundred papers, we offer a thorough overview of the field and highlight milestone works that have driven significant technological breakthroughs. Our goal for this survey is to unveil the prospects of human motion video generation and serve as a valuable resource for advancing the comprehensive applications of digital humans. A complete list of the models examined in this survey is available in Our Repository https://github.com/Winn1y/Awesome-Human-Motion-Video-Generation.