planning - 2025-09-07

$^{171}$Yb Reference Data

Authors:Ronen M. Kroeze, Sofus Laguna Kristensen, Sebastian Pucher
Date:2025-09-04 17:38:45

Ytterbium-171 is a versatile atomic species often used in quantum optics, precision metrology, and quantum computing. Consolidated atomic data is essential for the planning, execution, and evaluation of experiments. In this reference, we present physical and optical properties of neutral $^{171}$Yb relevant to these applications. We emphasize experimental results and supplement these with theoretical estimates. We present equations to convert values and derive important parameters. Tabulated results include key parameters for commonly used transitions in $^{171}$Yb (${}^1\mathrm{S}_0\rightarrow{}^1\mathrm{P}_1$, ${}^1\mathrm{S}_0\rightarrow{}^3\mathrm{P}_{0,1,2}\,$, ${}^3\mathrm{P}_{0,2}\rightarrow{}^3\mathrm{S}_1$, and ${}^3\mathrm{P}_0\rightarrow{}^3\mathrm{D}_1$). This dataset serves as an up-to-date reference for studies involving fermionic $^{171}$Yb.

SAFE--MA--RRT: Multi-Agent Motion Planning with Data-Driven Safety Certificates

Authors:Babak Esmaeili, Hamidreza Modares
Date:2025-09-04 17:34:59

This paper proposes a fully data-driven motion-planning framework for homogeneous linear multi-agent systems that operate in shared, obstacle-filled workspaces without access to explicit system models. Each agent independently learns its closed-loop behavior from experimental data by solving convex semidefinite programs that generate locally invariant ellipsoids and corresponding state-feedback gains. These ellipsoids, centered along grid-based waypoints, certify the dynamic feasibility of short-range transitions and define safe regions of operation. A sampling-based planner constructs a tree of such waypoints, where transitions are allowed only when adjacent ellipsoids overlap, ensuring invariant-to-invariant transitions and continuous safety. All agents expand their trees simultaneously and are coordinated through a space-time reservation table that guarantees inter-agent safety by preventing simultaneous occupancy and head-on collisions. Each successful edge in the tree is equipped with its own local controller, enabling execution without re-solving optimization problems at runtime. The resulting trajectories are not only dynamically feasible but also provably safe with respect to both environmental constraints and inter-agent collisions. Simulation results demonstrate the effectiveness of the approach in synthesizing synchronized, safe trajectories for multiple agents under shared dynamics and constraints, using only data and convex optimization tools.

Parking Availability Prediction via Fusing Multi-Source Data with A Self-Supervised Learning Enhanced Spatio-Temporal Inverted Transformer

Authors:Yin Huang, Yongqi Dong, Youhua Tang, Li Li
Date:2025-09-04 16:22:29

The rapid growth of private car ownership has worsened the urban parking predicament, underscoring the need for accurate and effective parking availability prediction to support urban planning and management. To address key limitations in modeling spatio-temporal dependencies and exploiting multi-source data for parking availability prediction, this study proposes a novel approach with SST-iTransformer. The methodology leverages K-means clustering to establish parking cluster zones (PCZs), extracting and integrating traffic demand characteristics from various transportation modes (i.e., metro, bus, online ride-hailing, and taxi) associated with the targeted parking lots. Upgraded on vanilla iTransformer, SST-iTransformer integrates masking-reconstruction-based pretext tasks for self-supervised spatio-temporal representation learning, and features an innovative dual-branch attention mechanism: Series Attention captures long-term temporal dependencies via patching operations, while Channel Attention models cross-variate interactions through inverted dimensions. Extensive experiments using real-world data from Chengdu, China, demonstrate that SST-iTransformer outperforms baseline deep learning models (including Informer, Autoformer, Crossformer, and iTransformer), achieving state-of-the-art performance with the lowest mean squared error (MSE) and competitive mean absolute error (MAE). Comprehensive ablation studies quantitatively reveal the relative importance of different data sources: incorporating ride-hailing data provides the largest performance gains, followed by taxi, whereas fixed-route transit features (bus/metro) contribute marginally. Spatial correlation analysis further confirms that excluding historical data from correlated parking lots within PCZs leads to substantial performance degradation, underscoring the importance of modeling spatial dependencies.

FaaSGuard: Secure CI/CD for Serverless Applications -- An OpenFaaS Case Study

Authors:Amine Barrak, Emna Ksontini, Ridouane Atike, Fehmi Jaafar
Date:2025-09-04 15:48:13

Serverless computing significantly alters software development by abstracting infrastructure management and enabling rapid, modular, event-driven deployments. Despite its benefits, the distinct characteristics of serverless functions, such as ephemeral execution and fine-grained scalability, pose unique security challenges, particularly in open-source platforms like OpenFaaS. Existing approaches typically address isolated phases of the DevSecOps lifecycle, lacking an integrated and comprehensive security strategy. To bridge this gap, we propose FaaSGuard, a unified DevSecOps pipeline explicitly designed for open-source serverless environments. FaaSGuard systematically embeds lightweight, fail-closed security checks into every stage of the development lifecycle-planning, coding, building, deployment, and monitoring-effectively addressing threats such as injection attacks, hard-coded secrets, and resource exhaustion. We validate our approach empirically through a case study involving 20 real-world serverless functions from public GitHub repositories. Results indicate that FaaSGuard effectively detects and prevents critical vulnerabilities, demonstrating high precision (95%) and recall (91%) without significant disruption to established CI/CD practices.

Improving Robustness of AlphaZero Algorithms to Test-Time Environment Changes

Authors:Isidoro Tamassia, Wendelin Böhmer
Date:2025-09-04 15:38:37

The AlphaZero framework provides a standard way of combining Monte Carlo planning with prior knowledge provided by a previously trained policy-value neural network. AlphaZero usually assumes that the environment on which the neural network was trained will not change at test time, which constrains its applicability. In this paper, we analyze the problem of deploying AlphaZero agents in potentially changed test environments and demonstrate how the combination of simple modifications to the standard framework can significantly boost performance, even in settings with a low planning budget available. The code is publicly available on GitHub.

Differential Morphological Profile Neural Networks for Semantic Segmentation

Authors:David Huangal, J. Alex Hurt
Date:2025-09-04 14:44:18

Semantic segmentation of overhead remote sensing imagery enables applications in mapping, urban planning, and disaster response. State-of-the-art segmentation networks are typically developed and tuned on ground-perspective photographs and do not directly address remote sensing challenges such as extreme scale variation, foreground-background imbalance, and large image sizes. We explore the incorporation of the differential morphological profile (DMP), a multi-scale shape extraction method based on grayscale morphology, into modern segmentation networks. Prior studies have shown that the DMP can provide critical shape information to Deep Neural Networks to enable superior detection and classification performance in overhead imagery. In this work, we extend prior DMPNet work beyond classification and object detection by integrating DMP features into three state-of-the-art convolutional and transformer semantic segmentation architectures. We utilize both direct input, which adapts the input stem of feature extraction architectures to accept DMP channels, and hybrid architectures, a dual-stream design that fuses RGB and DMP encoders. Using the iSAID benchmark dataset, we evaluate a variety of DMP differentials and structuring element shapes to more effectively provide shape information to the model. Our results show that while non-DMP models generally outperform the direct-input variants, hybrid DMP consistently outperforms direct-input and is capable of surpassing a non-DMP model on mIoU, F1, and Recall.

Lightweight Kinematic and Static Modeling of Cable-Driven Continuum Robots via Actuation-Space Energy Formulation

Authors:Ke Wu, Yuhao Wang, Kevin Henry, Cesare Stefanini, Gang Zheng
Date:2025-09-04 11:33:53

Continuum robots, inspired by octopus arms and elephant trunks, combine dexterity with intrinsic compliance, making them well suited for unstructured and confined environments. Yet their continuously deformable morphology poses challenges for motion planning and control, calling for accurate but lightweight models. We propose the Lightweight Actuation Space Energy Modeling (LASEM) framework for cable driven continuum robots, which formulates actuation potential energy directly in actuation space. LASEM yields an analytical forward model derived from geometrically nonlinear beam and rod theories via Hamilton's principle, while avoiding explicit modeling of cable backbone contact. It accepts both force and displacement inputs, thereby unifying kinematic and static formulations. Assuming the friction is neglected, the framework generalizes to nonuniform geometries, arbitrary cable routings, distributed loading and axial extensibility, while remaining computationally efficient for real-time use. Numerical simulations validate its accuracy, and a semi-analytical iterative scheme is developed for inverse kinematics. To address discretization in practical robots, LASEM further reformulates the functional minimization as a numerical optimization, which also naturally incorporates cable potential energy without explicit contact modeling.

Hybrid Reinforcement Learning and Search for Flight Trajectory Planning

Authors:Alberto Luise, Michele Lombardi, Florent Teichteil Koenigsbuch
Date:2025-09-04 11:01:43

This paper explores the combination of Reinforcement Learning (RL) and search-based path planners to speed up the optimization of flight paths for airliners, where in case of emergency a fast route re-calculation can be crucial. The fundamental idea is to train an RL Agent to pre-compute near-optimal paths based on location and atmospheric data and use those at runtime to constrain the underlying path planning solver and find a solution within a certain distance from the initial guess. The approach effectively reduces the size of the solver's search space, significantly speeding up route optimization. Although global optimality is not guaranteed, empirical results conducted with Airbus aircraft's performance models show that fuel consumption remains nearly identical to that of an unconstrained solver, with deviations typically within 1%. At the same time, computation speed can be improved by up to 50% as compared to using a conventional solver alone.

Object-Reconstruction-Aware Whole-body Control of Mobile Manipulators

Authors:Fatih Dursun, Bruno Vilhena Adorno, Simon Watson, Wei Pan
Date:2025-09-04 10:52:27

Object reconstruction and inspection tasks play a crucial role in various robotics applications. Identifying paths that reveal the most unknown areas of the object becomes paramount in this context, as it directly affects efficiency, and this problem is known as the view path planning problem. Current methods often use sampling-based path planning techniques, evaluating potential views along the path to enhance reconstruction performance. However, these methods are computationally expensive as they require evaluating several candidate views on the path. To this end, we propose a computationally efficient solution that relies on calculating a focus point in the most informative (unknown) region and having the robot maintain this point in the camera field of view along the path. We incorporated this strategy into the whole-body control of a mobile manipulator employing a visibility constraint without the need for an additional path planner. We conducted comprehensive and realistic simulations using a large dataset of 114 diverse objects of varying sizes from 57 categories to compare our method with a sampling-based planning strategy using Bayesian data analysis. Furthermore, we performed real-world experiments with an 8-DoF mobile manipulator to demonstrate the proposed method's performance in practice. Our results suggest that there is no significant difference in object coverage and entropy. In contrast, our method is approximately nine times faster than the baseline sampling-based method in terms of the average time the robot spends between views.

Keypoint-based Diffusion for Robotic Motion Planning on the NICOL Robot

Authors:Lennart Clasmeier, Jan-Gerrit Habekost, Connor Gäde, Philipp Allgeuer, Stefan Wermter
Date:2025-09-04 10:11:51

We propose a novel diffusion-based action model for robotic motion planning. Commonly, established numerical planning approaches are used to solve general motion planning problems, but have significant runtime requirements. By leveraging the power of deep learning, we are able to achieve good results in a much smaller runtime by learning from a dataset generated by these planners. While our initial model uses point cloud embeddings in the input to predict keypoint-based joint sequences in its output, we observed in our ablation study that it remained challenging to condition the network on the point cloud embeddings. We identified some biases in our dataset and refined it, which improved the model's performance. Our model, even without the use of the point cloud encodings, outperforms numerical models by an order of magnitude regarding the runtime, while reaching a success rate of up to 90% of collision free solutions on the test set.

FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction

Authors:Yifan Yang, Zhixiang Duan, Tianshi Xie, Fuyu Cao, Pinxi Shen, Peili Song, Piaopiao Jin, Guokang Sun, Shaoqing Xu, Yangwei You, Jingtai Liu
Date:2025-09-04 08:47:26

Robotic manipulation is a fundamental component of automation. However, traditional perception-planning pipelines often fall short in open-ended tasks due to limited flexibility, while the architecture of a single end-to-end Vision-Language-Action (VLA) offers promising capabilities but lacks crucial mechanisms for anticipating and recovering from failure. To address these challenges, we propose FPC-VLA, a dual-model framework that integrates VLA with a supervisor for failure prediction and correction. The supervisor evaluates action viability through vision-language queries and generates corrective strategies when risks arise, trained efficiently without manual labeling. A similarity-guided fusion module further refines actions by leveraging past predictions. Evaluation results on multiple simulation platforms (SIMPLER and LIBERO) and robot embodiments (WidowX, Google Robot, Franka) show that FPC-VLA outperforms state-of-the-art models in both zero-shot and fine-tuned settings. By activating the supervisor only at keyframes, our approach significantly increases task success rates with minimal impact on execution time. Successful real-world deployments on diverse, long-horizon tasks confirm FPC-VLA's strong generalization and practical utility for building more reliable autonomous systems.

Systematic Timing Leakage Analysis of NIST PQDSS Candidates: Tooling and Lessons Learned

Authors:Olivier Adjonyo, Sebastien Bardin, Emanuele Bellini, Gilbert Ndollane Dione, Mahmudul Faisal Al Ameen, Robert Merget, Frederic Recoules, Yanis Sellami
Date:2025-09-04 08:41:06

The PQDSS standardization process requires cryptographic primitives to be free from vulnerabilities, including timing and cache side-channels. Resistance to timing leakage is therefore an essential property, and achieving this typically relies on software implementations that follow constant-time principles. Moreover, ensuring that all implementations are constant-time is crucial for fair performance comparisons, as secure implementations often incur additional overhead. Such analysis also helps identify scheme proposals that are inherently difficult to implement in constant time. Because constant-time properties can be broken during compilation, it is often necessary to analyze the compiled binary directly. Since manual binary analysis is extremely challenging, automated analysis becomes highly important. Although several tools exist to assist with such analysis, they often have usability limitations and are difficult to set up correctly. To support the developers besides the NIST committee in verifying candidates, we developed a toolchain that automates configuration, execution, and result analysis for several widely used constant-time analysis tools. We selected TIMECOP and Binsec/Rel2 to verify constant-time policy compliance at the binary level, and dudect and RTLF to detect side-channel vulnerabilities through statistical analysis of execution time behavior. We demonstrate its effectiveness and practicability by evaluating the NIST PQDSS round 1 and round 2 implementations. We reported 26 issues in total to the respective developers, and 5 of them have already been fixed. We also discuss our different findings, as well as the benefits of shortcomings of the different tools.

Strengthening national capability in urban climate science: an Australian perspective

Authors:Negin Nazarian, Andy J Pitman, Mathew J Lipson, Melissa A Hart, Helen Cleugh, Ian Harman, Marcus J Thatcher, Annette L Hirsch, Giovanni Di Virgilio, Matthew L Riley, Nigel Tapper, Jason P Evans, Christian Jakob, Pascal Perez
Date:2025-09-04 07:47:49

Cities are experiencing significant warming and more frequent climate extremes, raising risks for over 90% of Australians living in cities. Yet many of our tools for climate prediction and projection lack accurate representations of these environments. We also lack the observations and datasets needed to evaluate model performance. This paper identifies critical gaps in Australias current capability, showing how they undermine climate impact and risk assessments in cities and may lead to poorly designed adaptation and mitigation strategies. These gaps, and the recommendations to address them, were identified through consultation with experts across research institutes, universities, two ARC Centres of Excellence, federal and state governments, and private agencies. Our recommendations span four key areas: city descriptive datasets, integrated observations, fit for purpose models, and a coordinated community of research and practice. Urgent action is needed to tailor models to Australia's unique urban landscapes and climates. This requires comprehensive, nationally consistent, high resolution datasets that capture the form, fabric, and function of contemporary and future cities. It also requires filling systematic gaps in integrated networks of urban climate observations for evaluation and benchmarking. At the same time, scientific understanding of key urban processes that influence weather and climate must advance, alongside improvements in their representation in physical models. This can be achieved through a national community of research and practice that codesigns and oversees an implementation plan, integrated with infrastructure such as ACCESS NRI and AURIN. Building this capability will enable us to answer critical questions about the interaction between cities and climate, protecting Australias urban populations and ensuring a resilient future.

Handling Infinite Domain Parameters in Planning Through Best-First Search with Delayed Partial Expansions

Authors:Ángel Aso-Mollar, Diego Aineto, Enrico Scala, Eva Onaindia
Date:2025-09-04 07:27:27

In automated planning, control parameters extend standard action representations through the introduction of continuous numeric decision variables. Existing state-of-the-art approaches have primarily handled control parameters as embedded constraints alongside other temporal and numeric restrictions, and thus have implicitly treated them as additional constraints rather than as decision points in the search space. In this paper, we propose an efficient alternative that explicitly handles control parameters as true decision points within a systematic search scheme. We develop a best-first, heuristic search algorithm that operates over infinite decision spaces defined by control parameters and prove a notion of completeness in the limit under certain conditions. Our algorithm leverages the concept of delayed partial expansion, where a state is not fully expanded but instead incrementally expands a subset of its successors. Our results demonstrate that this novel search algorithm is a competitive alternative to existing approaches for solving planning problems involving control parameters.

National social cost of carbon: An application of FUND

Authors:In Chang Hwang, Richard S. J. Tol
Date:2025-09-04 06:27:42

This paper presents a refined country-level integrated assessment model, FUND 3.9n, that extends the regional FUND 3.9 framework by incorporating sector-specific climate impact functions and parametric uncertainty analysis for 198 individual countries. The model enables estimation of the national social cost of carbon (NSCC), capturing heterogeneity across nations from economic structure, climate sensitivity, and population exposure. Our results demonstrate that both the NSCC and the global sum estimates are highly sensitive to damage specifications and preference parameters, including the pure rate of time preference and relative risk aversion. Compared to aggregated single-sector approaches, the disaggregated model with uncertainty yields higher values of the NSCC for low- and middle-income countries. The paper contributes to the literature by quantifying how sector-specific vulnerabilities and stochastic variability amplify climate damages and reshape global equity in the distribution of the NSCC. The NSCCs derived from our model offer policy-relevant metrics for adaptation planning, mitigation target setting, and equitable burden-sharing in international climate negotiations. This approach bridges the gap between globally harmonized carbon pricing and nationally differentiated climate impacts, providing a theoretically grounded and empirically rich framework for future climate policy design.

Reactive In-Air Clothing Manipulation with Confidence-Aware Dense Correspondence and Visuotactile Affordance

Authors:Neha Sunil, Megha Tippur, Arnau Saumell, Edward Adelson, Alberto Rodriguez
Date:2025-09-04 05:16:56

Manipulating clothing is challenging due to complex configurations, variable material dynamics, and frequent self-occlusion. Prior systems often flatten garments or assume visibility of key features. We present a dual-arm visuotactile framework that combines confidence-aware dense visual correspondence and tactile-supervised grasp affordance to operate directly on crumpled and suspended garments. The correspondence model is trained on a custom, high-fidelity simulated dataset using a distributional loss that captures cloth symmetries and generates correspondence confidence estimates. These estimates guide a reactive state machine that adapts folding strategies based on perceptual uncertainty. In parallel, a visuotactile grasp affordance network, self-supervised using high-resolution tactile feedback, determines which regions are physically graspable. The same tactile classifier is used during execution for real-time grasp validation. By deferring action in low-confidence states, the system handles highly occluded table-top and in-air configurations. We demonstrate our task-agnostic grasp selection module in folding and hanging tasks. Moreover, our dense descriptors provide a reusable intermediate representation for other planning modalities, such as extracting grasp targets from human video demonstrations, paving the way for more generalizable and scalable garment manipulation.

Human Motion Video Generation: A Survey

Authors:Haiwei Xue, Xiangyang Luo, Zhanghao Hu, Xin Zhang, Xunzhi Xiang, Yuqin Dai, Jianzhuang Liu, Zhensong Zhang, Minglei Li, Jian Yang, Fei Ma, Zhiyong Wu, Changpeng Yang, Zonghong Dai, Fei Richard Yu
Date:2025-09-04 04:39:21

Human motion video generation has garnered significant research interest due to its broad applications, enabling innovations such as photorealistic singing heads or dynamic avatars that seamlessly dance to music. However, existing surveys in this field focus on individual methods, lacking a comprehensive overview of the entire generative process. This paper addresses this gap by providing an in-depth survey of human motion video generation, encompassing over ten sub-tasks, and detailing the five key phases of the generation process: input, motion planning, motion video generation, refinement, and output. Notably, this is the first survey that discusses the potential of large language models in enhancing human motion video generation. Our survey reviews the latest developments and technological trends in human motion video generation across three primary modalities: vision, text, and audio. By covering over two hundred papers, we offer a thorough overview of the field and highlight milestone works that have driven significant technological breakthroughs. Our goal for this survey is to unveil the prospects of human motion video generation and serve as a valuable resource for advancing the comprehensive applications of digital humans. A complete list of the models examined in this survey is available in Our Repository https://github.com/Winn1y/Awesome-Human-Motion-Video-Generation.

Hardware-Aware Data and Instruction Mapping for AI Tasks: Balancing Parallelism, I/O and Memory Tradeoffs

Authors:Md Rownak Hossain Chowdhury, Mostafizur Rahman
Date:2025-09-04 03:14:16

We introduce a mapping framework for deep learning inference that takes advantage of predictable neural network behavior to plan both computation and communication ahead of time. The framework generates a unified stream of instructions and data, enabling the hardware to execute operations and route information on its own, without frequent involvement from the host and with minimal off-chip memory use. This naturally reduces reliance on I/O, off-chip memory, and host control. By leveraging fine-grained message passing on a programmable, message-based compute architecture, the framework keeps data movement local and coordinates computation across the array using techniques such as stationary-weight reuse, in-array multicasting, and staged reductions. Applied to VGG-19, the framework sustains high utilization (88 to 92 percent), with over 97 percent of messages generated internally and nearly 89 percent of time consumed on-chip transfers. Computation throughput scales beyond 1 TFLOP/s on larger arrays, while traffic reductions from reuse and local aggregation reach up to 100 MB per layer. Overall, the results highlight the effectiveness of streaming-based computation and show how our mapper enables this execution style by tightly coordinating data and instruction flow across the hardware.

A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai

Authors:Kaizhen Tan, Yufan Wu, Yuxuan Liu, Haoran Zeng
Date:2025-09-04 02:35:14

Historic urban quarters play a vital role in preserving cultural heritage while serving as vibrant spaces for tourism and everyday life. Understanding how tourists perceive these environments is essential for sustainable, human-centered urban planning. This study proposes a multidimensional AI-powered framework for analyzing tourist perception in historic urban quarters using multimodal data from social media. Applied to twelve historic quarters in central Shanghai, the framework integrates focal point extraction, color theme analysis, and sentiment mining. Visual focus areas are identified from tourist-shared photos using a fine-tuned semantic segmentation model. To assess aesthetic preferences, dominant colors are extracted using a clustering method, and their spatial distribution across quarters is analyzed. Color themes are further compared between social media photos and real-world street views, revealing notable shifts. This divergence highlights potential gaps between visual expectations and the built environment, reflecting both stylistic preferences and perceptual bias. Tourist reviews are evaluated through a hybrid sentiment analysis approach combining a rule-based method and a multi-task BERT model. Satisfaction is assessed across four dimensions: tourist activities, built environment, service facilities, and business formats. The results reveal spatial variations in aesthetic appeal and emotional response. Rather than focusing on a single technical innovation, this framework offers an integrated, data-driven approach to decoding tourist perception and contributes to informed decision-making in tourism, heritage conservation, and the design of aesthetically engaging public spaces.

Advancing Positron Emission Tomography Image Quantification: Artificial Intelligence-Driven Methods, Clinical Challenges, and Emerging Opportunities in Long-Axial Field-of-View Positron Emission Tomography/Computed Tomography Imaging

Authors:Fereshteh Yousefirizi, Movindu Dassanayake, Alejandro Lopez, Andrew Reader, Gary J. R. Cook, Clemens Mingels, Arman Rahmim, Robert Seifert, Ian Alberts
Date:2025-09-03 22:35:42

MTV is increasingly recognized as an accurate estimate of disease burden, which has prognostic value, but its implementation has been hindered by the time-consuming need for manual segmentation of images. Automated quantitation using AI-driven approaches is promising. AI-driven automated quantification significantly reduces labor-intensive manual segmentation, improving consistency, reproducibility, and feasibility for routine clinical practice. AI-enhanced radiomics provides comprehensive characterization of tumor biology, capturing intratumoral and intertumoral heterogeneity beyond what conventional volumetric metrics alone offer, supporting improved patient stratification and therapy planning. AI-driven segmentation of normal organs improves radioligand therapy planning by enabling accurate dose predictions and comprehensive organ-based radiomics analysis, further refining personalized patient management.

LayoutGKN: Graph Similarity Learning of Floor Plans

Authors:Casper van Engelenburg, Jan van Gemert, Seyran Khademi
Date:2025-09-03 21:56:16

Floor plans depict building layouts and are often represented as graphs to capture the underlying spatial relationships. Comparison of these graphs is critical for applications like search, clustering, and data visualization. The most successful methods to compare graphs \ie, graph matching networks, rely on costly intermediate cross-graph node-level interactions, therefore being slow in inference time. We introduce \textbf{LayoutGKN}, a more efficient approach that postpones the cross-graph node-level interactions to the end of the joint embedding architecture. We do so by using a differentiable graph kernel as a distance function on the final learned node-level embeddings. We show that LayoutGKN computes similarity comparably or better than graph matching networks while significantly increasing the speed. \href{https://github.com/caspervanengelenburg/LayoutGKN}{Code and data} are open.

Parameter Tuning Under Uncertain Road Perception in Driver Assistance Systems

Authors:Leon Greiser, Christian Rathgeber, Vladislav Nenchev, Sören Hohmann
Date:2025-09-03 20:21:30

Advanced driver assistance systems have improved comfort, safety, and efficiency of modern vehicles. However, sensor limitations lead to noisy lane estimates that pose a significant challenge in developing performant control architectures. Lateral trajectory planning often employs an optimal control formulation to maintain lane position and minimize steering effort. The parameters are often tuned manually, which is a time-intensive procedure. This paper presents an automatic parameter tuning method for lateral planning in lane-keeping scenarios based on recorded data, while taking into account noisy road estimates. By simulating the lateral vehicle behavior along a reference curve, our approach efficiently optimizes planner parameters for automated driving and demonstrates improved performance on previously unseen test data.

Efficient Virtuoso: A Latent Diffusion Transformer Model for Goal-Conditioned Trajectory Planning

Authors:Antonio Guillen-Perez
Date:2025-09-03 19:18:02

The ability to generate a diverse and plausible distribution of future trajectories is a critical capability for autonomous vehicle planning systems. While recent generative models have shown promise, achieving high fidelity, computational efficiency, and precise control remains a significant challenge. In this paper, we present the \textbf{Efficient Virtuoso}, a conditional latent diffusion model for goal-conditioned trajectory planning. Our approach introduces a novel two-stage normalization pipeline that first scales trajectories to preserve their geometric aspect ratio and then normalizes the resulting PCA latent space to ensure a stable training target. The denoising process is performed efficiently in this low-dimensional latent space by a simple MLP denoiser, which is conditioned on a rich scene context fused by a powerful Transformer-based StateEncoder. We demonstrate that our method achieves state-of-the-art performance on the Waymo Open Motion Dataset, reaching a \textbf{minADE of 0.25}. Furthermore, through a rigorous ablation study on goal representation, we provide a key insight: while a single endpoint goal can resolve strategic ambiguity, a richer, multi-step sparse route is essential for enabling the precise, high-fidelity tactical execution that mirrors nuanced human driving behavior.

Multilayer networks characterize human-mobility patterns by industry sector for the 2021 Texas winter storm

Authors:Melissa Butler, Alisha Khan, Francis Afrifa, Yingjie Hu, Dane Taylor
Date:2025-09-03 18:49:21

Understanding human mobility during disastrous events is crucial for emergency planning and disaster management. Here, we develop a methodology involving the construction of time-varying, multilayer networks in which edges encode observed movements between spatial regions (census tracts) and network layers encode different movement categories according to industry sectors (e.g., visitations to schools, hospitals, and grocery stores). This approach provides a rich characterization of human mobility, thereby complementing studies examining the risk-aversion activities of evacuation and sheltering in place. Focusing on the 2021 Texas winter storm as a case study which led to many casualties, we find that people largely reduced their movements to ambulatory healthcare services, restaurants, and schools, but prioritized movements to grocery stores and gas stations. Additionally, we study the predictability of nodes' in- and out-degrees in the multilayer networks, which encode movements into and out of census tracts. We find that inward movements are harder to predict than outward movements, and even more so during this winter storm. Our findings about the reduction, prioritization, and predictability of sector-specific human movements could inform mobility-related decisions arising from future extreme weather events.

A Brenier Theorem on $(\mathcal{P}_2 (\mathcal{P}_2(\mathbb{R}^d )), W_2 )$ and Applications to Adapted Transport

Authors:Mathias Beiglböck, Gudmund Pammer, Stefan Schrott
Date:2025-09-03 17:41:32

Brenier's fundamental theorem characterizes optimal transport plans for measures $\mu, \nu$ on $\mathbb{R}^d$ and quadratic distance costs in terms of gradients of convex functions. In particular it guarantees the existence of optimal transport maps for measures which are absolutely continuous wrt Lebesgue measure. Our goal is to provide a version of this result for measures $P,Q$ on $\mathcal{P}_2(\mathbb{R}^d)$ and costs given by the squared Wasserstein distance $W_2^2(\mu, \nu)$. We characterize optimizers in terms of convexity of the Lions lift. This is based on an observation which seems to be of independent interest: the $c$-transform of a functional $\phi$, where $c(\mu, \nu)$ denotes maximal covariance of $\mu, \nu$ corresponds precisely to the Legendre transform of the Lions lift of $\phi$. Moreover we show that for typical $P \in\mathcal{P}_2(\mathbb{R}^d)$ the optimizer is unique and given by a transport map. In the absence of a canonical reference measure on $\mathcal{P}_2(\mathbb{R}^d)$ we use a topological notion to make `typical' precise. Specifically we show that the transport regular measures are of second Baire category. A particular motivation for our article stems from the theory of adapted transport where the adapted Wasserstein distance provides an adequate distance between stochastic processes. In contrast to other metrics, the adapted Wasserstein distance yields continuity of Doob-decomposition, optimal stopping and stochastic control problems. Based on our results for measures on $\mathcal{P}_2(\mathbb{R}^d)$ we obtain a first Brenier-type theorem for the adapted Wasserstein distance.

Real-Time Instrument Planning and Perception for Novel Measurements of Dynamic Phenomena

Authors:Itai Zilberstein, Alberto Candela, Steve Chien
Date:2025-09-03 17:32:15

Advancements in onboard computing mean remote sensing agents can employ state-of-the-art computer vision and machine learning at the edge. These capabilities can be leveraged to unlock new rare, transient, and pinpoint measurements of dynamic science phenomena. In this paper, we present an automated workflow that synthesizes the detection of these dynamic events in look-ahead satellite imagery with autonomous trajectory planning for a follow-up high-resolution sensor to obtain pinpoint measurements. We apply this workflow to the use case of observing volcanic plumes. We analyze classification approaches including traditional machine learning algorithms and convolutional neural networks. We present several trajectory planning algorithms that track the morphological features of a plume and integrate these algorithms with the classifiers. We show through simulation an order of magnitude increase in the utility return of the high-resolution instrument compared to baselines while maintaining efficient runtimes.

ResearchPulse: Building Method-Experiment Chains through Multi-Document Scientific Inference

Authors:Qi Chen, Jingxuan Wei, Zhuoya Yao, Haiguang Wang, Gaowei Wu, Bihui Yu, Siyuan Li, Cheng Tan
Date:2025-09-03 15:45:03

Understanding how scientific ideas evolve requires more than summarizing individual papers-it demands structured, cross-document reasoning over thematically related research. In this work, we formalize multi-document scientific inference, a new task that extracts and aligns motivation, methodology, and experimental results across related papers to reconstruct research development chains. This task introduces key challenges, including temporally aligning loosely structured methods and standardizing heterogeneous experimental tables. We present ResearchPulse, an agent-based framework that integrates instruction planning, scientific content extraction, and structured visualization. It consists of three coordinated agents: a Plan Agent for task decomposition, a Mmap-Agent that constructs motivation-method mind maps, and a Lchart-Agent that synthesizes experimental line charts. To support this task, we introduce ResearchPulse-Bench, a citation-aware benchmark of annotated paper clusters. Experiments show that our system, despite using 7B-scale agents, consistently outperforms strong baselines like GPT-4o in semantic alignment, structural consistency, and visual fidelity. The dataset are available in https://huggingface.co/datasets/ResearchPulse/ResearchPulse-Bench.

Generative Auto-Bidding in Large-Scale Competitive Auctions via Diffusion Completer-Aligner

Authors:Yewen Li, Jingtong Gao, Nan Jiang, Shuai Mao, Ruyi An, Fei Pan, Xiangyu Zhao, Bo An, Qingpeng Cai, Peng Jiang
Date:2025-09-03 14:25:36

Auto-bidding is central to computational advertising, achieving notable commercial success by optimizing advertisers' bids within economic constraints. Recently, large generative models show potential to revolutionize auto-bidding by generating bids that could flexibly adapt to complex, competitive environments. Among them, diffusers stand out for their ability to address sparse-reward challenges by focusing on trajectory-level accumulated rewards, as well as their explainable capability, i.e., planning a future trajectory of states and executing bids accordingly. However, diffusers struggle with generation uncertainty, particularly regarding dynamic legitimacy between adjacent states, which can lead to poor bids and further cause significant loss of ad impression opportunities when competing with other advertisers in a highly competitive auction environment. To address it, we propose a Causal auto-Bidding method based on a Diffusion completer-aligner framework, termed CBD. Firstly, we augment the diffusion training process with an extra random variable t, where the model observes t-length historical sequences with the goal of completing the remaining sequence, thereby enhancing the generated sequences' dynamic legitimacy. Then, we employ a trajectory-level return model to refine the generated trajectories, aligning more closely with advertisers' objectives. Experimental results across diverse settings demonstrate that our approach not only achieves superior performance on large-scale auto-bidding benchmarks, such as a 29.9% improvement in conversion value in the challenging sparse-reward auction setting, but also delivers significant improvements on the Kuaishou online advertising platform, including a 2.0% increase in target cost.

The Role of Embodiment in Intuitive Whole-Body Teleoperation for Mobile Manipulation

Authors:Sophia Bianchi Moyen, Rickmer Krohn, Sophie Lueth, Kay Pompetzki, Jan Peters, Vignesh Prasad, Georgia Chalvatzaki
Date:2025-09-03 11:25:36

Intuitive Teleoperation interfaces are essential for mobile manipulation robots to ensure high quality data collection while reducing operator workload. A strong sense of embodiment combined with minimal physical and cognitive demands not only enhances the user experience during large-scale data collection, but also helps maintain data quality over extended periods. This becomes especially crucial for challenging long-horizon mobile manipulation tasks that require whole-body coordination. We compare two distinct robot control paradigms: a coupled embodiment integrating arm manipulation and base navigation functions, and a decoupled embodiment treating these systems as separate control entities. Additionally, we evaluate two visual feedback mechanisms: immersive virtual reality and conventional screen-based visualization of the robot's field of view. These configurations were systematically assessed across a complex, multi-stage task sequence requiring integrated planning and execution. Our results show that the use of VR as a feedback modality increases task completion time, cognitive workload, and perceived effort of the teleoperator. Coupling manipulation and navigation leads to a comparable workload on the user as decoupling the embodiments, while preliminary experiments suggest that data acquired by coupled teleoperation leads to better imitation learning performance. Our holistic view on intuitive teleoperation interfaces provides valuable insight into collecting high-quality, high-dimensional mobile manipulation data at scale with the human operator in mind. Project website:https://sophiamoyen.github.io/role-embodiment-wbc-moma-teleop/

Plan More, Debug Less: Applying Metacognitive Theory to AI-Assisted Programming Education

Authors:Tung Phung, Heeryung Choi, Mengyan Wu, Adish Singla, Christopher Brooks
Date:2025-09-03 09:38:43

The growing adoption of generative AI in education highlights the need to integrate established pedagogical principles into AI-assisted learning environments. This study investigates the potential of metacognitive theory to inform AI-assisted programming education through a hint system designed around the metacognitive phases of planning, monitoring, and evaluation. Upon request, the system can provide three types of AI-generated hints--planning, debugging, and optimization--to guide students at different stages of problem-solving. Through a study with 102 students in an introductory data science programming course, we find that students perceive and engage with planning hints most highly, whereas optimization hints are rarely requested. We observe a consistent association between requesting planning hints and achieving higher grades across question difficulty and student competency. However, when facing harder tasks, students seek additional debugging but not more planning support. These insights contribute to the growing field of AI-assisted programming education by providing empirical evidence on the importance of pedagogical principles in AI-assisted learning.