planning - 2025-07-15

EmbRACE-3K: Embodied Reasoning and Action in Complex Environments

Authors:Mingxian Lin, Wei Huang, Yitang Li, Chengjie Jiang, Kui Wu, Fangwei Zhong, Shengju Qian, Xin Wang, Xiaojuan Qi

Date:2025-07-14 17:59:46

Recent advanced vision-language models(VLMs) have demonstrated strong performance on passive, offline image and video understanding tasks. However, their effectiveness in embodied settings, which require online interaction and active scene understanding remains limited. In such scenarios, an agent perceives the environment from a first-person perspective, with each action dynamically shaping subsequent observations. Even state-of-the-art models such as GPT-4o, Claude 3.5 Sonnet, and Gemini 2.5 Pro struggle in open-environment interactions, exhibiting clear limitations in spatial reasoning and long-horizon planning. To address this gap, we introduce EmRACE-3K, a dataset of over 3,000 language-guided tasks situated in diverse, photorealistic environments constructed using Unreal Engine and the UnrealCV-Zoo framework. The tasks encompass a wide range of embodied challenges, including navigation, object manipulation, and multi-stage goal execution. Each task unfolds as a multi-step trajectory, pairing first-person visual observations with high-level instructions, grounded actions, and natural language rationales that express the agent's intent at every step. Using EmRACE-3K, we establish a benchmark to evaluate the embodied reasoning capabilities of VLMs across three key dimensions: Exploration, Dynamic Spatial-Semantic Reasoning, and Multi-stage Goal Execution. In zero-shot settings, all models achieve success rates below 20%, underscoring the challenge posed by our benchmark and the current limitations of VLMs in interactive environments. To demonstrate the utility of EmRACE-3K, we further fine-tune Qwen2.5-VL-7B using supervised learning followed by reinforcement learning. This approach yields substantial improvements across all three challenge categories, highlighting the dataset's effectiveness in enabling the development of embodied reasoning capabilities.

Graph World Model

Authors:Tao Feng, Yexin Wu, Guanyu Lin, Jiaxuan You

Date:2025-07-14 17:57:45

World models (WMs) demonstrate strong capabilities in prediction, generation, and planning tasks. Existing WMs primarily focus on unstructured data and cannot leverage the ubiquitous structured data, often represented as graphs, in the digital world. While multiple graph foundation models have been proposed, they focus on graph learning tasks and cannot extend to diverse multi-modal data and interdisciplinary tasks. To address these challenges, we propose the Graph World Model (GWM), a world model that supports both unstructured and graph-structured states with multi-modal information and represents diverse tasks as actions. The core of a GWM is a generic message-passing algorithm to aggregate structured information, either over a unified multi-modal token space by converting multi-modal data into text (GWM-T) or a unified multi-modal embedding space by modality-specific encoders (GWM-E). Notably, GWM introduces action nodes to support diverse tasks, where action nodes are linked to other nodes via direct reference or similarity computation. Extensive experiments on six tasks from diverse domains, including multi-modal generation and matching, recommendation, graph prediction, multi-agent, retrieval-augmented generation, and planning and optimization, show that the same GWM outperforms or matches domain-specific baselines' performance, benefits from multi-hop structures, and demonstrates strong zero-shot/few-shot capabilities on unseen new tasks. Our code for GWM is released at https://github.com/ulab-uiuc/GWM.

$^{88}$Sr Reference Data

Authors:Sebastian Pucher, Sofus Laguna Kristensen, Ronen M. Kroeze

Date:2025-07-14 17:09:39

Strontium-88 is a versatile atomic species often used in quantum optics, precision metrology, and quantum computing. Consolidated atomic data is essential for the planning, execution, and evaluation of experiments. In this reference, we present physical and optical properties of neutral $^{88}$Sr relevant to these applications. Here we focus on experimental results and supplement these with theoretical values. We present equations to convert values and derive important parameters. Tabulated results include key parameters for commonly used transitions in $^{88}$Sr ($^1\mathrm{S}_0 \rightarrow \, ^1\mathrm{P}_1$, $^1\mathrm{S}_0 \rightarrow \, ^3\mathrm{P}_{0,1,2}$, and $^3\mathrm{P}_{0,1,2} \rightarrow \, ^3\mathrm{S}_1$). This dataset serves as an up-to-date reference for studies involving bosonic $^{88}$Sr.

Intimate partner violence and women's economic preferences

Authors:Dan Anderberg, Rachel Cassidy, Anaya Dam, Melissa Hidrobo, Jessica Leight, Karlijn Morsink

Date:2025-07-14 15:57:32

One in three women globally experiences intimate partner violence (IPV), yet little is known about how such trauma affects economic decision-making. We provide causal evidence that IPV influences women's time preferences - a key parameter in models of savings, investment, and labor supply. We combine two empirical strategies using four distinct datasets. First, in two randomized recall experiments in Ethiopia, we randomly assigned women to recall specific acts of abuse before eliciting their intertemporal choices. Women with IPV experiences prompted to recall IPV display significantly greater impatience than otherwise similar women who are not prompted. Second, we exploit exogenous reductions in IPV generated by two randomized interventions - one involving cash transfers, the other psychotherapy - and use treatment assignment as an instrument for IPV exposure. Women who experience reduced IPV as a result of treatment exhibit more patient time preferences. Together, these results provide consistent, novel causal evidence that exposure to IPV induces individuals to discount the future more heavily. This evidence suggests a psychological channel through which violence can perpetuate economic disadvantage and constrain women's ability to take actions - such as saving, investing, or exiting abusive relationships - that require planning over time.

TOP: Trajectory Optimization via Parallel Optimization towards Constant Time Complexity

Authors:Jiajun Yu, Nanhe Chen, Guodong Liu, Chao Xu, Fei Gao, Yanjun Cao

Date:2025-07-14 13:56:59

Optimization has been widely used to generate smooth trajectories for motion planning. However, existing trajectory optimization methods show weakness when dealing with large-scale long trajectories. Recent advances in parallel computing have accelerated optimization in some fields, but how to efficiently solve trajectory optimization via parallelism remains an open question. In this paper, we propose a novel trajectory optimization framework based on the Consensus Alternating Direction Method of Multipliers (CADMM) algorithm, which decomposes the trajectory into multiple segments and solves the subproblems in parallel. The proposed framework reduces the time complexity to O(1) per iteration to the number of segments, compared to O(N) of the state-of-the-art (SOTA) approaches. Furthermore, we introduce a closed-form solution that integrates convex linear and quadratic constraints to speed up the optimization, and we also present numerical solutions for general inequality constraints. A series of simulations and experiments demonstrate that our approach outperforms the SOTA approach in terms of efficiency and smoothness. Especially for a large-scale trajectory, with one hundred segments, achieving over a tenfold speedup. To fully explore the potential of our algorithm on modern parallel computing architectures, we deploy our framework on a GPU and show high performance with thousands of segments.

A Cost Effective Optimization of the hybrid-DOM Design for TRIDENT

Authors:Hengbin Shao, Fuyudi Zhang, Qichao Chang, Shuhua Hao, Ruike Cao, Jingtao Huang, Weilun Huang, Hai Liu, Hualin Mei, Iwan Morton-Blake, Wei Tian, Yingwei Wang, Xin Xiang, Donglian Xu

Date:2025-07-14 13:28:22

TRIDENT is a planned multi-cubic-kilometer deep-sea neutrino telescope to be built in the South China Sea, designed to rapidly discover high-energy astrophysical neutrino sources with sensitivity to all neutrino flavors. Achieving this at scale requires a detector design that balances performance with power, cost, and mechanical simplicity. This study presents a cost-effective optimization of TRIDENT's hybrid Digital Optical Module (hDOM) design, comparing configurations using high-quantum-efficiency (QE) 3-inch PMTs and larger 4-inch PMTs, the latter evaluated with both baseline and enhanced QE assumptions. Using full-chain detector simulations incorporating site-specific seawater optical properties and realistic backgrounds, we assess performance in all-flavor neutrino detection efficiency, directional reconstruction, and tau neutrino flavor identification from 1 TeV to 10 PeV. We find that if 4-inch PMTs can achieve QE comparable to 3-inch PMTs, their performance matches or improves upon that of the 3-inch design, while significantly reducing channel count, power consumption, and cost. These findings support the 4-inch PMT hDOM as a promising and scalable choice for TRIDENT's future instrumentation.

REACT: Real-time Entanglement-Aware Coverage Path Planning for Tethered Underwater Vehicles

Authors:Abdelhakim Amer, Mohit Mehindratta, Yury Brodskiy, Bilal Wehbe, Erdal Kayacan

Date:2025-07-14 12:18:01

Inspection of complex underwater structures with tethered underwater vehicles is often hindered by the risk of tether entanglement. We propose REACT (real-time entanglement-aware coverage path planning for tethered underwater vehicles), a framework designed to overcome this limitation. REACT comprises a fast geometry-based tether model using the signed distance field (SDF) map for accurate, real-time simulation of taut tether configurations around arbitrary structures in 3D. This model enables an efficient online replanning strategy by enforcing a maximum tether length constraint, thereby actively preventing entanglement. By integrating REACT into a coverage path planning framework, we achieve safe and optimal inspection paths, previously challenging due to tether constraints. The complete REACT framework's efficacy is validated in a pipe inspection scenario, demonstrating safe, entanglement-free navigation and full-coverage inspection. Simulation results show that REACT achieves complete coverage while maintaining tether constraints and completing the total mission 20% faster than conventional planners, despite a longer inspection time due to proactive avoidance of entanglement that eliminates extensive post-mission disentanglement. Real-world experiments confirm these benefits, where REACT completes the full mission, while the baseline planner fails due to physical tether entanglement.

Recursive Feasibility without Terminal Constraints via Parent-Child MPC Architecture

Authors:Filip Surmaa, Anahita Jamshidnejad

Date:2025-07-14 11:26:59

This paper proposes a novel hierarchical model predictive control (MPC) framework, called the Parent-Child MPC architecture, to steer nonlinear systems under uncertainty towards a target set, balancing computational complexity and guaranteeing recursive feasibility and stability without relying on conservative terminal constraints in online decision-making. By coupling a small-horizon Child MPC layer with one or more large-horizon Parent MPC layers, the architecture ensures recursive feasibility and stability through adjustable stage-wise constraints derived from tube-based control. As is demonstrated in our case studies, compared to traditional MPC methods, the proposed Parent-Child MPC architecture enhances performance and computational efficiency, reduces conservativeness, and enables scalable planning for certain nonlinear systems.

Analysis of AI Techniques for Orchestrating Edge-Cloud Application Migration

Authors:Sadig Gojayev, Ahmad Anaqreh, Carolina Fortuna

Date:2025-07-14 10:03:23

Application migration in edge-cloud system enables high QoS and cost effective service delivery. However, automatically orchestrating such migration is typically solved with heuristic approaches. Starting from the Markov Decision Process (MDP), in this paper, we identify, analyze and compare selected state-of-the-art Artificial Intelligence (AI) planning and Reinforcement Learning (RL) approaches for solving the class of edge-cloud application migration problems that can be modeled as Towers of Hanoi (ToH) problems. We introduce a new classification based on state space definition and analyze the compared models also through this lense. The aim is to understand available techniques capable of orchestrating such application migration in emerging computing continuum environments.

MP-RBFN: Learning-based Vehicle Motion Primitives using Radial Basis Function Networks

Authors:Marc Kaufeld, Mattia Piccinini, Johannes Betz

Date:2025-07-14 08:26:41

This research introduces MP-RBFN, a novel formulation leveraging Radial Basis Function Networks for efficiently learning Motion Primitives derived from optimal control problems for autonomous driving. While traditional motion planning approaches based on optimization are highly accurate, they are often computationally prohibitive. In contrast, sampling-based methods demonstrate high performance but impose constraints on the geometric shape of trajectories. MP-RBFN combines the strengths of both by coupling the high-fidelity trajectory generation of sampling-based methods with an accurate description of vehicle dynamics. Empirical results show compelling performance compared to previous methods, achieving a precise description of motion primitives at low inference times. MP-RBFN yields a seven times higher accuracy in generating optimized motion primitives compared to existing semi-analytic approaches. We demonstrate the practical applicability of MP-RBFN for motion planning by integrating the method into a sampling-based trajectory planner. MP-RBFN is available as open-source software at https://github.com/TUM-AVS/RBFN-Motion-Primitives.

Ariel Explores: Vision-based underwater exploration and inspection via generalist drone-level autonomy

Authors:Mohit Singh, Mihir Dharmadhikari, Kostas Alexis

Date:2025-07-14 07:36:25

This work presents a vision-based underwater exploration and inspection autonomy solution integrated into Ariel, a custom vision-driven underwater robot. Ariel carries a $5$ camera and IMU based sensing suite, enabling a refraction-aware multi-camera visual-inertial state estimation method aided by a learning-based proprioceptive robot velocity prediction method that enhances robustness against visual degradation. Furthermore, our previously developed and extensively field-verified autonomous exploration and general visual inspection solution is integrated on Ariel, providing aerial drone-level autonomy underwater. The proposed system is field-tested in a submarine dry dock in Trondheim under challenging visual conditions. The field demonstration shows the robustness of the state estimation solution and the generalizability of the path planning techniques across robot embodiments.

Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS) in Edge Iterative MRI Lesion Localization System (EdgeIMLocSys)

Authors:Guohao Huo, Ruiting Dai, Hao Tang

Date:2025-07-14 07:29:49

Brain tumor segmentation plays a critical role in clinical diagnosis and treatment planning, yet the variability in imaging quality across different MRI scanners presents significant challenges to model generalization. To address this, we propose the Edge Iterative MRI Lesion Localization System (EdgeIMLocSys), which integrates Continuous Learning from Human Feedback to adaptively fine-tune segmentation models based on clinician feedback, thereby enhancing robustness to scanner-specific imaging characteristics. Central to this system is the Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS), which employs a Modality-Aware Adaptive Encoder (M2AE) to extract multi-scale semantic features efficiently, and a Graph-based Multi-Modal Collaborative Interaction Module (G2MCIM) to model complementary cross-modal relationships via graph structures. Additionally, we introduce a novel Voxel Refinement UpSampling Module (VRUM) that synergistically combines linear interpolation and multi-scale transposed convolutions to suppress artifacts while preserving high-frequency details, improving segmentation boundary accuracy. Our proposed GMLN-BTS model achieves a Dice score of 85.1% on the BraTS2017 dataset with only 4.58 million parameters, representing a 98% reduction compared to mainstream 3D Transformer models, and significantly outperforms existing lightweight approaches. This work demonstrates a synergistic breakthrough in achieving high-accuracy, resource-efficient brain tumor segmentation suitable for deployment in resource-constrained clinical environments.

Demonstrating the Octopi-1.5 Visual-Tactile-Language Model

Authors:Samson Yu, Kelvin Lin, Harold Soh

Date:2025-07-14 07:05:36

Touch is recognized as a vital sense for humans and an equally important modality for robots, especially for dexterous manipulation, material identification, and scenarios involving visual occlusion. Building upon very recent work in touch foundation models, this demonstration will feature Octopi-1.5, our latest visual-tactile-language model. Compared to its predecessor, Octopi-1.5 introduces the ability to process tactile signals from multiple object parts and employs a simple retrieval-augmented generation (RAG) module to improve performance on tasks and potentially learn new objects on-the-fly. The system can be experienced live through a new handheld tactile-enabled interface, the TMI, equipped with GelSight and TAC-02 tactile sensors. This convenient and accessible setup allows users to interact with Octopi-1.5 without requiring a robot. During the demonstration, we will showcase Octopi-1.5 solving tactile inference tasks by leveraging tactile inputs and commonsense knowledge. For example, in a Guessing Game, Octopi-1.5 will identify objects being grasped and respond to follow-up queries about how to handle it (e.g., recommending careful handling for soft fruits). We also plan to demonstrate Octopi-1.5's RAG capabilities by teaching it new items. With live interactions, this demonstration aims to highlight both the progress and limitations of VTLMs such as Octopi-1.5 and to foster further interest in this exciting field. Code for Octopi-1.5 and design files for the TMI gripper are available at https://github.com/clear-nus/octopi-1.5.

A Brain Tumor Segmentation Method Based on CLIP and 3D U-Net with Cross-Modal Semantic Guidance and Multi-Level Feature Fusion

Authors:Mingda Zhang

Date:2025-07-14 06:32:59

Precise segmentation of brain tumors from magnetic resonance imaging (MRI) is essential for neuro-oncology diagnosis and treatment planning. Despite advances in deep learning methods, automatic segmentation remains challenging due to tumor morphological heterogeneity and complex three-dimensional spatial relationships. Current techniques primarily rely on visual features extracted from MRI sequences while underutilizing semantic knowledge embedded in medical reports. This research presents a multi-level fusion architecture that integrates pixel-level, feature-level, and semantic-level information, facilitating comprehensive processing from low-level data to high-level concepts. The semantic-level fusion pathway combines the semantic understanding capabilities of Contrastive Language-Image Pre-training (CLIP) models with the spatial feature extraction advantages of 3D U-Net through three mechanisms: 3D-2D semantic bridging, cross-modal semantic guidance, and semantic-based attention mechanisms. Experimental validation on the BraTS 2020 dataset demonstrates that the proposed model achieves an overall Dice coefficient of 0.8567, representing a 4.8% improvement compared to traditional 3D U-Net, with a 7.3% Dice coefficient increase in the clinically important enhancing tumor (ET) region.

Modeling Cholera Dynamics with Vaccination as the Control Strategy and Seasonal-forcing Transmission

Authors:Eric Herrison Gyamfi

Date:2025-07-14 05:47:21

This study presents a seasonally forced cholera model that incorporates imperfect vaccination as a control strategy. The model captures the temporal dynamics of susceptible, vaccinated, infected, and recovered individuals, as well as the environmental pathogen concentration. A key focus is the instantaneous reproduction number, which serves as a threshold indicator for outbreak persistence or elimination. When reproduction number, the disease-free equilibrium is attainable; otherwise, endemic conditions persist. We conduct a sensitivity analysis to evaluate the influence of two critical parameters: the vaccination rate and the waning rate of immunity. Results show that increasing the vaccination rate and reducing the waning rate significantly decrease reproduction number, reinforcing the importance of sustained vaccine efficacy. Seasonal forcing amplifies the complexity of cholera dynamics, revealing the need for timely public health interventions, especially before high-transmission periods. This model demonstrates practical applicability in informing vaccination strategies, especially in resource-limited settings prone to seasonal outbreaks. It offers a flexible framework for public health planning, adaptable to other waterborne diseases. The findings suggest that integrated approaches combining vaccination, improved sanitation, and targeted education are essential to reducing cholera transmission and achieving long-term control.

Active Probing with Multimodal Predictions for Motion Planning

Authors:Darshan Gadginmath, Farhad Nawaz, Minjun Sung, Faizan M Tariq, Sangjae Bae, David Isele, Fabio Pasqualetti, Jovin Dsa

Date:2025-07-13 23:06:46

Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges. Code and videos are available at https://darshangm.github.io/papers/active-probing-multimodal-predictions/.

Technical Requirements for Halting Dangerous AI Activities

Authors:Peter Barnett, Aaron Scher, David Abecassis

Date:2025-07-13 21:32:15

The rapid development of AI systems poses unprecedented risks, including loss of control, misuse, geopolitical instability, and concentration of power. To navigate these risks and avoid worst-case outcomes, governments may proactively establish the capability for a coordinated halt on dangerous AI development and deployment. In this paper, we outline key technical interventions that could allow for a coordinated halt on dangerous AI activities. We discuss how these interventions may contribute to restricting various dangerous AI activities, and show how these interventions can form the technical foundation for potential AI governance plans.

Electric Vehicle Public Charging Equity Considerations: A Systematic Review

Authors:Boyou Chen, Kaihan Zhang, Austin Moore, Bochen Jia, Mengqiu Cao

Date:2025-07-13 17:57:06

Public electric vehicle (EV) charging infrastructure is crucial for accelerating EV adoption and reducing transportation emissions; however, disparities in infrastructure access have raised significant equity concerns. This systematic review synthesizes existing knowledge and identifies gaps regarding equity in EV public charging research. Following structured review protocols, 91 peer-reviewed studies from Scopus and Google Scholar were analyzed, focusing explicitly on equity considerations. The findings indicate that current research on EV public charging equity mainly adopted geographic information systems (GIS), network optimization, behavioral modeling, and hybrid analytical frameworks, yet lacks consistent normative frameworks for assessing equity outcomes. Equity assessments highlight four key dimensions: spatial accessibility, cost burdens, reliability and usability, and user awareness and trust. Socio-economic disparities, particularly income, housing tenure, and ethnicity, frequently exacerbate inequitable access, disproportionately disadvantaging low-income, renter, and minority populations. Additionally, infrastructure-specific choices, including charger reliability, strategic location, and pricing strategies, significantly influence adoption patterns and equity outcomes. However, existing literature primarily reflects North American, European, and Chinese contexts, revealing substantial geographical and methodological limitations. This review suggests the need for more robust normative evaluations of equity, comprehensive demographic data integration, and advanced methodological frameworks, thereby guiding targeted, inclusive, and context-sensitive infrastructure planning and policy interventions.

IteraOptiRacing: A Unified Planning-Control Framework for Real-time Autonomous Racing for Iterative Optimal Performance

Authors:Yifan Zeng, Yihan Li, Suiyi He, Koushil Sreenath, Jun Zeng

Date:2025-07-13 17:18:51

This paper presents a unified planning-control strategy for competing with other racing cars called IteraOptiRacing in autonomous racing environments. This unified strategy is proposed based on Iterative Linear Quadratic Regulator for Iterative Tasks (i2LQR), which can improve lap time performance in the presence of surrounding racing obstacles. By iteratively using the ego car's historical data, both obstacle avoidance for multiple moving cars and time cost optimization are considered in this unified strategy, resulting in collision-free and time-optimal generated trajectories. The algorithm's constant low computation burden and suitability for parallel computing enable real-time operation in competitive racing scenarios. To validate its performance, simulations in a high-fidelity simulator are conducted with multiple randomly generated dynamic agents on the track. Results show that the proposed strategy outperforms existing methods across all randomly generated autonomous racing scenarios, enabling enhanced maneuvering for the ego racing car.

Minimum-Peak-Cost Flows Over Time

Authors:Mariia Anapolska, Emma Ahrens, Christina Büsing, Felix Engelhardt, Timo Gersing, Corinna Mathwieser, Sabrian Schmitz, Sophia Wrede

Date:2025-07-13 15:50:52

When planning transportation whose operation requires non-consumable resources, the peak demand for allocated resources is often of higher interest than the duration of resource usage. For instance, it is more cost-effective to deliver parcels with a single truck over eight hours than to use two trucks for four hours, as long as the time suffices. To model such scenarios, we introduce the novel minimum peak cost flow over time problem, whose objective is to minimise the maximum cost at all points in time rather than minimising the integral of costs. We focus on minimising peak costs of temporally repeated flows. These are desirable for practical applications due to their simple structure. This yields the minimum-peak-cost Temporally Repeated flow problem (MPC-TRF). We show that the simple structure of temporally repeated flows comes with the drawback of arbitrarily bad approximation ratios compared to general flows over time. Furthermore, our complexity analysis shows the integral version of MPC-TRF is strongly NP-hard, even under strong restrictions. On the positive side, we identify two benign special cases: unit-cost series-parallel networks and networks with time horizon at least twice as long as the longest path in the network (with respect to the transit time). In both cases, we show that integral optimal flows if the desired flow value equals the maximum flow value and fractional optimal flows for arbitrary flow values can be found in polynomial time. For each of these cases, we provide an explicit algorithm that constructs an optimal solution.

Code Review as Decision-Making -- Building a Cognitive Model from the Questions Asked During Code Review

Authors:Lo Gullstrand Heander, Emma Söderberg, Christofer Rydenfält

Date:2025-07-13 14:04:16

Code review is a well-established and valued practice in the software engineering community contributing to both code quality and interpersonal benefits. However, there are challenges in both tools and processes that give rise to misalignments and frustrations. Recent research seeks to address this by automating code review entirely, but we believe that this risks losing the majority of the interpersonal benefits such as knowledge transfer and shared ownership. We believe that by better understanding the cognitive processes involved in code review, it would be possible to improve tool support, with out without AI, and make code review both more efficient, more enjoyable, while increasing or maintaining all of its benefits. In this paper, we conduct an ethnographic think-aloud study involving 10 participants and 34 code reviews. We build a cognitive model of code review bottom up through thematic, statistical, temporal, and sequential analysis of the transcribed material. Through the data, the similarities between the cognitive process in code review and decision-making processes, especially recognition-primed decision-making, become apparent. The result is the Code Review as Decision-Making (CRDM) model that shows how the developers move through two phases during the code review; first an orientation phase to establish context and rationale and then an analytical phase to understand, assess, and plan the rest of the review. Throughout the process several decisions must be taken, on writing comments, finding more information, voting, running the code locally, verifying continuous integration results, etc. Analysis software and process-coded data publicly available at: https://doi.org/10.5281/zenodo.15758266

Self-supervised Pretraining for Integrated Prediction and Planning of Automated Vehicles

Authors:Yangang Ren, Guojian Zhan, Chen Lv, Jun Li, Fenghua Liang, Keqiang Li

Date:2025-07-13 08:39:02

Predicting the future of surrounding agents and accordingly planning a safe, goal-directed trajectory are crucial for automated vehicles. Current methods typically rely on imitation learning to optimize metrics against the ground truth, often overlooking how scene understanding could enable more holistic trajectories. In this paper, we propose Plan-MAE, a unified pretraining framework for prediction and planning that capitalizes on masked autoencoders. Plan-MAE fuses critical contextual understanding via three dedicated tasks: reconstructing masked road networks to learn spatial correlations, agent trajectories to model social interactions, and navigation routes to capture destination intents. To further align vehicle dynamics and safety constraints, we incorporate a local sub-planning task predicting the ego-vehicle's near-term trajectory segment conditioned on earlier segment. This pretrained model is subsequently fine-tuned on downstream tasks to jointly generate the prediction and planning trajectories. Experiments on large-scale datasets demonstrate that Plan-MAE outperforms current methods on the planning metrics by a large margin and can serve as an important pre-training step for learning-based motion planner.

Consistency Trajectory Planning: High-Quality and Efficient Trajectory Optimization for Offline Model-Based Reinforcement Learning

Authors:Guanquan Wang, Takuya Hiraoka, Yoshimasa Tsuruoka

Date:2025-07-13 08:31:11

This paper introduces Consistency Trajectory Planning (CTP), a novel offline model-based reinforcement learning method that leverages the recently proposed Consistency Trajectory Model (CTM) for efficient trajectory optimization. While prior work applying diffusion models to planning has demonstrated strong performance, it often suffers from high computational costs due to iterative sampling procedures. CTP supports fast, single-step trajectory generation without significant degradation in policy quality. We evaluate CTP on the D4RL benchmark and show that it consistently outperforms existing diffusion-based planning methods in long-horizon, goal-conditioned tasks. Notably, CTP achieves higher normalized returns while using significantly fewer denoising steps. In particular, CTP achieves comparable performance with over $120\times$ speedup in inference time, demonstrating its practicality and effectiveness for high-performance, low-latency offline planning.

TraSculptor: Visual Analytics for Enhanced Decision-Making in Road Traffic Planning

Authors:Zikun Deng, Yuanbang Liu, Mingrui Zhu, Da Xiang, Haiyue Yu, Zicheng Su, Qinglong Lu, Tobias Schreck, Yi Cai

Date:2025-07-13 04:23:41

The design of urban road networks significantly influences traffic conditions, underscoring the importance of informed traffic planning. Traffic planning experts rely on specialized platforms to simulate traffic systems, assessing the efficacy of the road network across various states of modifications. Nevertheless, a prevailing issue persists: many existing traffic planning platforms exhibit inefficiencies in flexibly interacting with the road network's structure and attributes and intuitively comparing multiple states during the iterative planning process. This paper introduces TraSculptor, an interactive planning decision-making system. To develop TraSculptor, we identify and address two challenges: interactive modification of road networks and intuitive comparison of multiple network states. For the first challenge, we establish flexible interactions to enable experts to easily and directly modify the road network on the map. For the second challenge, we design a comparison view with a history tree of multiple states and a road-state matrix to facilitate intuitive comparison of road network states. To evaluate TraSculptor, we provided a usage scenario where the Braess's paradox was showcased, invited experts to perform a case study on the Sioux Falls network, and collected expert feedback through interviews.

MobiWorld: World Models for Mobile Wireless Network

Authors:Haoye Chai, Yuan Yuan, Yong Li

Date:2025-07-13 02:59:13

Accurate modeling and simulation of mobile networks are essential for enabling intelligent and cost-effective network optimization. In this paper, we propose MobiWorld, a generative world model designed to support high-fidelity and flexible environment simulation for mobile network planning and optimization. Unlike traditional predictive models constrained by limited generalization capabilities, MobiWorld exhibits strong universality by integrating heterogeneous data sources, including sensors, mobile devices, and base stations, as well as multimodal data types such as sequences and images. It is capable of generating both network element-level observations (e.g., traffic load, user distribution) and system-level performance indicators (e.g., throughput, energy consumption) to support a wide range of planning and optimization tasks. Built upon advanced diffusion models, MobiWorld offers powerful controllable generation capabilities by modeling the joint distribution between mobile network data and diverse conditional factors including spatio temporal contexts, user behaviors, and optimization policies. This enables accurate simulation of dynamic network states under varying policy configurations, providing optimization agents with precise environmental feedback and facilitating effective decision-making without relying on costly real-network interactions. We demonstrate the effectiveness of MobiWorld in a collaborative energy-saving scenario, where an agent uses observations and rewards generated by MobiWorld to optimize base station sleep and user offloading policies. Experimental results show that MobiWorld exhibits strong controllable generation performance and outperforms traditional methods in energy optimization.

Real-Time Adaptive Motion Planning via Point Cloud-Guided, Energy-Based Diffusion and Potential Fields

Authors:Wondmgezahu Teshome, Kian Behzad, Octavia Camps, Michael Everett, Milad Siami, Mario Sznaier

Date:2025-07-12 19:42:07

Motivated by the problem of pursuit-evasion, we present a motion planning framework that combines energy-based diffusion models with artificial potential fields for robust real time trajectory generation in complex environments. Our approach processes obstacle information directly from point clouds, enabling efficient planning without requiring complete geometric representations. The framework employs classifier-free guidance training and integrates local potential fields during sampling to enhance obstacle avoidance. In dynamic scenarios, the system generates initial trajectories using the diffusion model and continuously refines them through potential field-based adaptation, demonstrating effective performance in pursuit-evasion scenarios with partial pursuer observability.

A Systematic Review of Passive Cooling Strategies Integrating Traditional Wisdom and Modern Innovations for Sustainable Development in Arid Urban Environments

Authors:Shiva Manshour, Steffen Lehmann

Date:2025-07-12 17:59:15

Urban environments in hot-arid regions are increasingly challenged by rising temperatures, rapid urban expansion, and reliance on energy intensive mechanical cooling systems. This study presents a systematic review of peer reviewed literature from 1980 to 2025 to assess both traditional and contemporary passive cooling strategies tailored for arid urban settings. Following PRISMA 2020 guidelines, 30 high-quality studies were selected from databases including Scopus, Web of Science, ScienceDirect, and Google Scholar. These works span diverse geographical contexts from the Middle East and North Africa to parts of South Asia, and apply methods including field experiments, computer-based simulations, and qualitative analyses. Findings highlight strong consensus around passive principles such as solar control, natural ventilation, and the use of thermal mass. Vernacular solutions like courtyards, wind towers, and thick masonry walls remain effective, while innovations such as cool roofs, phase change materials, and parametric optimization techniques expand the design toolkit. Implementation is often limited by climate variability, cultural shifts, regulations, and economic feasibility. The review concludes that context-sensitive, hybrid solutions combining traditional knowledge with modern technology hold the greatest potential for achieving sustainable thermal comfort. These approaches must be supported by adaptive urban planning, user centered design, and updated building codes. The study offers practical insights for architects, planners, and policymakers aiming to create resilient, low-carbon cities that harmonize cultural identity with environmental responsibility.

Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics

Authors:Hongyu Nie, Xingyu Li, Xu Liu, Zhaotong Tan, Sen Mei, Wenbo Su

Date:2025-07-12 16:39:19

Autonomous navigation in mobile robots, reliant on perception and planning, faces major hurdles in large-scale, complex environments. These include heavy computational burdens for mapping, sensor occlusion failures for UAVs, and traversal challenges on irregular terrain for UGVs, all compounded by a lack of perception-aware strategies. To address these challenges, we introduce Random Mapping and Random Projection (RMRP). This method constructs a lightweight linear parametric map by first mapping data to a high-dimensional space, followed by a sparse random projection for dimensionality reduction. Our novel Residual Energy Preservation Theorem provides theoretical guarantees for this process, ensuring critical geometric properties are preserved. Based on this map, we propose the RPATR (Robust Perception-Aware Trajectory Planner) framework. For UAVs, our method unifies grid and Euclidean Signed Distance Field (ESDF) maps. The front-end uses an analytical occupancy gradient to refine initial paths for safety and smoothness, while the back-end uses a closed-form ESDF for trajectory optimization. Leveraging the trained RMRP model's generalization, the planner predicts unobserved areas for proactive navigation. For UGVs, the model characterizes terrain and provides closed-form gradients, enabling online planning to circumvent large holes. Validated in diverse scenarios, our framework demonstrates superior mapping performance in time, memory, and accuracy, and enables computationally efficient, safe navigation for high-speed UAVs and UGVs. The code will be released to foster community collaboration.

Informed Hybrid Zonotope-based Motion Planning Algorithm

Authors:Peng Xie, Johannes Betz, Amr Alanwar

Date:2025-07-12 14:54:46

Optimal path planning in nonconvex free spaces is notoriously challenging, as formulating such problems as mixed-integer linear programs (MILPs) is NP-hard. We propose HZ-MP, an informed Hybrid Zonotope-based Motion Planner, as an alternative approach that decomposes the obstacle-free space and performs low-dimensional face sampling guided by an ellipsotope heuristic, enabling focused exploration along promising transit regions. This structured exploration eliminates the excessive, unreachable sampling that degrades existing informed planners such as AIT* and EIT* in narrow gaps or boxed-goal scenarios. We prove that HZ-MP is probabilistically complete and asymptotically optimal. It converges to near-optimal trajectories in finite time and scales to high-dimensional cluttered scenes.

DAA*: Deep Angular A Star for Image-based Path Planning

Authors:Zhiwei Xu

Date:2025-07-12 14:46:42

Path smoothness is often overlooked in path imitation learning from expert demonstrations. In this paper, we introduce a novel learning method, termed deep angular A* (DAA*), by incorporating the proposed path angular freedom (PAF) into A* to improve path similarity through adaptive path smoothness. The PAF aims to explore the effect of move angles on path node expansion by finding the trade-off between their minimum and maximum values, allowing for high adaptiveness for imitation learning. DAA* improves path optimality by closely aligning with the reference path through joint optimization of path shortening and smoothing, which correspond to heuristic distance and PAF, respectively. Throughout comprehensive evaluations on 7 datasets, including 4 maze datasets, 2 video-game datasets, and a real-world drone-view dataset containing 2 scenarios, we demonstrate remarkable improvements of our DAA* over neural A* in path similarity between the predicted and reference paths with a shorter path length when the shortest path is plausible, improving by 9.0% SPR, 6.9% ASIM, and 3.9% PSIM. Furthermore, when jointly learning pathfinding with both path loss and path probability map loss, DAA* significantly outperforms the state-of-the-art TransPath by 6.7% SPR, 6.5% PSIM, and 3.7% ASIM. We also discuss the minor trade-off between path optimality and search efficiency where applicable.