planning - 2025-07-25

Captain Cinema: Towards Short Movie Generation

Authors:Junfei Xiao, Ceyuan Yang, Lvmin Zhang, Shengqu Cai, Yang Zhao, Yuwei Guo, Gordon Wetzstein, Maneesh Agrawala, Alan Yuille, Lu Jiang

Date:2025-07-24 17:59:56

We present Captain Cinema, a generation framework for short movie generation. Given a detailed textual description of a movie storyline, our approach firstly generates a sequence of keyframes that outline the entire narrative, which ensures long-range coherence in both the storyline and visual appearance (e.g., scenes and characters). We refer to this step as top-down keyframe planning. These keyframes then serve as conditioning signals for a video synthesis model, which supports long context learning, to produce the spatio-temporal dynamics between them. This step is referred to as bottom-up video synthesis. To support stable and efficient generation of multi-scene long narrative cinematic works, we introduce an interleaved training strategy for Multimodal Diffusion Transformers (MM-DiT), specifically adapted for long-context video data. Our model is trained on a specially curated cinematic dataset consisting of interleaved data pairs. Our experiments demonstrate that Captain Cinema performs favorably in the automated creation of visually coherent and narrative consistent short movies in high quality and efficiency. Project page: https://thecinema.ai

Hybrid quantum-classical algorithm for near-optimal planning in POMDPs

Authors:Gilberto Cunha, Alexandra Ramôa, André Sequeira, Michael de Oliveira, Luís Barbosa

Date:2025-07-24 17:42:30

Reinforcement learning (RL) provides a principled framework for decision-making in partially observable environments, which can be modeled as Markov decision processes and compactly represented through dynamic decision Bayesian networks. Recent advances demonstrate that inference on sparse Bayesian networks can be accelerated using quantum rejection sampling combined with amplitude amplification, leading to a computational speedup in estimating acceptance probabilities.\\ Building on this result, we introduce Quantum Bayesian Reinforcement Learning (QBRL), a hybrid quantum-classical look-ahead algorithm for model-based RL in partially observable environments. We present a rigorous, oracle-free time complexity analysis under fault-tolerant assumptions for the quantum device. Unlike standard treatments that assume a black-box oracle, we explicitly specify the inference process, allowing our bounds to more accurately reflect the true computational cost. We show that, for environments whose dynamics form a sparse Bayesian network, horizon-based near-optimal planning can be achieved sub-quadratically faster through quantum-enhanced belief updates. Furthermore, we present numerical experiments benchmarking QBRL against its classical counterpart on simple yet illustrative decision-making tasks. Our results offer a detailed analysis of how the quantum computational advantage translates into decision-making performance, highlighting that the magnitude of the advantage can vary significantly across different deployment settings.

Linear Memory SE(2) Invariant Attention

Authors:Ethan Pronovost, Neha Boloor, Peter Schleede, Noureldin Hendy, Andres Morales, Nicholas Roy

Date:2025-07-24 17:28:57

Processing spatial data is a key component in many learning tasks for autonomous driving such as motion forecasting, multi-agent simulation, and planning. Prior works have demonstrated the value in using SE(2) invariant network architectures that consider only the relative poses between objects (e.g. other agents, scene features such as traffic lanes). However, these methods compute the relative poses for all pairs of objects explicitly, requiring quadratic memory. In this work, we propose a mechanism for SE(2) invariant scaled dot-product attention that requires linear memory relative to the number of objects in the scene. Our SE(2) invariant transformer architecture enjoys the same scaling properties that have benefited large language models in recent years. We demonstrate experimentally that our approach is practical to implement and improves performance compared to comparable non-invariant architectures.

A Novel Monte-Carlo Compressed Sensing and Dictionary Learning Method for the Efficient Path Planning of Remote Sensing Robots

Authors:Alghalya Al-Hajri, Ejmen Al-Ubejdij, Aiman Erbad, Ali Safa

Date:2025-07-24 14:39:01

In recent years, Compressed Sensing (CS) has gained significant interest as a technique for acquiring high-resolution sensory data using fewer measurements than traditional Nyquist sampling requires. At the same time, autonomous robotic platforms such as drones and rovers have become increasingly popular tools for remote sensing and environmental monitoring tasks, including measurements of temperature, humidity, and air quality. Within this context, this paper presents, to the best of our knowledge, the first investigation into how the structure of CS measurement matrices can be exploited to design optimized sampling trajectories for robotic environmental data collection. We propose a novel Monte Carlo optimization framework that generates measurement matrices designed to minimize both the robot's traversal path length and the signal reconstruction error within the CS framework. Central to our approach is the application of Dictionary Learning (DL) to obtain a data-driven sparsifying transform, which enhances reconstruction accuracy while further reducing the number of samples that the robot needs to collect. We demonstrate the effectiveness of our method through experiments reconstructing $NO_2$ pollution maps over the Gulf region. The results indicate that our approach can reduce robot travel distance to less than $10\%$ of a full-coverage path, while improving reconstruction accuracy by over a factor of five compared to traditional CS methods based on DCT and polynomial dictionaries, as well as by a factor of two compared to previously-proposed Informative Path Planning (IPP) methods.

PALM: PAnoramic Learning Map Integrating Learning Analytics and Curriculum Map for Scalable Insights Across Courses

Authors:Mahiro Ozaki, Li Chen, Shotaro Naganuma, Valdemar Švábenský, Fumiya Okubo, Atsushi Shimada

Date:2025-07-24 13:17:47

This study proposes and evaluates the PAnoramic Learning Map (PALM), a learning analytics (LA) dashboard designed to address the scalability challenges of LA by integrating curriculum-level information. Traditional LA research has predominantly focused on individual courses or learners and often lacks a framework that considers the relationships between courses and the long-term trajectory of learning. To bridge this gap, PALM was developed to integrate multilayered educational data into a curriculum map, enabling learners to intuitively understand their learning records and academic progression. We conducted a system evaluation to assess PALM's effectiveness in two key areas: (1) its impact on students' awareness of their learning behaviors, and (2) its comparative performance against existing systems. The results indicate that PALM enhances learners' awareness of study planning and reflection, particularly by improving perceived behavioral control through the visual presentation of individual learning histories and statistical trends, which clarify the links between learning actions and outcomes. Although PALM requires ongoing refinement as a system, it received significantly higher evaluations than existing systems in terms of visual appeal and usability. By serving as an information resource with previously inaccessible insights, PALM enhances self-regulated learning and engagement, representing a significant step beyond conventional LA toward a comprehensive and scalable approach.

Modelling Tritium Production and Release at High-Energy Proton Accelerators

Authors:Dali Georgobiani, Thomas Ginter, Alajos Makovec, Igor Rakhno, Matthew Strait, Igor Tropin

Date:2025-07-24 11:27:13

Tritium is a well-known byproduct of particle accelerator operations. To keep levels of tritium below regulatory limits, tritium production is actively monitored and managed at Fermilab. We plan to study tritium production in the targets, beamline components, and shielding elements of the Fermilab facilities such as NuMI, BNB, and MI-65. To facilitate the analysis, we construct a simple model and use three Monte Carlo radiation codes, FLUKA, MARS, and PHITS, to estimate the amount of tritium produced in these facilities. The analysis could also serve as an intercomparison between these code results related to tritium production. To assess the actual amounts of tritium that would be released from various materials, we employ a semi-empirical diffusion model. The results of this analysis are compared to experimental data whenever possible. This approach also helps to optimize proposed target materials with respect to the tritium production and release.

ReSem3D: Refinable 3D Spatial Constraints via Fine-Grained Semantic Grounding for Generalizable Robotic Manipulation

Authors:Chenyu Su, Weiwei Shang, Chen Qian, Fei Zhang, Shuang Cong

Date:2025-07-24 10:07:31

Semantics-driven 3D spatial constraints align highlevel semantic representations with low-level action spaces, facilitating the unification of task understanding and execution in robotic manipulation. The synergistic reasoning of Multimodal Large Language Models (MLLMs) and Vision Foundation Models (VFMs) enables cross-modal 3D spatial constraint construction. Nevertheless, existing methods have three key limitations: (1) coarse semantic granularity in constraint modeling, (2) lack of real-time closed-loop planning, (3) compromised robustness in semantically diverse environments. To address these challenges, we propose ReSem3D, a unified manipulation framework for semantically diverse environments, leveraging the synergy between VFMs and MLLMs to achieve fine-grained visual grounding and dynamically constructs hierarchical 3D spatial constraints for real-time manipulation. Specifically, the framework is driven by hierarchical recursive reasoning in MLLMs, which interact with VFMs to automatically construct 3D spatial constraints from natural language instructions and RGB-D observations in two stages: part-level extraction and region-level refinement. Subsequently, these constraints are encoded as real-time optimization objectives in joint space, enabling reactive behavior to dynamic disturbances. Extensive simulation and real-world experiments are conducted in semantically rich household and sparse chemical lab environments. The results demonstrate that ReSem3D performs diverse manipulation tasks under zero-shot conditions, exhibiting strong adaptability and generalization. Code and videos at https://resem3d.github.io.

Autonomous UAV Navigation for Search and Rescue Missions Using Computer Vision and Convolutional Neural Networks

Authors:Luka Šiktar, Branimir Ćaran, Bojan Šekoranja, Marko Švaco

Date:2025-07-24 07:54:45

In this paper, we present a subsystem, using Unmanned Aerial Vehicles (UAV), for search and rescue missions, focusing on people detection, face recognition and tracking of identified individuals. The proposed solution integrates a UAV with ROS2 framework, that utilizes multiple convolutional neural networks (CNN) for search missions. System identification and PD controller deployment are performed for autonomous UAV navigation. The ROS2 environment utilizes the YOLOv11 and YOLOv11-pose CNNs for tracking purposes, and the dlib library CNN for face recognition. The system detects a specific individual, performs face recognition and starts tracking. If the individual is not yet known, the UAV operator can manually locate the person, save their facial image and immediately initiate the tracking process. The tracking process relies on specific keypoints identified on the human body using the YOLOv11-pose CNN model. These keypoints are used to track a specific individual and maintain a safe distance. To enhance accurate tracking, system identification is performed, based on measurement data from the UAVs IMU. The identified system parameters are used to design PD controllers that utilize YOLOv11-pose to estimate the distance between the UAVs camera and the identified individual. The initial experiments, conducted on 14 known individuals, demonstrated that the proposed subsystem can be successfully used in real time. The next step involves implementing the system on a large experimental UAV for field use and integrating autonomous navigation with GPS-guided control for rescue operations planning.

Regional Frequency-Constrained Planning for the Optimal Sizing of Power Systems via Enhanced Input Convex Neural Networks

Authors:Yi Wang, Goran Strbac

Date:2025-07-24 05:28:52

Large renewable penetration has been witnessed in power systems, resulting in reduced levels of system inertia and increasing requirements for frequency response services. There have been plenty of studies developing frequency-constrained models for power system security. However, most existing literature only considers uniform frequency security, while neglecting frequency spatial differences in different regions. To fill this gap, this paper proposes a novel planning model for the optimal sizing problem of power systems, capturing regional frequency security and inter-area frequency oscillations. Specifically, regional frequency constraints are first extracted via an enhanced input convex neural network (ICNN) and then embedded into the original optimisation for frequency security, where a principled weight initialisation strategy is adopted to deal with the gradient vanishing issues of non-negative weights in traditional ICNNs and enhance its fitting ability. An adaptive genetic algorithm with sparsity calculation and local search is developed to separate the planning model into two stages and effectively solve it iteratively. Case studies have been conducted on three different power systems to verify the effectiveness of the proposed frequency-constrained planning model in ensuring regional system security and obtaining realistic investment decisions.

Comparison of Segmentation Methods in Remote Sensing for Land Use Land Cover

Authors:Naman Srivastava, Joel D Joy, Yash Dixit, Swarup E, Rakshit Ramesh

Date:2025-07-24 05:23:02

Land Use Land Cover (LULC) mapping is essential for urban and resource planning, and is one of the key elements in developing smart and sustainable cities.This study evaluates advanced LULC mapping techniques, focusing on Look-Up Table (LUT)-based Atmospheric Correction applied to Cartosat Multispectral (MX) sensor images, followed by supervised and semi-supervised learning models for LULC prediction. We explore DeeplabV3+ and Cross-Pseudo Supervision (CPS). The CPS model is further refined with dynamic weighting, enhancing pseudo-label reliability during training. This comprehensive approach analyses the accuracy and utility of LULC mapping techniques for various urban planning applications. A case study of Hyderabad, India, illustrates significant land use changes due to rapid urbanization. By analyzing Cartosat MX images over time, we highlight shifts such as urban sprawl, shrinking green spaces, and expanding industrial areas. This demonstrates the practical utility of these techniques for urban planners and policymakers.

Carbon Emission Flow Tracing: Fast Algorithm and California Grid Study

Authors:Yuqing Shen, Yuanyuan Shi, Daniel Kirschen, Yize Chen

Date:2025-07-24 04:01:54

Power systems decarbonization are at the focal point of the clean energy transition. While system operators and utility companies increasingly publicize system-level carbon emission information, it remains unclear how emissions from individual generators are transported through the grid and how they impact electricity users at specific locations. This paper presents a novel and computationally efficient approach for exact quantification of nodal average and marginal carbon emission rates, applicable to both AC and DC optimal power flow problems. The approach leverages graph-based topological sorting and directed cycle removal techniques, applied to directed graphs formed by generation dispatch and optimal power flow solutions. Our proposed algorithm efficiently identifies each generator's contribution to each node, capturing how emissions are spatially distributed under varying system conditions. To validate its effectiveness and reveal locational and temporal emission patterns in the real world, we simulate the 8,870-bus realistic California grid using actual CAISO data and the CATS model. Based on year long hourly data on nodal loads and renewable generation, obtained or estimated from CAISO public data, our method accurately estimates power flow conditions, generation mixes, and systemwide emissions, and delivers fine grained spatiotemporal emission analysis for every California county. Both our algorithm and the California study are open-sourced, providing a foundation for future research on grid emissions, planning, operations, and energy policy.

Factors Impacting Faculty Adoption of Project-Based Learning in Computing Education: a Survey

Authors:Ahmad D. Suleiman, Yiming Tang, Daqing Hou

Date:2025-07-24 02:16:29

This research full paper investigates the factors influencing computing educators' adoption of project-based learning (PjBL) in software engineering and computing curricula. Recognized as a student-centered pedagogical approach, PjBL has the potential to enhance student motivation, engagement, critical thinking, collaboration, and problem-solving skills. Despite these benefits, faculty adoption remains inconsistent due to challenges such as insufficient institutional support, time constraints, limited training opportunities, designing or sourcing projects, and aligning them with course objectives. This research explores these barriers and investigates the strategies and resources that facilitate a successful adoption. Using a mixed-methods approach, data from 80 computing faculty were collected through an online survey comprising closed-ended questions to quantify barriers, enablers, and resource needs, along with an open-ended question to gather qualitative insights. Quantitative data were analyzed using statistical methods, while qualitative responses underwent thematic analysis. Results reveal that while PjBL is widely valued, its adoption is often selective and impacted by challenges in planning and managing the learning process, designing suitable projects, and a lack of institutional support, such as time, funding, and teaching assistants. Faculty are more likely to adopt or sustain PjBL when they have access to peer collaboration, professional development, and institutional incentives. In addition, sourcing projects from research, industry partnerships, and borrowing from peers emerged as key facilitators for new projects. These findings underscore the need for systemic support structures to empower faculty to experiment with and scale PjBL practices.

Synthesis of timeline-based planning strategies avoiding determinization

Authors:Dario Della Monica, Angelo Montanari, Pietro Sala

Date:2025-07-23 23:39:04

Qualitative timeline-based planning models domains as sets of independent, but interacting, components whose behaviors over time, the timelines, are governed by sets of qualitative temporal constraints (ordering relations), called synchronization rules. Its plan-existence problem has been shown to be PSPACE-complete; in particular, PSPACE-membership has been proved via reduction to the nonemptiness problem for nondeterministic finite automata. However, nondeterministic automata cannot be directly used to synthesize planning strategies as a costly determinization step is needed. In this paper, we identify a fragment of qualitative timeline-based planning whose plan-existence problem can be directly mapped into the nonemptiness problem of deterministic finite automata, which can then synthesize strategies. In addition, we identify a maximal subset of Allen's relations that fits into such a deterministic fragment.

Tuning chiral anomaly signature in a Dirac semimetal via fast-ion implantation

Authors:Manasi Mandal, Eunbi Rha, Abhijatmedhi Chotrattanapituk, Denisse Córdova Carrizales, Alexander Lygo, Kevin B. Woller, Mouyang Cheng, Ryotaro Okabe, Guomin Zhu, Kiran Mak, Chu-Liang Fu, Chuhang Liu, Lijun Wu, Yimei Zhu, Susanne Stemmer, Mingda Li

Date:2025-07-23 22:41:22

Cd$_3$As$_2$ is a prototypical Dirac semimetal that hosts a chiral anomaly and thereby functions as a platform to test high-energy physics hypotheses and to realize energy efficient applications. Here we use a combination of accelerator-based fast ion implantation and theory-driven planning to enhance the negative longitudinal magnetoresistance (NLMR)--a signature of a chiral anomaly--in Nb-doped Cd$_3$As$_2$ thin films. High-energy ion implantation is commonly used to investigate semiconductors and nuclear materials but is rarely employed to study quantum materials. We use electrical transport and transmission electron microscopy to characterize the NLMR and the crystallinity of Nb-doped Cd$_3$As$_2$ thin films. We find surface-doped Nb-Cd$_3$As$_2$ thin films display a maximum NLMR around $B = 7$ T and bulk-doped Nb-Cd$_3$As$_2$ thin films display a maximum NLMR over $B = 9$ T--all while maintaining crystallinity. This is more than a 100% relative enhancement of the maximum NLMR compared to pristine Cd$_3$As$_2$ thin films ($B = 4$ T). Our work demonstrates the potential of high-energy ion implantation as a practical route to realize chiralitronic functionalities in topological semimetals.

Online Submission and Evaluation System Design for Competition Operations

Authors:Zhe Chen, Daniel Harabor, Ryan Hechnenberger, Nathan R. Sturtevant

Date:2025-07-23 17:44:10

Research communities have developed benchmark datasets across domains to compare the performance of algorithms and techniques However, tracking the progress in these research areas is not easy, as publications appear in different venues at the same time, and many of them claim to represent the state-of-the-art. To address this, research communities often organise periodic competitions to evaluate the performance of various algorithms and techniques, thereby tracking advancements in the field. However, these competitions pose a significant operational burden. The organisers must manage and evaluate a large volume of submissions. Furthermore, participants typically develop their solutions in diverse environments, leading to compatibility issues during the evaluation of their submissions. This paper presents an online competition system that automates the submission and evaluation process for a competition. The competition system allows organisers to manage large numbers of submissions efficiently, utilising isolated environments to evaluate submissions. This system has already been used successfully for several competitions, including the Grid-Based Pathfinding Competition and the League of Robot Runners competition.

Layout optimization for the LUXE-NPOD experiment

Authors:Melissa Almanza Soto, Oleksandr Borysov, Torben Ferber, Shan Huang, Adrián Irles, Markus Klute, Jesús P. Márquez Hernández, Josep Pérez Segura, Raquel Quishpe, Yotam Soreq, Noam Tal Hod, Nicolò Trevisani

Date:2025-07-23 17:28:15

Beam dump experiments represent an effective way to probe new physics in a parameter space, where new particles have feeble couplings to the Standard Model sector and masses below the GeV scale. The LUXE experiment, designed primarily to study strong-field quantum electrodynamics, can be used also as a photon beam dump experiment with a unique reach for new spin-0 particles in the $10-350~\mathrm{MeV}$ mass and $10^{-6}-10^{-3}~\mathrm{GeV}^{-1}$ couplings to photons ranges. This is achieved via the ``New Physics search with Optical Dump'' (NPOD) concept. While prior estimations were obtained with a simplified model of the experimental setup, in this work we present a systematic study of the new physics reach in the full, realistic experimental apparatus, including an existing detector to be used in the LUXE NPOD context. We furthermore investigate updated scenarios of LUXE's experimental plan and confirm that our results are in agreement with the original estimations of a background-free operation.

Safety Assurance for Quadrotor Kinodynamic Motion Planning

Authors:Theodoros Tavoulareas, Marzia Cescon

Date:2025-07-23 16:42:12

Autonomous drones have gained considerable attention for applications in real-world scenarios, such as search and rescue, inspection, and delivery. As their use becomes ever more pervasive in civilian applications, failure to ensure safe operation can lead to physical damage to the system, environmental pollution, and even loss of human life. Recent work has demonstrated that motion planning techniques effectively generate a collision-free trajectory during navigation. However, these methods, while creating the motion plans, do not inherently consider the safe operational region of the system, leading to potential safety constraints violation during deployment. In this paper, we propose a method that leverages run time safety assurance in a kinodynamic motion planning scheme to satisfy the system's operational constraints. First, we use a sampling-based geometric planner to determine a high-level collision-free path within a user-defined space. Second, we design a low-level safety assurance filter to provide safety guarantees to the control input of a Linear Quadratic Regulator (LQR) designed with the purpose of trajectory tracking. We demonstrate our proposed approach in a restricted 3D simulation environment using a model of the Crazyflie 2.0 drone.

Impact of Medium and Heavy-Duty Electric Vehicle Electrification on Distribution System Stability

Authors:Ali Hassan, Wanshi Hong, Bin Wang, Wencong Su

Date:2025-07-23 16:29:47

Medium and heavy-duty (MHD) commercial vehicles contribute significantly to carbon emissions, accounting for 21\% of the total emissions in the transportation sector. To curb this, U.S. government is increasingly focusing on achieving 100\% fleet electrification over the next decade. However, the integration of megawatt-scale charging stations designed for MHD vehicles poses challenges to the stability of secondary distribution systems. This study investigates the impact of megawatt-scale charging station loads on a benchmark IEEE 33-bus distribution system using real data from the HEVI-LOAD software for MHD electrification planning developed by Lawrence Berkeley National Laboratory (LBNL). The results reveal significant violations of per-unit (p.u.) voltage values at various nodes of the distribution system, indicating that substantial upgrades to the distribution infrastructure will be necessary to accommodate the projected MHDEV charging loads and meet electrification targets.

PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving

Authors:Maciej K. Wozniak, Lianhang Liu, Yixi Cai, Patric Jensfelt

Date:2025-07-23 15:28:23

While end-to-end autonomous driving models show promising results, their practical deployment is often hindered by large model sizes, a reliance on expensive LiDAR sensors and computationally intensive BEV feature representations. This limits their scalability, especially for mass-market vehicles equipped only with cameras. To address these challenges, we propose PRIX (Plan from Raw Pixels). Our novel and efficient end-to-end driving architecture operates using only camera data, without explicit BEV representation and forgoing the need for LiDAR. PRIX leverages a visual feature extractor coupled with a generative planning head to predict safe trajectories from raw pixel inputs directly. A core component of our architecture is the Context-aware Recalibration Transformer (CaRT), a novel module designed to effectively enhance multi-level visual features for more robust planning. We demonstrate through comprehensive experiments that PRIX achieves state-of-the-art performance on the NavSim and nuScenes benchmarks, matching the capabilities of larger, multimodal diffusion planners while being significantly more efficient in terms of inference speed and model size, making it a practical solution for real-world deployment. Our work is open-source and the code will be at https://maxiuw.github.io/prix.

A Joint Planning Model for Fixed and Mobile Electric Vehicle Charging Stations Considering Flexible Capacity Strategy

Authors:Zhe Yu, Xue Hu, Qin Wang

Date:2025-07-23 15:22:04

The widespread adoption of electric vehicles (EVs) has significantly increased demand on both transportation and power systems, posing challenges to their stable operation. To support the growing need for EV charging, both fixed charging stations (FCSs) and mobile charging stations (MCSs) have been introduced, serving as key interfaces between the power grid and traffic network. Recognizing the importance of collaborative planning across these sectors, this paper presents a two-stage joint planning model for FCSs and MCSs, utilizing an improved alternating direction method of multipliers (ADMM) algorithm. The primary goal of the proposed model is to transform the potential negative impacts of large-scale EV integration into positive outcomes, thereby enhancing social welfare through collaboration among multiple stakeholders. In the first stage, we develop a framework for evaluating FCS locations, incorporating assessments of EV hosting capacity and voltage stability. The second stage introduces a joint planning model for FCSs and MCSs, aiming to minimize the overall social costs of the EV charging system while maintaining a reliable power supply. To solve the planning problem, we employ a combination of mixed-integer linear programming, queueing theory, and sequential quadratic programming. The improved ADMM algorithm couples the siting and sizing decisions consistently by introducing coupling constraints, and supports a distributed optimization framework that coordinates the interests of EV users, MCS operators, and distribution system operators. Additionally, a flexible capacity planning strategy that accounts for the multi-period development potential of EVCS is proposed to reduce both the complexity and the investment required for FCS construction. Finally, a case study with comparative experiments demonstrates the effectiveness of the proposed models and solution methods.

Terrain-Aware Adaptation for Two-Dimensional UAV Path Planners

Authors:Kostas Karakontis, Thanos Petsanis, Athanasios Ch. Kapoutsis, Pavlos Ch. Kapoutsis, Elias B. Kosmatopoulos

Date:2025-07-23 13:55:37

Multi-UAV Coverage Path Planning (mCPP) algorithms in popular commercial software typically treat a Region of Interest (RoI) only as a 2D plane, ignoring important3D structure characteristics. This leads to incomplete 3Dreconstructions, especially around occluded or vertical surfaces. In this paper, we propose a modular algorithm that can extend commercial two-dimensional path planners to facilitate terrain-aware planning by adjusting altitude and camera orientations. To demonstrate it, we extend the well-known DARP (Divide Areas for Optimal Multi-Robot Coverage Path Planning) algorithm and produce DARP-3D. We present simulation results in multiple 3D environments and a real-world flight test using DJI hardware. Compared to baseline, our approach consistently captures improved 3D reconstructions, particularly in areas with significant vertical features. An open-source implementation of the algorithm is available here:https://github.com/konskara/TerraPlan

Joint Multi-Target Detection-Tracking in Cognitive Massive MIMO Radar via POMCP

Authors:Imad Bouhou, Stefano Fortunati, Leila Gharsalli, Alexandre Renaux

Date:2025-07-23 13:43:29

This correspondence presents a power-aware cognitive radar framework for joint detection and tracking of multiple targets in a massive multiple-input multiple-output (MIMO) radar environment. Building on a previous single-target algorithm based on Partially Observable Monte Carlo Planning (POMCP), we extend it to the multi-target case by assigning each target an independent POMCP tree, enabling scalable and efficient planning. Departing from uniform power allocation-which is often suboptimal with varying signal-to-noise ratios (SNRs)-our approach predicts each target's future angular position and expected received power, based on its estimated range and radar cross-section (RCS). These predictions guide adaptive waveform design via a constrained optimization problem that allocates transmit energy to enhance the detectability of weaker or distant targets, while ensuring sufficient power for high-SNR targets. The reward function in the underlying partially observable Markov decision process (POMDP) is also modified to prioritize accurate spatial and power estimation. Simulations involving multiple targets with different SNRs confirm the effectiveness of our method. The proposed framework for the cognitive radar improves detection probability for low-SNR targets and achieves more accurate tracking compared to approaches using uniform or orthogonal waveforms. These results demonstrate the potential of the POMCP-based framework for adaptive, efficient multi-target radar systems.

IndoorBEV: Joint Detection and Footprint Completion of Objects via Mask-based Prediction in Indoor Scenarios for Bird's-Eye View Perception

Authors:Haichuan Li, Changda Tian, Panos Trahanias, Tomi Westerlund

Date:2025-07-23 12:07:21

Detecting diverse objects within complex indoor 3D point clouds presents significant challenges for robotic perception, particularly with varied object shapes, clutter, and the co-existence of static and dynamic elements where traditional bounding box methods falter. To address these limitations, we propose IndoorBEV, a novel mask-based Bird's-Eye View (BEV) method for indoor mobile robots. In a BEV method, a 3D scene is projected into a 2D BEV grid which handles naturally occlusions and provides a consistent top-down view aiding to distinguish static obstacles from dynamic agents. The obtained 2D BEV results is directly usable to downstream robotic tasks like navigation, motion prediction, and planning. Our architecture utilizes an axis compact encoder and a window-based backbone to extract rich spatial features from this BEV map. A query-based decoder head then employs learned object queries to concurrently predict object classes and instance masks in the BEV space. This mask-centric formulation effectively captures the footprint of both static and dynamic objects regardless of their shape, offering a robust alternative to bounding box regression. We demonstrate the effectiveness of IndoorBEV on a custom indoor dataset featuring diverse object classes including static objects and dynamic elements like robots and miscellaneous items, showcasing its potential for robust indoor scene understanding.

Optimizing Car Resequencing on Mixed-Model Assembly Lines: Algorithm Development and Deployment

Authors:Andreas Karrenbauer, Bernd Kuhn, Kurt Mehlhorn, Paolo Luigi Rinaldi

Date:2025-07-23 11:29:44

The mixed-model assembly line (MMAL) is a production system used in the automobile industry to manufacture different car models on the same conveyor, offering a high degree of product customization and flexibility. However, the MMAL also poses challenges, such as finding optimal sequences of models satisfying multiple constraints and objectives related to production performance, quality, and delivery -- including minimizing the number of color changeovers in the Paint Shop, balancing the workload and setup times on the assembly line, and meeting customer demand and delivery deadlines. We propose a multi-objective algorithm to solve the MMAL resequencing problem under consideration of all these aspects simultaneously. We also present empirical results obtained from recorded event data of the production process over $4$ weeks following the deployment of our algorithm in the Saarlouis plant of Ford-Werke GmbH. We achieved an improvement of the average batch size of about $30\%$ over the old control software translating to a $23\%$ reduction of color changeovers. Moreover, we reduced the spread of cars planned for a specific date by $10\%$, reducing the risk of delays in delivery. We discuss effectiveness and robustness of our algorithm in improving production performance and quality as well as trade-offs and limitations.

From Extraterrestrial Microbes to Alien Intelligence: Rebalancing Astronomical Research Priorities

Authors:Omer Eldadi, Gershon Tenenbaum, Abraham Loeb

Date:2025-07-23 10:04:56

We examine the funding disparity in astronomical research priorities: the Habitable Worlds Observatory is planned to receive over $10 billion over the next two decades whereas extraterrestrial intelligence research receives nearly zero federal funding. This imbalance is in contrast to both scientific value and public interest, as 65% of Americans and 58.2% of surveyed astrobiologists believe extraterrestrial intelligence exists. Empirical psychological research demonstrates that humanity possesses greater resilience toward extraterrestrial contact than historically recognized. Contemporary studies reveal adaptive responses rather than mass panic, conflicting with the rationale for excluding extraterrestrial intelligence research from federal funding since 1993. The response to the recent interstellar object 3I/ATLAS exemplifies consequences of this underinvestment: despite discovery forecasts of a new interstellar object every few months for the coming decade, no funded missions exist to intercept or closely study these visitors from outside the Solar System. We propose establishing a comprehensive research program to explore both biosignatures and technosignatures on interstellar objects. This program would address profound public interest while advancing detection capabilities and enabling potentially transformative discoveries in the search for extraterrestrial life. The systematic exclusion of extraterrestrial intelligence research represents institutional bias rather than scientific limitation, requiring immediate reconsideration of funding priorities.

DeMo++: Motion Decoupling for Autonomous Driving

Authors:Bozhou Zhang, Nan Song, Xiatian Zhu, Li Zhang

Date:2025-07-23 09:11:25

Motion forecasting and planning are tasked with estimating the trajectories of traffic agents and the ego vehicle, respectively, to ensure the safety and efficiency of autonomous driving systems in dynamically changing environments. State-of-the-art methods typically adopt a one-query-one-trajectory paradigm, where each query corresponds to a unique trajectory for predicting multi-mode trajectories. While this paradigm can produce diverse motion intentions, it often falls short in modeling the intricate spatiotemporal evolution of trajectories, which can lead to collisions or suboptimal outcomes. To overcome this limitation, we propose DeMo++, a framework that decouples motion estimation into two distinct components: holistic motion intentions to capture the diverse potential directions of movement, and fine spatiotemporal states to track the agent's dynamic progress within the scene and enable a self-refinement capability. Further, we introduce a cross-scene trajectory interaction mechanism to explore the relationships between motions in adjacent scenes. This allows DeMo++ to comprehensively model both the diversity of motion intentions and the spatiotemporal evolution of each trajectory. To effectively implement this framework, we developed a hybrid model combining Attention and Mamba. This architecture leverages the strengths of both mechanisms for efficient scene information aggregation and precise trajectory state sequence modeling. Extensive experiments demonstrate that DeMo++ achieves state-of-the-art performance across various benchmarks, including motion forecasting (Argoverse 2 and nuScenes), motion planning (nuPlan), and end-to-end planning (NAVSIM).

EarthLink: A Self-Evolving AI Agent for Climate Science

Authors:Zijie Guo, Jiong Wang, Xiaoyu Yue, Wangxu Wei, Zhe Jiang, Wanghan Xu, Ben Fei, Wenlong Zhang, Xinyu Gu, Lijing Cheng, Jing-Jia Luo, Chao Li, Yaqiang Wang, Tao Chen, Wanli Ouyang, Fenghua Ling, Lei Bai

Date:2025-07-23 08:29:25

Modern Earth science is at an inflection point. The vast, fragmented, and complex nature of Earth system data, coupled with increasingly sophisticated analytical demands, creates a significant bottleneck for rapid scientific discovery. Here we introduce EarthLink, the first AI agent designed as an interactive copilot for Earth scientists. It automates the end-to-end research workflow, from planning and code generation to multi-scenario analysis. Unlike static diagnostic tools, EarthLink can learn from user interaction, continuously refining its capabilities through a dynamic feedback loop. We validated its performance on a number of core scientific tasks of climate change, ranging from model-observation comparisons to the diagnosis of complex phenomena. In a multi-expert evaluation, EarthLink produced scientifically sound analyses and demonstrated an analytical competency that was rated as comparable to specific aspects of a human junior researcher's workflow. Additionally, its transparent, auditable workflows and natural language interface empower scientists to shift from laborious manual execution to strategic oversight and hypothesis generation. EarthLink marks a pivotal step towards an efficient, trustworthy, and collaborative paradigm for Earth system research in an era of accelerating global change. The system is accessible at our website https://earthlink.intern-ai.org.cn.

VLA-Touch: Enhancing Vision-Language-Action Models with Dual-Level Tactile Feedback

Authors:Jianxin Bi, Kevin Yuchen Ma, Ce Hao, Mike Zheng Shou, Harold Soh

Date:2025-07-23 07:54:10

Tactile feedback is generally recognized to be crucial for effective interaction with the physical world. However, state-of-the-art Vision-Language-Action (VLA) models lack the ability to interpret and use tactile signals, limiting their effectiveness in contact-rich tasks. Incorporating tactile feedback into these systems is challenging due to the absence of large multi-modal datasets. We present VLA-Touch, an approach that enhances generalist robot policies with tactile sensing \emph{without fine-tuning} the base VLA. Our method introduces two key innovations: (1) a pipeline that leverages a pretrained tactile-language model that provides semantic tactile feedback for high-level task planning, and (2) a diffusion-based controller that refines VLA-generated actions with tactile signals for contact-rich manipulation. Through real-world experiments, we demonstrate that our dual-level integration of tactile feedback improves task planning efficiency while enhancing execution precision. Code is open-sourced at \href{https://github.com/jxbi1010/VLA-Touch}{this URL}.

Investigating State-of-the-Art Planning Strategies for Electric Vehicle Charging Infrastructures in Coupled Transport and Power Networks: A Comprehensive Review

Authors:Jinhao Li, Arlena Chew, Hao Wang

Date:2025-07-23 07:29:06

Electric vehicles (EVs) have emerged as a pivotal solution to reduce greenhouse gas emissions paving a pathway to net zero. As the adoption of EVs continues to grow, countries are proactively formulating systematic plans for nationwide electric vehicle charging infrastructure (EVCI) to keep pace with the accelerating shift towards EVs. This comprehensive review aims to thoroughly examine current global practices in EVCI planning and explore state-of-the-art methodologies for designing EVCI planning strategies. Despite remarkable efforts by influential players in the global EV market, such as China, the United States, and the European Union, the progress in EVCI rollout has been notably slower than anticipated in the rest of the world. This delay can be attributable to three major impediments: inadequate EVCI charging services, low utilization rates of public EVCI facilities, and the non-trivial integration of EVCI into the electric grid. This review dissects the interests of these stakeholders, clarifying their respective roles and expectations in the context of EVCI planning. This review also provides insights into level 1, 2, and 3 chargers with explorations of their applications in different geographical locations for diverse EV charging patterns. Finally, a thorough review of node-based and flow-based approaches to EV planning is presented. The modeling of placing charging stations is broadly categorized into set coverage, maximum coverage, flow-capturing, and flow-refueling location models. In conclusion, this review identifies several research gaps, including the dynamic modeling of EV charging demand and the coordination of vehicle electrification with grid decarbonization. This paper calls for further contributions to bridge these gaps and drive the advancement of EVCI planning.

Lessons from a Big-Bang Integration: Challenges in Edge Computing and Machine Learning

Authors:Alessandro Aneggi, Andrea Janes

Date:2025-07-23 07:16:45

This experience report analyses a one year project focused on building a distributed real-time analytics system using edge computing and machine learning. The project faced critical setbacks due to a big-bang integration approach, where all components developed by multiple geographically dispersed partners were merged at the final stage. The integration effort resulted in only six minutes of system functionality, far below the expected 40 minutes. Through root cause analysis, the study identifies technical and organisational barriers, including poor communication, lack of early integration testing, and resistance to topdown planning. It also considers psychological factors such as a bias toward fully developed components over mockups. The paper advocates for early mock based deployment, robust communication infrastructures, and the adoption of topdown thinking to manage complexity and reduce risk in reactive, distributed projects. These findings underscore the limitations of traditional Agile methods in such contexts and propose simulation-driven engineering and structured integration cycles as key enablers for future success.