planning - 2025-07-21

UGPL: Uncertainty-Guided Progressive Learning for Evidence-Based Classification in Computed Tomography

Authors:Shravan Venkatraman, Pavan Kumar S, Rakesh Raj Madavan, Chandrakala S

Date:2025-07-18 17:30:56

Accurate classification of computed tomography (CT) images is essential for diagnosis and treatment planning, but existing methods often struggle with the subtle and spatially diverse nature of pathological features. Current approaches typically process images uniformly, limiting their ability to detect localized abnormalities that require focused analysis. We introduce UGPL, an uncertainty-guided progressive learning framework that performs a global-to-local analysis by first identifying regions of diagnostic ambiguity and then conducting detailed examination of these critical areas. Our approach employs evidential deep learning to quantify predictive uncertainty, guiding the extraction of informative patches through a non-maximum suppression mechanism that maintains spatial diversity. This progressive refinement strategy, combined with an adaptive fusion mechanism, enables UGPL to integrate both contextual information and fine-grained details. Experiments across three CT datasets demonstrate that UGPL consistently outperforms state-of-the-art methods, achieving improvements of 3.29%, 2.46%, and 8.08% in accuracy for kidney abnormality, lung cancer, and COVID-19 detection, respectively. Our analysis shows that the uncertainty-guided component provides substantial benefits, with performance dramatically increasing when the full progressive learning pipeline is implemented. Our code is available at: https://github.com/shravan-18/UGPL

Context-Aware Behavior Learning with Heuristic Motion Memory for Underwater Manipulation

Authors:Markus Buchholz, Ignacio Carlucho, Michele Grimaldi, Maria Koskinopoulou, Yvan R. Petillot

Date:2025-07-18 17:25:54

Autonomous motion planning is critical for efficient and safe underwater manipulation in dynamic marine environments. Current motion planning methods often fail to effectively utilize prior motion experiences and adapt to real-time uncertainties inherent in underwater settings. In this paper, we introduce an Adaptive Heuristic Motion Planner framework that integrates a Heuristic Motion Space (HMS) with Bayesian Networks to enhance motion planning for autonomous underwater manipulation. Our approach employs the Probabilistic Roadmap (PRM) algorithm within HMS to optimize paths by minimizing a composite cost function that accounts for distance, uncertainty, energy consumption, and execution time. By leveraging HMS, our framework significantly reduces the search space, thereby boosting computational performance and enabling real-time planning capabilities. Bayesian Networks are utilized to dynamically update uncertainty estimates based on real-time sensor data and environmental conditions, thereby refining the joint probability of path success. Through extensive simulations and real-world test scenarios, we showcase the advantages of our method in terms of enhanced performance and robustness. This probabilistic approach significantly advances the capability of autonomous underwater robots, ensuring optimized motion planning in the face of dynamic marine challenges.

A multi-strategy improved snake optimizer for three-dimensional UAV path planning and engineering problems

Authors:Genliang Li, Yaxin Cui, Jinyu Su

Date:2025-07-18 16:11:35

Metaheuristic algorithms have gained widespread application across various fields owing to their ability to generate diverse solutions. One such algorithm is the Snake Optimizer (SO), a progressive optimization approach. However, SO suffers from the issues of slow convergence speed and susceptibility to local optima. In light of these shortcomings, we propose a novel Multi-strategy Improved Snake Optimizer (MISO). Firstly, we propose a new adaptive random disturbance strategy based on sine function to alleviate the risk of getting trapped in a local optimum. Secondly, we introduce adaptive Levy flight strategy based on scale factor and leader and endow the male snake leader with flight capability, which makes it easier for the algorithm to leap out of the local optimum and find the global optimum. More importantly, we put forward a position update strategy combining elite leadership and Brownian motion, effectively accelerating the convergence speed while ensuring precision. Finally, to demonstrate the performance of MISO, we utilize 30 CEC2017 test functions and the CEC2022 test suite, comparing it with 11 popular algorithms across different dimensions to validate its effectiveness. Moreover, Unmanned Aerial Vehicle (UAV) has been widely used in various fields due to its advantages of low cost, high mobility and easy operation. However, the UAV path planning problem is crucial for flight safety and efficiency, and there are still challenges in establishing and optimizing the path model. Therefore, we apply MISO to the UAV 3D path planning problem as well as 6 engineering design problems to assess its feasibility in practical applications. The experimental results demonstrate that MISO exceeds other competitive algorithms in terms of solution quality and stability, establishing its strong potential for application.

DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation

Authors:Haoran Li, Yuli Tian, Kun Lan, Yong Liao, Lin Wang, Pan Hui, Peng Yuan Zhou

Date:2025-07-18 14:45:54

Generating 3D scenes from natural language holds great promise for applications in gaming, film, and design. However, existing methods struggle with automation, 3D consistency, and fine-grained control. We present DreamScene, an end-to-end framework for high-quality and editable 3D scene generation from text or dialogue. DreamScene begins with a scene planning module, where a GPT-4 agent infers object semantics and spatial constraints to construct a hybrid graph. A graph-based placement algorithm then produces a structured, collision-free layout. Based on this layout, Formation Pattern Sampling (FPS) generates object geometry using multi-timestep sampling and reconstructive optimization, enabling fast and realistic synthesis. To ensure global consistent, DreamScene employs a progressive camera sampling strategy tailored to both indoor and outdoor settings. Finally, the system supports fine-grained scene editing, including object movement, appearance changes, and 4D dynamic motion. Experiments demonstrate that DreamScene surpasses prior methods in quality, consistency, and flexibility, offering a practical solution for open-domain 3D content creation. Code and demos are available at https://dreamscene-project.github.io.

Leveraging Pathology Foundation Models for Panoptic Segmentation of Melanoma in H&E Images

Authors:Jiaqi Lv, Yijie Zhu, Carmen Guadalupe Colin Tenorio, Brinder Singh Chohan, Mark Eastwood, Shan E Ahmed Raza

Date:2025-07-18 14:38:25

Melanoma is an aggressive form of skin cancer with rapid progression and high metastatic potential. Accurate characterisation of tissue morphology in melanoma is crucial for prognosis and treatment planning. However, manual segmentation of tissue regions from haematoxylin and eosin (H&E) stained whole-slide images (WSIs) is labour-intensive and prone to inter-observer variability, this motivates the need for reliable automated tissue segmentation methods. In this study, we propose a novel deep learning network for the segmentation of five tissue classes in melanoma H&E images. Our approach leverages Virchow2, a pathology foundation model trained on 3.1 million histopathology images as a feature extractor. These features are fused with the original RGB images and subsequently processed by an encoder-decoder segmentation network (Efficient-UNet) to produce accurate segmentation maps. The proposed model achieved first place in the tissue segmentation task of the PUMA Grand Challenge, demonstrating robust performance and generalizability. Our results show the potential and efficacy of incorporating pathology foundation models into segmentation networks to accelerate computational pathology workflows.

Generalist Forecasting with Frozen Video Models via Latent Diffusion

Authors:Jacob C Walker, Pedro Vélez, Luisa Polania Cabrera, Guangyao Zhou, Rishabh Kabra, Carl Doersch, Maks Ovsjanikov, João Carreira, Shiry Ginosar

Date:2025-07-18 14:14:19

Forecasting what will happen next is a critical skill for general-purpose systems that plan or act in the world at different levels of abstraction. In this paper, we identify a strong correlation between a vision model's perceptual ability and its generalist forecasting performance over short time horizons. This trend holds across a diverse set of pretrained models-including those trained generatively-and across multiple levels of abstraction, from raw pixels to depth, point tracks, and object motion. The result is made possible by a novel generalist forecasting framework that operates on any frozen vision backbone: we train latent diffusion models to forecast future features in the frozen representation space, which are then decoded via lightweight, task-specific readouts. To enable consistent evaluation across tasks, we introduce distributional metrics that compare distributional properties directly in the space of downstream tasks and apply this framework to nine models and four tasks. Our results highlight the value of bridging representation learning and generative modeling for temporally grounded video understanding.

NeHMO: Neural Hamilton-Jacobi Reachability Learning for Decentralized Safe Multi-Agent Motion Planning

Authors:Qingyi Chen, Ahmed H. Qureshi

Date:2025-07-18 14:12:56

Safe Multi-Agent Motion Planning (MAMP) is a significant challenge in robotics. Despite substantial advancements, existing methods often face a dilemma. Decentralized algorithms typically rely on predicting the behavior of other agents, sharing contracts, or maintaining communication for safety, while centralized approaches struggle with scalability and real-time decision-making. To address these challenges, we introduce Neural Hamilton-Jacobi Reachability Learning (HJR) for Decentralized Multi-Agent Motion Planning. Our method provides scalable neural HJR modeling to tackle high-dimensional configuration spaces and capture worst-case collision and safety constraints between agents. We further propose a decentralized trajectory optimization framework that incorporates the learned HJR solutions to solve MAMP tasks in real-time. We demonstrate that our method is both scalable and data-efficient, enabling the solution of MAMP problems in higher-dimensional scenarios with complex collision constraints. Our approach generalizes across various dynamical systems, including a 12-dimensional dual-arm setup, and outperforms a range of state-of-the-art techniques in successfully addressing challenging MAMP tasks. Video demonstrations are available at https://youtu.be/IZiePX0p1Mc.

Extracting Insights from Large-Scale Telematics Data for ITS Applications: Lessons and Recommendations

Authors:Gibran Ali, Neal Feierabend, Prarthana Doshi, Calvin Winkowski, Michael Fontaine

Date:2025-07-18 14:09:40

Over 90% of new vehicles in the United States now collect and transmit telematics data. Similar trends are seen in other developed countries. Transportation planners have previously utilized telematics data in various forms, but its current scale offers significant new opportunities in traffic measurement, classification, planning, and control. Despite these opportunities, the enormous volume of data and lack of standardization across manufacturers necessitates a clearer understanding of the data and improved data processing methods for extracting actionable insights. This paper takes a step towards addressing these needs through four primary objectives. First, a data processing pipeline was built to efficiently analyze 1.4 billion miles (120 million trips) of telematics data collected in Virginia between August 2021 and August 2022. Second, an open data repository of trip and roadway segment level summaries was created. Third, interactive visualization tools were designed to extract insights from these data about trip-taking behavior and the speed profiles of roadways. Finally, major challenges that were faced during processing this data are summarized and recommendations to overcome them are provided. This work will help manufacturers collecting the data and transportation professionals using the data to develop a better understanding of the possibilities and major pitfalls to avoid.

A Quantum-assisted Attention U-Net for Building Segmentation over Tunis using Sentinel-1 Data

Authors:Luigi Russo, Francesco Mauro, Babak Memar, Alessandro Sebastianelli, Silvia Liberata Ullo, Paolo Gamba

Date:2025-07-18 12:16:04

Building segmentation in urban areas is essential in fields such as urban planning, disaster response, and population mapping. Yet accurately segmenting buildings in dense urban regions presents challenges due to the large size and high resolution of satellite images. This study investigates the use of a Quanvolutional pre-processing to enhance the capability of the Attention U-Net model in the building segmentation. Specifically, this paper focuses on the urban landscape of Tunis, utilizing Sentinel-1 Synthetic Aperture Radar (SAR) imagery. In this work, Quanvolution was used to extract more informative feature maps that capture essential structural details in radar imagery, proving beneficial for accurate building segmentation. Preliminary results indicate that proposed methodology achieves comparable test accuracy to the standard Attention U-Net model while significantly reducing network parameters. This result aligns with findings from previous works, confirming that Quanvolution not only maintains model accuracy but also increases computational efficiency. These promising outcomes highlight the potential of quantum-assisted Deep Learning frameworks for large-scale building segmentation in urban environments.

Principles and Reasons Behind Automated Vehicle Decisions in Ethically Ambiguous Everyday Scenarios

Authors:Lucas Elbert Suryana, Simeon Calvert, Arkady Zgonnikov, Bart van Arem

Date:2025-07-18 11:52:33

Automated vehicles (AVs) increasingly encounter ethically ambiguous situations in everyday driving--scenarios involving conflicting human interests and lacking clearly optimal courses of action. While existing ethical models often focus on rare, high-stakes dilemmas (e.g., crash avoidance or trolley problems), routine decisions such as overtaking cyclists or navigating social interactions remain underexplored. This study addresses that gap by applying the tracking condition of Meaningful Human Control (MHC), which holds that AV behaviour should align with human reasons--defined as the values, intentions, and expectations that justify actions. We conducted qualitative interviews with 18 AV experts to identify the types of reasons that should inform AV manoeuvre planning. Thirteen categories of reasons emerged, organised across normative, strategic, tactical, and operational levels, and linked to the roles of relevant human agents. A case study on cyclist overtaking illustrates how these reasons interact in context, revealing a consistent prioritisation of safety, contextual flexibility regarding regulatory compliance, and nuanced trade-offs involving efficiency, comfort, and public acceptance. Based on these insights, we propose a principled conceptual framework for AV decision-making in routine, ethically ambiguous scenarios. The framework supports dynamic, human-aligned behaviour by prioritising safety, allowing pragmatic actions when strict legal adherence would undermine key values, and enabling constrained deviations when appropriately justified. This empirically grounded approach advances current guidance by offering actionable, context-sensitive design principles for ethically aligned AV systems.

Scalable Submodular Policy Optimization via Pruned Submodularity Graph

Authors:Aditi Anand, Suman Banerjee, Dildar Ali

Date:2025-07-18 11:42:07

In Reinforcement Learning (abbreviated as RL), an agent interacts with the environment via a set of possible actions, and a reward is generated from some unknown distribution. The task here is to find an optimal set of actions such that the reward after a certain time step gets maximized. In a traditional setup, the reward function in an RL Problem is considered additive. However, in reality, there exist many problems, including path planning, coverage control, etc., the reward function follows the diminishing return, which can be modeled as a submodular function. In this paper, we study a variant of the RL Problem where the reward function is submodular, and our objective is to find an optimal policy such that this reward function gets maximized. We have proposed a pruned submodularity graph-based approach that provides a provably approximate solution in a feasible computation time. The proposed approach has been analyzed to understand its time and space requirements as well as a performance guarantee. We have experimented with a benchmark agent-environment setup, which has been used for similar previous studies, and the results are reported. From the results, we observe that the policy obtained by our proposed approach leads to more reward than the baseline methods.

SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing

Authors:Yingying Zhang, Lixiang Ru, Kang Wu, Lei Yu, Lei Liang, Yansheng Li, Jingdong Chen

Date:2025-07-18 10:44:22

The multi-modal remote sensing foundation model (MM-RSFM) has significantly advanced various Earth observation tasks, such as urban planning, environmental monitoring, and natural disaster management. However, most existing approaches generally require the training of separate backbone networks for each data modality, leading to redundancy and inefficient parameter utilization. Moreover, prevalent pre-training methods typically apply self-supervised learning (SSL) techniques from natural images without adequately accommodating the characteristics of remote sensing (RS) images, such as the complicated semantic distribution within a single RS image. In this work, we present SkySense V2, a unified MM-RSFM that employs a single transformer backbone to handle multiple modalities. This backbone is pre-trained with a novel SSL strategy tailored to the distinct traits of RS data. In particular, SkySense V2 incorporates an innovative adaptive patch merging module and learnable modality prompt tokens to address challenges related to varying resolutions and limited feature diversity across modalities. In additional, we incorporate the mixture of experts (MoE) module to further enhance the performance of the foundation model. SkySense V2 demonstrates impressive generalization abilities through an extensive evaluation involving 16 datasets over 7 tasks, outperforming SkySense by an average of 1.8 points.

Regression-Based Approach to Anxiety Estimation of Spider Phobics During Behavioural Avoidance Tasks

Authors:Florian Grensing, Vanessa Schmücker, Anne Sophie Hildebrand, Tim Klucken, Maria Maleshkova

Date:2025-07-18 10:09:38

Phobias significantly impact the quality of life of affected persons. Two methods of assessing anxiety responses are questionnaires and behavioural avoidance tests (BAT). While these can be used in a clinical environment they only record momentary insights into anxiety measures. In this study, we estimate the intensity of anxiety during these BATs, using physiological data collected from unobtrusive, wrist-worn sensors. Twenty-five participants performed four different BATs in a single session, while periodically being asked how anxious they currently are. Using heart rate, heart rate variability, electrodermal activity, and skin temperature, we trained regression models to predict anxiety ratings from three types of input data: (1) using only physiological signals, (2) adding computed features (e.g., min, max, range, variability), and (3) computed features combined with contextual task information. Adding contextual information increased the effectiveness of the model, leading to a root mean squared error (RMSE) of 0.197 and a mean absolute error (MAE) of 0.041. Overall, this study shows, that data obtained from wearables can continuously provide meaningful estimations of anxiety, which can assist in therapy planning and enable more personalised treatment.

On consistency of the MLE under finite mixtures of location-scale distributions with a structural parameter

Authors:Guanfu Liu, Pengfei Li, Yukun Liu, Xiaolong Pu

Date:2025-07-18 09:16:55

We provide a general and rigorous proof for the strong consistency of maximum likelihood estimators of the cumulative distribution function of the mixing distribution and structural parameter under finite mixtures of location-scale distributions with a structural parameter. The consistency results do not require the parameter space of location and scale to be compact. We illustrate the results by applying them to finite mixtures of location-scale distributions with the component density function being one of the commonly used density functions: normal, logistic, extreme-value, or $t$. An extension of the strong consistency results to finite mixtures of multivariate elliptical distributions is also discussed.

Improved particle swarm optimization algorithm: multi-target trajectory optimization for swarm drones

Authors:Minze Li, Wei Zhao, Ran Chen, Mingqiang Wei

Date:2025-07-18 04:31:49

Real-time trajectory planning for unmanned aerial vehicles (UAVs) in dynamic environments remains a key challenge due to high computational demands and the need for fast, adaptive responses. Traditional Particle Swarm Optimization (PSO) methods, while effective for offline planning, often struggle with premature convergence and latency in real-time scenarios. To overcome these limitations, we propose PE-PSO, an enhanced PSO-based online trajectory planner. The method introduces a persistent exploration mechanism to preserve swarm diversity and an entropy-based parameter adjustment strategy to dynamically adapt optimization behavior. UAV trajectories are modeled using B-spline curves, which ensure path smoothness while reducing optimization complexity. To extend this capability to UAV swarms, we develop a multi-agent framework that combines genetic algorithm (GA)-based task allocation with distributed PE-PSO, supporting scalable and coordinated trajectory generation. The distributed architecture allows for parallel computation and decentralized control, enabling effective cooperation among agents while maintaining real-time performance. Comprehensive simulations demonstrate that the proposed framework outperforms conventional PSO and other swarm-based planners across several metrics, including trajectory quality, energy efficiency, obstacle avoidance, and computation time. These results confirm the effectiveness and applicability of PE-PSO in real-time multi-UAV operations under complex environmental conditions.

Quantification of head and neck cancer patients' anatomical changes during radiotherapy: prediction of replanning need

Authors:Odette Rios-Ibacache, James Manalad, Kayla O'Sullivan-Steben, Emily Poon, Luc Galarneau, Julia Khriguian, George Shenouda, John Kildea

Date:2025-07-18 03:41:57

Head and neck cancer (HNC) patients who undergo radiotherapy (RT) may experience anatomical changes during treatment, compromising the validity of the initial treatment plan, necessitating replanning. However, replanning disrupts clinical workflows, creating a stressful environment. Currently, no standardized method exists to determine the total amount of anatomical change that necessitates replanning. This project aimed to create metrics to describe anatomical changes HNC patients may experience during RT and develop machine learning (ML) models to predict RT replanning. We included a cohort of 150 HNC patients treated at the McGill University Health Centre. Based on the shape of the RT structures, we created metrics and developed an extraction pipeline, called HNGeoNatomyX, to automatically calculate them. A univariate metric analysis using linear regression was conducted to obtain the rate of change of each metric. We also obtained the relative variation of each metric between the pre-treatment scan and the fraction at which replanning was requested. Fraction-specific ML models (models that incorporated information available up to and including the specific fraction) for fractions 5, 10, and 15 were built using the metrics, clinical data, and feature selection techniques. To estimate models' performance, we used a repeated stratified 5-fold cross-validation resampling technique and the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve. The best specific multivariate models for fractions 5, 10, and 15 yielded testing scores of 0.82, 0.70, and 0.79, respectively. Our models early predicted replanning for 76% of the true positives. The created metrics have the potential to characterize and distinguish which patients will necessitate RT replanning. They show promise in guiding clinicians to evaluate RT replanning for HNC patients and streamline workflows.

Conformal Contraction for Robust Nonlinear Control with Distribution-Free Uncertainty Quantification

Authors:Sihang Wei, Melkior Ornik, Hiroyasu Tsukamoto

Date:2025-07-18 02:44:32

We present a novel robust control framework for continuous-time, perturbed nonlinear dynamical systems with uncertainty that depends nonlinearly on both the state and control inputs. Unlike conventional approaches that impose structural assumptions on the uncertainty, our framework enhances contraction-based robust control with data-driven uncertainty prediction, remaining agnostic to the models of the uncertainty and predictor. We statistically quantify how reliably the contraction conditions are satisfied under dynamics with uncertainty via conformal prediction, thereby obtaining a distribution-free and finite-time probabilistic guarantee for exponential boundedness of the trajectory tracking error. We further propose the probabilistically robust control invariant (PRCI) tube for distributionally robust motion planning, within which the perturbed system trajectories are guaranteed to stay with a finite probability, without explicit knowledge of the uncertainty model. Numerical simulations validate the effectiveness of the proposed robust control framework and the performance of the PRCI tube.

BreastSegNet: Multi-label Segmentation of Breast MRI

Authors:Qihang Li, Jichen Yang, Yaqian Chen, Yuwen Chen, Hanxue Gu, Lars J. Grimm, Maciej A. Mazurowski

Date:2025-07-18 02:16:00

Breast MRI provides high-resolution imaging critical for breast cancer screening and preoperative staging. However, existing segmentation methods for breast MRI remain limited in scope, often focusing on only a few anatomical structures, such as fibroglandular tissue or tumors, and do not cover the full range of tissues seen in scans. This narrows their utility for quantitative analysis. In this study, we present BreastSegNet, a multi-label segmentation algorithm for breast MRI that covers nine anatomical labels: fibroglandular tissue (FGT), vessel, muscle, bone, lesion, lymph node, heart, liver, and implant. We manually annotated a large set of 1123 MRI slices capturing these structures with detailed review and correction from an expert radiologist. Additionally, we benchmark nine segmentation models, including U-Net, SwinUNet, UNet++, SAM, MedSAM, and nnU-Net with multiple ResNet-based encoders. Among them, nnU-Net ResEncM achieves the highest average Dice scores of 0.694 across all labels. It performs especially well on heart, liver, muscle, FGT, and bone, with Dice scores exceeding 0.73, and approaching 0.90 for heart and liver. All model code and weights are publicly available, and we plan to release the data at a later date.

Time Series Forecastability Measures

Authors:Rui Wang, Steven Klee, Alexis Roos

Date:2025-07-17 22:23:51

This paper proposes using two metrics to quantify the forecastability of time series prior to model development: the spectral predictability score and the largest Lyapunov exponent. Unlike traditional model evaluation metrics, these measures assess the inherent forecastability characteristics of the data before any forecast attempts. The spectral predictability score evaluates the strength and regularity of frequency components in the time series, whereas the Lyapunov exponents quantify the chaos and stability of the system generating the data. We evaluated the effectiveness of these metrics on both synthetic and real-world time series from the M5 forecast competition dataset. Our results demonstrate that these two metrics can correctly reflect the inherent forecastability of a time series and have a strong correlation with the actual forecast performance of various models. By understanding the inherent forecastability of time series before model training, practitioners can focus their planning efforts on products and supply chain levels that are more forecastable, while setting appropriate expectations or seeking alternative strategies for products with limited forecastability.

Heatwave-driven air conditioning adoption could increase German electricity demand by 14 GW in the near future

Authors:Leo Semmelmann, Frederik vom Scheidt

Date:2025-07-17 20:55:23

Intensifying heatwaves driven by climate change are accelerating the adoption of mobile air conditioning (AC) systems. A rapid mass adoption of such AC systems could create additional stress on electricity grids and the power system. This study presents a novel method to estimate the electricity demand from AC systems both at system level and at high temporal and spatial granularity. We apply the method to a near-future heatwave scenario in Germany in which household AC adoption increases from current 19% to 35% during a heatwave similar to the one of July 2025. We analyze the effects for 196,428 grid cells of one square kilometer across Germany, by combining weather data, census data, socio-demographic assumptions, mobility patterns, and temperature-dependent AC activation functions. We find that electricity demand of newly purchased mobile AC systems could increase the peak load by over 14 GW (23%), with urban hot-spots reaching 5.8 MW per square kilometer. The temporal pattern creates a pronounced afternoon peak that coincides with lower photovoltaic generation, potentially exacerbating power system stability challenges. Our findings underscore the urgency for proactive energy system planning to manage emerging demand peaks.

On branching points in the Gilbert-Steiner problem

Authors:Danila Cherkashin

Date:2025-07-17 20:48:29

The Gilbert--Steiner problem is a generalization of the Steiner tree problem and specific optimal mass transportation, which allows the use additional (branching) point in a transport plan. A specific feature of the problem is that the cost of transporting a mass $m$ along a segment of length $l$ is equal to $l \times m^p$ for a fixed $0 < p < 1$ and segments may end at points not belonging to the supports of given measures (branching points). Main result of this paper determines all pairs of $(p,d)$ for which the Gilbert--Steiner problem in $\mathbb{R}^d$ admits only branching points of degree 3. Namely, it happens if and only if $d = 2$ or $p < 1/2$.

Impact of Forecast Stability on Navigational Contrail Avoidance

Authors:Thomas R Dean, Tristan H Abbott, Zeb Engberg, Nicholas Masson, Roger Teoh, Jonathan P Itcovitz, Marc E J Stettler, Marc L Shapiro

Date:2025-07-17 20:43:20

Mitigating contrail-induced warming by re-routing flights around contrail-forming regions requires accurate and stable forecasts of the state of the upper troposphere and lower stratosphere. Forecast stability (i.e., consistency between forecast cycles with different lead times) is particularly important for "pre-tactical" contrail avoidance strategies that adjust routes based on forecasts with lead times as long as 24-48 hours. However, no study to date has systematically quantified the degree to which forecast stability limits the effectiveness of pre-tactical avoidance. This study addresses this gap by comparing contrail forecasts generated using ECMWF HRES weather forecasts with lead times up to 48 hours to contrail hindcasts generated based on ECMWF ERA5 reanalysis. An analysis of forecast errors shows low pointwise consistency between persistent-contrail-forming regions in forecasts and reanalysis, with pointwise error rates similar to those found in previous comparisons of contrail-forming regions in reanalysis and reality. However, we also show that spatial errors in the locations of contrail-forming regions are relatively small, both when forecasts are compared to reanalysis and when reanalysis is compared to in-situ measurements. Finally, we show that designing a trajectory optimizer to take advantage of relatively small spatial errors allows flight trajectory optimizations based on contrail forecasts to reduce contrail climate forcing evaluated based on reanalysis by 80-90% at the 8-24 hour lead times most relevant to flight planning, with fuel penalties under 0.4%. Our results show that forecasts with lead times relevant to flight planning are stable enough to be used for pre-tactical contrail avoidance.

Multi-Agent Synergy-Driven Iterative Visual Narrative Synthesis

Authors:Wang Xi, Quan Shi, Tian Yu, Yujie Peng, Jiayi Sun, Mengxing Ren, Zenghui Ding, Ningguang Yao

Date:2025-07-17 16:50:07

Automated generation of high-quality media presentations is challenging, requiring robust content extraction, narrative planning, visual design, and overall quality optimization. Existing methods often produce presentations with logical inconsistencies and suboptimal layouts, thereby struggling to meet professional standards. To address these challenges, we introduce RCPS (Reflective Coherent Presentation Synthesis), a novel framework integrating three key components: (1) Deep Structured Narrative Planning; (2) Adaptive Layout Generation; (3) an Iterative Optimization Loop. Additionally, we propose PREVAL, a preference-based evaluation framework employing rationale-enhanced multi-dimensional models to assess presentation quality across Content, Coherence, and Design. Experimental results demonstrate that RCPS significantly outperforms baseline methods across all quality dimensions, producing presentations that closely approximate human expert standards. PREVAL shows strong correlation with human judgments, validating it as a reliable automated tool for assessing presentation quality.

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management

Authors:Luis Gasco, Hermenegildo Fabregat, Laura García-Sardiña, Paula Estrella, Daniel Deniz, Alvaro Rodrigo, Rabih Zbib

Date:2025-07-17 16:33:57

Advances in natural language processing and large language models are driving a major transformation in Human Capital Management, with a growing interest in building smart systems based on language technologies for talent acquisition, upskilling strategies, and workforce planning. However, the adoption and progress of these technologies critically depend on the development of reliable and fair models, properly evaluated on public data and open benchmarks, which have so far been unavailable in this domain. To address this gap, we present TalentCLEF 2025, the first evaluation campaign focused on skill and job title intelligence. The lab consists of two tasks: Task A - Multilingual Job Title Matching, covering English, Spanish, German, and Chinese; and Task B - Job Title-Based Skill Prediction, in English. Both corpora were built from real job applications, carefully anonymized, and manually annotated to reflect the complexity and diversity of real-world labor market data, including linguistic variability and gender-marked expressions. The evaluations included monolingual and cross-lingual scenarios and covered the evaluation of gender bias. TalentCLEF attracted 76 registered teams with more than 280 submissions. Most systems relied on information retrieval techniques built with multilingual encoder-based models fine-tuned with contrastive learning, and several of them incorporated large language models for data augmentation or re-ranking. The results show that the training strategies have a larger effect than the size of the model alone. TalentCLEF provides the first public benchmark in this field and encourages the development of robust, fair, and transferable language technologies for the labor market.

Signal Temporal Logic Compliant Co-design of Planning and Control

Authors:Manas Sashank Juvvi, Tushar Dilip Kurne, Vaishnavi J, Shishir Kolathaya, Pushpak Jagtap

Date:2025-07-17 15:37:24

This work presents a novel co-design strategy that integrates trajectory planning and control to handle STL-based tasks in autonomous robots. The method consists of two phases: $(i)$ learning spatio-temporal motion primitives to encapsulate the inherent robot-specific constraints and $(ii)$ constructing an STL-compliant motion plan from these primitives. Initially, we employ reinforcement learning to construct a library of control policies that perform trajectories described by the motion primitives. Then, we map motion primitives to spatio-temporal characteristics. Subsequently, we present a sampling-based STL-compliant motion planning strategy tailored to meet the STL specification. The proposed model-free approach, which generates feasible STL-compliant motion plans across various environments, is validated on differential-drive and quadruped robots across various STL specifications. Demonstration videos are available at https://tinyurl.com/m6zp7rsm.

Leveraging Pre-Trained Visual Models for AI-Generated Video Detection

Authors:Keerthi Veeramachaneni, Praveen Tirupattur, Amrit Singh Bedi, Mubarak Shah

Date:2025-07-17 15:36:39

Recent advances in Generative AI (GenAI) have led to significant improvements in the quality of generated visual content. As AI-generated visual content becomes increasingly indistinguishable from real content, the challenge of detecting the generated content becomes critical in combating misinformation, ensuring privacy, and preventing security threats. Although there has been substantial progress in detecting AI-generated images, current methods for video detection are largely focused on deepfakes, which primarily involve human faces. However, the field of video generation has advanced beyond DeepFakes, creating an urgent need for methods capable of detecting AI-generated videos with generic content. To address this gap, we propose a novel approach that leverages pre-trained visual models to distinguish between real and generated videos. The features extracted from these pre-trained models, which have been trained on extensive real visual content, contain inherent signals that can help distinguish real from generated videos. Using these extracted features, we achieve high detection performance without requiring additional model training, and we further improve performance by training a simple linear classification layer on top of the extracted features. We validated our method on a dataset we compiled (VID-AID), which includes around 10,000 AI-generated videos produced by 9 different text-to-video models, along with 4,000 real videos, totaling over 7 hours of video content. Our evaluation shows that our approach achieves high detection accuracy, above 90% on average, underscoring its effectiveness. Upon acceptance, we plan to publicly release the code, the pre-trained models, and our dataset to support ongoing research in this critical area.

A Computational Framework to Identify Self-Aspects in Text

Authors:Jaya Caporusso, Matthew Purver, Senja Pollak

Date:2025-07-17 13:31:04

This Ph.D. proposal introduces a plan to develop a computational framework to identify Self-aspects in text. The Self is a multifaceted construct and it is reflected in language. While it is described across disciplines like cognitive science and phenomenology, it remains underexplored in natural language processing (NLP). Many of the aspects of the Self align with psychological and other well-researched phenomena (e.g., those related to mental health), highlighting the need for systematic NLP-based analysis. In line with this, we plan to introduce an ontology of Self-aspects and a gold-standard annotated dataset. Using this foundation, we will develop and evaluate conventional discriminative models, generative large language models, and embedding-based retrieval approaches against four main criteria: interpretability, ground-truth adherence, accuracy, and computational efficiency. Top-performing models will be applied in case studies in mental health and empirical phenomenology.

Undulating patterns of Hysteresis loops in diurnal seasonality of air temperature in Urban Heat Island effect: Insights from Paris and Madrid

Authors:Suman Dharmasthala, Vittal Hari, Rohini Kumar

Date:2025-07-17 12:43:39

This study examines the dynamics of the urban heat island (UHI) effect by conducting a comparative analysis of air temperature hysteresis patterns in Paris and Madrid, two major European cities with distinct climatic and urban characteristics. Utilizing high-resolution modelled air temperature data aggregated at a fine temporal resolution of three-hour intervals from 2008 to 2017, we investigate how diurnal and seasonal hysteresis loops reveal both unique and universal aspects of UHI variability. Paris, located in a temperate oceanic climate, and Madrid, situated in a cold semi-arid zone, display pronounced differences in UHI intensity, seasonal distribution, and diurnal patterns. Despite these contrasts, both cities exhibit remarkably similar hysteresis loop directions and slopes, suggesting that time-dependent mechanisms such as solar radiation and heat storage fundamentally govern air temperature UHI across diverse urban contexts. Our findings underscore the importance of considering both local climate and universal physical processes in developing targeted, climate-resilient urban strategies. The results pave the way for group-based interventions and classification of cities by hysteresis patterns to inform urban planning and heat mitigation efforts.

Efficient Online Learning and Adaptive Planning for Robotic Information Gathering Based on Streaming Data

Authors:Sanjeev Ramkumar Sudha, Joel Jose, Erlend M. Coates

Date:2025-07-17 12:26:03

Robotic information gathering (RIG) techniques refer to methods where mobile robots are used to acquire data about the physical environment with a suite of sensors. Informative planning is an important part of RIG where the goal is to find sequences of actions or paths that maximize efficiency or the quality of information collected. Many existing solutions solve this problem by assuming that the environment is known in advance. However, real environments could be unknown or time-varying, and adaptive informative planning remains an active area of research. Adaptive planning and incremental online mapping are required for mapping initially unknown or varying spatial fields. Gaussian process (GP) regression is a widely used technique in RIG for mapping continuous spatial fields. However, it falls short in many applications as its real-time performance does not scale well to large datasets. To address these challenges, this paper proposes an efficient adaptive informative planning approach for mapping continuous scalar fields with GPs with streaming sparse GPs. Simulation experiments are performed with a synthetic dataset and compared against existing benchmarks. Finally, it is also verified with a real-world dataset to further validate the efficacy of the proposed method. Results show that our method achieves similar mapping accuracy to the baselines while reducing computational complexity for longer missions.

A Translation of Probabilistic Event Calculus into Markov Decision Processes

Authors:Lyris Xu, Fabio Aurelio D'Asaro, Luke Dickens

Date:2025-07-17 10:56:22

Probabilistic Event Calculus (PEC) is a logical framework for reasoning about actions and their effects in uncertain environments, which enables the representation of probabilistic narratives and computation of temporal projections. The PEC formalism offers significant advantages in interpretability and expressiveness for narrative reasoning. However, it lacks mechanisms for goal-directed reasoning. This paper bridges this gap by developing a formal translation of PEC domains into Markov Decision Processes (MDPs), introducing the concept of "action-taking situations" to preserve PEC's flexible action semantics. The resulting PEC-MDP formalism enables the extensive collection of algorithms and theoretical tools developed for MDPs to be applied to PEC's interpretable narrative domains. We demonstrate how the translation supports both temporal reasoning tasks and objective-driven planning, with methods for mapping learned policies back into human-readable PEC representations, maintaining interpretability while extending PEC's capabilities.