planning - 2025-11-29

Revolutionizing Glioma Segmentation & Grading Using 3D MRI - Guided Hybrid Deep Learning Models

Authors:Pandiyaraju V, Sreya Mynampati, Abishek Karthik, Poovarasan L, D. Saraswathi

Date:2025-11-26 18:51:46

Gliomas are brain tumor types that have a high mortality rate which means early and accurate diagnosis is important for therapeutic intervention for the tumors. To address this difficulty, the proposed research will develop a hybrid deep learning model which integrates U-Net based segmentation and a hybrid DenseNet-VGG classification network with multihead attention and spatial-channel attention capabilities. The segmentation model will precisely demarcate the tumors in a 3D volume of MRI data guided by spatial and contextual information. The classification network which combines a branch of both DenseNet and VGG, will incorporate the demarcated tumor on which features with attention mechanisms would be focused on clinically relevant features. High-dimensional 3D MRI data could successfully be utilized in the model through preprocessing steps which are normalization, resampling, and data augmentation. Through a variety of measures the framework is evaluated: measures of performance in segmentation are Dice coefficient and Mean Intersection over Union (IoU) and measures of performance in classification are accuracy precision, recall, and F1-score. The hybrid framework that has been proposed has demonstrated through physical testing that it has the capability of obtaining a Dice coefficient of 98% in tumor segmentation, and 99% on classification accuracy, outperforming traditional CNN models and attention-free methods. Utilizing multi-head attention mechanisms enhances notions of priority in aspects of the tumor that are clinically significant, and enhances interpretability and accuracy. The results suggest a great potential of the framework in facilitating the timely and reliable diagnosis and grading of glioma by clinicians is promising, allowing for better planning of patient treatment.

Uncertainty Quantification for Visual Object Pose Estimation

Authors:Lorenzo Shaikewitz, Charis Georgiou, Luca Carlone

Date:2025-11-26 18:39:44

Quantifying the uncertainty of an object's pose estimate is essential for robust control and planning. Although pose estimation is a well-studied robotics problem, attaching statistically rigorous uncertainty is not well understood without strict distributional assumptions. We develop distribution-free pose uncertainty bounds about a given pose estimate in the monocular setting. Our pose uncertainty only requires high probability noise bounds on pixel detections of 2D semantic keypoints on a known object. This noise model induces an implicit, non-convex set of pose uncertainty constraints. Our key contribution is SLUE (S-Lemma Uncertainty Estimation), a convex program to reduce this set to a single ellipsoidal uncertainty bound that is guaranteed to contain the true object pose with high probability. SLUE solves a relaxation of the minimum volume bounding ellipsoid problem inspired by the celebrated S-lemma. It requires no initial guess of the bound's shape or size and is guaranteed to contain the true object pose with high probability. For tighter uncertainty bounds at the same confidence, we extend SLUE to a sum-of-squares relaxation hierarchy which is guaranteed to converge to the minimum volume ellipsoidal uncertainty bound for a given set of keypoint constraints. We show this pose uncertainty bound can easily be projected to independent translation and axis-angle orientation bounds. We evaluate SLUE on two pose estimation datasets and a real-world drone tracking scenario. Compared to prior work, SLUE generates substantially smaller translation bounds and competitive orientation bounds. We release code at https://github.com/MIT-SPARK/PoseUncertaintySets.

Learning When to Stop: Adaptive Latent Reasoning via Reinforcement Learning

Authors:Alex Ning, Yen-Ling Kuo, Gabe Gomes

Date:2025-11-26 16:54:06

Latent reasoning represents a new development in Transformer language models that has shown potential in compressing reasoning lengths compared to chain-of-thought reasoning. By directly passing the information-rich previous final latent state into the next sequence, latent reasoning removes the restriction to human language tokens as the medium for reasoning. We develop adaptive-length latent reasoning models and introduce a post-SFT reinforcement-learning methodology to optimize latent reasoning length by minimizing reasoning length while maintaining accuracy. This, in turn, further reduces compute usage and raises the bar on the compressive capabilities of latent reasoning models. Experiments on the Llama 3.2 1B model and the GSM8K-Aug dataset show a $52\%$ drop in total reasoning length with no penalty to accuracy. In future work, we plan to extend to additional models and datasets, analyze relationships between training coefficients, experiment with architecture variations, and continue our knowledge distillation for latent reasoning SFT efforts. We make our code and pretrained weights available at https://github.com/apning/adaptive-latent-reasoning.

SV-LIB 1.0: A Standard Exchange Format for Software-Verification Tasks

Authors:Dirk Beyer, Gidon Ernst, Martin Jonáš, Marian Lingsch-Rosenfeld

Date:2025-11-26 15:44:54

In the past two decades, significant research and development effort went into the development of verification tools for individual languages, such asC, C++, and Java. Many of the used verification approaches are in fact language-agnostic and it would be beneficial for the technology transfer to allow for using the implementations also for other programming and modeling languages. To address the problem, we propose SV-LIB, an exchange format and intermediate language for software-verification tasks, including programs, specifications, and verification witnesses. SV-LIBis based on well-known concepts from imperative programming languages and uses SMT-LIB to represent expressions and sorts used in the program. This makes it easy to parse and to build into existing infrastructure, since many verification tools are based on SMT solvers already. Furthermore, SV-LIBdefines a witness format for both correct and incorrect SV-LIB programs, together with means for specifying witness-validation tasks. This makes it possible both to implement independent witness validators and to reuse some verifiers also as validators for witnesses. This paper presents version 1.0 of the SV-LIBformat, including its design goals, the syntax, and informal semantics. Formal semantics and further extensions to concurrency are planned for future versions.

A Dynamic Anti-Equinus Orthosis with Electromyography Sensor for Neuromuscular Rehabilitation

Authors:Manuel Terradillos Perea, Olga Alonso Gonzalez, Cristina Soguero Ruiz, David Gutierrez

Date:2025-11-26 15:18:15

The equinus foot is a neuromuscular condition that affects ankle dorsiflexion, impairing gait and reducing quality of life. This study presents EquiSay, a dynamic anti-equinus orthosis equipped with an anterior elastic tension system and an electromyography (EMG) sensor to quantify muscle activation, particularly of the tibialis anterior. EquiSay provides dynamic support that improves foot posture and natural movement while enabling real-time neuromuscular monitoring. To address the limited availability of EMG data, the system incorporates a U-Net based model for generating synthetic EMG signals and a predictive framework for automatic calibration of minimum activation thresholds. Experimental results show improved dorsiflexion, increased patient satisfaction, and valuable clinical insights for rehabilitation planning. These findings highlight the potential of EquiSay as an assistive tool and as a platform for future AI-enhanced developments.

SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

Authors:Peiran Xu, Sudong Wang, Yao Zhu, Jianing Li, Yunjian Zhang

Date:2025-11-26 15:04:18

Spatial cognition is fundamental to real-world multimodal intelligence, allowing models to effectively interact with the physical environment. While multimodal large language models (MLLMs) have made significant strides, existing benchmarks often oversimplify spatial cognition, reducing it to a single-dimensional metric, which fails to capture the hierarchical structure and interdependence of spatial abilities. To address this gap, we propose a hierarchical spatial cognition framework that decomposes spatial intelligence into five progressively complex levels from basic observation to high-level planning. Building upon this taxonomy, we construct SpatialBench, a large-scale, fine-grained benchmark covering 15 tasks aligned with these cognitive levels. To provide a unified evaluation across heterogeneous tasks, we further introduce a high-level capability-oriented metric that reliably assesses a model's overall spatial reasoning ability. Extensive experiments over massive MLLMs reveal distinct performance stratification across cognitive levels: models exhibit strong perceptual grounding yet remain limited in symbolic reasoning, causal inference, and planning. Additional human tests demonstrate that humans perform selective, goal-directed abstraction, while MLLMs tend to over-attend to surface details without coherent spatial intent. Our work establishes the first systematic framework for measuring hierarchical spatial cognition in MLLMs, laying the foundation for future spatially intelligent systems.

EWE: An Agentic Framework for Extreme Weather Analysis

Authors:Zhe Jiang, Jiong Wang, Xiaoyu Yue, Zijie Guo, Wenlong Zhang, Fenghua Ling, Wanli Ouyang, Lei Bai

Date:2025-11-26 14:37:25

Extreme weather events pose escalating risks to global society, underscoring the urgent need to unravel their underlying physical mechanisms. Yet the prevailing expert-driven, labor-intensive diagnostic paradigm has created a critical analytical bottleneck, stalling scientific progress. While AI for Earth Science has achieved notable advances in prediction, the equally essential challenge of automated diagnostic reasoning remains largely unexplored. We present the Extreme Weather Expert (EWE), the first intelligent agent framework dedicated to this task. EWE emulates expert workflows through knowledge-guided planning, closed-loop reasoning, and a domain-tailored meteorological toolkit. It autonomously produces and interprets multimodal visualizations from raw meteorological data, enabling comprehensive diagnostic analyses. To catalyze progress, we introduce the first benchmark for this emerging field, comprising a curated dataset of 103 high-impact events and a novel step-wise evaluation metric. EWE marks a step toward automated scientific discovery and offers the potential to democratize expertise and intellectual resources, particularly for developing nations vulnerable to extreme weather.

Understanding Regional Inertia Dynamics in CAISO from Real Grid Disturbances

Authors:Saurav Dulal, Mohammed M. Olama, Ali R. Ekti, Nils M. Stenvig, Yilu Liu

Date:2025-11-26 13:35:56

The shift from synchronous generators to inverter-based resources has caused power system inertia to be unevenly distributed across power grids. As a result, certain grid regions are more vulnerable to high rate-of-change of frequency (RoCoF) during disturbances. This paper presents a measurement-based framework for estimating grid inertia in CAISO (California Independent System Operator) region using real disturbance-driven frequency data from the Frequency Monitoring Network (FNET/GridEye). By analyzing confirmed disturbances from 2013 to 2024, we identify trends in regional inertia and frequency dynamics, highlighting their relationship with renewable generation and the evolving duck curve. Regional RoCoF values were up to six times higher than interconnection-wide values, coinciding with declining inertia. Recent recovery in inertia is attributed to the increased deployment of battery energy storage systems with synthetic inertia capabilities. These findings underscore the importance of regional inertia monitoring, strategic resource planning, and adaptive operational practices to ensure grid reliability amid growing renewable integration.

Stopping power monitoring during proton therapy by means of prompt gamma timing: first experimental results with a homogeneous phantom

Authors:Julius Werner, Francesco Pennazio, Piergiorgio Cerello, Elisa Fiorina, Simona Giordanengo, Felix Mas Milian, Alessio Mereghetti, Franco Mostardi, Marco Pullia, Sahar Ranjbar, Roberto Sacchi, Anna Vignati, Magdalena Rafecas, Veronica Ferrero

Date:2025-11-26 12:50:02

Proton therapy's full potential is limited by uncertainties that prevent optimal dose distribution. Monitoring techniques can reduce these uncertainties and enable adaptive treatment planning. Spatiotemporal Emission Reconstruction from Prompt-Gamma Timing (SER-PGT) is a promising method that provides insights into both particle range and stopping power, whose calculation would normally require knowledge about patient tissue properties that cannot be directly measured. We present the first experimental results using a 226.9 MeV synchrotron-proton beam impinging on a homogeneous phantom at a sub-clinical intensity (2 - 4 x 10^7 pps). SER-PGT uses data from a multi-detector setup: a thin and segmented Low Gain Avalanche Diode for proton detection and Lanthanum Bromide-based crystals for photon detection. The estimated stopping power profile showed an 8% +- 3% average error compared to NIST PSTAR values, and 2% +- 2% deviation relative to water at 100 MeV. Range assessment in a phantom with a 4 cm air-gap successfully identified the range shift with a 3 mm standard deviation. These results demonstrate the feasibility of using SER-PGT to recover both range and stopping power information through particle kinematics and PGT measurements.

PathMamba: A Hybrid Mamba-Transformer for Topologically Coherent Road Segmentation in Satellite Imagery

Authors:Jules Decaestecker, Nicolas Vigne

Date:2025-11-26 11:42:27

Achieving both high accuracy and topological continuity in road segmentation from satellite imagery is a critical goal for applications ranging from urban planning to disaster response. State-of-the-art methods often rely on Vision Transformers, which excel at capturing global context, yet their quadratic complexity is a significant barrier to efficient deployment, particularly for on-board processing in resource-constrained platforms. In contrast, emerging State Space Models like Mamba offer linear-time efficiency and are inherently suited to modeling long, continuous structures. We posit that these architectures have complementary strengths. To this end, we introduce PathMamba, a novel hybrid architecture that integrates Mamba's sequential modeling with the Transformer's global reasoning. Our design strategically uses Mamba blocks to trace the continuous nature of road networks, preserving topological structure, while integrating Transformer blocks to refine features with global context. This approach yields topologically superior segmentation maps without the prohibitive scaling costs of pure attention-based models. Our experiments on the DeepGlobe Road Extraction and Massachusetts Roads datasets demonstrate that PathMamba sets a new state-of-the-art. Notably, it significantly improves topological continuity, as measured by the APLS metric, setting a new benchmark while remaining computationally competitive.

Enterprise Profit Prediction Using Multiple Data Sources with Missing Values through Vertical Federated Learning

Authors:Huiyun Tang, Feifei Wang, Long Feng, Yang Li

Date:2025-11-26 11:09:59

Small and medium-sized enterprises (SMEs) play a crucial role in driving economic growth. Monitoring their financial performance and discovering relevant covariates are essential for risk assessment, business planning, and policy formulation. This paper focuses on predicting profits for SMEs. Two major challenges are faced in this study: 1) SMEs data are stored across different institutions, and centralized analysis is restricted due to data security concerns; 2) data from various institutions contain different levels of missing values, resulting in a complex missingness issue. To tackle these issues, we introduce an innovative approach named Vertical Federated Expectation Maximization (VFEM), designed for federated learning under a missing data scenario. We embed a new EM algorithm into VFEM to address complex missing patterns when full dataset access is unfeasible. Furthermore, we establish the linear convergence rate for the VFEM and establish a statistical inference framework, enabling covariates to influence assessment and enhancing model interpretability. Extensive simulation studies are conducted to validate its finite sample performance. Finally, we thoroughly investigate a real-life profit prediction problem for SMEs using VFEM. Our findings demonstrate that VFEM provides a promising solution for addressing data isolation and missing values, ultimately improving the understanding of SMEs' financial performance.

AVFakeBench: A Comprehensive Audio-Video Forgery Detection Benchmark for AV-LMMs

Authors:Shuhan Xia, Peipei Li, Xuannan Liu, Dongsen Zhang, Xinyu Guo, Zekun Li

Date:2025-11-26 10:33:12

The threat of Audio-Video (AV) forgery is rapidly evolving beyond human-centric deepfakes to include more diverse manipulations across complex natural scenes. However, existing benchmarks are still confined to DeepFake-based forgeries and single-granularity annotations, thus failing to capture the diversity and complexity of real-world forgery scenarios. To address this, we introduce AVFakeBench, the first comprehensive audio-video forgery detection benchmark that spans rich forgery semantics across both human subject and general subject. AVFakeBench comprises 12K carefully curated audio-video questions, covering seven forgery types and four levels of annotations. To ensure high-quality and diverse forgeries, we propose a multi-stage hybrid forgery framework that integrates proprietary models for task planning with expert generative models for precise manipulation. The benchmark establishes a multi-task evaluation framework covering binary judgment, forgery types classification, forgery detail selection, and explanatory reasoning. We evaluate 11 Audio-Video Large Language Models (AV-LMMs) and 2 prevalent detection methods on AVFakeBench, demonstrating the potential of AV-LMMs as emerging forgery detectors while revealing their notable weaknesses in fine-grained perception and reasoning.

Strategic Development of a Hydrogen Supply Chain in Corsica: a Multi-criteria Analysis

Authors:Tchougoune Moustapha Mai, Mohamed Hajajji, Catherine Azzaro-Pantel, Maude Chin Choi, Christian Cristofari

Date:2025-11-26 08:28:11

A multi-objective framework for hydrogen supply chain (HSC) planning is developed for island contexts, incorporating Mixed-Integer Linear Programming (MILP) over multiple time periods. The model minimizes total system cost, greenhouse gas (GHG) emissions, and a risk index criteria. The case study of Corsica is considered, using Geographic Information Systems (GIS) for spatial analysis and infrastructure locating. The 2050 future design of the HSC is determined including site selection, capacity sizing, and technology choices. The proposed m-TOPSIS-based multi objectives solution shows a decentralized infrastructure with a levelized cost of hydrogen of ___6.55/kg, and greenhouse gas emissions under 2 kgCO___e/kg H___. The study also integrates water availability and tourism-induced demand variation as key drivers of energy planning in insular regions.

MIRA: Multimodal Iterative Reasoning Agent for Image Editing

Authors:Ziyun Zeng, Hang Hua, Jiebo Luo

Date:2025-11-26 06:13:32

Instruction-guided image editing offers an intuitive way for users to edit images with natural language. However, diffusion-based editing models often struggle to accurately interpret complex user instructions, especially those involving compositional relationships, contextual cues, or referring expressions, leading to edits that drift semantically or fail to reflect the intended changes. We tackle this problem by proposing MIRA (Multimodal Iterative Reasoning Agent), a lightweight, plug-and-play multimodal reasoning agent that performs editing through an iterative perception-reasoning-action loop, effectively simulating multi-turn human-model interaction processes. Instead of issuing a single prompt or static plan, MIRA predicts atomic edit instructions step by step, using visual feedback to make its decisions. Our 150K multimodal tool-use dataset, MIRA-Editing, combined with a two-stage SFT + GRPO training pipeline, enables MIRA to perform reasoning and editing over complex editing instructions. When paired with open-source image editing models such as Flux.1-Kontext, Step1X-Edit, and Qwen-Image-Edit, MIRA significantly improves both semantic consistency and perceptual quality, achieving performance comparable to or exceeding proprietary systems such as GPT-Image and Nano-Banana.

Efficient Diffusion Planning with Temporal Diffusion

Authors:Jiaming Guo, Rui Zhang, Zerun Li, Yunkai Gao, Shaohui Peng, Siming Lan, Xing Hu, Zidong Du, Xishan Zhang, Ling Li

Date:2025-11-26 04:45:00

Diffusion planning is a promising method for learning high-performance policies from offline data. To avoid the impact of discrepancies between planning and reality on performance, previous works generate new plans at each time step. However, this incurs significant computational overhead and leads to lower decision frequencies, and frequent plan switching may also affect performance. In contrast, humans might create detailed short-term plans and more general, sometimes vague, long-term plans, and adjust them over time. Inspired by this, we propose the Temporal Diffusion Planner (TDP) which improves decision efficiency by distributing the denoising steps across the time dimension. TDP begins by generating an initial plan that becomes progressively more vague over time. At each subsequent time step, rather than generating an entirely new plan, TDP updates the previous one with a small number of denoising steps. This reduces the average number of denoising steps, improving decision efficiency. Additionally, we introduce an automated replanning mechanism to prevent significant deviations between the plan and reality. Experiments on D4RL show that, compared to previous works that generate new plans every time step, TDP improves the decision-making frequency by 11-24.8 times while achieving higher or comparable performance.

AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios

Authors:Chenglizhao Chen, Shaofeng Liang, Runwei Guan, Xiaolou Sun, Haocheng Zhao, Haiyun Jiang, Tao Huang, Henghui Ding, Qing-Long Han

Date:2025-11-26 04:44:27

Referring Multi-Object Tracking (RMOT) aims to achieve precise object detection and tracking through natural language instructions, representing a fundamental capability for intelligent robotic systems. However, current RMOT research remains mostly confined to ground-level scenarios, which constrains their ability to capture broad-scale scene contexts and perform comprehensive tracking and path planning. In contrast, Unmanned Aerial Vehicles (UAVs) leverage their expansive aerial perspectives and superior maneuverability to enable wide-area surveillance. Moreover, UAVs have emerged as critical platforms for Embodied Intelligence, which has given rise to an unprecedented demand for intelligent aerial systems capable of natural language interaction. To this end, we introduce AerialMind, the first large-scale RMOT benchmark in UAV scenarios, which aims to bridge this research gap. To facilitate its construction, we develop an innovative semi-automated collaborative agent-based labeling assistant (COALA) framework that significantly reduces labor costs while maintaining annotation quality. Furthermore, we propose HawkEyeTrack (HETrack), a novel method that collaboratively enhances vision-language representation learning and improves the perception of UAV scenarios. Comprehensive experiments validated the challenging nature of our dataset and the effectiveness of our method.

Probabilistic Wildfire Spread Prediction Using an Autoregressive Conditional Generative Adversarial Network

Authors:Taehoon Kang, Taeyong Kim

Date:2025-11-26 03:32:54

Climate change has intensified the frequency and severity of wildfires, making rapid and accurate prediction of fire spread essential for effective mitigation and response. Physics-based simulators such as FARSITE offer high-fidelity predictions but are computationally intensive, limiting their applicability in real-time decision-making, while existing deep learning models often yield overly smooth predictions that fail to capture the complex, nonlinear dynamics of wildfire propagation. This study proposes an autoregressive conditional generative adversarial network (CGAN) for probabilistic wildfire spread prediction. By formulating the prediction task as an autoregressive problem, the model learns sequential state transitions, ensuring long-term prediction stability. Experimental results demonstrate that the proposed CGAN-based model outperforms conventional deep learning models in both overall predictive accuracy and boundary delineation of fire perimeters. These results demonstrate that adversarial learning allows the model to capture the strong nonlinearity and uncertainty of wildfire spread, instead of simply fitting the pixel average. Furthermore, the autoregressive framework facilitates systematic temporal forecasting of wildfire evolution. The proposed CGAN-based autoregressive framework enhances both the accuracy and physical interpretability of wildfire spread prediction, offering a promising foundation for time-sensitive response and evacuation planning.

Virtual high voltage lab: gamified learning in a safe 3d environment

Authors:Vladyslav Pliuhin, Yevgen Tsegelnyk, Maria Sukhonos, Ihor Biletskyi, Sergiy Plankovskyy, Taras Sakhoshko

Date:2025-11-25 23:24:13

The integration of immersive technologies has transformed engineering education, particularly in high-risk disciplines like high-voltage engineering, which is essential for urban energy infrastructure. This study presents a 3D virtual laboratory developed using Unreal Engine 5 to support the High Voltage Engineering course for undergraduate students in Power Engineering, Electrical Engineering, and Electromechanics. The research investigates how an immersive, gamified virtual laboratory enhances learning outcomes, safety training, and preparedness for urban infrastructure challenges. We hypothesize that VLab-HV significantly improves student engagement, knowledge retention, practical skills, and safety awareness compared to traditional laboratories, contributing to urban energy system resilience. Through ten curriculum-aligned experiments, gamified interactions, and AI-driven pedagogical tools, VLab-HV offers a risk-free environment for mastering HV concepts. Evaluation via usability testing, engagement metrics, and surveys confirms superior learning outcomes. The study highlights the role of VLab-HV in training engineers and professionals for urban energy challenges, with planned expansions for multiplayer and virtual reality integration.

Selecting Belief-State Approximations in Simulators with Latent States

Authors:Nan Jiang

Date:2025-11-25 21:34:01

State resetting is a fundamental but often overlooked capability of simulators. It supports sample-based planning by allowing resets to previously encountered simulation states, and enables calibration of simulators using real data by resetting to states observed in real-system traces. While often taken for granted, state resetting in complex simulators can be nontrivial: when the simulator comes with latent variables (states), state resetting requires sampling from the posterior over the latent state given the observable history, a.k.a. the belief state (Silver and Veness, 2010). While exact sampling is often infeasible, many approximate belief-state samplers can be constructed, raising the question of how to select among them using only sampling access to the simulator. In this paper, we show that this problem reduces to a general conditional distribution-selection task and develop a new algorithm and analysis under sampling-only access. Building on this reduction, the belief-state selection problem admits two different formulations: latent state-based selection, which directly targets the conditional distribution of the latent state, and observation-based selection, which targets the induced distribution over the observation. Interestingly, these formulations differ in how their guarantees interact with the downstream roll-out methods: perhaps surprisingly, observation-based selection may fail under the most natural roll-out method (which we call Single-Reset) but enjoys guarantees under the less conventional alternative (which we call Repeated-Reset). Together with discussion on issues such as distribution shift and the choice of sampling policies, our paper reveals a rich landscape of algorithmic choices, theoretical nuances, and open questions, in this seemingly simple problem.

RefTr: Recurrent Refinement of Confluent Trajectories for 3D Vascular Tree Centerline Graphs

Authors:Roman Naeem, David Hagerman, Jennifer Alvén, Fredrik Kahl

Date:2025-11-25 20:22:57

Tubular trees, such as blood vessels and lung airways, are essential for material transport within the human body. Accurately detecting their centerlines with correct tree topology is critical for clinical tasks such as diagnosis, treatment planning, and surgical navigation. In these applications, maintaining high recall is crucial, as missing small branches can result in fatal mistakes caused by incomplete assessments or undetected abnormalities. We present RefTr, a 3D image-to-graph model for centerline generation of vascular trees via recurrent refinement of confluent trajectories. RefTr uses a Producer-Refiner architecture based on a Transformer decoder, where the Producer proposes a set of initial confluent trajectories that are recurrently refined by the Refiner to produce final trajectories, which forms the centerline graph. The confluent trajectory representation enables refinement of complete trajectories while explicitly enforcing a valid tree topology. The recurrent refinement scheme improves precision and reuses the same Refiner block across multiple steps, yielding a 2.4x reduction in decoder parameters compared to previous SOTA. We also introduce an efficient non-maximum suppression algorithm for spatial tree graphs to merge duplicate branches and boost precision. Across multiple public centerline datasets, RefTr achieves superior recall and comparable precision to previous SOTA, while offering faster inference and substantially fewer parameters, demonstrating its potential as a new state-of-the-art framework for vascular tree analysis in 3D medical imaging.

Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model

Authors:Ziyue Wang, Yayati Jadhav, Peter Pak, Amir Barati Farimani

Date:2025-11-25 18:55:12

Mechanical design and manufacturing workflows conventionally begin with conceptual design, followed by the creation of a computer-aided design (CAD) model and fabrication through material-extrusion (MEX) printing. This process requires converting CAD geometry into machine-readable G-code through slicing and path planning. While each step is well established, dependence on CAD modeling remains a major bottleneck: constructing object-specific 3D geometry is slow and poorly suited to rapid prototyping. Even minor design variations typically necessitate manual updates in CAD software, making iteration time-consuming and difficult to scale. To address this limitation, we introduce Image2Gcode, an end-to-end data-driven framework that bypasses the CAD stage and generates printer-ready G-code directly from images and part drawings. Instead of relying on an explicit 3D model, a hand-drawn or captured 2D image serves as the sole input. The framework first extracts slice-wise structural cues from the image and then employs a denoising diffusion probabilistic model (DDPM) over G-code sequences. Through iterative denoising, the model transforms Gaussian noise into executable print-move trajectories with corresponding extrusion parameters, establishing a direct mapping from visual input to native toolpaths. By producing structured G-code directly from 2D imagery, Image2Gcode eliminates the need for CAD or STL intermediates, lowering the entry barrier for additive manufacturing and accelerating the design-to-fabrication cycle. This approach supports on-demand prototyping from simple sketches or visual references and integrates with upstream 2D-to-3D reconstruction modules to enable an automated pipeline from concept to physical artifact. The result is a flexible, computationally efficient framework that advances accessibility in design iteration, repair workflows, and distributed manufacturing.

Safe and Stable Neural Network Dynamical Systems for Robot Motion Planning

Authors:Allen Emmanuel Binny, Mahathi Anand, Hugo T. M. Kussaba, Lingyun Chen, Shreenabh Agrawal, Fares J. Abu-Dakka, Abdalla Swikir

Date:2025-11-25 18:24:11

Learning safe and stable robot motions from demonstrations remains a challenge, especially in complex, nonlinear tasks involving dynamic, obstacle-rich environments. In this paper, we propose Safe and Stable Neural Network Dynamical Systems S$^2$-NNDS, a learning-from-demonstration framework that simultaneously learns expressive neural dynamical systems alongside neural Lyapunov stability and barrier safety certificates. Unlike traditional approaches with restrictive polynomial parameterizations, S$^2$-NNDS leverages neural networks to capture complex robot motions providing probabilistic guarantees through split conformal prediction in learned certificates. Experimental results on various 2D and 3D datasets -- including LASA handwriting and demonstrations recorded kinesthetically from the Franka Emika Panda robot -- validate S$^2$-NNDS effectiveness in learning robust, safe, and stable motions from potentially unsafe demonstrations.

EnergyTwin: A Multi-Agent System for Simulating and Coordinating Energy Microgrids

Authors:Jakub Muszyński, Ignacy Walużenicz, Patryk Zan, Zofia Wrona, Maria Ganzha, Marcin Paprzycki, Costin Bădică

Date:2025-11-25 18:19:40

Microgrids are deployed to reduce purchased grid energy, limit exposure to volatile tariffs, and ensure service continuity during disturbances. This requires coordinating heterogeneous distributed energy resources across multiple time scales and under variable conditions. Among existing tools, typically, power-system simulators capture physical behaviour but assume centralized control, while multi-agent frameworks model decentralized decision-making but represent energy with no physical grounding. In this context, the EnergyTwin is introduced, an agent-based microgrid simulation environment that couples physically grounded models with forecast-informed, rolling-horizon planning, and negotiations. Each asset is modeled as an agent, interacting with a central agent that obtains forecasts, formulates predictions, and allocates energy through contract-based interactions. EnergyTwin targets tertiary-layer decision making and is extensible for digital-twin use. Its feasibility was evaluated in a university campus microgrid scenario where multiple planning strategies were compared. Achieved results show that forecast-driven rolling-horizon planning increases local energy self-sufficiency, maintains higher battery reserves, and reduces exposure to low-resilience operating states. They demonstrate also potential of EnergyTwin as platform supporting research on resilient, negotiation-driven microgrids.

Gated Uncertainty-Aware Runtime Dual Invariants for Neural Signal-Controlled Robotics

Authors:Tasha Kim, Oiwi Parker Jones

Date:2025-11-25 18:05:05

Safety-critical assistive systems that directly decode user intent from neural signals require rigorous guarantees of reliability and trust. We present GUARDIAN (Gated Uncertainty-Aware Runtime Dual Invariants), a framework for real-time neuro-symbolic verification for neural signal-controlled robotics. GUARDIAN enforces both logical safety and physiological trust by coupling confidence-calibrated brain signal decoding with symbolic goal grounding and dual-layer runtime monitoring. On the BNCI2014 motor imagery electroencephalogram (EEG) dataset with 9 subjects and 5,184 trials, the system performs at a high safety rate of 94-97% even with lightweight decoder architectures with low test accuracies (27-46%) and high ECE confidence miscalibration (0.22-0.41). We demonstrate 1.7x correct interventions in simulated noise testing versus at baseline. The monitor operates at 100Hz and sub-millisecond decision latency, making it practically viable for closed-loop neural signal-based systems. Across 21 ablation results, GUARDIAN exhibits a graduated response to signal degradation, and produces auditable traces from intent, plan to action, helping to link neural evidence to verifiable robot action.

Causal Feature Selection for Weather-Driven Residential Load Forecasting

Authors:Elise Zhang, François Mirallès, Stéphane Dellacherie, Di Wu, Benoit Boulet

Date:2025-11-25 17:17:31

Weather is a dominant external driver of residential electricity demand, but adding many meteorological covariates can inflate model complexity and may even impair accuracy. Selecting appropriate exogenous features is non-trivial and calls for a principled selection framework, given the direct operational implications for day-to-day planning and reliability. This work investigates whether causal feature selection can retain the most informative weather drivers while improving parsimony and robustness for short-term load forecasting. We present a case study on Southern Ontario with two open-source datasets: (i) IESO hourly electricity consumption by Forward Sortation Areas; (ii) ERA5 weather reanalysis data. We compare different feature selection regimes (no feature selection, non-causal selection, PCMCI-causal selection) on city-level forecasting with three different time series forecasting models: GRU, TCN, PatchTST. In the feature analysis, non-causal selection prioritizes radiation and moisture variables that show correlational dependence, whereas PCMCI-causal selection emphasizes more direct thermal drivers and prunes the indirect covariates. We detail the evaluation pipeline and report diagnostics on prediction accuracy and extreme-weather robustness, positioning causal feature selection as a practical complement to modern forecasters when integrating weather into residential load forecasting.

InferF: Declarative Factorization of AI/ML Inferences over Joins

Authors:Kanchan Chowdhury, Lixi Zhou, Lulu Xie, Xinwei Fu, Jia Zou

Date:2025-11-25 16:55:43

Real-world AI/ML workflows often apply inference computations to feature vectors joined from multiple datasets. To avoid the redundant AI/ML computations caused by repeated data records in the join's output, factorized ML has been proposed to decompose ML computations into sub-computations to be executed on each normalized dataset. However, there is insufficient discussion on how factorized ML could impact AI/ML inference over multi-way joins. To address the limitations, we propose a novel declarative InferF system, focusing on the factorization of arbitrary inference workflows represented as analyzable expressions over the multi-way joins. We formalize our problem to flexibly push down partial factorized computations to qualified nodes in the join tree to minimize the overall inference computation and join costs and propose two algorithms to resolve the problem: (1) a greedy algorithm based on a per-node cost function that estimates the influence on overall latency if a subset of factorized computations is pushed to a node, and (2) a genetic algorithm for iteratively enumerating and evaluating promising factorization plans. We implement InferF on Velox, an open-sourced database engine from Meta, evaluate it on real-world datasets, observed up to 11.3x speedups, and systematically summarized the factors that determine when factorized ML can benefit AI/ML inference workflows.

BRIC: Bridging Kinematic Plans and Physical Control at Test Time

Authors:Dohun Lim, Minji Kim, Jaewoon Lim, Sungchan Kim

Date:2025-11-25 16:03:38

We propose BRIC, a novel test-time adaptation (TTA) framework that enables long-term human motion generation by resolving execution discrepancies between diffusion-based kinematic motion planners and reinforcement learning-based physics controllers. While diffusion models can generate diverse and expressive motions conditioned on text and scene context, they often produce physically implausible outputs, leading to execution drift during simulation. To address this, BRIC dynamically adapts the physics controller to noisy motion plans at test time, while preserving pre-trained skills via a loss function that mitigates catastrophic forgetting. In addition, BRIC introduces a lightweight test-time guidance mechanism that steers the diffusion model in the signal space without updating its parameters. By combining both adaptation strategies, BRIC ensures consistent and physically plausible long-term executions across diverse environments in an effective and efficient manner. We validate the effectiveness of BRIC on a variety of long-term tasks, including motion composition, obstacle avoidance, and human-scene interaction, achieving state-of-the-art performance across all tasks.

Optimization of the X-Arapuca Photon Collection Efficiency for the DUNE Horizontal Drift Far Detector

Authors:E. Bertolini, C. Brizzolari, F. Bruni, P. Carniti, C. M. Cattadori, S. Copello, E. Cristaldo, M. Delgado, F. Galizzi, C. Gotti, D. Guffanti, A. A. Machado, L. Malinverni, L. Meazza, F. Meinardi, G. Pessina, G. Raselli, M. Rossella, E. Segreto, H. Souza, F. Terranova, D. Warner

Date:2025-11-25 15:27:39

The Deep Underground Neutrino Experiment (DUNE) Far Detector (FD) Photon Detection System (PDS) employs the X-Arapuca concept, a photon trapping system relying on reflective surfaces and dichroic filters. In this paper are reported measurements, performed at the University of Milano-Bicocca, aimed at increasing the FD Horizontal Drift (HD) PDS module efficiency. The baseline implementation of the X-Arapuca concept for the FD-HD PDS module is close to the DUNE requirements as demonstrated in the collaborations laboratory testing. However, an increased performance would provide a safety margin for a detector planned to be operated for 30 years, without possibility of performing maintenance. A higher detector performance would also benefit the DUNE low energy physics program. The already proven Milano-Bicocca setup has been utilized to test different PDS module configurations comparing them to the original baseline. Exploiting prior knowledge of the X-Arapuca components and Geant4 based optical simulations it has been possible to achieve up to an ~84% performance increase over the baseline design. In the following it is presented the testing procedure, the performed measurements and a brief discussion on the obtained results.

Improved adaptive wind driven optimization algorithm for real-time path planning

Authors:Shiqian Liu, Azlan Mohd Zain, Le-le Mao

Date:2025-11-25 15:19:45

Recently, path planning has achieved remarkable progress in enhancing global search capability and convergence accuracy through heuristic and learning-inspired optimization frameworks. However, real-time adaptability in dynamic environments remains a critical challenge for autonomous navigation, particularly when robots must generate collision-free, smooth, and efficient trajectories under complex constraints. By analyzing the difficulties in dynamic path planning, the Wind Driven Optimization (WDO) algorithm emerges as a promising framework owing to its physically interpretable search dynamics. Motivated by these observations, this work revisits the WDO principle and proposes a variant formulation, Multi-hierarchical adaptive wind driven optimization(MAWDO), that improves adaptability and robustness in time-varying environments. To mitigate instability and premature convergence, a hierarchical-guidance mechanism divides the population into multiple groups guided by individual, regional, and global leaders to balance exploration and exploitation. Extensive evaluations on sixteen benchmark functions show that MAWDO achieves superior optimization accuracy, convergence stability, and adaptability over state-of-the art metaheuristics. In dynamic path planning, MAWDO shortens the path length to 469.28 pixels, improving over Multi-strategy ensemble wind driven optimization(MEWDO), Adaptive wind driven optimization(AWDO) and WDO by 3.51\%, 11.63\% and 14.93\%, and achieves the smallest optimality gap (1.01) with smoothness 0.71 versus 13.50 and 15.67 for AWDO and WDO, leading to smoother, shorter, and collision-free trajectories that confirm its effectiveness for real-time path planning in complex environments.

Accelerating Time-Optimal Trajectory Planning for Connected and Automated Vehicles with Graph Neural Networks

Authors:Viet-Anh Le, Andreas A. Malikopoulos

Date:2025-11-25 15:05:49

In this paper, we present a learning-based framework that accelerates time- and energy-optimal trajectory planning for connected and automated vehicles (CAVs) using graph neural networks (GNNs). We formulate the multi-agent coordination problem encountered in traffic scenarios as a cooperative trajectory planning problem that minimizes travel time, subject to motion primitives derived from energy-optimal solutions. The effectiveness of this framework can be further improved through replanning at each time step, enabling the system to incorporate newly observed information. To achieve real-time execution of such a multi-agent replanning scheme, we employ a GNN architecture to learn the solutions of the time-optimal trajectory planning problem from offline-generated data. The trained model produces online predictions that serve as warm-start solutions for numerical optimization, thereby enabling rapid computation of minimal exit times and the associated feasible trajectories. This learning-augmented approach substantially reduces computation time while ensuring that all state, input, and safety constraints are satisfied.