planning - 2025-12-09

Efficient and Compliant Control Framework for Versatile Human-Humanoid Collaborative Transportation

Authors:Shubham S. Kumbhar, Abhijeet M. Kulkarni, Panagiotis Artemiadis

Date:2025-12-08 18:52:10

We present a control framework that enables humanoid robots to perform collaborative transportation tasks with a human partner. The framework supports both translational and rotational motions, which are fundamental to co-transport scenarios. It comprises three components: a high-level planner, a low-level controller, and a stiffness modulation mechanism. At the planning level, we introduce the Interaction Linear Inverted Pendulum (I-LIP), which, combined with an admittance model and an MPC formulation, generates dynamically feasible footstep plans. These are executed by a QP-based whole-body controller that accounts for the coupled humanoid-object dynamics. Stiffness modulation regulates robot-object interaction, ensuring convergence to the desired relative configuration defined by the distance between the object and the robot's center of mass. We validate the effectiveness of the framework through real-world experiments conducted on the Digit humanoid platform. To quantify collaboration quality, we propose an efficiency metric that captures both task performance and inter-agent coordination. We show that this metric highlights the role of compliance in collaborative tasks and offers insights into desirable trajectory characteristics across both high- and low-level control layers. Finally, we showcase experimental results on collaborative behaviors, including translation, turning, and combined motions such as semi circular trajectories, representative of naturally occurring co-transportation tasks.

Optimal Auction Design under Costly Learning

Authors:Kemal Ozbek

Date:2025-12-08 18:29:32

We study optimal auction design in an independent private values environment where bidders can endogenously -- but at a cost -- improve information about their own valuations. The optimal mechanism is two-stage: at stage-1 bidders register an information acquisition plan and pay a transfer; at stage-2 they bid, and allocation and payments are determined. We show that the revenue-optimal stage-2 rule is the Vickrey--Clarke--Groves (VCG) mechanism, while stage-1 transfers implement the optimal screening of types and absorb information rents consistent with incentive compatibility and participation. By committing to VCG ex post, the pre-auction information game becomes a potential game, so equilibrium information choices maximize expected welfare; the stage-1 fee schedule then transfers an optimal amount of payoff without conditioning on unverifiable cost scales. The design is robust to asymmetric primitives and accommodates a wide range of information technologies, providing a simple implementation that unifies efficiency and optimal revenue in environments with endogenous information acquisition.

OptMap: Geometric Map Distillation via Submodular Maximization

Authors:David Thorne, Nathan Chan, Christa S. Robison, Philip R. Osteen, Brett T. Lopez

Date:2025-12-08 17:56:57

Autonomous robots rely on geometric maps to inform a diverse set of perception and decision-making algorithms. As autonomy requires reasoning and planning on multiple scales of the environment, each algorithm may require a different map for optimal performance. Light Detection And Ranging (LiDAR) sensors generate an abundance of geometric data to satisfy these diverse requirements, but selecting informative, size-constrained maps is computationally challenging as it requires solving an NP-hard combinatorial optimization. In this work we present OptMap: a geometric map distillation algorithm which achieves real-time, application-specific map generation via multiple theoretical and algorithmic innovations. A central feature is the maximization of set functions that exhibit diminishing returns, i.e., submodularity, using polynomial-time algorithms with provably near-optimal solutions. We formulate a novel submodular reward function which quantifies informativeness, reduces input set sizes, and minimizes bias in sequentially collected datasets. Further, we propose a dynamically reordered streaming submodular algorithm which improves empirical solution quality and addresses input order bias via an online approximation of the value of all scans. Testing was conducted on open-source and custom datasets with an emphasis on long-duration mapping sessions, highlighting OptMap's minimal computation requirements. Open-source ROS1 and ROS2 packages are available and can be used alongside any LiDAR SLAM algorithm.

DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Authors:Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, Xinggang Wang

Date:2025-12-08 17:29:52

Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse, tending to generate conservative and homogeneous behaviors. While DiffusionDrive employs predefined anchors representing different driving intentions to partition the action space and generate diverse trajectories, its reliance on imitation learning lacks sufficient constraints, resulting in a dilemma between diversity and consistent high quality. In this work, we propose DiffusionDriveV2, which leverages reinforcement learning to both constrain low-quality modes and explore for superior trajectories. This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model. First, we use scale-adaptive multiplicative noise, ideal for trajectory planning, to promote broad exploration. Second, we employ intra-anchor GRPO to manage advantage estimation among samples generated from a single anchor, and inter-anchor truncated GRPO to incorporate a global perspective across different anchors, preventing improper advantage comparisons between distinct intentions (e.g., turning vs. going straight), which can lead to further mode collapse. DiffusionDriveV2 achieves 91.2 PDMS on the NAVSIM v1 dataset and 85.5 EPDMS on the NAVSIM v2 dataset in closed-loop evaluation with an aligned ResNet-34 backbone, setting a new record. Further experiments validate that our approach resolves the dilemma between diversity and consistent high quality for truncated diffusion models, achieving the best trade-off. Code and model will be available at https://github.com/hustvl/DiffusionDriveV2

Privacy Practices of Browser Agents

Authors:Alisha Ukani, Hamed Haddadi, Ali Shahin Shamsabadi, Peter Snyder

Date:2025-12-08 17:16:12

This paper presents a systematic evaluation of the privacy behaviors and attributes of eight recent, popular browser agents. Browser agents are software that automate Web browsing using large language models and ancillary tooling. However, the automated capabilities that make browser agents powerful also make them high-risk points of failure. Both the kinds of tasks browser agents are designed to execute, along with the kinds of information browser agents are entrusted with to fulfill those tasks, mean that vulnerabilities in these tools can result in enormous privacy harm. This work presents a framework of five broad factors (totaling 15 distinct measurements) to measure the privacy risks in browser agents. Our framework assesses i. vulnerabilities in the browser agent's components, ii. how the browser agent protects against website behaviors, iii. whether the browser agent prevents cross-site tracking, iv. how the agent responds to privacy-affecting prompts, and v. whether the tool leaks personal information to sites. We apply our framework to eight browser agents and identify 30 vulnerabilities, ranging from disabled browser privacy features to "autocompleting" sensitive personal information in form fields. We have responsibly disclosed our findings, and plan to release our dataset and other artifacts.

Energy-Aware Aggregation of Input Data for the Optimisation of Heat Supply of Municipal Districts

Authors:Patrik Schönfeldt, Elif Turhan

Date:2025-12-08 15:34:15

In the context of municipal heat planning, it is imperative to consider the numerous buildings, numbering in the hundreds or thousands, that are involved. This poses particular challenges for model-based energy system optimization, as the number of variables increases with the number of buildings under consideration. In the worst case, the computational complexity of the models experiences an exponential increase with the number of variables. Furthermore, within the context of heat transition, it is often necessary to map extended periods of time (i.e., the service life of systems) with high resolution (particularly in the case of load peaks that occur at the onset of the day). In response to these challenges, the aggregation of input data is a common practice. In general, building blocks or other geographical and urban formations, such as neighbourhoods, are combined. This article explores the potential of incorporating energy performance indicators into the grouping of buildings. The case study utilizes authentic data from the Neu-Schwachhausen district, grouped based on geographical location, building geometry, and energy performance indicators. The selection of energy indicators includes the annual heat consumption as well as the potential for solar energy generation. To this end, a methodology is hereby presented that considers not only the anticipated annual energy quantity, but also its progression over time. We present a full workflow from geodata to a set of techno-socio-economically Pareto-optimal heat supply options. Our findings suggest that it is beneficial to find a balance between geographical position and energy properties when grouping buildings for the use in energy system models.

Precise Liver Tumor Segmentation in CT Using a Hybrid Deep Learning-Radiomics Framework

Authors:Xuecheng Li, Weikuan Jia, Komildzhon Sharipov, Alimov Ruslan, Lutfuloev Mazbutdzhon, Ismoilov Shuhratjon, Yuanjie Zheng

Date:2025-12-08 14:09:21

Accurate three-dimensional delineation of liver tumors on contrast-enhanced CT is a prerequisite for treatment planning, navigation and response assessment, yet manual contouring is slow, observer-dependent and difficult to standardise across centres. Automatic segmentation is complicated by low lesion-parenchyma contrast, blurred or incomplete boundaries, heterogeneous enhancement patterns, and confounding structures such as vessels and adjacent organs. We propose a hybrid framework that couples an attention-enhanced cascaded U-Net with handcrafted radiomics and voxel-wise 3D CNN refinement for joint liver and liver-tumor segmentation. First, a 2.5D two-stage network with a densely connected encoder, sub-pixel convolution decoders and multi-scale attention gates produces initial liver and tumor probability maps from short stacks of axial slices. Inter-slice temporal consistency is then enforced by a simple three-slice refinement rule along the cranio-caudal direction, which restores thin and tiny lesions while suppressing isolated noise. Next, 728 radiomic descriptors spanning intensity, texture, shape, boundary and wavelet feature groups are extracted from candidate lesions and reduced to 20 stable, highly informative features via multi-strategy feature selection; a random forest classifier uses these features to reject false-positive regions. Finally, a compact 3D patch-based CNN derived from AlexNet operates in a narrow band around the tumor boundary to perform voxel-level relabelling and contour smoothing.

Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models

Authors:Kassoum Sanogo, Renzo Ardiccioni

Date:2025-12-08 13:58:46

Vision-language models (VLMs) frequently generate hallucinated content plausible but incorrect claims about image content. We propose a training-free self-correction framework enabling VLMs to iteratively refine responses through uncertainty-guided visual re-attention. Our method combines multidimensional uncertainty quantification (token entropy, attention dispersion, semantic consistency, claim confidence) with attention-guided cropping of under-explored regions. Operating entirely with frozen, pretrained VLMs, our framework requires no gradient updates. We validate our approach on the POPE and MMHAL BENCH benchmarks using the Qwen2.5-VL-7B [23] architecture. Experimental results demonstrate that our method reduces hallucination rates by 9.8 percentage points compared to the baseline, while improving object existence accuracy by 4.7 points on adversarial splits. Furthermore, qualitative analysis confirms that uncertainty-guided re-attention successfully grounds corrections in visual evidence where standard decoding fails. We validate our approach on Qwen2.5-VL-7B [23], with plans to extend validation across diverse architectures in future versions. We release our code and methodology to facilitate future research in trustworthy multimodal systems.

Model-Based Reinforcement Learning Under Confounding

Authors:Nishanth Venkatesh, Andreas A. Malikopoulos

Date:2025-12-08 13:02:00

We investigate model-based reinforcement learning in contextual Markov decision processes (C-MDPs) in which the context is unobserved and induces confounding in the offline dataset. In such settings, conventional model-learning methods are fundamentally inconsistent, as the transition and reward mechanisms generated under a behavioral policy do not correspond to the interventional quantities required for evaluating a state-based policy. To address this issue, we adapt a proximal off-policy evaluation approach that identifies the confounded reward expectation using only observable state-action-reward trajectories under mild invertibility conditions on proxy variables. When combined with a behavior-averaged transition model, this construction yields a surrogate MDP whose Bellman operator is well defined and consistent for state-based policies, and which integrates seamlessly with the maximum causal entropy (MaxCausalEnt) model-learning framework. The proposed formulation enables principled model learning and planning in confounded environments where contextual information is unobserved, unavailable, or impractical to collect.

From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

Authors:Fei Yu, Yu Liu, Luyang Tang, Mingchao Sun, Zengye Ge, Rui Bu, Yuchao Jin, Haisen Zhao, He Sun, Yangyan Li, Mu Xu, Wenzheng Chen, Baoquan Chen

Date:2025-12-08 13:01:12

City-scale 3D reconstruction from satellite imagery presents the challenge of extreme viewpoint extrapolation, where our goal is to synthesize ground-level novel views from sparse orbital images with minimal parallax. This requires inferring nearly $90^\circ$ viewpoint gaps from image sources with severely foreshortened facades and flawed textures, causing state-of-the-art reconstruction engines such as NeRF and 3DGS to fail. To address this problem, we propose two design choices tailored for city structures and satellite inputs. First, we model city geometry as a 2.5D height map, implemented as a Z-monotonic signed distance field (SDF) that matches urban building layouts from top-down viewpoints. This stabilizes geometry optimization under sparse, off-nadir satellite views and yields a watertight mesh with crisp roofs and clean, vertically extruded facades. Second, we paint the mesh appearance from satellite images via differentiable rendering techniques. While the satellite inputs may contain long-range, blurry captures, we further train a generative texture restoration network to enhance the appearance, recovering high-frequency, plausible texture details from degraded inputs. Our method's scalability and robustness are demonstrated through extensive experiments on large-scale urban reconstruction. For example, in our teaser figure, we reconstruct a $4\,\mathrm{km}^2$ real-world region from only a few satellite images, achieving state-of-the-art performance in synthesizing photorealistic ground views. The resulting models are not only visually compelling but also serve as high-fidelity, application-ready assets for downstream tasks like urban planning and simulation.

Bridging CORDEX and CMIP6: Machine Learning Downscaling for Wind and Solar Energy Droughts in Central Europe

Authors:Nina Effenberger, Maxim Samarin, Maybritt Schillinger, Reto Knutti

Date:2025-12-08 11:03:15

Reliable regional climate information is essential for assessing the impacts of climate change and for planning in sectors such as renewable energy; yet, producing high-resolution projections through coordinated initiatives like CORDEX that run multiple physical regional climate models is both computationally demanding and difficult to organize. Machine learning emulators that learn the mapping between global and regional climate fields offer a promising way to address these limitations. Here we introduce the application of such an emulator: trained on CMIP5 and CORDEX simulations, it reproduces regional climate model data with sufficient accuracy. When applied to CMIP6 simulations not seen during training, it also produces realistic results, indicating stable performance. Using CORDEX data, CMIP5 and CMIP6 simulations, as well as regional data generated by two machine learning models, we analyze the co-occurrence of low wind speed and low solar radiation and find indications that the number of such energy drought days is likely to decrease in the future. Our results highlight that downscaling with machine learning emulators provides an efficient complement to efforts such as CORDEX, supplying the higher-resolution information required for impact assessments.

Off-grid solar energy storage system with hybrid lithium iron phosphate (LFP) and lead-acid batteries in high mountains: a case report of Jiujiu Cabins in Taiwan

Authors:Hsien-Ching Chung

Date:2025-12-08 09:48:38

Mountain huts are buildings located at high altitude, offering a place for hikers and providing shelter. Energy supply on mountain huts is still an open issue. Using renewable energies could be an appropriate solution. Jiujiu Cabins, a famous mountain hut in Shei-Pa National Park, Taiwan, has operated an off-grid solar energy storage system (ESS) with lead-acid batteries. In 2021, a serious system failures took place, leading to no electricity. After an detailed on-site survey, a reorganization and repair project implemented, the energy system came back to operate normally. Meanwhile, a eco-friendly lithium iron phosphate battery (LFP battery) ESS replaces part of the lead-acid battery ESS, forming a hybrid ESS, making a better and green off-grid solar ESS. In this case report, the energy architecture, detailed descriptions, and historical status of the system are provided. An on-site survey of the failed energy system, a system improvement project, and future plan are listed.

M-STAR: Multi-Scale Spatiotemporal Autoregression for Human Mobility Modeling

Authors:Yuxiao Luo, Songming Zhang, Sijie Ruan, Siran Chen, Kang Liu, Yang Xu, Yu Zheng, Ling Yin

Date:2025-12-08 08:57:55

Modeling human mobility is vital for extensive applications such as transportation planning and epidemic modeling. With the rise of the Artificial Intelligence Generated Content (AIGC) paradigm, recent works explore synthetic trajectory generation using autoregressive and diffusion models. While these methods show promise for generating single-day trajectories, they remain limited by inefficiencies in long-term generation (e.g., weekly trajectories) and a lack of explicit spatiotemporal multi-scale modeling. This study proposes Multi-Scale Spatio-Temporal AutoRegression (M-STAR), a new framework that generates long-term trajectories through a coarse-to-fine spatiotemporal prediction process. M-STAR combines a Multi-scale Spatiotemporal Tokenizer that encodes hierarchical mobility patterns with a Transformer-based decoder for next-scale autoregressive prediction. Experiments on two real-world datasets show that M-STAR outperforms existing methods in fidelity and significantly improves generation speed. The data and codes are available at https://github.com/YuxiaoLuo0013/M-STAR.

Efficient Computation of a Continuous Topological Model of the Configuration Space of Tethered Mobile Robots

Authors:Gianpietro Battocletti, Dimitris Boskos, Bart De Schutter

Date:2025-12-08 08:45:17

Despite the attention that the problem of path planning for tethered robots has garnered in the past few decades, the approaches proposed to solve it typically rely on a discrete representation of the configuration space and do not exploit a model that can simultaneously capture the topological information of the tether and the continuous location of the robot. In this work, we explicitly build a topological model of the configuration space of a tethered robot starting from a polygonal representation of the workspace where the robot moves. To do so, we first establish a link between the configuration space of the tethered robot and the universal covering space of the workspace, and then we exploit this link to develop an algorithm to compute a simplicial complex model of the configuration space. We show how this approach improves the performances of existing algorithms that build other types of representations of the configuration space. The proposed model can be computed in a fraction of the time required to build traditional homotopy-augmented graphs, and is continuous, allowing to solve the path planning task for tethered robots using a broad set of path planning algorithms.

Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery

Authors:Mai Tsujimoto, Junjue Wang, Weihao Xuan, Naoto Yokoya

Date:2025-12-08 08:16:14

Three-dimensional geospatial analysis is critical to applications in urban planning, climate adaptation, and environmental assessment. Current methodologies depend on costly, specialized sensors (e.g., LiDAR and multispectral), which restrict global accessibility. Existing sensor-based and rule-driven methods further struggle with tasks requiring the integration of multiple 3D cues, handling diverse queries, and providing interpretable reasoning. We hereby present Geo3DVQA, a comprehensive benchmark for evaluating vision-language models (VLMs) in height-aware, 3D geospatial reasoning using RGB-only remote sensing imagery. Unlike conventional sensor-based frameworks, Geo3DVQA emphasizes realistic scenarios that integrate elevation, sky view factors, and land cover patterns. The benchmark encompasses 110k curated question-answer pairs spanning 16 task categories across three complexity levels: single-feature inference, multi-feature reasoning, and application-level spatial analysis. The evaluation of ten state-of-the-art VLMs highlights the difficulty of RGB-to-3D reasoning. GPT-4o and Gemini-2.5-Flash achieved only 28.6% and 33.0% accuracy respectively, while domain-specific fine-tuning of Qwen2.5-VL-7B achieved 49.6% (+24.8 points). These results reveal both the limitations of current VLMs and the effectiveness of domain adaptation. Geo3DVQA introduces new challenge frontiers for scalable, accessible, and holistic 3D geospatial analysis. The dataset and code will be released upon publication at https://github.com/mm1129/Geo3DVQA.

Analysing the factors affecting electric vehicle adoption using the extended theory of planned behaviour framework

Authors:Pranshu Raghuvanshi, Anjula Gurtoo

Date:2025-12-08 05:49:23

This study uses the Theory of Planned Behaviour (TPB) framework and expands it by including Optimism, Innovativeness and Range Anxiety constructs. In this study, conducted in Lucknow, the capital of India's most populous province (Uttar Pradesh), a multi stage random sampling design was employed to select 432 respondents from different city areas. The survey instruments were adapted from similar studies and suitably modified to suit the context. Using exploratory factor analysis, 18 measurement items converged into six factors, namely attitude, subjective norms, perceived behavioural control, optimism, innovativeness and range anxiety. We confirmed the reliability and validity of the constructs using Cronbach's alpha, composite reliability, average variance extracted and discriminant validity analysis. We explored the relationship between them using structural equation modelling. All factors but Optimism were found to be significantly associated with adoption intention. We further employed mediation analysis to examine the mediation pathways. The TPB components mediated the effect of innovativeness but not range anxiety. The study's insights can help policymakers and marketers design targeted interventions that address consumer concerns, reshape consumer perceptions, and foster greater EV adoption. The interventions can target increasing the mediating variables or decreasing range anxiety to facilitate a smoother transition to sustainable transportation.

TrajMoE: Scene-Adaptive Trajectory Planning with Mixture of Experts and Reinforcement Learning

Authors:Zebin Xing, Pengxuan Yang, Linbo Wang, Yichen Zhang, Yiming Hu, Yupeng Zheng, Junli Wang, Yinfeng Gao, Guang Li, Kun Ma, Long Chen, Zhongpu Xia, Qichao Zhang, Hangjun Ye, Dongbin Zhao

Date:2025-12-08 03:40:10

Current autonomous driving systems often favor end-to-end frameworks, which take sensor inputs like images and learn to map them into trajectory space via neural networks. Previous work has demonstrated that models can achieve better planning performance when provided with a prior distribution of possible trajectories. However, these approaches often overlook two critical aspects: 1) The appropriate trajectory prior can vary significantly across different driving scenarios. 2) Their trajectory evaluation mechanism lacks policy-driven refinement, remaining constrained by the limitations of one-stage supervised training. To address these issues, we explore improvements in two key areas. For problem 1, we employ MoE to apply different trajectory priors tailored to different scenarios. For problem 2, we utilize Reinforcement Learning to fine-tune the trajectory scoring mechanism. Additionally, we integrate models with different perception backbones to enhance perceptual features. Our integrated model achieved a score of 51.08 on the navsim ICCV benchmark, securing third place.

Procrustean Bed for AI-Driven Retrosynthesis: A Unified Framework for Reproducible Evaluation

Authors:Anton Morgunov, Victor S. Batista

Date:2025-12-08 01:26:39

Progress in computer-aided synthesis planning (CASP) is obscured by the lack of standardized evaluation infrastructure and the reliance on metrics that prioritize topological completion over chemical validity. We introduce RetroCast, a unified evaluation suite that standardizes heterogeneous model outputs into a common schema to enable statistically rigorous, apples-to-apples comparison. The framework includes a reproducible benchmarking pipeline with stratified sampling and bootstrapped confidence intervals, accompanied by SynthArena, an interactive platform for qualitative route inspection. We utilize this infrastructure to evaluate leading search-based and sequence-based algorithms on a new suite of standardized benchmarks. Our analysis reveals a divergence between "solvability" (stock-termination rate) and route quality; high solvability scores often mask chemical invalidity or fail to correlate with the reproduction of experimental ground truths. Furthermore, we identify a "complexity cliff" in which search-based methods, despite high solvability rates, exhibit a sharp performance decay in reconstructing long-range synthetic plans compared to sequence-based approaches. We release the full framework, benchmark definitions, and a standardized database of model predictions to support transparent and reproducible development in the field.

DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

Authors:Adnan Munir, Shujaat Khan

Date:2025-12-07 23:57:00

Medical image segmentation plays a pivotal role in automated diagnostic and treatment planning systems. In this work, we present DAUNet, a novel lightweight UNet variant that integrates Deformable V2 Convolutions and Parameter-Free Attention (SimAM) to improve spatial adaptability and context-aware feature fusion without increasing model complexity. DAUNet's bottleneck employs dynamic deformable kernels to handle geometric variations, while the decoder and skip pathways are enhanced using SimAM attention modules for saliency-aware refinement. Extensive evaluations on two challenging datasets, FH-PS-AoP (fetal head and pubic symphysis ultrasound) and FUMPE (CT-based pulmonary embolism detection), demonstrate that DAUNet outperforms state-of-the-art models in Dice score, HD95, and ASD, while maintaining superior parameter efficiency. Ablation studies highlight the individual contributions of deformable convolutions and SimAM attention. DAUNet's robustness to missing context and low-contrast regions establishes its suitability for deployment in real-time and resource-constrained clinical environments.

A Hetero-Associative Sequential Memory Model Utilizing Neuromorphic Signals: Validated on a Mobile Manipulator

Authors:Runcong Wang, Fengyi Wang, Gordon Cheng

Date:2025-12-07 22:50:01

This paper presents a hetero-associative sequential memory system for mobile manipulators that learns compact, neuromorphic bindings between robot joint states and tactile observations to produce step-wise action decisions with low compute and memory cost. The method encodes joint angles via population place coding and converts skin-measured forces into spike-rate features using an Izhikevich neuron model; both signals are transformed into bipolar binary vectors and bound element-wise to create associations stored in a large-capacity sequential memory. To improve separability in binary space and inject geometry from touch, we introduce 3D rotary positional embeddings that rotate subspaces as a function of sensed force direction, enabling fuzzy retrieval through a softmax weighted recall over temporally shifted action patterns. On a Toyota Human Support Robot covered by robot skin, the hetero-associative sequential memory system realizes a pseudocompliance controller that moves the link under touch in the direction and with speed correlating to the amplitude of applied force, and it retrieves multi-joint grasp sequences by continuing tactile input. The system sets up quickly, trains from synchronized streams of states and observations, and exhibits a degree of generalization while remaining economical. Results demonstrate single-joint and full-arm behaviors executed via associative recall, and suggest extensions to imitation learning, motion planning, and multi-modal integration.

Dynamic Boolean Synthesis with Zero-suppressed Decision Diagrams

Authors:Yi Lin, Moshe Y. Vardi

Date:2025-12-07 21:57:53

Motivated by functional synthesis in sequential circuit construction and quantified boolean formulas (QBF), boolean synthesis serves as one of the core problems in Formal Methods. Recent advances show that decision diagrams (DD) are particularly competitive in symbolic approaches for boolean synthesis, among which zero-suppressed decision diagram (ZDD) is a relatively new algorithmic approach, but is complementary to the industrial portfolio, where binary decision diagrams (BDDs) are more often applied. We propose a new dynamic-programming ZDD-based framework in the context of boolean synthesis, show solutions to theoretical challenges, develop a tool, and investigate the experimental performance. We also propose an idea of magic number that functions as the upper bound of planning-phase time and treewidth, showing how to interpret the exploration-exploitation dilemma in planning-execution synthesis framework. The algorithm we propose shows its strengths in general, gives inspiration for future needs to determine industrial magic numbers, and justifies that the framework we propose is an appropriate addition to the industrial synthesis solvers portfolio.

Synergies between AI Computing and Power Systems: Metrics, Scheduling, and Resilience

Authors:Farzaneh Pourahmadi, Olivier Corradi, Pierre Pinson

Date:2025-12-07 21:06:55

In this paper, we first clarify the concepts of green AI versus frugal AI, positioning frugality as efficiency by design and green AI as transparency and accountability. We then argue that these approaches, while complementary, are insufficient without a shared quantitative foundation that links AI computing to power system contexts. This motivates the development of standardized carbon metrics as a bridge between algorithmic decisions and their physical consequences. We next embed these signals into scheduling and planning frameworks, presenting two architectures: (i) an iterative signal-response loop for real-time operations, and (ii) an integrated optimization that learns and encodes flexible-load behavior for long-term planning. Finally, we show how the same coordination stack supports resilience, enabling signals to shift from emissions-first to stability-first during stress events, providing targeted relief and faster restoration.

Utilizing Multi-Agent Reinforcement Learning with Encoder-Decoder Architecture Agents to Identify Optimal Resection Location in Glioblastoma Multiforme Patients

Authors:Krishna Arun, Moinak Bhattachrya, Paras Goel

Date:2025-12-07 20:51:59

Currently, there is a noticeable lack of AI in the medical field to support doctors in treating heterogenous brain tumors such as Glioblastoma Multiforme (GBM), the deadliest human cancer in the world with a five-year survival rate of just 5.1%. This project develops an AI system offering the only end-to-end solution by aiding doctors with both diagnosis and treatment planning. In the diagnosis phase, a sequential decision-making framework consisting of 4 classification models (Convolutional Neural Networks and Support Vector Machine) are used. Each model progressively classifies the patient's brain into increasingly specific categories, with the final step being named diagnosis. For treatment planning, an RL system consisting of 3 generative models is used. First, the resection model (diffusion model) analyzes the diagnosed GBM MRI and predicts a possible resection outcome. Second, the radiotherapy model (Spatio-Temporal Vision Transformer) generates an MRI of the brain's progression after a user-defined number of weeks. Third, the chemotherapy model (Diffusion Model) produces the post-treatment MRI. A survival rate calculator (Convolutional Neural Network) then checks if the generated post treatment MRI has a survival rate within 15% of the user defined target. If not, a feedback loop using proximal policy optimization iterates over this system until an optimal resection location is identified. When compared to existing solutions, this project found 3 key findings: (1) Using a sequential decision-making framework consisting of 4 small diagnostic models reduced computing costs by 22.28x, (2) Transformers regression capabilities decreased tumor progression inference time by 113 hours, and (3) Applying Augmentations resembling Real-life situations improved overall DICE scores by 2.9%. These results project to increase survival rates by 0.9%, potentially saving approximately 2,250 lives.

On Memory: A comparison of memory mechanisms in world models

Authors:Eli J. Laird, Corey Clark

Date:2025-12-07 20:29:20

World models enable agents to plan within imagined environments by predicting future states conditioned on past observations and actions. However, their ability to plan over long horizons is limited by the effective memory span of the backbone architecture. This limitation leads to perceptual drift in long rollouts, hindering the model's capacity to perform loop closures within imagined trajectories. In this work, we investigate the effective memory span of transformer-based world models through an analysis of several memory augmentation mechanisms. We introduce a taxonomy that distinguishes between memory encoding and memory injection mechanisms, motivating their roles in extending the world model's memory through the lens of residual stream dynamics. Using a state recall evaluation task, we measure the memory recall of each mechanism and analyze its respective trade-offs. Our findings show that memory mechanisms improve the effective memory span in vision transformers and provide a path to completing loop closures within a world model's imagination.

Could electron-top interactions spoil the measurement of the Higgs trilinear? -A quantitative estimate at future lepton colliders-

Authors:Lukas Allwicher, Christophe Grojean, Lucine Tabatt

Date:2025-12-07 16:51:58

The measurement of the Higgs self-coupling is considered the next milestone in the study of the Higgs boson properties. At future $e^+e^-$ facilities below the double Higgs production threshold, this is extracted from the $Zh$ production cross-section, which is sensitive to the trilinear coupling at the one-loop level. At the same perturbative order, potential effects beyond the Standard Model (SM) may affect the Higgstrahlung rate and distort the self-coupling determination. We study the question focusing especially on contact interactions containing two electron and two top-quark fields. We conclude that, in the context of FCC-ee and its planned runs at different energies, $eett$ interactions change the Higgs self-coupling sensitivity below the percent level. Even in the most pessimistic scenarios, we confirm a robust sensitivity of the order of 17% at the 1$σ$ confidence level under the assumption of otherwise SM-like Higgs couplings. A crucial role in these results is played by the measurement of fermion pair production above the $Z$ resonance.

Energy-Efficient Navigation for Surface Vehicles in Vortical Flow Fields

Authors:Rushiraj Gadhvi, Sandeep Manjanna

Date:2025-12-07 16:36:31

For centuries, khalasi have skillfully harnessed ocean currents to navigate vast waters with minimal effort. Emulating this intuition in autonomous systems remains a significant challenge, particularly for Autonomous Surface Vehicles tasked with long duration missions under strict energy budgets. In this work, we present a learning-based approach for energy-efficient surface vehicle navigation in vortical flow fields, where partial observability often undermines traditional path-planning methods. We present an end to end reinforcement learning framework based on Soft Actor Critic that learns flow-aware navigation policies using only local velocity measurements. Through extensive evaluation across diverse and dynamically rich scenarios, our method demonstrates substantial energy savings and robust generalization to previously unseen flow conditions, offering a promising path toward long term autonomy in ocean environments. The navigation paths generated by our proposed approach show an improvement in energy conservation 30 to 50 percent compared to the existing state of the art techniques.

From Zero to High-Speed Racing: An Autonomous Racing Stack

Authors:Hassan Jardali, Durgakant Pushp, Youwei Yu, Mahmoud Ali, Ihab S. Mohamed, Alejandro Murillo-Gonzalez, Paul D. Coen, Md. Al-Masrur Khan, Reddy Charan Pulivendula, Saeoul Park, Lingchuan Zhou, Lantao Liu

Date:2025-12-07 15:35:16

High-speed, head-to-head autonomous racing presents substantial technical and logistical challenges, including precise localization, rapid perception, dynamic planning, and real-time control-compounded by limited track access and costly hardware. This paper introduces the Autonomous Race Stack (ARS), developed by the IU Luddy Autonomous Racing team for the Indy Autonomous Challenge (IAC). We present three iterations of our ARS, each validated on different tracks and achieving speeds up to 260 km/h. Our contributions include: (i) the modular architecture and evolution of the ARS across ARS1, ARS2, and ARS3; (ii) a detailed performance evaluation that contrasts control, perception, and estimation across oval and road-course environments; and (iii) the release of a high-speed, multi-sensor dataset collected from oval and road-course tracks. Our findings highlight the unique challenges and insights from real-world high-speed full-scale autonomous racing.

Spatial Retrieval Augmented Autonomous Driving

Authors:Xiaosong Jia, Chenhe Zhang, Yule Jiang, Songbur Wong, Zhiyuan Zhang, Chen Chen, Shaofeng Zhang, Xuanhe Zhou, Xue Yang, Junchi Yan, Yu-Gang Jiang

Date:2025-12-07 14:40:49

Existing autonomous driving systems rely on onboard sensors (cameras, LiDAR, IMU, etc) for environmental perception. However, this paradigm is limited by the drive-time perception horizon and often fails under limited view scope, occlusion or extreme conditions such as darkness and rain. In contrast, human drivers are able to recall road structure even under poor visibility. To endow models with this ``recall" ability, we propose the spatial retrieval paradigm, introducing offline retrieved geographic images as an additional input. These images are easy to obtain from offline caches (e.g, Google Maps or stored autonomous driving datasets) without requiring additional sensors, making it a plug-and-play extension for existing AD tasks. For experiments, we first extend the nuScenes dataset with geographic images retrieved via Google Maps APIs and align the new data with ego-vehicle trajectories. We establish baselines across five core autonomous driving tasks: object detection, online mapping, occupancy prediction, end-to-end planning, and generative world modeling. Extensive experiments show that the extended modality could enhance the performance of certain tasks. We will open-source dataset curation code, data, and benchmarks for further study of this new autonomous driving paradigm.

db-LaCAM: Fast and Scalable Multi-Robot Kinodynamic Motion Planning with Discontinuity-Bounded Search and Lightweight MAPF

Authors:Akmaral Moldagalieva, Keisuke Okumura, Amanda Prorok, Wolfgang Hönig

Date:2025-12-07 11:17:10

State-of-the-art multi-robot kinodynamic motion planners struggle to handle more than a few robots due to high computational burden, which limits their scalability and results in slow planning time. In this work, we combine the scalability and speed of modern multi-agent path finding (MAPF) algorithms with the dynamic-awareness of kinodynamic planners to address these limitations. To this end, we propose discontinuity-Bounded LaCAM (db-LaCAM), a planner that utilizes a precomputed set of motion primitives that respect robot dynamics to generate horizon-length motion sequences, while allowing a user-defined discontinuity between successive motions. The planner db-LaCAM is resolution-complete with respect to motion primitives and supports arbitrary robot dynamics. Extensive experiments demonstrate that db-LaCAM scales efficiently to scenarios with up to 50 robots, achieving up to ten times faster runtime compared to state-of-the-art planners, while maintaining comparable solution quality. The approach is validated in both 2D and 3D environments with dynamics such as the unicycle and 3D double integrator. We demonstrate the safe execution of trajectories planned with db-LaCAM in two distinct physical experiments involving teams of flying robots and car-with-trailer robots.

A Novel Deep Neural Network Architecture for Real-Time Water Demand Forecasting

Authors:Tony Salloom, Okyay Kaynak, Wei He

Date:2025-12-07 08:08:49

Short-term water demand forecasting (StWDF) is the foundation stone in the derivation of an optimal plan for controlling water supply systems. Deep learning (DL) approaches provide the most accurate solutions for this purpose. However, they suffer from complexity problem due to the massive number of parameters, in addition to the high forecasting error at the extreme points. In this work, an effective method to alleviate the error at these points is proposed. It is based on extending the data by inserting virtual data within the actual data to relieve the nonlinearity around them. To our knowledge, this is the first work that considers the problem related to the extreme points. Moreover, the water demand forecasting model proposed in this work is a novel DL model with relatively low complexity. The basic model uses the gated recurrent unit (GRU) to handle the sequential relationship in the historical demand data, while an unsupervised classification method, K-means, is introduced for the creation of new features to enhance the prediction accuracy with less number of parameters. Real data obtained from two different water plants in China are used to train and verify the model proposed. The prediction results and the comparison with the state-of-the-art illustrate that the method proposed reduces the complexity of the model six times of what achieved in the literature while conserving the same accuracy. Furthermore, it is found that extending the data set significantly reduces the error by about 30%. However, it increases the training time.