planning - 2025-12-25

Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty

Authors:Ziyu Chen, Xinbei Jiang, Peng Sun, Tao Lin
Date:2025-12-24 18:59:51

Masked Diffusion Models (MDMs) offer flexible, non-autoregressive generation, but this freedom introduces a challenge: final output quality is highly sensitive to the decoding order. We are the first to formalize this issue, attributing the variability in output quality to the cumulative predictive uncertainty along a generative path. To quantify this uncertainty, we introduce Denoising Entropy, a computable metric that serves as an internal signal for evaluating generative process. Leveraging this metric, we propose two algorithms designed to optimize the decoding path: a post-hoc selection method and a real-time guidance strategy. Experiments demonstrate that our entropy-guided methods significantly improve generation quality, consistently boosting accuracy on challenging reasoning, planning, and code benchmarks. Our work establishes Denoising Entropy as a principled tool for understanding and controlling generation, effectively turning the uncertainty in MDMs from a liability into a key advantage for discovering high-quality solutions.

Equivariant Multiscale Learned Invertible Reconstruction for Cone Beam CT: From Simulated to Real Data

Authors:Nikita Moriakov, Efstratios Gavves, Jonathan H. Mason, Carmen Seller-Oria, Jonas Teuwen, Jan-Jakob Sonke
Date:2025-12-24 13:59:43

Cone Beam CT (CBCT) is an important imaging modality nowadays, however lower image quality of CBCT compared to more conventional Computed Tomography (CT) remains a limiting factor in CBCT applications. Deep learning reconstruction methods are a promising alternative to classical analytical and iterative reconstruction methods, but applying such methods to CBCT is often difficult due to the lack of ground truth data, memory limitations and the need for fast inference at clinically-relevant resolutions. In this work we propose LIRE++, an end-to-end rotationally-equivariant multiscale learned invertible primal-dual scheme for fast and memory-efficient CBCT reconstruction. Memory optimizations and multiscale reconstruction allow for fast training and inference, while rotational equivariance improves parameter efficiency. LIRE++ was trained on simulated projection data from a fast quasi-Monte Carlo CBCT projection simulator that we developed as well. Evaluated on synthetic data, LIRE++ gave an average improvement of 1 dB in Peak Signal-to-Noise Ratio over alternative deep learning baselines. On real clinical data, LIRE++ improved the average Mean Absolute Error between the reconstruction and the corresponding planning CT by 10 Hounsfield Units with respect to current proprietary state-of-the-art hybrid deep-learning/iterative method.

Portfolio Optimization for Index Tracking with Constraints on Downside Risk and Carbon Footprint

Authors:Suparna Biswas, Rituparna Sen
Date:2025-12-24 10:16:05

Historically, financial risk management has mostly addressed risk factors that arise from the financial environment. Climate risks present a novel and significant challenge for companies and financial markets. Investors aiming for avoidance of firms with high carbon footprints require suitable risk measures and portfolio management strategies. This paper presents the construction of decarbonized indices for tracking the S \& P-500 index of the U.S. stock market, as well as the Indian index NIFTY-50, employing two distinct methodologies and study their performances. These decarbonized indices optimize the portfolio weights by minimizing the mean-VaR and mean-ES and seek to reduce the risk of significant financial losses while still pursuing decarbonization goals. Investors can thereby find a balance between financial performance and environmental responsibilities. Ensuring transparency in the development of these indices will encourage the excluded and under-weighted asset companies to lower their carbon footprints through appropriate action plans. For long-term passive investors, these indices may present a more favourable option than green stocks.

Co-Existence of Private 5G Network and Wireless Hospital Systems

Authors:Mohsin Khan, Matti Hämäläinen, Timo J. Mäkelä, Erkki Harjula, Jani Katisko
Date:2025-12-24 09:55:51

This paper investigates the feasibility of deploying private 5G networks in hospital environments, with a focus on the operating room at the brand new Oulu University Hospital, Finland. The study aims to evaluate the interference risk with other wireless systems, and electromagnetic safety of a private 5G network in the 3.9-4.1 GHz band, while ensuring compatibility with legacy wireless systems, such as LTE and Wi-Fi. We conducted a measurement campaign, employing state-of-the-art instrumentation and a methodology that combined high resolution and long-duration spectrum scans. The results demonstrate no measurable interference between the hospital's private 5G network with adjacent LTE (4G) or Wi-Fi bands, confirming the spectral isolation of the 5G transmissions, and vise versa. Additionally, RF exposure levels in the operating room were found to be well below ICNIRP, WHO, and IEEE safety thresholds, ensuring that the network poses negligible biological risk to patients and hospital staff. The study also proposes spectrum management strategies for private 5G networks in hospitals, focusing on adaptive sensing and guardband planning. These findings provide a solid foundation for the integration of private 5G infrastructure in hospitals environments, supporting digital transformation in patient care without compromising electromagnetic compatibility or patient safety. The results also contribute to ongoing discussions around private 5G network deployments in sensitive sectors and provide actionable guidelines for future hospitals' wireless systems planning.

zkFL-Health: Blockchain-Enabled Zero-Knowledge Federated Learning for Medical AI Privacy

Authors:Savvy Sharma, George Petrovic, Sarthak Kaushik
Date:2025-12-24 08:29:28

Healthcare AI needs large, diverse datasets, yet strict privacy and governance constraints prevent raw data sharing across institutions. Federated learning (FL) mitigates this by training where data reside and exchanging only model updates, but practical deployments still face two core risks: (1) privacy leakage via gradients or updates (membership inference, gradient inversion) and (2) trust in the aggregator, a single point of failure that can drop, alter, or inject contributions undetected. We present zkFL-Health, an architecture that combines FL with zero-knowledge proofs (ZKPs) and Trusted Execution Environments (TEEs) to deliver privacy-preserving, verifiably correct collaborative training for medical AI. Clients locally train and commit their updates; the aggregator operates within a TEE to compute the global update and produces a succinct ZK proof (via Halo2/Nova) that it used exactly the committed inputs and the correct aggregation rule, without revealing any client update to the host. Verifier nodes validate the proof and record cryptographic commitments on-chain, providing an immutable audit trail and removing the need to trust any single party. We outline system and threat models tailored to healthcare, the zkFL-Health protocol, security/privacy guarantees, and a performance evaluation plan spanning accuracy, privacy risk, latency, and cost. This framework enables multi-institutional medical AI with strong confidentiality, integrity, and auditability, key properties for clinical adoption and regulatory compliance.

Tracing Energy Flow: Learning Tactile-based Grasping Force Control to Prevent Slippage in Dynamic Object Interaction

Authors:Cheng-Yu Kuo, Hirofumi Shin, Takamitsu Matsubara
Date:2025-12-24 08:19:25

Regulating grasping force to reduce slippage during dynamic object interaction remains a fundamental challenge in robotic manipulation, especially when objects are manipulated by multiple rolling contacts, have unknown properties (such as mass or surface conditions), and when external sensing is unreliable. In contrast, humans can quickly regulate grasping force by touch, even without visual cues. Inspired by this ability, we aim to enable robotic hands to rapidly explore objects and learn tactile-driven grasping force control under motion and limited sensing. We propose a physics-informed energy abstraction that models the object as a virtual energy container. The inconsistency between the fingers' applied power and the object's retained energy provides a physically grounded signal for inferring slip-aware stability. Building on this abstraction, we employ model-based learning and planning to efficiently model energy dynamics from tactile sensing and perform real-time grasping force optimization. Experiments in both simulation and hardware demonstrate that our method can learn grasping force control from scratch within minutes, effectively reduce slippage, and extend grasp duration across diverse motion-object pairs, all without relying on external sensing or prior object knowledge.

FinAgent: An Agentic AI Framework Integrating Personal Finance and Nutrition Planning

Authors:Toqeer Ali Syed, Abdulaziz Alshahrani, Ali Ullah, Ali Akarma, Sohail Khan, Muhammad Nauman, Salman Jan
Date:2025-12-24 06:33:17

The issue of limited household budgets and nutritional demands continues to be a challenge especially in the middle-income environment where food prices fluctuate. This paper introduces a price aware agentic AI system, which combines personal finance management with diet optimization. With household income and fixed expenditures, medical and well-being status, as well as real-time food costs, the system creates nutritionally sufficient meals plans at comparatively reasonable prices that automatically adjust to market changes. The framework is implemented in a modular multi-agent architecture, which has specific agents (budgeting, nutrition, price monitoring, and health personalization). These agents share the knowledge base and use the substitution graph to ensure that the nutritional quality is maintained at a minimum cost. Simulations with a representative Saudi household case study show a steady 12-18\% reduction in costs relative to a static weekly menu, nutrient adequacy of over 95\% and high performance with price changes of 20-30%. The findings indicate that the framework can locally combine affordability with nutritional adequacy and provide a viable avenue of capacity-building towards sustainable and fair diet planning in line with Sustainable Development Goals on Zero Hunger and Good Health.

Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models

Authors:Xiang Zhang, Jiaqi Wei, Yuejin Yang, Zijie Qiu, Yuhan Chen, Zhiqiang Gao, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Wanli Ouyang, Chenyu You, Siqi Sun
Date:2025-12-24 05:25:17

Chain-of-Thought (CoT) prompting has significantly advanced task-solving capabilities in natural language processing with large language models. Unlike standard prompting, CoT encourages the model to generate intermediate reasoning steps, non-answer tokens, that help guide the model toward more accurate final outputs. These intermediate steps enable more complex reasoning processes such as error correction, memory management, future planning, and self-reflection. However, applying CoT to non-natural language domains, such as protein and RNA language models, is not yet possible, primarily due to the limited expressiveness of their token spaces (e.g., amino acid tokens). In this work, we propose and define the concept of language expressiveness: the ability of a given language, using its tokens and grammar, to encode information. We show that the limited expressiveness of protein language severely restricts the applicability of CoT-style reasoning. To overcome this, we introduce reflection pretraining, for the first time in a biological sequence model, which enables the model to engage in intermediate reasoning through the generation of auxiliary "thinking tokens" beyond simple answer tokens. Theoretically, we demonstrate that our augmented token set significantly enhances biological language expressiveness, thereby improving the overall reasoning capacity of the model. Experimentally, our pretraining approach teaches protein models to self-correct and leads to substantial performance gains compared to standard pretraining.

ETP-R1: Evolving Topological Planning with Reinforcement Fine-tuning for Vision-Language Navigation in Continuous Environments

Authors:Shuhao Ye, Sitong Mao, Yuxiang Cui, Xuan Yu, Shichao Zhai, Wen Chen, Shunbo Zhou, Rong Xiong, Yue Wang
Date:2025-12-24 04:53:03

Vision-Language Navigation in Continuous Environments (VLN-CE) requires an embodied agent to navigate towards target in continuous environments, following natural language instructions. While current graph-based methods offer an efficient, structured approach by abstracting the environment into a topological map and simplifying the action space to waypoint selection, they lag behind methods based on Large Vision-Language Models (LVLMs) in leveraging large-scale data and advanced training paradigms. In this paper, we try to bridge this gap by introducing ETP-R1, a framework that applies the paradigm of scaling up data and Reinforcement Fine-Tuning (RFT) to a graph-based VLN-CE model. To build a strong foundation, we first construct a high-quality, large-scale pretraining dataset using the Gemini API. This dataset consists of diverse, low-hallucination instructions for topological trajectories, providing rich supervision for our graph-based policy to map language to topological paths. This foundation is further strengthened by unifying data from both R2R and RxR tasks for joint pretraining. Building on this, we introduce a three-stage training paradigm, which culminates in the first application of closed-loop, online RFT to a graph-based VLN-CE model, powered by the Group Relative Policy Optimization (GRPO) algorithm. Extensive experiments demonstrate that our approach is highly effective, establishing new state-of-the-art performance across all major metrics on both the R2R-CE and RxR-CE benchmarks. Our code is available at https://github.com/Cepillar/ETP-R1.

Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation

Authors:Hongxing Fan, Shuyu Zhao, Jiayang Ao, Lu Sheng
Date:2025-12-24 04:39:45

Amodal completion, the task of inferring invisible object parts, faces significant challenges in maintaining semantic consistency and structural integrity. Prior progressive approaches are inherently limited by inference instability and error accumulation. To tackle these limitations, we present a Collaborative Multi-Agent Reasoning Framework that explicitly decouples Semantic Planning from Visual Synthesis. By employing specialized agents for upfront reasoning, our method generates a structured, explicit plan before pixel generation, enabling visually and semantically coherent single-pass synthesis. We integrate this framework with two critical mechanisms: (1) a self-correcting Verification Agent that employs Chain-of-Thought reasoning to rectify visible region segmentation and identify residual occluders strictly within the Semantic Planning phase, and (2) a Diverse Hypothesis Generator that addresses the ambiguity of invisible regions by offering diverse, plausible semantic interpretations, surpassing the limited pixel-level variations of standard random seed sampling. Furthermore, addressing the limitations of traditional metrics in assessing inferred invisible content, we introduce the MAC-Score (MLLM Amodal Completion Score), a novel human-aligned evaluation metric. Validated against human judgment and ground truth, these metrics establish a robust standard for assessing structural completeness and semantic consistency with visible context. Extensive experiments demonstrate that our framework significantly outperforms state-of-the-art methods across multiple datasets. Our project is available at: https://fanhongxing.github.io/remac-page.

YCB-Handovers Dataset: Analyzing Object Weight Impact on Human Handovers to Adapt Robotic Handover Motion

Authors:Parag Khanna, Karen Jane Dsouza, Chunyu Wang, Mårten Björkman, Christian Smith
Date:2025-12-23 23:50:55

This paper introduces the YCB-Handovers dataset, capturing motion data of 2771 human-human handovers with varying object weights. The dataset aims to bridge a gap in human-robot collaboration research, providing insights into the impact of object weight in human handovers and readiness cues for intuitive robotic motion planning. The underlying dataset for object recognition and tracking is the YCB (Yale-CMU-Berkeley) dataset, which is an established standard dataset used in algorithms for robotic manipulation, including grasping and carrying objects. The YCB-Handovers dataset incorporates human motion patterns in handovers, making it applicable for data-driven, human-inspired models aimed at weight-sensitive motion planning and adaptive robotic behaviors. This dataset covers an extensive range of weights, allowing for a more robust study of handover behavior and weight variation. Some objects also require careful handovers, highlighting contrasts with standard handovers. We also provide a detailed analysis of the object's weight impact on the human reaching motion in these handovers.

Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions

Authors:Rashmeet Kaur Nayyar, Naman Shah, Siddharth Srivastava
Date:2025-12-23 23:12:53

Real-world sequential decision-making often involves parameterized action spaces that require both, decisions regarding discrete actions and decisions about continuous action parameters governing how an action is executed. Existing approaches exhibit severe limitations in this setting -- planning methods demand hand-crafted action models, and standard reinforcement learning (RL) algorithms are designed for either discrete or continuous actions but not both, and the few RL methods that handle parameterized actions typically rely on domain-specific engineering and fail to exploit the latent structure of these spaces. This paper extends the scope of RL algorithms to long-horizon, sparse-reward settings with parameterized actions by enabling agents to autonomously learn both state and action abstractions online. We introduce algorithms that progressively refine these abstractions during learning, increasing fine-grained detail in the critical regions of the state-action space where greater resolution improves performance. Across several continuous-state, parameterized-action domains, our abstraction-driven approach enables TD($λ$) to achieve markedly higher sample efficiency than state-of-the-art baselines.

NULLBUS: Multimodal Mixed-Supervision for Breast Ultrasound Segmentation via Nullable Global-Local Prompts

Authors:Raja Mallina, Bryar Shareef
Date:2025-12-23 21:30:05

Breast ultrasound (BUS) segmentation provides lesion boundaries essential for computer-aided diagnosis and treatment planning. While promptable methods can improve segmentation performance and tumor delineation when text or spatial prompts are available, many public BUS datasets lack reliable metadata or reports, constraining training to small multimodal subsets and reducing robustness. We propose NullBUS, a multimodal mixed-supervision framework that learns from images with and without prompts in a single model. To handle missing text, we introduce nullable prompts, implemented as learnable null embeddings with presence masks, enabling fallback to image-only evidence when metadata are absent and the use of text when present. Evaluated on a unified pool of three public BUS datasets, NullBUS achieves a mean IoU of 0.8568 and a mean Dice of 0.9103, demonstrating state-of-the-art performance under mixed prompt availability.

Towards Optimal Performance and Action Consistency Guarantees in Dec-POMDPs with Inconsistent Beliefs and Limited Communication

Authors:Moshe Rafaeli Shimron, Vadim Indelman
Date:2025-12-23 21:25:53

Multi-agent decision-making under uncertainty is fundamental for effective and safe autonomous operation. In many real-world scenarios, each agent maintains its own belief over the environment and must plan actions accordingly. However, most existing approaches assume that all agents have identical beliefs at planning time, implying these beliefs are conditioned on the same data. Such an assumption is often impractical due to limited communication. In reality, agents frequently operate with inconsistent beliefs, which can lead to poor coordination and suboptimal, potentially unsafe, performance. In this paper, we address this critical challenge by introducing a novel decentralized framework for optimal joint action selection that explicitly accounts for belief inconsistencies. Our approach provides probabilistic guarantees for both action consistency and performance with respect to open-loop multi-agent POMDP (which assumes all data is always communicated), and selectively triggers communication only when needed. Furthermore, we address another key aspect of whether, given a chosen joint action, the agents should share data to improve expected performance in inference. Simulation results show our approach outperforms state-of-the-art algorithms.

Optimized Rolling Allocation of Outages for Damage Assesment

Authors:Hritik Gopal Shah, Catherine Tajmajer, Elli Ntakou
Date:2025-12-23 19:31:15

Natural disasters often inflict severe damage on distribution grids. Rapid, reliable damage assessment (DA) is essential for storm restoration, yet most optimization work targets repair dispatch after faults are identified. This paper presents a production, rolling horizon DA crew allocation system deployed across multiple U.S. states in Eversource Energy's service territory and used during live storms. The method implements a sequential k-job assignment policy per available crew, executed on a fixed cadence and on operators' control. The objective jointly prioritizes critical facilities and customer impact while controlling travel time on the actual road network via the Google Maps API. A key constraint is the absence of live crew GPS; we infer crew locations from the last confirmed DA site and robustify travel estimates for staleness, yielding stable recommendations without continuous tracking. The operator remains in the loop with controls to limit churn and to publish a feasible plan. Using data from the March 7 New Hampshire storm with 90 moderate outages and seven DA crews, we observe shorter time to first assessment, fewer revisits with reduced distance traveled. To our knowledge, this is among the first multi-state enterprise integrated deployments to treat DA crews as a first-class optimized resource in storm restoration.

SemanticGen: Video Generation in Semantic Space

Authors:Jianhong Bai, Xiaoshi Wu, Xintao Wang, Xiao Fu, Yuanxing Zhang, Qinghe Wang, Xiaoyu Shi, Menghan Xia, Zuozhu Liu, Haoji Hu, Pengfei Wan, Kun Gai
Date:2025-12-23 18:59:56

State-of-the-art video generative models typically learn the distribution of video latents in the VAE space and map them to pixels using a VAE decoder. While this approach can generate high-quality videos, it suffers from slow convergence and is computationally expensive when generating long videos. In this paper, we introduce SemanticGen, a novel solution to address these limitations by generating videos in the semantic space. Our main insight is that, due to the inherent redundancy in videos, the generation process should begin in a compact, high-level semantic space for global planning, followed by the addition of high-frequency details, rather than directly modeling a vast set of low-level video tokens using bi-directional attention. SemanticGen adopts a two-stage generation process. In the first stage, a diffusion model generates compact semantic video features, which define the global layout of the video. In the second stage, another diffusion model generates VAE latents conditioned on these semantic features to produce the final output. We observe that generation in the semantic space leads to faster convergence compared to the VAE latent space. Our method is also effective and computationally efficient when extended to long video generation. Extensive experiments demonstrate that SemanticGen produces high-quality videos and outperforms state-of-the-art approaches and strong baselines.

Active Intelligence in Video Avatars via Closed-loop World Modeling

Authors:Xuanhua He, Tianyu Yang, Ke Cao, Ruiqi Wu, Cheng Meng, Yong Zhang, Zhuoliang Kang, Xiaoming Wei, Qifeng Chen
Date:2025-12-23 18:59:16

Current video avatar generation methods excel at identity preservation and motion alignment but lack genuine agency, they cannot autonomously pursue long-term goals through adaptive environmental interaction. We address this by introducing L-IVA (Long-horizon Interactive Visual Avatar), a task and benchmark for evaluating goal-directed planning in stochastic generative environments, and ORCA (Online Reasoning and Cognitive Architecture), the first framework enabling active intelligence in video avatars. ORCA embodies Internal World Model (IWM) capabilities through two key innovations: (1) a closed-loop OTAR cycle (Observe-Think-Act-Reflect) that maintains robust state tracking under generative uncertainty by continuously verifying predicted outcomes against actual generations, and (2) a hierarchical dual-system architecture where System 2 performs strategic reasoning with state prediction while System 1 translates abstract plans into precise, model-specific action captions. By formulating avatar control as a POMDP and implementing continuous belief updating with outcome verification, ORCA enables autonomous multi-step task completion in open-domain scenarios. Extensive experiments demonstrate that ORCA significantly outperforms open-loop and non-reflective baselines in task success rate and behavioral coherence, validating our IWM-inspired design for advancing video avatar intelligence from passive animation to active, goal-oriented behavior.

Cube Bench: A Benchmark for Spatial Visual Reasoning in MLLMs

Authors:Dhruv Anand, Ehsan Shareghi
Date:2025-12-23 18:43:05

We introduce Cube Bench, a Rubik's-cube benchmark for evaluating spatial and sequential reasoning in multimodal large language models (MLLMs). The benchmark decomposes performance into five skills: (i) reconstructing cube faces from images and text, (ii) choosing the optimal next move, (iii) predicting the outcome of a candidate move without applying it, (iv) executing multi-step plans while recovering from mistakes, and (v) detecting and revising one's own errors. Using a shared set of scrambled cube states, identical prompts and parsers, and a single distance-to-solved metric, we compare recent MLLMs side by side as a function of scramble depth. Across seven MLLMs, accuracy drops sharply with depth; once a trajectory stalls or diverges, models rarely recover, and high face-reconstruction accuracy does not guarantee competent action selection or multi-step execution. A pronounced closed- vs open-source gap emerges: the strongest closed model leads on both single-step perception tasks and multi-step control tasks, while open-weight models cluster near chance on the hardest settings; yet even the best MLLM degrades at higher cube complexity. A simple self-correction via reflective thinking yields modest gains but can also introduce overthinking. Cube Bench offers a compact, reproducible probe of sequential spatial reasoning in MLLMs.

The Sensitivity of PUEO to Cosmogenic Neutrinos and Exotic Physics Scenarios

Authors:Angelina Sherman, Ke Fang, Dan Hooper
Date:2025-12-23 18:42:19

Several observatories designed to detect ultrahigh-energy neutrinos are planned for the next decade. The most imminent of these is the Payload for Ultrahigh Energy Observations (PUEO), a long-duration balloon-based experiment that will provide unprecedented sensitivity to neutrinos with energies in the range of ~ 1 - 1000 EeV. In this work, we assess the scientific reach of PUEO. In particular, we evaluate the sensitivity of this observatory to cosmogenic neutrinos and, in turn, to the proton fraction of the ultrahigh-energy cosmic-ray spectrum. We also consider the potential of PUEO to probe scenarios in which neutrinos are produced through the decays of ultraheavy dark matter particles or are radiated from cosmic strings. We find that PUEO will be able to constrain the proton composition of ultrahigh-energy cosmic rays in scenarios that feature very strong source evolution and in which protons are accelerated to extremely high energies. Although gamma-ray observations are generally more sensitive to decaying particles than neutrino observations, PUEO is expected to set the strongest neutrino-detector constraints above 10^19 eV. PUEO will also provide the strongest constraints on some models of cosmic strings.

Explainable time-series forecasting with sampling-free SHAP for Transformers

Authors:Matthias Hertel, Sebastian Pütz, Ralf Mikut, Veit Hagenmeyer, Benjamin Schäfer
Date:2025-12-23 17:02:35

Time-series forecasts are essential for planning and decision-making in many domains. Explainability is key to building user trust and meeting transparency requirements. Shapley Additive Explanations (SHAP) is a popular explainable AI framework, but it lacks efficient implementations for time series and often assumes feature independence when sampling counterfactuals. We introduce SHAPformer, an accurate, fast and sampling-free explainable time-series forecasting model based on the Transformer architecture. It leverages attention manipulation to make predictions based on feature subsets. SHAPformer generates explanations in under one second, several orders of magnitude faster than the SHAP Permutation Explainer. On synthetic data with ground truth explanations, SHAPformer provides explanations that are true to the data. Applied to real-world electrical load data, it achieves competitive predictive performance and delivers meaningful local and global insights, such as identifying the past load as the key predictor and revealing a distinct model behavior during the Christmas period.

Drift-Corrected Monocular VIO and Perception-Aware Planning for Autonomous Drone Racing

Authors:Maulana Bisyir Azhari, Donghun Han, Je In You, Sungjun Park, David Hyunchul Shim
Date:2025-12-23 16:12:10

The Abu Dhabi Autonomous Racing League(A2RL) x Drone Champions League competition(DCL) requires teams to perform high-speed autonomous drone racing using only a single camera and a low-quality inertial measurement unit -- a minimal sensor set that mirrors expert human drone racing pilots. This sensor limitation makes the system susceptible to drift from Visual-Inertial Odometry (VIO), particularly during long and fast flights with aggressive maneuvers. This paper presents the system developed for the championship, which achieved a competitive performance. Our approach corrected VIO drift by fusing its output with global position measurements derived from a YOLO-based gate detector using a Kalman filter. A perception-aware planner generated trajectories that balance speed with the need to keep gates visible for the perception system. The system demonstrated high performance, securing podium finishes across multiple categories: third place in the AI Grand Challenge with top speed of 43.2 km/h, second place in the AI Drag Race with over 59 km/h, and second place in the AI Multi-Drone Race. We detail the complete architecture and present a performance analysis based on experimental data from the competition, contributing our insights on building a successful system for monocular vision-based autonomous drone flight.

Even Small Companies Can Save Lives by Reducing Emissions

Authors:Daniel Baldassare, Abby Lute, Hikari Murayama, Cora Kingdon, Christopher Schwalm
Date:2025-12-23 16:08:08

Global warming is often framed in broad planetary numbers such as the 1.5 C and 2 C warming thresholds, creating the false impression that individual corporations efforts to reduce emissions are meaningless in the absence of collective action. This perspective causes companies to reduce ambition towards voluntarily cutting emissions, as they believe their pollution has negligible impacts on its own. Reframing the issue to focus on the life-saving potential of individual corporate actions empowers companies to act and holds them accountable for inaction. Here, we show the results from an innovative modeling technique which calculates the avoided deaths from sustainability efforts for 3,084 companies spanning a range of sizes and sectors. From the reported emissions and planned emissions reductions, we create scenarios for 2020-2049 with and without the pledged emissions cuts and calculate the resulting warming from 2020-2100 using a climate emulator. We then use temperatures from these scenarios to calculate the deaths resulting from warming by using mortality damage functions. We find that more than 97% of these companies stand to save at least one life by following through with emissions reduction plans. Additionally, if all 3,084 companies follow through with their emissions reduction plans, over 4.4 million temperature-related deaths can be avoided.

Characterization of the BIFROST spectrometer through virtual experiments

Authors:Kristine M. L. Krighaar, Silas B. Schack, Nicolai L. Amin, Gregory S. Tucker, Rasmus Toft-Petersen, Kim Lefmann
Date:2025-12-23 15:56:43

Using the Monte Carlo ray tracing package McStas, we illustrate the possibilities of creating virtual experiments of the neutron spectrometer BIFROST at the European Spallation Source, ESS. With this model, we are able to benchmark BIFROST with respect to expected intensity, $Q$- and energy-resolution. The simulations reproduce the expected resolution behavior and quantify effects that are difficult to capture analytically, including a wavelength-dependent edge enhancement arising from a combination of the long-pulsed source and the pulse-shaping chopper. Furthermore, we present an antiferromagnetic (AF) spin wave simulation, which we use to create realistic datasets at different instrument operation settings. Our virtual experiments focus on realistic dispersive dynamics and illustrate how the virtual experiment approach reveal resolution effects, not easily calculable via analytical models. This demonstrates the crucial role of numerical simulations in the planning of challenging experiments.

Contingency Model-based Control (CMC) for Communicationless Cooperative Collision Avoidance in Robot Swarms

Authors:Georg Schildbach
Date:2025-12-23 14:28:42

Cooperative collision avoidance between robots in swarm operations remains an open challenge. Assuming a decentralized architecture, each robot is responsible for making its own control decisions, including motion planning. To this end, most existing approaches mostly rely some form of (wireless) communication between the agents of the swarm. In reality, however, communication is brittle. It may be affected by latency, further delays and packet losses, transmission faults, and is subject to adversarial attacks, such as jamming or spoofing. This paper proposes Contingency Model-based Control (CMC) as a communicationless alternative. It follows the implicit cooperation paradigm, under which the design of the robots is based on consensual (offline) rules, similar to traffic rules. They include the definition of a contingency trajectory for each robot, and a method for construction of mutual collision avoidance constraints. The setup is shown to guarantee the recursive feasibility and collision avoidance between all swarm members in closed-loop operation. Moreover, CMC naturally satisfies the Plug \& Play paradigm, i.e., for new robots entering the swarm. Two numerical examples demonstrate that the collision avoidance guarantee is intact and that the robot swarm operates smoothly under the CMC regime.

Status of the Muon g-2/EDM Experiment at J-PARC

Authors:Graziano Venanzoni
Date:2025-12-23 13:08:35

The Muon g-2/EDM Experiment at J-PARC will employ a novel way to measure the muon magnetic anomaly, a_mu = (g-2)_mu/2, by using a low-emittance beam of positive muons stored in a compact muon storage magnet. The experimental method includes new technologies such as a three-dimensional spiral injection, an MRI-type storage magnet with superb field uniformity, and a positron tracking detector. The expected systematic uncertainty will be at the same level as that of the Fermilab Muon g-2 experiment, providing an important cross-check of the "storage-ring method" employed at BNL and Fermilab. I will present the current status of the experiment, ongoing tests and design optimizations, and the plans for improvements of the experimental precision.

SlideTailor: Personalized Presentation Slide Generation for Scientific Papers

Authors:Wenzheng Zeng, Mingyu Ouyang, Langyuan Cui, Hwee Tou Ng
Date:2025-12-23 12:01:18

Automatic presentation slide generation can greatly streamline content creation. However, since preferences of each user may vary, existing under-specified formulations often lead to suboptimal results that fail to align with individual user needs. We introduce a novel task that conditions paper-to-slides generation on user-specified preferences. We propose a human behavior-inspired agentic framework, SlideTailor, that progressively generates editable slides in a user-aligned manner. Instead of requiring users to write their preferences in detailed textual form, our system only asks for a paper-slides example pair and a visual template - natural and easy-to-provide artifacts that implicitly encode rich user preferences across content and visual style. Despite the implicit and unlabeled nature of these inputs, our framework effectively distills and generalizes the preferences to guide customized slide generation. We also introduce a novel chain-of-speech mechanism to align slide content with planned oral narration. Such a design significantly enhances the quality of generated slides and enables downstream applications like video presentations. To support this new task, we construct a benchmark dataset that captures diverse user preferences, with carefully designed interpretable metrics for robust evaluation. Extensive experiments demonstrate the effectiveness of our framework.

A non-compact QCD axion

Authors:Georgios K. Karananas, Mikhail Shaposhnikov, Sebastian Zell
Date:2025-12-23 12:00:03

We investigate the cosmology of an axion that is fundamentally non-compact. During inflation, fluctuations of the effectively massless field populate many QCD vacua, thereby evading conventional isocurvature constraints while generating domain walls -- without accompanying cosmic strings. A small non-QCD contribution to the axion potential is required to trigger the timely collapse of domain walls; as a consequence, a residual amount of CP violation in the strong sector must exist, potentially within reach of planned experiments. Non-compact axions can account for the entirety of the dark matter abundance, and the collapse of domain walls sources a stochastic gravitational-wave background at nanohertz frequencies. Such axion dynamics can be embedded in top-down constructions -- such as Weyl-invariant Einstein-Cartan gravity -- where the tilting of the axion potential arises automatically.

Replacing Gas with Low-cost, Abundant Long-duration Pumped Hydro in Electricity Systems

Authors:Timothy Weber, Cheng Cheng, Harry Thawley, Kylie Catchpole, Andrew Blakers, Bin Lu, Jennifer Zhao, Anna Nadolny
Date:2025-12-23 11:50:08

Fossil gas is sometimes presented as an enabler of variable solar and wind generation beyond 2050, despite being a primary source of greenhouse gas emissions from methane leakage and combustion. We find that balancing solar and wind generation with pumped hydro energy storage eliminates the need for fossil gas without incurring a cost penalty. However, many existing long-term electricity system plans are biased to rely on fossil gas due to using temporal aggregation methods that either heavily constrain storage cycling behaviour or lose track of the state-of-charge, failing to consider the potential of low-cost long-duration off-river pumped hydro, and ignoring the broad suite of near-optimal energy transition pathways. We show that a temporal aggregation method based on 'segmentation' (fitted chronology) closely resembles the full-series optimisation, captures long-duration storage behaviour (48- and 160-hour durations), and finds a near-optimal 100% renewable electricity solution. We develop a new electricity system model to rapidly evaluate millions of other near-optimal solutions, stressing the importance of modelling pumped hydro sites with a low energy volume cost (

Automated Training of Learned Database Components with Generative AI

Authors:Angjela Davitkova, Sebastian Michel
Date:2025-12-23 11:24:52

The use of deep learning for database optimization has gained significant traction, offering improvements in indexing, cardinality estimation, and query optimization. However, acquiring high-quality training data remains a significant challenge. This paper explores the possibility of using generative models, such as GPT, to synthesize training data for learned database components. We present an initial feasibility study investigating their ability to produce realistic query distributions and execution plans for database workloads. Additionally, we discuss key challenges, such as data scalability and labeling, along with potential solutions. The initial results suggest that generative models can effectively augment training datasets, improving the adaptability of learned database techniques.

Detecting Non-Optimal Decisions of Embodied Agents via Diversity-Guided Metamorphic Testing

Authors:Wenzhao Wu, Yahui Tang, Mingfei Cheng, Wenbing Tang, Yuan Zhou, Yang Liu
Date:2025-12-23 06:27:18

As embodied agents advance toward real-world deployment, ensuring optimal decisions becomes critical for resource-constrained applications. Current evaluation methods focus primarily on functional correctness, overlooking the non-functional optimality of generated plans. This gap can lead to significant performance degradation and resource waste. We identify and formalize the problem of Non-optimal Decisions (NoDs), where agents complete tasks successfully but inefficiently. We present NoD-DGMT, a systematic framework for detecting NoDs in embodied agent task planning via diversity-guided metamorphic testing. Our key insight is that optimal planners should exhibit invariant behavioral properties under specific transformations. We design four novel metamorphic relations capturing fundamental optimality properties: position detour suboptimality, action optimality completeness, condition refinement monotonicity, and scene perturbation invariance. To maximize detection efficiency, we introduce a diversity-guided selection strategy that actively selects test cases exploring different violation categories, avoiding redundant evaluations while ensuring comprehensive diversity coverage. Extensive experiments on the AI2-THOR simulator with four state-of-the-art planning models demonstrate that NoD-DGMT achieves violation detection rates of 31.9% on average, with our diversity-guided filter improving rates by 4.3% and diversity scores by 3.3 on average. NoD-DGMT significantly outperforms six baseline methods, with 16.8% relative improvement over the best baseline, and demonstrates consistent superiority across different model architectures and task complexities.