planning - 2025-04-11

Geometric and Dosimetric Validation of Deformable Image Registration for Prostate MR-guided Adaptive Radiotherapy

Authors:Victor N. Malkov, Iymad R. Mansour, Vickie Kong, Winnie Li, Jennifer Dang, Parisa Sadeghi, Inmaculada Navarro, Jerusha Padayachee, Peter Chung, Jeff D. Winter, Tony Tadic
Date:2025-04-10 17:47:47

Objective: Quantify geometric and dosimetric accuracy of a novel prostate MR-to-MR deformable image registration (DIR) approach to support MR-guided adaptive radiation therapy dose accumulation. Approach: We evaluated DIR accuracy in 25 patients treated with 30 Gy in 5 fractions on a 1.5 T MR-linac using an adaptive workflow. A reference MR was used for planning, with three images collected at each fraction: adapt MR for adaptive planning, verify MR for pretreatment position verification and beam-on for capturing anatomy during radiation delivery. We assessed three DIR approaches: intensity-based, intensity-based with controlling structures (CS) and novel intensity based with controlling structures and points of interest (CS+P). DIRs were performed between the reference and fraction images and within fractions. We propagated CTV, bladder, and rectum contours using the DIRs and compared to manual contours using Dice similarity coefficient, mean distance to agreement (DTAmean), and dose-volume metrics. Results: CS and CS+P improved geometric agreement between contours over intensity-only DIR. DTAmean for reference-to-beam-on intensity-only DIR was 0.131+/-0.009cm (CTV), 0.46+/-0.08cm (bladder), and 0.154+/-0.013cm (rectum). For the CS, the values were 0.018+/-0.002cm, 0.388+/-0.14cm, and 0.036+/-0.013cm. For CS+P these values were 0.015+/-0.001cm, 0.025+/-0.004cm, and 0.021+/-0.002cm. Dosimetrically, comparing CS and CS+P for reference to beam-on DIRs resulted in a change of CTV D98% from [-29cGy, 19cGy] to [-18cGy, 26cGy], rectum D1cc from [-106cGy, 72cGy] to [-52cGy, 74cGy], and bladder D5cc from [-51cGy, 544cGy] to [-79cGy, 36cGy]. Significance: CS improved geometric and dosimetric accuracy over intensity-only DIR, with CS+P providing the most consistent performance. However, session image segmentation remains a challenge, which may be addressed with automated contouring.

Optimal Control For Anti-Abeta Treatment in Alzheimer's Disease using a Reaction-Diffusion Model

Authors:Wenrui Hao, Chiu-Yen Kao, Sun Lee, Zhiyuan Li
Date:2025-04-10 17:22:09

Alzheimer's disease is a progressive neurodegenerative disorder that significantly impairs patient survival and quality of life. While current pharmacological treatments aim to slow disease progression, they remain insufficient in halting cognitive decline. Mathematical modeling has emerged as a powerful tool for understanding the dynamics of AD and optimizing treatment strategies. However, most existing models focus on temporal dynamics using ordinary differential equation-based approaches, often neglecting the critical role of spatial heterogeneity in disease progression. In this study, we employ a spatially explicit reaction-diffusion model to describe amyloid-beta (A beta) dynamics in the brain, incorporating treatment optimization while accounting for potential side effects. Our objective is to minimize amyloid-beta plaque concentration while balancing therapeutic efficacy against adverse effects, such as amyloid-related imaging abnormalities (ARIA). Under specific assumptions, we establish the well-posedness and uniqueness of the optimal solution. We employ numerical methods based on the Finite Element Method to compute personalized treatment strategies, leveraging real patient amyloid-beta positron emission tomography (PET) scan data. Our results demonstrate that optimal treatment strategies outperform constant dosing regimens, achieving significant reductions in amyloid burden while minimizing side effects. By integrating spatial dynamics and personalized treatment planning, our framework offers a novel approach to refining therapeutic interventions for Alzheimer's disease.

Fast Adaptation with Behavioral Foundation Models

Authors:Harshit Sikchi, Andrea Tirinzoni, Ahmed Touati, Yingchen Xu, Anssi Kanervisto, Scott Niekum, Amy Zhang, Alessandro Lazaric, Matteo Pirotta
Date:2025-04-10 16:14:17

Unsupervised zero-shot reinforcement learning (RL) has emerged as a powerful paradigm for pretraining behavioral foundation models (BFMs), enabling agents to solve a wide range of downstream tasks specified via reward functions in a zero-shot fashion, i.e., without additional test-time learning or planning. This is achieved by learning self-supervised task embeddings alongside corresponding near-optimal behaviors and incorporating an inference procedure to directly retrieve the latent task embedding and associated policy for any given reward function. Despite promising results, zero-shot policies are often suboptimal due to errors induced by the unsupervised training process, the embedding, and the inference procedure. In this paper, we focus on devising fast adaptation strategies to improve the zero-shot performance of BFMs in a few steps of online interaction with the environment while avoiding any performance drop during the adaptation process. Notably, we demonstrate that existing BFMs learn a set of skills containing more performant policies than those identified by their inference procedure, making them well-suited for fast adaptation. Motivated by this observation, we propose both actor-critic and actor-only fast adaptation strategies that search in the low-dimensional task-embedding space of the pre-trained BFM to rapidly improve the performance of its zero-shot policies on any downstream task. Notably, our approach mitigates the initial "unlearning" phase commonly observed when fine-tuning pre-trained RL models. We evaluate our fast adaptation strategies on top of four state-of-the-art zero-shot RL methods in multiple navigation and locomotion domains. Our results show that they achieve 10-40% improvement over their zero-shot performance in a few tens of episodes, outperforming existing baselines.

A quantum computing approach to beam angle optimization

Authors:Nimita Shinde, Ya-Nan Zhu, Haozheng Shen, Hao Gao
Date:2025-04-10 15:24:37

Background: Beam angle optimization (BAO) is a critical component of radiation therapy (RT) treatment planning, where small changes in beam configuration can significantly impact treatment quality, especially for proton RT. Mathematically, BAO is a mixed integer programming (MIP) problem, which is NP-hard due to its exponential growing search space. Traditional optimization techniques often struggle with computational efficiency, necessitating the development of novel approaches. Purpose: This study introduces QC-BAO, a hybrid quantum-classical approach that leverages quantum computing to solve the MIP formulation of BAO. Methods: The proposed approach, QC-BAO, models BAO as an MIP problem, incorporating binary variables for beam angle selection and continuous variables for optimizing spot intensities for proton therapy. The proposed approach employs a hybrid quantum-classical framework, utilizing quantum computing to solve the binary decision component while integrating classical optimization techniques, including iterative convex relaxation and alternating direction method of multipliers. Results: Computational experiments were conducted on clinical test cases to evaluate QC-BAO's performance against clinically verified angles and a heuristic approach, GS-BAO. QC-BAO demonstrated improved treatment plan quality over both clinical and GS-BAO. The method consistently increased the conformity index (CI) for target coverage while reducing mean and maximum doses to organs-at-risk (OAR). Additionally, QC-BAO produced the lowest objective function value, confirming its superior optimization capability. Conclusions: The findings highlight the potential of quantum computing to enhance the solution to BAO problem by demonstrated improvement in plan quality using the proposed method, QC-BAO. This study paves the way for future clinical implementation of quantum-accelerated optimization in RT.

Anytime Single-Step MAPF Planning with Anytime PIBT

Authors:Nayesha Gandotra, Rishi Veerapaneni, Muhammad Suhail Saleem, Daniel Harabor, Jiaoyang Li, Maxim Likhachev
Date:2025-04-10 15:21:23

PIBT is a popular Multi-Agent Path Finding (MAPF) method at the core of many state-of-the-art MAPF methods including LaCAM, CS-PIBT, and WPPL. The main utility of PIBT is that it is a very fast and effective single-step MAPF solver and can return a collision-free single-step solution for hundreds of agents in less than a millisecond. However, the main drawback of PIBT is that it is extremely greedy in respect to its priorities and thus leads to poor solution quality. Additionally, PIBT cannot use all the planning time that might be available to it and returns the first solution it finds. We thus develop Anytime PIBT, which quickly finds a one-step solution identically to PIBT but then continuously improves the solution in an anytime manner. We prove that Anytime PIBT converges to the optimal solution given sufficient time. We experimentally validate that Anytime PIBT can rapidly improve single-step solution quality within milliseconds and even find the optimal single-step action. However, we interestingly find that improving the single-step solution quality does not have a significant effect on full-horizon solution costs.

HarmonySeg: Tubular Structure Segmentation with Deep-Shallow Feature Fusion and Growth-Suppression Balanced Loss

Authors:Yi Huang, Ke Zhang, Wei Liu, Yuanyuan Wang, Vishal M. Patel, Le Lu, Xu Han, Dakai Jin, Ke Yan
Date:2025-04-10 15:04:42

Accurate segmentation of tubular structures in medical images, such as vessels and airway trees, is crucial for computer-aided diagnosis, radiotherapy, and surgical planning. However, significant challenges exist in algorithm design when faced with diverse sizes, complex topologies, and (often) incomplete data annotation of these structures. We address these difficulties by proposing a new tubular structure segmentation framework named HarmonySeg. First, we design a deep-to-shallow decoder network featuring flexible convolution blocks with varying receptive fields, which enables the model to effectively adapt to tubular structures of different scales. Second, to highlight potential anatomical regions and improve the recall of small tubular structures, we incorporate vesselness maps as auxiliary information. These maps are aligned with image features through a shallow-and-deep fusion module, which simultaneously eliminates unreasonable candidates to maintain high precision. Finally, we introduce a topology-preserving loss function that leverages contextual and shape priors to balance the growth and suppression of tubular structures, which also allows the model to handle low-quality and incomplete annotations. Extensive quantitative experiments are conducted on four public datasets. The results show that our model can accurately segment 2D and 3D tubular structures and outperform existing state-of-the-art methods. External validation on a private dataset also demonstrates good generalizability.

Siren Federate: Bridging document, relational, and graph models for exploratory graph analysis

Authors:Georgeta Bordea, Stephane Campinas, Matteo Catena, Renaud Delbru
Date:2025-04-10 14:52:03

Investigative workflows require interactive exploratory analysis on large heterogeneous knowledge graphs. Current databases show limitations in enabling such task. This paper discusses the architecture of Siren Federate, a system that efficiently supports exploratory graph analysis by bridging document-oriented, relational and graph models. Technical contributions include distributed join algorithms, adaptive query planning, query plan folding, semantic caching, and semi-join decomposition for path query. Semi-join decomposition addresses the exponential growth of intermediate results in path-based queries. Experiments show that Siren Federate exhibits low latency and scales well with the amount of data, the number of users, and the number of computing nodes.

Counting Hours, Counting Losses: The Toll of Unpredictable Work Schedules on Financial Security

Authors:Pegah Nokhiz, Aravinda Kanchana Ruwanpathirana, Aditya Bhaskara, Suresh Venkatasubramanian
Date:2025-04-10 13:09:56

Financial instability has become a significant issue in today's society. While research typically focuses on financial aspects, there is a tendency to overlook time-related aspects of unstable work schedules. The inability to rely on consistent work schedules leads to burnout, work-family conflicts, and financial shocks that directly impact workers' income and assets. Unforeseen fluctuations in earnings pose challenges in financial planning, affecting decisions on savings and spending and ultimately undermining individuals' long-term financial stability and well-being. This issue is particularly evident in sectors where workers experience frequently changing schedules without sufficient notice, including those in the food service and retail sectors, part-time and hourly workers, and individuals with lower incomes. These groups are already more financially vulnerable, and the unpredictable nature of their schedules exacerbates their financial fragility. Our objective is to understand how unforeseen fluctuations in earnings exacerbate financial fragility by investigating the extent to which individuals' financial management depends on their ability to anticipate and plan for the future. To address this question, we develop a simulation framework that models how individuals optimize utility amidst financial uncertainty and the imperative to avoid financial ruin. We employ online learning techniques, specifically adapting workers' consumption policies based on evolving information about their work schedules. With this framework, we show both theoretically and empirically how a worker's capacity to anticipate schedule changes enhances their long-term utility. Conversely, the inability to predict future events can worsen workers' instability. Moreover, our framework enables us to explore interventions to mitigate the problem of schedule uncertainty and evaluate their effectiveness.

Predicting the Lifespan of Industrial Printheads with Survival Analysis

Authors:Dan Parii, Evelyne Janssen, Guangzhi Tang, Charalampos Kouzinopoulos, Marcin Pietrasik
Date:2025-04-10 10:38:13

Accurately predicting the lifespan of critical device components is essential for maintenance planning and production optimization, making it a topic of significant interest in both academia and industry. In this work, we investigate the use of survival analysis for predicting the lifespan of production printheads developed by Canon Production Printing. Specifically, we focus on the application of five techniques to estimate survival probabilities and failure rates: the Kaplan-Meier estimator, Cox proportional hazard model, Weibull accelerated failure time model, random survival forest, and gradient boosting. The resulting estimates are further refined using isotonic regression and subsequently aggregated to determine the expected number of failures. The predictions are then validated against real-world ground truth data across multiple time windows to assess model reliability. Our quantitative evaluation using three performance metrics demonstrates that survival analysis outperforms industry-standard baseline methods for printhead lifespan prediction.

Joint Travel Route Optimization Framework for Platooning

Authors:Akif Adas, Stefano Arrigoni, Mattia Brambilla, Monica Barbara Nicoli, Edoardo Sabbioni
Date:2025-04-10 10:13:20

Platooning represents an advanced driving technology designed to assist drivers in traffic convoys of varying lengths, enhancing road safety, reducing driver fatigue, and improving fuel efficiency. Sophisticated automated driving assistance systems have facilitated this innovation. Recent advancements in platooning emphasize cooperative mechanisms within both centralized and decentralized architectures enabled by vehicular communication technologies. This study introduces a cooperative route planning optimization framework aimed at promoting the adoption of platooning through a centralized platoon formation strategy at the system level. This approach is envisioned as a transitional phase from individual (ego) driving to fully collaborative driving. Additionally, this research formulates and incorporates travel cost metrics related to fuel consumption, driver fatigue, and travel time, considering regulatory constraints on consecutive driving durations. The performance of these cost metrics has been evaluated using Dijkstra's and A* shortest path algorithms within a network graph framework. The results indicate that the proposed architecture achieves an average cost improvement of 14 % compared to individual route planning for long road trips.

Efficient Swept Volume-Based Trajectory Generation for Arbitrary-Shaped Ground Robot Navigation

Authors:Yisheng Li, Longji Yin, Yixi Cai, Jianheng Liu, Haotian Li, Fu Zhang
Date:2025-04-10 08:34:34

Navigating an arbitrary-shaped ground robot safely in cluttered environments remains a challenging problem. The existing trajectory planners that account for the robot's physical geometry severely suffer from the intractable runtime. To achieve both computational efficiency and Continuous Collision Avoidance (CCA) of arbitrary-shaped ground robot planning, we proposed a novel coarse-to-fine navigation framework that significantly accelerates planning. In the first stage, a sampling-based method selectively generates distinct topological paths that guarantee a minimum inflated margin. In the second stage, a geometry-aware front-end strategy is designed to discretize these topologies into full-state robot motion sequences while concurrently partitioning the paths into SE(2) sub-problems and simpler R2 sub-problems for back-end optimization. In the final stage, an SVSDF-based optimizer generates trajectories tailored to these sub-problems and seamlessly splices them into a continuous final motion plan. Extensive benchmark comparisons show that the proposed method is one to several orders of magnitude faster than the cutting-edge methods in runtime while maintaining a high planning success rate and ensuring CCA.

Drive in Corridors: Enhancing the Safety of End-to-end Autonomous Driving via Corridor Learning and Planning

Authors:Zhiwei Zhang, Ruichen Yang, Ke Wu, Zijun Xu, Jingchu Liu, Lisen Mu, Zhongxue Gan, Wenchao Ding
Date:2025-04-10 07:10:40

Safety remains one of the most critical challenges in autonomous driving systems. In recent years, the end-to-end driving has shown great promise in advancing vehicle autonomy in a scalable manner. However, existing approaches often face safety risks due to the lack of explicit behavior constraints. To address this issue, we uncover a new paradigm by introducing the corridor as the intermediate representation. Widely adopted in robotics planning, the corridors represents spatio-temporal obstacle-free zones for the vehicle to traverse. To ensure accurate corridor prediction in diverse traffic scenarios, we develop a comprehensive learning pipeline including data annotation, architecture refinement and loss formulation. The predicted corridor is further integrated as the constraint in a trajectory optimization process. By extending the differentiability of the optimization, we enable the optimized trajectory to be seamlessly trained within the end-to-end learning framework, improving both safety and interpretability. Experimental results on the nuScenes dataset demonstrate state-of-the-art performance of our approach, showing a 66.7% reduction in collisions with agents and a 46.5% reduction with curbs, significantly enhancing the safety of end-to-end driving. Additionally, incorporating the corridor contributes to higher success rates in closed-loop evaluations.

Bottleneck Identification in Resource-Constrained Project Scheduling via Constraint Relaxation

Authors:Lukáš Nedbálek, Antonín Novák
Date:2025-04-10 06:53:10

In realistic production scenarios, Advanced Planning and Scheduling (APS) tools often require manual intervention by production planners, as the system works with incomplete information, resulting in suboptimal schedules. Often, the preferable solution is not found just because of the too-restrictive constraints specifying the optimization problem, representing bottlenecks in the schedule. To provide computer-assisted support for decision-making, we aim to automatically identify bottlenecks in the given schedule while linking them to the particular constraints to be relaxed. In this work, we address the problem of reducing the tardiness of a particular project in an obtained schedule in the resource-constrained project scheduling problem by relaxing constraints related to identified bottlenecks. We develop two methods for this purpose. The first method adapts existing approaches from the job shop literature and utilizes them for so-called untargeted relaxations. The second method identifies potential improvements in relaxed versions of the problem and proposes targeted relaxations. Surprisingly, the untargeted relaxations result in improvements comparable to the targeted relaxations.

Traversal Learning Coordination For Lossless And Efficient Distributed Learning

Authors:Erdenebileg Batbaatar, Jeonggeol Kim, Yongcheol Kim, Young Yoon
Date:2025-04-10 05:48:57

In this paper, we introduce Traversal Learning (TL), a novel approach designed to address the problem of decreased quality encountered in popular distributed learning (DL) paradigms such as Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL). Traditional FL experiences from an accuracy drop during aggregation due to its averaging function, while SL and SFL face increased loss due to the independent gradient updates on each split network. TL adopts a unique strategy where the model traverses the nodes during forward propagation (FP) and performs backward propagation (BP) on the orchestrator, effectively implementing centralized learning (CL) principles within a distributed environment. The orchestrator is tasked with generating virtual batches and planning the sequential node visits of the model during FP, aligning them with the ordered index of the data within these batches. We conducted experiments on six datasets representing diverse characteristics across various domains. Our evaluation demonstrates that TL is on par with classic CL approaches in terms of accurate inference, thereby offering a viable and robust solution for DL tasks. TL outperformed other DL methods and improved accuracy by 7.85% for independent and identically distributed (IID) datasets, macro F1-score by 1.06% for non-IID datasets, accuracy by 2.60% for text classification, and AUC by 3.88% and 4.54% for medical and financial datasets, respectively. By effectively preserving data privacy while maintaining performance, TL represents a significant advancement in DL methodologies.

Robust Social Planning

Authors:Florian Mudekereza
Date:2025-04-10 02:52:35

This paper analyzes a society composed of individuals who have diverse sets of beliefs (or models) and diverse tastes (or utility functions). It characterizes the model selection process of a social planner who wishes to aggregate individuals' beliefs and tastes but is concerned that their beliefs are misspecified (or distorted). A novel impossibility result emerges: a utilitarian social planner who seeks robustness to misspecification never aggregates individuals' beliefs but instead behaves systematically as a dictator by selecting a single individual's belief. This tension between robustness and aggregation exists because aggregation yields policy-contingent beliefs, which are very sensitive to policy outcomes. Restoring possibility of belief aggregation requires individuals to have heterogeneous tastes and some common beliefs. This analysis reveals that misspecification has significant economic implications for welfare aggregation. These implications are illustrated in treatment choice, asset pricing, and dynamic macroeconomics.

PROPEL: Supervised and Reinforcement Learning for Large-Scale Supply Chain Planning

Authors:Vahid Eghbal Akhlaghi, Reza Zandehshahvar, Pascal Van Hentenryck
Date:2025-04-10 02:04:29

This paper considers how to fuse Machine Learning (ML) and optimization to solve large-scale Supply Chain Planning (SCP) optimization problems. These problems can be formulated as MIP models which feature both integer (non-binary) and continuous variables, as well as flow balance and capacity constraints. This raises fundamental challenges for existing integrations of ML and optimization that have focused on binary MIPs and graph problems. To address these, the paper proposes PROPEL, a new framework that combines optimization with both supervised and Deep Reinforcement Learning (DRL) to reduce the size of search space significantly. PROPEL uses supervised learning, not to predict the values of all integer variables, but to identify the variables that are fixed to zero in the optimal solution, leveraging the structure of SCP applications. PROPEL includes a DRL component that selects which fixed-at-zero variables must be relaxed to improve solution quality when the supervised learning step does not produce a solution with the desired optimality tolerance. PROPEL has been applied to industrial supply chain planning optimizations with millions of variables. The computational results show dramatic improvements in solution times and quality, including a 60% reduction in primal integral and an 88% primal gap reduction, and improvement factors of up to 13.57 and 15.92, respectively.

Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments

Authors:Licheng Luo, Mingyu Cai
Date:2025-04-09 21:19:51

Deep Reinforcement Learning (DRL) has emerged as a powerful model-free paradigm for learning optimal policies. However, in real-world navigation tasks, DRL methods often suffer from insufficient exploration, particularly in cluttered environments with sparse rewards or complex dynamics under system disturbances. To address this challenge, we bridge general graph-based motion planning with DRL, enabling agents to explore cluttered spaces more effectively and achieve desired navigation performance. Specifically, we design a dense reward function grounded in a graph structure that spans the entire state space. This graph provides rich guidance, steering the agent toward optimal strategies. We validate our approach in challenging environments, demonstrating substantial improvements in exploration efficiency and task success rates. The project website is available at: https://plen1lune.github.io/overcome_exploration/

Analysis of the Unscented Transform for Cooperative Localization with Ranging-Only Information

Authors:Uthman Olawoye, Cagri Kilic, Jason N Gross
Date:2025-04-09 19:29:16

Cooperative localization in multi-agent robotic systems is challenging, especially when agents rely on limited information, such as only peer-to-peer range measurements. Two key challenges arise: utilizing this limited information to improve position estimation; handling uncertainties from sensor noise, nonlinearity, and unknown correlations between agents measurements; and avoiding information reuse. This paper examines the use of the Unscented Transform (UT) for state estimation for a case in which range measurement between agents and covariance intersection (CI) is used to handle unknown correlations. Unlike Kalman Filter approaches, CI methods fuse complete state and covariance estimates. This makes formulating a CI approach with ranging-only measurements a challenge. To overcome this, UT is used to handle uncertainties and formulate a cooperative state update using range measurements and current cooperative state estimates. This introduces information reuse in the measurement update. Therefore, this work aims to evaluate the limitations and utility of this formulation when faced with various levels of state measurement uncertainty and errors.

Neural Motion Simulator: Pushing the Limit of World Models in Reinforcement Learning

Authors:Chenjie Hao, Weyl Lu, Yifan Xu, Yubei Chen
Date:2025-04-09 17:59:32

An embodied system must not only model the patterns of the external world but also understand its own motion dynamics. A motion dynamic model is essential for efficient skill acquisition and effective planning. In this work, we introduce the neural motion simulator (MoSim), a world model that predicts the future physical state of an embodied system based on current observations and actions. MoSim achieves state-of-the-art performance in physical state prediction and provides competitive performance across a range of downstream tasks. This works shows that when a world model is accurate enough and performs precise long-horizon predictions, it can facilitate efficient skill acquisition in imagined worlds and even enable zero-shot reinforcement learning. Furthermore, MoSim can transform any model-free reinforcement learning (RL) algorithm into a model-based approach, effectively decoupling physical environment modeling from RL algorithm development. This separation allows for independent advancements in RL algorithms and world modeling, significantly improving sample efficiency and enhancing generalization capabilities. Our findings highlight that world models for motion dynamics is a promising direction for developing more versatile and capable embodied systems.

AssistanceZero: Scalably Solving Assistance Games

Authors:Cassidy Laidlaw, Eli Bronstein, Timothy Guo, Dylan Feng, Lukas Berglund, Justin Svegliato, Stuart Russell, Anca Dragan
Date:2025-04-09 17:59:03

Assistance games are a promising alternative to reinforcement learning from human feedback (RLHF) for training AI assistants. Assistance games resolve key drawbacks of RLHF, such as incentives for deceptive behavior, by explicitly modeling the interaction between assistant and user as a two-player game where the assistant cannot observe their shared goal. Despite their potential, assistance games have only been explored in simple settings. Scaling them to more complex environments is difficult because it requires both solving intractable decision-making problems under uncertainty and accurately modeling human users' behavior. We present the first scalable approach to solving assistance games and apply it to a new, challenging Minecraft-based assistance game with over $10^{400}$ possible goals. Our approach, AssistanceZero, extends AlphaZero with a neural network that predicts human actions and rewards, enabling it to plan under uncertainty. We show that AssistanceZero outperforms model-free RL algorithms and imitation learning in the Minecraft-based assistance game. In a human study, our AssistanceZero-trained assistant significantly reduces the number of actions participants take to complete building tasks in Minecraft. Our results suggest that assistance games are a tractable framework for training effective AI assistants in complex environments. Our code and models are available at https://github.com/cassidylaidlaw/minecraft-building-assistance-game.

Self-Steering Language Models

Authors:Gabriel Grand, Joshua B. Tenenbaum, Vikash K. Mansinghka, Alexander K. Lew, Jacob Andreas
Date:2025-04-09 17:54:22

While test-time reasoning enables language models to tackle complex tasks, searching or planning in natural language can be slow, costly, and error-prone. But even when LMs struggle to emulate the precise reasoning steps needed to solve a problem, they often excel at describing its abstract structure--both how to verify solutions and how to search for them. This paper introduces DisCIPL, a method for "self-steering" LMs where a Planner model generates a task-specific inference program that is executed by a population of Follower models. Our approach equips LMs with the ability to write recursive search procedures that guide LM inference, enabling new forms of verifiable and efficient reasoning. When instantiated with a small Follower (e.g., Llama-3.2-1B), DisCIPL matches (and sometimes outperforms) much larger models, including GPT-4o and o1, on challenging constrained generation tasks. In decoupling planning from execution, our work opens up a design space of highly-parallelized Monte Carlo inference strategies that outperform standard best-of-N sampling, require no finetuning, and can be implemented automatically by existing LMs.

Microlensing at Cosmological Distances: Event Rate Predictions in the Warhol Arc of MACS 0416

Authors:J. M. Palencia, J. M. Diego, L. Dai, M. Pascale, R. Windhorst, A. M. Koekemoer, Sung Kei Li, B. J. Kavanagh, Fengwu Sun, Amruth Alfred, Ashish K. Meena, Thomas J. Broadhurst, Patrick L. Kelly, Derek Perera, Hayley Williams, Adi Zitrin
Date:2025-04-09 16:54:22

Highly magnified stars ($\mu$ $>$ 100) are now outinely identified as transient events at cosmological distances thanks to microlensing by intra-cluster stars near the critical curves of galaxy clusters. Using the {\it James Webb} Space Telescope (JWST) in combination with the {\it Hubble} Space Telescope (HST), we outline here an analytical framework that is applied to the Warhol arc (at $z=0.94$) in the MACS 0416 galaxy cluster (at $z=0.396)$ where over a dozen microlensed stars have been detected to date. This method is general and can be applied to other lensed arcs. Within this lensed galaxy we fit the spatially resolved SED spanned by eight JWST-NIRCam filters combined with three ACS filters, for accurate lensed star predictions in 2D. With this tool we can generate 2D maps of microlensed stars for well resolved arcs in general, including dependence on wavelength and limiting apparent magnitude, for comparison with with planned cadenced campaigns for JWST and Hubble, for constraining directly the IMF and the level of dark matter substructure.

UAV Position Estimation using a LiDAR-based 3D Object Detection Method

Authors:Uthman Olawoye, Jason N. Gross
Date:2025-04-09 16:43:59

This paper explores the use of applying a deep learning approach for 3D object detection to compute the relative position of an Unmanned Aerial Vehicle (UAV) from an Unmanned Ground Vehicle (UGV) equipped with a LiDAR sensor in a GPS-denied environment. This was achieved by evaluating the LiDAR sensor's data through a 3D detection algorithm (PointPillars). The PointPillars algorithm incorporates a column voxel point-cloud representation and a 2D Convolutional Neural Network (CNN) to generate distinctive point-cloud features representing the object to be identified, in this case, the UAV. The current localization method utilizes point-cloud segmentation, Euclidean clustering, and predefined heuristics to obtain the relative position of the UAV. Results from the two methods were then compared to a reference truth solution.

Leveraging GCN-based Action Recognition for Teleoperation in Daily Activity Assistance

Authors:Thomas M. Kwok, Jiaan Li, Yue Hu
Date:2025-04-09 16:14:55

Caregiving of older adults is an urgent global challenge, with many older adults preferring to age in place rather than enter residential care. However, providing adequate home-based assistance remains difficult, particularly in geographically vast regions. Teleoperated robots offer a promising solution, but conventional motion-mapping teleoperation imposes unnatural movement constraints on operators, leading to muscle fatigue and reduced usability. This paper presents a novel teleoperation framework that leverages action recognition to enable intuitive remote robot control. Using our simplified Spatio-Temporal Graph Convolutional Network (S-ST-GCN), the system recognizes human actions and executes corresponding preset robot trajectories, eliminating the need for direct motion synchronization. A finite-state machine (FSM) is integrated to enhance reliability by filtering out misclassified actions. Our experiments demonstrate that the proposed framework enables effortless operator movement while ensuring accurate robot execution. This proof-of-concept study highlights the potential of teleoperation with action recognition for enabling caregivers to remotely assist older adults during activities of daily living (ADLs). Future work will focus on improving the S-ST-GCN's recognition accuracy and generalization, integrating advanced motion planning techniques to further enhance robotic autonomy in older adult care, and conducting a user study to evaluate the system's telepresence and ease of control.

Optimal promotions of new products on networks

Authors:Gadi Fibich, Amit Golan
Date:2025-04-09 15:23:10

We present a novel methodology for analyzing the optimal promotion in the Bass model for the spreading of new products on networks. For general networks with $M$ nodes, the optimal promotion is the solution of $2^M-1$ nonlinearly-coupled boundary-value problems. On structured networks, however, the number of equations can be reduced to a manageable size which is amendable to simulations and analysis. This enables us to gain insight into the effect of the network structure on optimal promotions. We find that the optimal advertising strategy decreases with time, whereas the optimal boosting of peer effects increases from zero and then decreases. In low-degree networks, it is optimal to prioritize advertising over boosting peer effects, but this relation is flipped in high-degree networks. When the planning horizon is finite, the optimal promotion continues until the last minute, as opposed to an infinite planning horizon where the optimal promotion decays to zero. Finally, promotions with short planning horizons can yield an order of magnitude higher increase of profits, compared to those with long planning horizons.

Understanding The Effects of Geotechnical Properties on Viscous Erosion Rate from Plume Surface Interactions

Authors:B. Dotson, A. St. John, R. Hall, D. Sapkota, D. Britt, P. Metzger
Date:2025-04-09 14:51:21

With humans returning to the Moon under the Artemis program, understanding and mitigating effects from Plume Surface Interactions (PSI) will be essential for the protection of personnel and equipment on the Moon. To help characterize the underlying mechanics associated with viscous erosion and crater formation, experimental measurements using regolith simulants and subsonic, non-reacting flows were completed using compressed air in a splitter plate, plume cratering setup. More specifically, these investigations examined the underlying effects of bulk density, cohesion, and exhaust flow characteristics on viscous erosion rates and crater formation using Lunar highlands simulant (LHS-1), Lunar mare simulant (LMS-1), LHS-1D (Dust) simulants, and 40-80 um glass beads in atmosphere. Results show that particle size distribution can ultimately influence crater shapes and erosion rates, likely owing to internal angle of friction. Measurements show that increasing bulk density, especially from an uncompacted to a slightly compacted state, decreases erosion rate by as much as 50%. While cohesion of granular material can mitigate erosion rates to some extent, higher levels of cohesion above 1,000 Pa may actually increase viscous erosion rates due to particle clumping. A modified version of Metzger's (2024a) equation for volumetric erosion rate is presented, with limitations discussed. These modified equations for viscous erosion, with limitations noted, show that geotechnical properties play an important role in viscous erosion and should be considered in PSI computer models for future mission planning.

Longitudinal Assessment of Lung Lesion Burden in CT

Authors:Tejas Sudharshan Mathai, Benjamin Hou, Ronald M. Summers
Date:2025-04-09 14:30:43

In the U.S., lung cancer is the second major cause of death. Early detection of suspicious lung nodules is crucial for patient treatment planning, management, and improving outcomes. Many approaches for lung nodule segmentation and volumetric analysis have been proposed, but few have looked at longitudinal changes in total lung tumor burden. In this work, we trained two 3D models (nnUNet) with and without anatomical priors to automatically segment lung lesions and quantified total lesion burden for each patient. The 3D model without priors significantly outperformed ($p < .001$) the model trained with anatomy priors. For detecting clinically significant lesions $>$ 1cm, a precision of 71.3\%, sensitivity of 68.4\%, and F1-score of 69.8\% was achieved. For segmentation, a Dice score of 77.1 $\pm$ 20.3 and Hausdorff distance error of 11.7 $\pm$ 24.1 mm was obtained. The median lesion burden was 6.4 cc (IQR: 2.1, 18.1) and the median volume difference between manual and automated measurements was 0.02 cc (IQR: -2.8, 1.2). Agreements were also evaluated with linear regression and Bland-Altman plots. The proposed approach can produce a personalized evaluation of the total tumor burden for a patient and facilitate interval change tracking over time.

Combining high-contrast imaging with high-resolution spectroscopy: Actual on-sky MIRI/MRS results compared to expectations

Authors:S. Martos, A. Bidot, A. Carlotti, D. Mouillet
Date:2025-04-09 13:51:20

CONTEXT: Combining high-contrast imaging with high-resolution spectroscopy offers a powerful way to detect and characterize exoplanets around nearby stars, despite challenges linked to their faintness. Instruments like VLT/SPHERE are state of the art in high-contrast imaging, but their spectral resolution (R=50) limits them to basic characterization of close companions. These systems can detect planets down to 5-10 Mjup at 10 AU from their stars. Detection limits are mainly constrained by speckle noise, which dominates over photon and detector noise at short separations, even with advanced differential imaging. Space-based high-contrast imaging is also limited by image stability. Speckle noise can, however, be mitigated through molecular mapping, a technique that leverages high-resolution spectroscopic data. AIMS: We aim to predict detection limits in spectro-imaging after molecular mapping, analyzing how photon and detector noise propagate and comparing predictions with real data to assess performance losses from instrumental effects. We also propose mitigation strategies and validate our model using observations. METHODS: We analyzed JWST/MIRI/MRS data with FastCurves, an numerical tool, and compared results to outputs from the MIRI simulator. We also applied principal component analysis (PCA) to identify and isolate systematic effects, with and without molecular mapping. RESULTS: We studied various systematic effects and their impacts on signal and noise. PCA helped highlight and reduce straylight, fringes, and aliasing. We further compared observed and modeled companion spectra. CONCLUSIONS: FastCurves was improved to account for systematics and validated with real data. In high-flux regimes, systematics impose contrast limits even with molecular mapping. Our approach could benefit other instruments and inform the planning of future facilities like ELT/ANDES and ELT/PCS.

A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools

Authors:Rio Kishimoto, Tetsuya Kanda, Yuki Manabe, Katsuro Inoue, Shi Qiu, Yoshiki Higo
Date:2025-04-09 13:35:02

A Software Bill of Materials (SBOM) is becoming an essential tool for effective software dependency management. An SBOM is a list of components used in software, including details such as component names, versions, and licenses. Using SBOMs, developers can quickly identify software components and assess whether their software depends on vulnerable libraries. Numerous tools support software dependency management through SBOMs, which can be broadly categorized into two types: tools that generate SBOMs and tools that utilize SBOMs. A substantial collection of accurate SBOMs is required to evaluate tools that utilize SBOMs. However, there is no publicly available dataset specifically designed for this purpose, and research on SBOM consumption tools remains limited. In this paper, we present a dataset of SBOMs to address this gap. The dataset we constructed comprises 46 SBOMs generated from real-world Java projects, with plans to expand it to include a broader range of projects across various programming languages. Accurate and well-structured SBOMs enable researchers to evaluate the functionality of SBOM consumption tools and identify potential issues. We collected 3,271 Java projects from GitHub and generated SBOMs for 798 of them using Maven with an open-source SBOM generation tool. These SBOMs were refined through both automatic and manual corrections to ensure accuracy, currently resulting in 46 SBOMs that comply with the SPDX Lite profile, which defines minimal requirements tailored to practical workflows in industries. This process also revealed issues with the SBOM generation tools themselves. The dataset is publicly available on Zenodo (DOI: 10.5281/zenodo.14233415).

Inducing Programmatic Skills for Agentic Tasks

Authors:Zora Zhiruo Wang, Apurva Gandhi, Graham Neubig, Daniel Fried
Date:2025-04-09 12:25:37

To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning task-specific skills online through interaction with the web environment. In this work, we demonstrate that programs are an effective representation for skills. We propose agent skill induction (ASI), which allows agents to adapt themselves by inducing, verifying, and utilizing program-based skills on the fly. We start with an evaluation on the WebArena agent benchmark and show that ASI outperforms the static baseline agent and its text-skill counterpart by 23.5% and 11.3% in success rate, mainly thanks to the programmatic verification guarantee during the induction phase. ASI also improves efficiency by reducing 10.7-15.3% of the steps over baselines, by composing primitive actions (e.g., click) into higher-level skills (e.g., search product). We then highlight the efficacy of ASI in remaining efficient and accurate under scaled-up web activities. Finally, we examine the generalizability of induced skills when transferring between websites, and find that ASI can effectively reuse common skills, while also updating incompatible skills to versatile website changes.