planning - 2026-04-30

Three-Step Nav: A Hierarchical Global-Local Planner for Zero-Shot Vision-and-Language Navigation

Authors:Wanrong Zheng, Yunhao Ge, Laurent Itti
Date:2026-04-29 17:55:05

Breakthrough progress in vision-based navigation through unknown environments has been achieved by using multimodal large language models (MLLMs). These models can plan a sequence of motions by evaluating the current view at each time step against the task and goal given to the agent. However, current zero-shot Vision-and-Language Navigation (VLN) agents powered by MLLMs still tend to drift off course, halt prematurely, and achieve low overall success rates. We propose Three-Step Nav to counteract these failures with a three-view protocol: First, "look forward" to extract global landmarks and sketch a coarse plan. Then, "look now" to align the current visual observation with the next sub-goal for fine-grained guidance. Finally, "look backward" audits the entire trajectory to correct accumulated drift before stopping. Requiring no gradient updates or task-specific fine-tuning, our planner drops into existing VLN pipelines with minimal overhead. Three-Step Nav achieves state-of-the-art zero-shot performance on the R2R-CE and RxR-CE dataset. Our code is available at https://github.com/ZoeyZheng0/3-step-Nav.

Causal Learning with Neural Assemblies

Authors:Evangelia Kopadi, Dimitris Kalles
Date:2026-04-29 17:34:33

Can Neural Assemblies -- groups of neurons that fire together and strengthen through co-activation -- learn the direction of causal influence between variables? While established as a computationally general substrate for classification, parsing, and planning, neural assemblies have not yet been shown to internalize causal directionality. We demonstrate that the inherent operations of neural assemblies -- projection, local plasticity control, and sparse winner selection -- are sufficient for directional learning. We introduce DIRECT (DIRectional Edge Coupling/Training), a mechanism that co-activates source and target assemblies under an adaptive gain schedule to internalize directed relations. Unlike backpropagation-based methods, DIRECT relies solely on local plasticity, making the resulting causal claims auditable at the mechanism level. Our findings are verified through a dual-readout validation strategy: (i) synaptic-strength asymmetry, measuring the emergent weight gap between forward and reverse links, and (ii) functional propagation overlap, quantifying the reliability of directional signal flow. Across multiple domains, the framework achieves perfect structural recovery under a supervised, known-structure setting. These results establish neural assemblies as an auditable bridge between biologically plausible dynamics and formal causal models, offering an "explainable by design" framework where causal claims are traceable to specific neural winners and synaptic asymmetries.

Bi-Level Optimization for Contact and Motion Planning in Rope-Assisted Legged Robots

Authors:Ruben Malacarne, Ioannis Tsikelis, Enrico Mingo Hoffman, Michele Focchi
Date:2026-04-29 17:18:29

This paper presents a planning pipeline framework for locomotion in rope-assisted robots climbing vertical surfaces. The proposed framework is formulated as a bi-level optimization scheme that addresses a mixed-integer problem: selecting feasible terrain regions for landing while simultaneously optimizing the control inputs, namely rope tensions and leg forces, and landing location. The outer level of the optimization is solved using the Cross-Entropy Method, while the inner level relies on gradient-based nonlinear optimization to compute dynamically feasible motions. The approach is validated on a novel climbing robot platform, ALPINE, across a variety of challenging terrain configurations.

Safe Navigation using Neural Radiance Fields via Reachable Sets

Authors:Omanshu Thapliyal, Malarvizhi Sankaranarayanasamy, Ravigopal Vennelakanti
Date:2026-04-29 17:09:19

Safe navigation in cluttered environments is an important challenge for autonomous systems. Robots navigating through obstacle ridden scenarios need to be able to navigate safely in the presence of obstacles, goals, and ego objects of varying geometries. In this work, reachable set representations of the robot's real-time capabilities in the state space can be utilized to capture safe navigation requirements. While neural radiance fields (NeRFs) are utilized to compute, store, and manipulate the volumetric representations of the obstacles, or ego vehicle, as needed. Constrained optimal control is employed to represent the resulting path planning problem, involving linear matrix inequality constraints. We present simulation results for path planning in the presence of numerous obstacles in two different scenarios. Safe navigation is demonstrated through using reachable sets in the corresponding constrained optimal control problems.

Improving Bias Correction Methods for Daily Rainfall Using a Markov Chain Approach

Authors:Danny Parsons, David Stern, Mouhamadou Bamba Sylla, James Musyoka, John Bagiliko, Lily Clements, John Mupuro, Denis Ndanguza
Date:2026-04-29 16:53:05

Accurate, localised rainfall information is essential for applications such as agricultural planning, climate risk assessment, and water resources management. Gridded climate products provide rainfall information over large areas but can lack the accuracy needed at local scales, often requiring bias correction before use in local impact studies. Bias correction of daily rainfall is particularly challenging due to its complex characteristics. Local intensity scaling (LOCI) and quantile mapping (QM) are two widely used bias correction methods which adjust both rainfall frequency and intensity, but do not account for the temporal structure of daily rainfall. This can lead to biases in the representation of wet and dry spells. This study proposes integrating a two-state first-order Markov chain directly into existing bias correction methods through state-dependent rain day thresholds and rainfall adjustments, aimed at improving the temporal structure of rainfall. Two implementations of this framework are presented: Markov chain local intensity scaling (MC LOCI) and Markov chain quantile mapping (MC QM). The proposed methods were applied to AgERA5 reanalysis data with rainfall data from five stations in Zimbabwe. Results showed that the Markov chain methods outperformed LOCI and QM by improving the representation of rainfall persistence, onset, and wet and dry spell characteristics, while maintaining improvements in rain day frequency and overall rainfall statistics. These results demonstrate that the proposed methods could be beneficial for applications such as crop simulation, hydrological modelling and other applications which rely on accurate representation of rainfall sequencing.

Walk With Me: Long-Horizon Social Navigation for Human-Centric Outdoor Assistance

Authors:Lingfeng Zhang, Xiaoshuai Hao, Xizhou Bu, Yingbo Tang, Hongsheng Li, Jinghui Lu, Xiu-shen Wei, Jiayi Ma, Yu Liu, Jing Zhang, Hangjun Ye, Xiaojun Liang, Long Chen, Wenbo Ding
Date:2026-04-29 16:02:13

Assisting humans in open-world outdoor environments requires robots to translate high-level natural-language intentions into safe, long-horizon, and socially compliant navigation behavior. Existing map-based methods rely on costly pre-built HD maps, while learning-based policies are mostly limited to indoor and short-horizon settings. To bridge this gap, we propose Walk with Me, a map-free framework for long-horizon social navigation from high-level human instructions. Walk with Me leverages GPS context and lightweight candidate points-of-interest from a public map API for semantic destination grounding and waypoint proposal. A High-Level Vision-Language Model grounds abstract instructions into concrete destinations and plans coarse waypoint sequences. During execution, an observation-aware routing mechanism determines whether the Low-Level Vision-Language-Action policy can handle the current situation or whether explicit safety reasoning from the High-Level VLM is needed. Routine segments are executed by the Low-Level VLA, while complex situations such as crowded crossings trigger high-level reasoning and stop-and-wait behavior when unsafe. By combining semantic intent grounding, map-free long-horizon planning, safety-aware reasoning, and low-level action generation, Walk with Me enables practical outdoor social navigation for human-centric assistance.

Continuum contribution to charged-current absorption of low-energy $ν_e$ on $^{40}$Ar

Authors:Steven Gardiner, Pablo Barham Alzás, Alexis Nikolakopoulos, Luca H. Abu El-Haj, Natalie Jachowicz, Vishvas Pandey
Date:2026-04-29 15:30:22

Accurate modeling of the absorption of tens-of-MeV $ν_e$ on $^{40}$Ar is needed to enable measurements of astrophysical neutrinos using large liquid argon time projection chamber (LArTPC) detectors, such as those planned for the Deep Underground Neutrino Experiment (DUNE). We revisit the MARLEY neutrino interaction model used in present estimates of DUNE sensitivity to supernova and solar neutrino signals. Multiple theoretical refinements are pursued, especially in the unbound continuum region of nuclear excitation energy. Inclusive charged-current neutrino-argon cross sections are calculated using a hybrid strategy. Nuclear transitions to unbound states are treated using a Hartree-Fock Continuum Random Phase Approximation (HF-CRPA) model, including forbidden contributions. Allowed transitions to low-lying discrete levels are also included using indirect measurements and approximate corrections for the momentum transfer dependence. Exclusive predictions are obtained by coupling these calculations with a statistical nuclear de-excitation model. The impact on observables of interest for DUNE and similar experiments is examined in terms of both total and differential cross sections. Our refined calculations predict a lower allowed portion of the cross section relative to the prior MARLEY model. At neutrino energies appreciably below 100 MeV, the inclusion of forbidden transitions does not fully compensate for the loss of allowed strength. For a representative neutrino burst from a galactic core-collapse supernova, our results suggest that MARLEY 1.2.0 overestimates the event yield in a DUNE-like detector by approximately 20%. However, because this overestimation is more severe at backwards angles, use of the charged-current $ν_e$-$^{40}$Ar reaction for supernova pointing may be more feasible than previously expected.

Virtual-reality based patient-specific simulation of spine surgical procedures: A fast, highly automated and high-fidelity system for surgical education and planning

Authors:Raj Kumar Ranabhat, Tayler D Ross, Tony Jiao, Jeremie Larouche, Joel Finkelstein, Michael Hardisty
Date:2026-04-29 15:12:18

Surgical training involves didactic teaching, mentor-led learning, surgical skills laboratories, and direct exposure to surgery; however, increasing clinical pressures have limited operating room (OR) exposure. This work leverages virtual reality (VR) to provide a safe and immersive training environment. Existing VR training is often based on standardized scenarios not tailored to individual clinical cases. This study addresses this limitation using artificial intelligence (AI) based computer vision methods to generate patient-specific simulations from computed tomography (CT) and magnetic resonance imaging (MRI). This study focuses on patient-specific spinal decompression simulation for spinal stenosis in a virtual operating room. The objectives were (1) automatic creation of 3D anatomical models and (2) VR simulation of spinal decompression procedures including laminectomy, disc resection, and foraminotomy. Model construction required multimodal fusion (registration) of CT and MRI and segmentation of relevant structures. Segmentation was evaluated using the Dice Similarity Coefficient (DSC), and registration accuracy using Target Registration Error (TRE). Qualitative feedback was obtained from surgeons and trainees. High-fidelity patient-specific 3D models were generated efficiently (approximately 2.5 minutes per case, N = 15). Segmentation accuracy was high, with a DSC of 0.95 (+/- 0.03) for vertebral bone and 0.895 (+/- 0.02) for soft tissue structures. Registration accuracy showed a mean TRE of 1.73 (+/- 0.42) mm. Semi-structured interviews indicated improved spatial understanding, increased procedural confidence, and strong perceived educational value. This platform significantly reduced the time and costs of patient-specific modelling, thereby facilitating pre-operative planning, post-procedural assessments, and comprehensive surgical simulation.

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

Authors:GLM-V Team, :, Wenyi Hong, Xiaotao Gu, Ziyang Pan, Zhen Yang, Yuting Wang, Yue Wang, Yuanchang Yue, Yu Wang, Yanling Wang, Yan Wang, Xijun Liu, Wenmeng Yu, Weihan Wang, Wei Li, Shuaiqi Duan, Sheng Yang, Ruiliang Lv, Mingdao Liu, Lihang Pan, Ke Ning, Junhui Ji, Jinjiang Wang, Jing Chen, Jiazheng Xu, Jiale Zhu, Jiale Cheng, Ji Qi, Guobing Gan, Guo Wang, Cong Yao, Zijun Dou, Zihao Zhou, Zihan Wang, Zhiqi Ge, Zhijie Li, Zhenyu Hou, Zhao Xue, Zehui Wang, Zehai He, Yusen Liu, Yukuo Cen, Yuchen Li, Yuan Wang, Yijian Lu, Yanzi Wang, Yadong Xue, Xinyu Zhang, Xinyu Liu, Wenkai Li, Tianyu Tong, Tianshu Zhang, Shengdong Yan, Qinkai Zheng, Mingde Xu, Licheng Bao, Jiaxing Xu, Jiaxin Fan, Jiawen Qian, Jiali Chen, Jiahui Lin, Haozhi Zheng, Haoran Wang, Haochen Li, Fan Yang, Dan Zhang, Chuangxin Zhao, Chengcheng Wu, Boyan Shi, Bowei Jia, Baoxu Wang, Peng Zhang, Debing Liu, Bin Xu, Juanzi Li, Minlie Huang, Yuxiao Dong, Jie Tang
Date:2026-04-29 14:49:37

We present GLM-5V-Turbo, a step toward native foundation models for multimodal agents. As foundation models are increasingly deployed in real environments, agentic capability depends not only on language reasoning, but also on the ability to perceive, interpret, and act over heterogeneous contexts such as images, videos, webpages, documents, GUIs. GLM-5V-Turbo is built around this objective: multimodal perception is integrated as a core component of reasoning, planning, tool use, and execution, rather than as an auxiliary interface to a language model. This report summarizes the main improvements behind GLM-5V-Turbo across model design, multimodal training, reinforcement learning, toolchain expansion, and integration with agent frameworks. These developments lead to strong performance in multimodal coding, visual tool use, and framework-based agentic tasks, while preserving competitive text-only coding capability. More importantly, our development process offers practical insights for building multimodal agents, highlighting the central role of multimodal perception, hierarchical optimization, and reliable end-to-end verification.

Understanding the Skills Gap between Higher Education Institutions and the Software Engineering Industry

Authors:Huy Phan, Ievgeniia Kuzminykh, Bogdan Ghita
Date:2026-04-29 13:21:15

In the rapidly evolving field of software engineering, the skills required of graduates entering the job market are constantly changing. Several studies have identified a gap between the skills taught in university curricula and those demanded by the software engineering industry. This chapter investigates the technical skill and expertise gap between higher education institutions (HEIs) and the UK software engineering industry by mapping job descriptions to the skills included in computer science degree programmes. A custom web scraping and text analysis tool, utilising fuzzy matching, was developed to extract and categorise skills from 300 job postings and undergraduate curricula from 30 UK universities. The analysis showed that the curricula place a strong emphasis on Programming Languages (18%) and Database Management (12.83%). In contrast, the industry s most frequently requested skill category is Software Design and Planning, which appears in approximately 88.68% of job descriptions, highlighting its critical importance. General Programming Language and System Structures also show strong demand, present in over 78.30% and 66.04% of postings, respectively. The mapping indicates that areas such as System Structures and Software Domains are significantly underrepresented in curricula, while Database Management and Compiler Design may be overemphasised. These insights can support HEIs in aligning their programmes with industry needs, supporting the preparation of graduates for dynamic careers in software engineering.

SciHorizon-DataEVA: An Agentic System for AI-Readiness Evaluation of Heterogeneous Scientific Data

Authors:Dianyu Liu, Chuan Qin, Xi Chen, Xiaohan Li, Wenxi Xu, Yuyang Wang, Xin Chen, Yuanchun Zhou, Hengshu Zhu
Date:2026-04-29 13:11:53

AI-for-Science (AI4Science) is increasingly transforming scientific discovery by embedding machine learning models into prediction, simulation, and hypothesis generation workflows across domains. However, the effectiveness of these models is fundamentally constrained by the AI-readiness of scientific data, for which no scalable and systematic evaluation mechanism currently exists. In this work, we propose SciHorizon-DataEVA, a novel agentic system to scalable AI-readiness evaluation of heterogeneous scientific data. At the evaluation-criteria level, we introduce the Sci-TQA2 principles, which organize AI-readiness into four complementary dimensions: Governance Trustworthiness, Data Quality, AI Compatibility, and Scientific Adaptability. Each dimension is decomposed into measurable atomic elements that enable fine-grained and executable assessment. To operationalize these principles at scale, we develop Sci-TQA2-Eval, a hierarchical multi-agent evaluation approach orchestrated through a directed, cyclic workflow. Our Sci-TQA2-Eval dynamically constructs dataset-aware evaluation specifications by combining lightweight dataset profiling, applicability-aware metric activation, and knowledge-augmented planning grounded in domain constraints and dataset-paper signals. These specifications are executed through an adaptive, tool-centric evaluation mechanism with built-in verification and self-correction, enabling scalable and reliable assessment across heterogeneous scientific data. Extensive experiments on scientific datasets spanning multiple domains demonstrate the effectiveness and generality of SciHorizon-DataEVA for principled AI-readiness evaluation.

STAR-Filter: Efficient Convex Free-Space Approximation via Starshaped Set Filtering in Noisy Environments

Authors:Yuwei Wu, Yichen Zhao, Dexter Ong, Vijay Kumar
Date:2026-04-29 12:54:20

Approximating collision-free space is fundamental to robot planning in complex environments. Convex geometric representations, such as polytopes and ellipsoids, are widely employed due to their structural properties, which can be easily integrated with convex optimization. Iterative optimization-based inflation methods can generate large volume polytopes in cluttered environments, but their efficiency degrades as the obstacle set becomes more complex or when sensor data are noisy. These methods are also sensitive to initialization and often rely on accurate geometric models. In this paper, we propose the STAR-Filter, a lightweight framework that employs starshaped set construction as a fast filter for convex region generation in collision-free space. By identifying obstacle points as active supporting constraints, the proposed method significantly reduces redundant computation while preserving feasibility and robustness to sensor noise. We provide theoretical and numerical analyses that characterize the structural properties of the starshaped set and proposed pipeline in environments of varying complexity. Simulation results show that the proposed framework achieves the lowest computation time and reduces conservativeness in polytope generation for real-world noisy and large-scale data. We demonstrate the effectiveness of the framework for Safe Flight Corridor (SFC) generation and agile quadrotor planning in noisy environments.

Impact of Attitude and Bounded Rationality on Collective Behavioral Transitions

Authors:Chen Song, Vladimir Cvetkovic, Angela Fontan, Rong Su, Karl H. Johansson
Date:2026-04-29 12:44:24

The theory of planned behavior (TPB) is one of the most influential frameworks in social psychology, stating that a person's behavior is driven by intention, which is primarily shaped by attitude, subjective norms, and perceived behavioral control. Despite its strong empirical support, TPB remains a static conceptual framework without explicit mathematical formulations that capture the temporal evolution of its components. To address this gap, we develop a dynamic agent-based modeling framework that integrates the core principles of TPB with a behavior-to-attitude feedback mechanism. Specifically, we define behaviors based on their feedback effects on attitude and examine when the population undergoes collective transitions by either adopting a beneficial behavior or rejecting a harmful one. Results from our model demonstrate that collective transitions can be effectively controlled by adjusting two key behavioral parameters that reflect agents' attitude influence and decision rationality. These findings provide quantitative insights on TPB, highlighting the key factors that drive collective behavioral transitions and the need for further socio-psychological case studies.

HiPAN: Hierarchical Posture-Adaptive Navigation for Quadruped Robots in Unstructured 3D Environments

Authors:Jeil Jeong, Minsung Yoon, Seokryun Choi, Heechan Shin, Taegeun Yang, Sung-eui Yoon
Date:2026-04-29 10:08:48

Navigating quadruped robots in unstructured 3D environments poses significant challenges, requiring goal-directed motion, effective exploration to escape from local minima, and posture adaptation to traverse narrow, height-constrained spaces. Conventional approaches employ a sequential mapping-planning pipeline but suffer from accumulated perception errors and high computational overhead, restricting their applicability on resource-constrained platforms. To address these challenges, we propose Hierarchical Posture-Adaptive Navigation (HiPAN), a framework that operates directly on onboard depth images at deployment. HiPAN adopts a hierarchical design: a high-level policy generates strategic navigation commands (planar velocity and body posture), which are executed by a low-level, posture-adaptive locomotion controller. To mitigate myopic behaviors and facilitate long-horizon navigation, we introduce Path-Guided Curriculum Learning, which progressively extends the navigation horizon from reactive obstacle avoidance to strategic navigation. In simulation, HiPAN achieves higher navigation success rates and greater path efficiency than classical reactive planners and end-to-end baselines, while real-world experiments further validate its applicability across diverse, unstructured 3D environments.

Order-Sensitive Sequential Interventions on Ideal Lattices

Authors:Dmitry Pasechnyuk-Vilensky
Date:2026-04-29 09:29:10

We study sequential interventions under prerequisite constraints. In this setting, admissible intervention sequences are paths in the ideal lattice of a finite prerequisite poset rather than unconstrained action strings. We give an exact local-to-global theory of order sensitivity on this state space. First, we prove that any two admissible paths with the same endpoints differ by a finite sequence of elementary diamond swaps. Second, for edge-additive path valuations, we show that path-independence is equivalent to vanishing diamond curvature, yielding an endpoint potential with a canonical Möbius parameterization on the ideal lattice. Third, we prove that a local diamond field is induced by an edge-based path model if and only if it satisfies cube consistency, with uniqueness after fixing a reference-tree gauge. Under reduced-state longitudinal assumptions, supported reference paths identify reference-path scores, whereas local order effects require two-sided support of both orders on each diamond. These results yield exact planning consequences, including an order-insensitivity bound and dynamic programming on the truncated ideal lattice.

A simple strategy for valid inference in target trial emulations

Authors:Mats Julius Stensrud
Date:2026-04-29 09:28:12

Target trial emulation has improved comparative effectiveness research by making the causal question, assumptions, and analysis plan explicit. However, target trial protocols are usually developed iteratively. After examining the data, investigators revise the protocol to reflect which target trials the observational data can realistically support. While this iterative procedure is part of normal scientific practice, it raises concerns about selective choices and invalid statistical inference. A simple procedure can address these concerns. This procedure is based on sample splitting. In the initial split, investigators explore the data to define a target trial protocol. When these choices are made, the target trial protocol is implemented on the second split. Although the investigators made data-informed choices to select the target trial protocol, the inference has the usual coverage guarantees. The procedure is created to mirror how trialists move from pilot studies to a phase 3 trial. First, they use data from pilots and early-phase trials to learn and decide on a final protocol. Then they implement this protocol and analyze a new set of data in a phase 3 trial.

The local Calderón problem and the determination at the boundary of a complex anisotropic admittivity

Authors:Jessica Crosse, Romina Gaburro
Date:2026-04-29 09:13:16

We address Calderón's problem of stably determining the anisotropic complex admittivity $σ$ in a domain $Ω\subset\mathbb{R}^n$, with $n\geq3$, representing a conducting medium, in terms of a Dirichlet-to-Neumann map locally prescribed on a non-empty portion $Σ$ of the boundary of $Ω$, $\partialΩ$. $σ$ is assumed to be of type $σ(\cdot)=A(\cdot,a(\cdot))$ in $Ω$, where the one-parameter family of complex-symmetric matrices $[λ^{-1},\:λ]\ni t\mapsto A(\cdot,\: t)$ is assumed to be a-priori known and the scalar function $a$ is unknown. We establish Lipschitz and Hölder stability estimates at the boundary for $σ$ and its derivatives of arbitrary order on $Σ$, respectively, in terms of the local map.

RV and TTV Measurements of Two Transiting Long-Period Giants around TOI-4600

Authors:Tong Hu, Zitao Lin, Sharon X. Wang, Mu-Tian Wang, Ismael Mireles, Jacob Bean, Madison Brady, Nina Brown, Qikang Feng, Tianjun Gan, Chengyang Ji, Xue Li, Jiayue Zhang, Ritvik Basant, Nikita Chazov, David Charbonneau, Karen A. Collins, Tanya Das, Diana Dragomir, Zahra Essack, Juliana Garcia-Mejia, Yang Huang, Jinzhong Liu, Christopher R. Mann, Hugh P. Osborn, Aleks Scholz, Andreas Seifahrt, Patrick Tamburo, Shuming Wang, Shahidin Yaqup
Date:2026-04-29 08:36:54

TOI-4600b and c, originally identified by the Transiting Exoplanet Survey Satellite (TESS) and reported by I. Mireles et al. (2023), are a rare pair of transiting long-period giant planets ($\rm P_b=82.7$ days, $\rm P_c=482.8$ days) orbiting an early K dwarf. In this work, we refine the orbital parameters of the TOI-4600 system by combining new TESS photometry, ground-based transit follow-up, and radial velocity (RV) observations from MAROON-X. We obtain improved constraints on planetary masses and eccentricities, and update other parameters, such as the stellar age. For TOI-4600b, we measure a mass of $M_p = 74.7^{+4.7}_{-4.4}\,M_{\oplus}$ and an eccentricity of $e=0.153^{+0.020}_{-0.018}$, and $M_p = 212.53^{+13.26}_{-13.03}\,M_{\oplus}$ and $e=0.219^{+0.015}_{-0.018}$ for TOI-4600c. We find significant transit timing variations (TTV) in both planets, with semi-amplitudes of approximately $1$\,hr. We derive Transit Spectroscopy Metric values of 16.87 for TOI-4600b and 10.09 for TOI-4600c, indicating that both planets are promising JWST targets for studying the atmospheres of temperate and cold Jupiters, a relatively poorly characterized sample thus far. These updated parameters and TTV ephemerides are important for planning and interpreting future photometric, spectroscopic, and dynamical studies of the TOI-4600 system.

Towards a Frugal Photosynthesis Sensing Toolkit for Data-Driven Plant Science Education and Exploration

Authors:Qitong Li, Raj Nileshbhai Dave, Rhema Amanda Phiri, Leo Zhang, Xiaoyu Zheng, Ariana Blake, Livia Ford, Sarah Jones, Susan R. Strickler, Nivedita Arora
Date:2026-04-29 05:10:13

Rapid environmental change and advances in data-driven analysis highlight the need not only to use computational tools, but also to foster understanding of the natural world and inspire creativity. Photosynthesis, the process that fuels nearly all life on Earth, provides a compelling context for such learning, particularly in understanding how plants alter their photosynthetic strategies in response to environmental changes. However, existing tools for studying photosynthesis are often inaccessible or limited to demonstrating its presence, rather than capturing its temporal dynamics. We present PhytoBits, a frugal in situ gas-exchange sensing toolkit for distinguishing and teaching photosynthetic strategies. PhytoBits combines leaf enclosure with accessible materials, an off-the-shelf CO\textsubscript{2} sensor, and a low-cost microcontroller, to support multi-day monitoring of plant gas-exchange in educational and research contexts. We validated PhytoBits against research-grade gas-exchange systems, confirming that it identifies C\textsubscript{3} and CAM (Crassulacean Acid Metabolism) photosynthetic pathways. In addition to obligate CAM, PhytoBits also resolves facultative CAM and developmental CAM dynamics in plants. This work presents an early-stage hardware validation; user deployment studies, open-source code dissemination, and automated pathway classification are planned as future work.

2D and 3D Grasp Planners for the GET Asymmetrical Gripper

Authors:Andrew Goldberg, Ethan Ransing, Anton Kourakin, Cael Magner, Edward H. Adelson, Ken Goldberg
Date:2026-04-29 01:31:56

In this paper, we introduce GET-2D-1.0, a fast grasp planner for the GET asymmetrical gripper that operates from a single-view RGB-D image, using the Ferrari-Canny metric and a novel sampling strategy, and GET-3D-1.0, a mesh-based method using a 3D gripper model and ray-tracing. We evaluate both grasp planners against baselines with physical experiments, which suggest that GET-2D-1.0 can improve over a bounding box baseline by over 40% in lift success, shake survival, and force resistance. Experiments with GET-3D-1.0 suggest slight improvement compared to GET-2D-1.0 on lift success and shake survival, but are more computationally expensive, averaging 17 seconds of planning compared to 683 ms for GET-2D-1.0.

Lifting Embodied World Models for Planning and Control

Authors:Alex N. Wang, Trevor Darrell, Pavel Izmailov, Yutong Bai, Amir Bar
Date:2026-04-28 23:59:19

World models of embodied agents predict future observations conditioned on an action taken by the agent. For complex embodiments, action spaces are high-dimensional and difficult to specify: for example, precisely controlling a human agent requires specifying the motion of each joint. This makes the world model hard to control and expensive to plan with as search-based methods like CEM scale poorly with action dimensionality. To address this issue, we train a lightweight policy that maps high-level actions to sequences of low-level joint actions. Composing this policy with the frozen world model produces a lifted world model that predicts a sequence of future observations from a single high-level action. We instantiate this framework for a human-like embodiment, defining the high-level action space as a small set of 2D waypoints annotated on the current observation frame, each specifying a near-term goal position for a leaf joint (pelvis, head, hands). Waypoints are low-dimensional, visually interpretable, and easy to specify manually or to search over. We show that the lifted world model substantially outperforms searching directly in low-level joint space ($3.8\times$ lower mean joint error to the goal pose), while remaining more compute-efficient and generalizing to environments unseen by the policy.

Hardware-Efficient Quantum Optimization for Transportation Networks via Compressed Adiabatic Evolution

Authors:Talha Azfar, Ruimin Ke, Sean He, Cara Wang, José Holguín-Veras
Date:2026-04-28 23:45:54

Transportation systems such as urban logistics, vehicle routing, and infrastructure planning require solving large-scale combinatorial optimization problems under complex constraints. Problems such as the vehicle routing problem (VRP), traveling salesman problem (TSP), and facility location problem (FLP) involve large discrete search spaces and the need to generate multiple feasible solutions in real time. In this work, we develop a hardware-grounded hybrid quantum optimization framework that uses Approximate Quantum Compilation (AQC) to compress early segments of digitized adiabatic evolution into shallow circuits. The compressed prefix is combined with variational layers, enabling a systematic study of how initialization, circuit depth, and expressivity interact on near-term quantum hardware. All experiments are performed on an IBM gate-based quantum computer, and circuits are evaluated as stochastic generators of candidate transportation plans. Results show that moderate prefix compression reduces two-qubit gate depth while maintaining or improving feasible solution discovery, particularly for routing problems. These benefits depend on compatibility between the compressed prefix and the variational ansatz: while standard QAOA effectively leverages AQC initialization, linear-chain QAOA shows limited improvement. Overall, this work demonstrates that hybrid AQC-QAOA methods provide a practical pathway for hardware-efficient quantum optimization, positioning quantum algorithms as candidate generators within transportation decision-making workflows.

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

Authors:Abhirami Pillai
Date:2026-04-28 23:24:26

Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline - first collect historical data to estimate heterogeneous treatment effects (HTE), then solve a constrained optimization to allocate the budget. This works well with abundant data, but fails in cold-start settings such as new campaigns, new markets, or new customer segments where little historical data exists. We propose Budget-Constrained Causal Bandits (BCCB), an online framework that learns which users respond to ads while simultaneously spending the budget, making treatment decisions one user at a time. BCCB unifies three components into a single sequential process: learning individual-level ad effectiveness, exploring users whose response is uncertain, and pacing the budget over time. We evaluated on the Criteo Uplift dataset, a large-scale advertising dataset from a real randomized controlled trial. Our key finding is a data-efficiency crossover: offline methods require approximately 10,000 historical observations to produce reliable results, while BCCB operates effectively from the very first user. Furthermore, BCCB exhibits 3-5x lower performance variance between runs, making it more practical for real campaign planning. Among purely online methods, BCCB consistently outperforms standard Thompson Sampling, budgeted Thompson Sampling, and greedy HTE estimation across all budget levels tested.

Spatially-constrained clustering of geospatial features for heat vulnerability assessment of favelas in Rio de Janeiro

Authors:Baptiste Clemence, Thomas Hallopeau, Vanderlei Pascoal De Matos, Laurent Demagistri, Joris Guerin
Date:2026-04-28 21:45:22

Informal settlements face disproportionate exposure to climate-related health hazards. However, existing methodologies lack systematic approaches to link diverse settlement characteristics with environmental health outcomes. We develop a data-driven framework to assess heat vulnerability in Rio de Janeiro's favelas by combining spatially-constrained clustering with land surface temperature (LST) analysis. Using remote sensing and geospatial features, we identify two distinct favela typologies: recent, well-connected settlements on flat terrain (Cluster 0) and historical, poorly-connected communities on vegetated slopes (Cluster 1). Analysis of 16 extreme heat events reveals systematic temperature differences of 2--3$^\circ$C between clusters, with flat-terrain favelas experiencing significantly higher heat exposure. Our findings demonstrate that settlement morphology critically influences heat vulnerability, providing a replicable framework for targeted urban planning and public health interventions in informal settlements globally.

Evaluating Strategic Reasoning in Forecasting Agents

Authors:Tom Liptay, Dan Schwarz, Rafael Poyiadzi, Jack Wildman, Nikos I. Bosse
Date:2026-04-28 20:45:59

Forecasting benchmarks produce accuracy leaderboards but little insight into why some forecasters are more accurate than others. We introduce Bench to the Future 2 (BTF-2), 1,417 pastcasting questions with a frozen 15M-document research corpus in which agents reproducibly research and forecast offline, producing full reasoning traces. BTF-2 detects accuracy differences of 0.004 Brier score, and can distinguish differential agent strengths in research vs. judgment. We build a forecaster 0.011 Brier more accurate than any single frontier agent, and use it to evaluate agent strategic reasoning without hindsight bias. We find the better forecaster differs primarily in its pre-mortem analysis of its blind spots and consideration of black swans. Expert human forecasters found the dominant strategic reasoning failures of frontier agents are in assessing political and business leaders' incentives, judging their likelihood to follow through on stated plans, and modeling institutional processes.

Application-Aware Twin-in-the-Loop Planning for Federated Split Learning over Wireless Edge Networks

Authors:Zihao Ding, Beining Wu, Jun Huang, Shiwen Mao
Date:2026-04-28 20:42:57

We investigate task-success-oriented resource allocation for federated split learning (FSL) at the wireless edge. In this setting, the server must jointly determine bandwidth, transmit power, split-layer placement, compression level, and terminal participation under per-round deadline, memory, and spectrum constraints. These coupled decisions affect wireless transmission, model training, and task execution, which evolve at different time scales and cannot be efficiently evaluated through repeated real-world trials. To address this challenge, we propose TiLP, a twin-in-the-loop planner that evaluates candidate decisions through a cross-domain digital twin before execution. The twin integrates network, training, and task sub-twins, with each sub-twin calibrated at the time scale of the process it models. Based on this twin, TiLP performs receding-horizon cross-entropy method planning with actor-critic guidance to search over mixed continuous-discrete decisions. Experiments on LIBERO robotic manipulation tasks over a Sionna RT-simulated wireless network show that TiLP improves task success by 9.5 percentage points over the strongest single-axis baseline, while satisfying the per-round deadline and energy budget.

SWE-Edit: Rethinking Code Editing for Efficient SWE-Agent

Authors:Yikai Zhang, Jiaxin Pei, Kenan Li, Maoquan Wang, Jin Pan, Yu Kang, Shengyu Fu, Elsie Nallipogu, Junjie Hu, Yufan Huang, Zijian Jin
Date:2026-04-28 20:35:09

Large language model agents have achieved remarkable progress on software engineering tasks, yet current approaches suffer from a fundamental context coupling problem: the standard code editing interface conflates code inspection, modification planning, and edit execution within a single context window, forcing agents to interleave exploratory viewing with strictly formatted edit generation. This causes irrelevant information to accumulate and degrades agent performance. To address this, we propose SWE-Edit, which decomposes code editing into two specialized subagents: a Viewer that extracts task-relevant code on demand, and an Editor that executes modifications from high-level plans--allowing the main agent to focus on reasoning while delegating context-intensive operations to clean context windows. We further investigate what makes an effective editing model: observing that the prevalent find-and-replace format is error-prone, we train Qwen3-8B with GRPO to adaptively select editing modes, yielding improved editing efficiency over single-format baselines. On SWE-bench Verified, SWE-Edit improves resolved rate by 2.1% while reducing inference cost by 17.9%. We additionally propose a code editing benchmark that reliably predicts downstream agentic performance, providing practical guidance for editing model selection. Our code is publicly available at https://github.com/microsoft/SWE-Edit.

Variational Neural Belief Parameterizations for Robust Dexterous Grasping under Multimodal Uncertainty

Authors:Clinton Enwerem, Shreya Kalyanaraman, John S. Baras, Calin Belta
Date:2026-04-28 17:40:49

Contact variability, sensing uncertainty, and external disturbances make grasp execution stochastic. Expected-quality objectives ignore tail outcomes and often select grasps that fail under adverse contact realizations. Risk-sensitive POMDPs address this failure mode, but many use particle-filter beliefs that scale poorly, obstruct gradient-based optimization, and estimate Conditional Value-at-Risk (CVaR) with high-variance approximations. We instead formulate grasp acquisition as variational inference over latent contact parameters and object pose, representing the belief with a differentiable Gaussian mixture. We use Gumbel-Softmax component selection and location-scale reparameterization to express samples as smooth functions of the belief parameters, enabling pathwise gradients through a differentiable CVaR surrogate for direct optimization of tail robustness. In simulation, our variational neural belief improves robust grasp success under contact-parameter uncertainty and exogenous force perturbations while reducing planning time by roughly an order of magnitude relative to particle-filter model-predictive control. On a serial-chain robot arm with a multifingered hand, we validate grasp-and-lift success under object-pose uncertainty against a Gaussian baseline. Both methods succeed on the tested perturbations, but our controller terminates in fewer steps and less wall-clock time while achieving a higher tactile grasp-quality proxy. Our learned belief also calibrates risk more accurately, keeping mean absolute calibration error below 0.14 across tested simulation regimes, compared with 0.58 for a Cross-Entropy Method planner.

KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning

Authors:Yixuan Huang, Bowen Li, Vaibhav Saxena, Yichao Liang, Utkarsh Aashu Mishra, Liang Ji, Lihan Zha, Jimmy Wu, Nishanth Kumar, Sebastian Scherer, Danfei Xu, Tom Silver
Date:2026-04-28 15:58:09

Robotic systems that interact with the physical world must reason about kinematic and dynamic constraints imposed by their own embodiment, their environment, and the task at hand. We introduce KinDER, a benchmark for Kinematic and Dynamic Embodied Reasoning that targets physical reasoning challenges arising in robot learning and planning. KinDER comprises 25 procedurally generated environments, a Gymnasium-compatible Python library with parameterized skills and demonstrations, and a standardized evaluation suite with 13 implemented baselines spanning task and motion planning, imitation learning, reinforcement learning, and foundation-model-based approaches. The environments are designed to isolate five core physical reasoning challenges: basic spatial relations, nonprehensile multi-object manipulation, tool use, combinatorial geometric constraints, and dynamic constraints, disentangled from perception, language understanding, and application-specific complexity. Empirical evaluation shows that existing methods struggle to solve many of the environments, indicating substantial gaps in current approaches to physical reasoning. We additionally include real-to-sim-to-real experiments on a mobile manipulator to assess the correspondence between simulation and real-world physical interaction. KinDER is fully open-sourced and intended to enable systematic comparison across diverse paradigms for advancing physical reasoning in robotics. Website and code: https://prpl-group.com/kinder-site/

Open Problems in Frontier AI Risk Management

Authors:Marta Ziosi, Miro Plueckebaum, Stephen Casper, Henry Papadatos, Ze Shen Chin, Peter Slattery, James Gealy, Tim G. J. Rudner, Brian Tse, Ariel Gil, Patricia Paskov, Maximilian Negele, Rokas Gipiškis, Nada Madkour, Vera Lummis, Rupal Jain, Luise Eder, Kristina Fort, Malou C. van Draanen Glismann, Inès Belhadj, Amin Oueslati, Anna K. Wisakanto, Richard Mallah, Koen Holtman, Ranj Zuhdi, Daniel S. Schiff, Jessica Newman, Malcolm Murray, Robert Trager
Date:2026-04-28 15:47:24

Frontier AI both amplifies existing risks and introduces qualitatively novel challenges. Not only is there a notable lack of stable scientific consensus resulting from the rapid pace of technological change, but emerging frontier AI safety practices are often misaligned with, or may undermine, established risk management frameworks. To address these challenges, we systematically surface open problems in frontier AI risk management. Adopting a problem-oriented approach, we examine each stage of the risk management process - risk planning, identification, analysis, evaluation, and mitigation - through a structured review of the literature, identifying unresolved challenges and the actors best positioned to address them. Recognising that different types of open problems call for different responses, we classify open problems according to whether they reflect (a) a lack of scientific or technical consensus, (b) misalignment with, or challenges to, established risk management frameworks, or (c) shortcomings in implementation despite apparent consensus and alignment. By mapping these open problems and identifying the actors best positioned to address them - including developers, deployers, regulators, standards bodies, researchers, and third-party evaluators - this work aims to clarify where progress is needed to enable robust and meaningful consensus on frontier AI risk management.The paper does not propose specific solutions; instead, it provides a problem-oriented, agenda-setting reference document, complemented by a living online repository, intended to support coordination, reduce duplication, and guide future research and governance efforts.