planning - 2025-11-05

Optimizing AI Agent Attacks With Synthetic Data

Authors:Chloe Loughridge, Paul Colognese, Avery Griffin, Tyler Tracy, Jon Kutasov, Joe Benton
Date:2025-11-04 18:48:56

As AI deployments become more complex and high-stakes, it becomes increasingly important to be able to estimate their risk. AI control is one framework for doing so. However, good control evaluations require eliciting strong attack policies. This can be challenging in complex agentic environments where compute constraints leave us data-poor. In this work, we show how to optimize attack policies in SHADE-Arena, a dataset of diverse realistic control environments. We do this by decomposing attack capability into five constituent skills -- suspicion modeling, attack selection, plan synthesis, execution and subtlety -- and optimizing each component individually. To get around the constraint of limited data, we develop a probabilistic model of attack dynamics, optimize our attack hyperparameters using this simulation, and then show that the results transfer to SHADE-Arena. This results in a substantial improvement in attack strength, reducing safety score from a baseline of 0.87 to 0.41 using our scaffold.

Assessing win strength in MLB win prediction models

Authors:Morgan Allen, Paul Savala
Date:2025-11-04 18:40:10

In Major League Baseball, strategy and planning are major factors in determining the outcome of a game. Previous studies have aided this by building machine learning models for predicting the winning team of any given game. We extend this work by training a comprehensive set of machine learning models using a common dataset. In addition, we relate the win probabilities produced by these models to win strength as measured by score differential. In doing so we show that the most common machine learning models do indeed demonstrate a relationship between predicted win probability and the strength of the win. Finally, we analyze the results of using predicted win probabilities as a decision making mechanism on run-line betting. We demonstrate positive returns when utilizing appropriate betting strategies, and show that naive use of machine learning models for betting lead to significant loses.

ISAC Empowered Air-Sea Collaborative System: A UAV-USV Joint Inspection Framework

Authors:Rui Zhang, Fuwang Dong, Wei Wang
Date:2025-11-04 14:12:32

In this paper, we construct an air-sea collaborative system framework based on the Integrated Sensing and Communication (ISAC) techniques, where the Unmanned Aerial Vehicle (UAV) and Unmanned Surface Vehicle (USV) jointly inspect targets of interest while keeping communication with each other simultaneously. First, we demonstrate the unique challenges encountered in this collaborative system, i.e., the coupling and heterogeneity of the UAV/USV's trajectories. Then, we formulate a total energy consumption minimization problem to jointly optimize the trajectories, flying and hovering times, target scheduling, and beamformers under the constraints of water currents, collision avoidance, and Sensing and Communication (S\&C) requirements. To address the strong coupling of the variables, we divide the original problem into two subproblems, namely, the hover point selection and the joint trajectory planning and beamforming design. In the first subproblem, we propose a three-step hierarchical method including: (1) a virtual base station coverage (VBSC) and clustering algorithm to obtain the target scheduling and rough position of hover points; (2) a Bi-traveling salesman problem with neighborhood (Bi-TSPN)-based algorithm to determine the visiting order sequence of the hover points; (3) a hover point refinement and time allocation algorithm to further optimize the time allocation. In the latter subproblem, we complete the remaining trajectory planning and beamforming design in each flying and hovering stage by developing a semi-definite relaxation (SDR) and successive convex approximation (SCA) method. Finally, we conduct a series of simulations to demonstrate the superiority of the proposed scheme over existing sequential access and leader-follower strategies.

SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration

Authors:Dan Bohus, Sean Andrist, Ann Paradiso, Nick Saw, Tim Schoonbeek, Maia Stiber
Date:2025-11-04 13:30:15

We introduce SigmaCollab, a dataset enabling research on physically situated human-AI collaboration. The dataset consists of a set of 85 sessions in which untrained participants were guided by a mixed-reality assistive AI agent in performing procedural tasks in the physical world. SigmaCollab includes a set of rich, multimodal data streams, such as the participant and system audio, egocentric camera views from the head-mounted device, depth maps, head, hand and gaze tracking information, as well as additional annotations performed post-hoc. While the dataset is relatively small in size (~ 14 hours), its application-driven and interactive nature brings to the fore novel research challenges for human-AI collaboration, and provides more realistic testing grounds for various AI models operating in this space. In future work, we plan to use the dataset to construct a set of benchmarks for physically situated collaboration in mixed-reality task assistive scenarios. SigmaCollab is available at https://github.com/microsoft/SigmaCollab.

Sparse Source Identification in Transient Advection-Diffusion Problems with a Primal-Dual-Active-Point Strategy

Authors:Marco Mattuschka, Daniel Walter, Max von Danwitz, Alexander Popp
Date:2025-11-04 13:04:40

This work presents a mathematical model to enable rapid prediction of airborne contaminant transport based on scarce sensor measurements. The method is designed for applications in critical infrastructure protection (CIP), such as evacuation planning following contaminant release. In such scenarios, timely and reliable decision-making is essential, despite limited observation data. To identify contaminant sources, we formulate an inverse problem governed by an advection-diffusion equation. Given the problem's underdetermined nature, we further employ a variational regularization ansatz and model the unknown contaminant sources as distribution over the spatial domain. To efficiently solve the arising inverse problem, we employ a problem-specific variant of the Primal-Dual-Active-Point (PDAP) algorithm which efficiently approximates sparse minimizers of the inverse problem by alternating between greedy location updates and source intensity optimization. The approach is demonstrated on two- and three-dimensional test cases involving both instantaneous and continuous contaminant sources and outperforms state-of-the-art techniques with $L^2$-regularization. Its effectiveness is further illustrated in complex domains with real-world building geometries imported from OpenStreetMap.

Agentic AI for Mobile Network RAN Management and Optimization

Authors:Jorge Pellejero, Luis A. Hernández Gómez, Luis Mendo Tomás, Zoraida Frias Barroso
Date:2025-11-04 12:34:57

Agentic AI represents a new paradigm for automating complex systems by using Large AI Models (LAMs) to provide human-level cognitive abilities with multimodal perception, planning, memory, and reasoning capabilities. This will lead to a new generation of AI systems that autonomously decompose goals, retain context over time, learn continuously, operate across tools and environments, and adapt dynamically. The complexity of 5G and upcoming 6G networks renders manual optimization ineffective, pointing to Agentic AI as a method for automating decisions in dynamic RAN environments. However, despite its rapid advances, there is no established framework outlining the foundational components and operational principles of Agentic AI systems nor a universally accepted definition. This paper contributes to ongoing research on Agentic AI in 5G and 6G networks by outlining its core concepts and then proposing a practical use case that applies Agentic principles to RAN optimization. We first introduce Agentic AI, tracing its evolution from classical agents and discussing the progress from workflows and simple AI agents to Agentic AI. Core design patterns-reflection, planning, tool use, and multi-agent collaboration-are then described to illustrate how intelligent behaviors are orchestrated. These theorical concepts are grounded in the context of mobile networks, with a focus on RAN management and optimization. A practical 5G RAN case study shows how time-series analytics and LAM-driven agents collaborate for KPI-based autonomous decision-making.

Using ensemble learning with hybrid graph neural networks and transformers to predict traffic in cities

Authors:Ismail Zrigui, Samira Khoulji, Mohamed Larbi Kerkeb
Date:2025-11-04 11:14:49

Intelligent transportation systems (ITS) still have a hard time accurately predicting traffic in cities, especially in big, multimodal settings with complicated spatiotemporal dynamics. This paper presents HybridST, a hybrid architecture that integrates Graph Neural Networks (GNNs), multi-head temporal Transformers, and supervised ensemble learning methods (XGBoost or Random Forest) to collectively capture spatial dependencies, long-range temporal patterns, and exogenous signals, including weather, calendar, or control states. We test our model on the METR-LA, PEMS-BAY, and Seattle Loop tree public benchmark datasets. These datasets include situations ranging from freeway sensor networks to vehicle-infrastructure cooperative perception. Experimental results show that HybridST consistently beats classical baselines (LSTM, GCN, DCRNN, PDFormer) on important metrics like MAE and RMSE, while still being very scalable and easy to understand. The proposed framework presents a promising avenue for real-time urban mobility planning, energy optimization, and congestion alleviation strategies, especially within the framework of smart cities and significant events such as the 2030 FIFA World Cup.

Self-Supervised Moving Object Segmentation of Sparse and Noisy Radar Point Clouds

Authors:Leon Schwarzer, Matthias Zeller, Daniel Casado Herraez, Simon Dierl, Michael Heidingsfeld, Cyrill Stachniss
Date:2025-11-04 09:21:45

Moving object segmentation is a crucial task for safe and reliable autonomous mobile systems like self-driving cars, improving the reliability and robustness of subsequent tasks like SLAM or path planning. While the segmentation of camera or LiDAR data is widely researched and achieves great results, it often introduces an increased latency by requiring the accumulation of temporal sequences to gain the necessary temporal context. Radar sensors overcome this problem with their ability to provide a direct measurement of a point's Doppler velocity, which can be exploited for single-scan moving object segmentation. However, radar point clouds are often sparse and noisy, making data annotation for use in supervised learning very tedious, time-consuming, and cost-intensive. To overcome this problem, we address the task of self-supervised moving object segmentation of sparse and noisy radar point clouds. We follow a two-step approach of contrastive self-supervised representation learning with subsequent supervised fine-tuning using limited amounts of annotated data. We propose a novel clustering-based contrastive loss function with cluster refinement based on dynamic points removal to pretrain the network to produce motion-aware representations of the radar data. Our method improves label efficiency after fine-tuning, effectively boosting state-of-the-art performance by self-supervised pretraining.

Whole-body motion planning and safety-critical control for aerial manipulation

Authors:Lin Yang, Jinwoo Lee, Domenico Campolo, H. Jin Kim, Jeonghyun Byun
Date:2025-11-04 08:00:59

Aerial manipulation combines the maneuverability of multirotors with the dexterity of robotic arms to perform complex tasks in cluttered spaces. Yet planning safe, dynamically feasible trajectories remains difficult due to whole-body collision avoidance and the conservativeness of common geometric abstractions such as bounding boxes or ellipsoids. We present a whole-body motion planning and safety-critical control framework for aerial manipulators built on superquadrics (SQs). Using an SQ-plus-proxy representation, we model both the vehicle and obstacles with differentiable, geometry-accurate surfaces. Leveraging this representation, we introduce a maximum-clearance planner that fuses Voronoi diagrams with an equilibrium-manifold formulation to generate smooth, collision-aware trajectories. We further design a safety-critical controller that jointly enforces thrust limits and collision avoidance via high-order control barrier functions. In simulation, our approach outperforms sampling-based planners in cluttered environments, producing faster, safer, and smoother trajectories and exceeding ellipsoid-based baselines in geometric fidelity. Actual experiments on a physical aerial-manipulation platform confirm feasibility and robustness, demonstrating consistent performance across simulation and hardware settings. The video can be found at https://youtu.be/hQYKwrWf1Ak.

ZJUNlict Extended Team Description Paper 2025

Authors:Zifei Wu, Lijie Wang, Zhe Yang, Shijie Yang, Liang Wang, Haoran Fu, Yinliang Cai, Rong Xiong
Date:2025-11-04 07:00:34

This paper presents the ZJUNlict team's work over the past year, covering both hardware and software advancements. In the hardware domain, the integration of an IMU into the v2023 robot was completed to enhance posture accuracy and angular velocity planning. On the software side, key modules were optimized, including the strategy and CUDA modules, with significant improvements in decision making efficiency, ball pursuit prediction, and ball possession prediction to adapt to high-tempo game dynamics.

Large-scale automatic carbon ion treatment planning for head and neck cancers via parallel multi-agent reinforcement learning

Authors:Jueye Zhang, Chao Yang, Youfang Lai, Kai-Wen Li, Wenting Yan, Yunzhou Xia, Haimei Zhang, Jingjing Zhou, Gen Yang, Chen Lin, Tian Li, Yibao Zhang
Date:2025-11-04 06:57:31

Head-and-neck cancer (HNC) planning is difficult because multiple critical organs-at-risk (OARs) are close to complex targets. Intensity-modulated carbon-ion therapy (IMCT) offers superior dose conformity and OAR sparing but remains slow due to relative biological effectiveness (RBE) modeling, leading to laborious, experience-based, and often suboptimal tuning of many treatment-planning parameters (TPPs). Recent deep learning (DL) methods are limited by data bias and plan feasibility, while reinforcement learning (RL) struggles to efficiently explore the exponentially large TPP search space. We propose a scalable multi-agent RL (MARL) framework for parallel tuning of 45 TPPs in IMCT. It uses a centralized-training decentralized-execution (CTDE) QMIX backbone with Double DQN, Dueling DQN, and recurrent encoding (DRQN) for stable learning in a high-dimensional, non-stationary environment. To enhance efficiency, we (1) use compact historical DVH vectors as state inputs, (2) apply a linear action-to-value transform mapping small discrete actions to uniform parameter adjustments, and (3) design an absolute, clinically informed piecewise reward aligned with plan scores. A synchronous multi-process worker system interfaces with the PHOENIX TPS for parallel optimization and accelerated data collection. On a head-and-neck dataset (10 training, 10 testing), the method tuned 45 parameters simultaneously and produced plans comparable to or better than expert manual ones (relative plan score: RL $85.93\pm7.85%$ vs Manual $85.02\pm6.92%$), with significant (p-value $<$ 0.05) improvements for five OARs. The framework efficiently explores high-dimensional TPP spaces and generates clinically competitive IMCT plans through direct TPS interaction, notably improving OAR sparing.

Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling

Authors:Lancelot Da Costa, Sanjeev Namjoshi, Mohammed Abbas Ansari, Bernhard Schölkopf
Date:2025-11-03 22:02:04

The field of world modeling is fragmented, with researchers developing bespoke architectures that rarely build upon each other. We propose a framework that specifies the natural building blocks for structured world models based on the fundamental stochastic processes that any world model must capture: discrete processes (logic, symbols) and continuous processes (physics, dynamics); the world model is then defined by the hierarchical composition of these building blocks. We examine Hidden Markov Models (HMMs) and switching linear dynamical systems (sLDS) as natural building blocks for discrete and continuous modeling--which become partially-observable Markov decision processes (POMDPs) and controlled sLDS when augmented with actions. This modular approach supports both passive modeling (generation, forecasting) and active control (planning, decision-making) within the same architecture. We avoid the combinatorial explosion of traditional structure learning by largely fixing the causal architecture and searching over only four depth parameters. We review practical expressiveness through multimodal generative modeling (passive) and planning from pixels (active), with performance competitive to neural approaches while maintaining interpretability. The core outstanding challenge is scalable joint structure-parameter learning; current methods finesse this by cleverly growing structure and parameters incrementally, but are limited in their scalability. If solved, these natural building blocks could provide foundational infrastructure for world modeling, analogous to how standardized layers enabled progress in deep learning.

Characterizing the Reliability of a Novel Upright CT for Proton Therapy

Authors:Yuhao Yan, Jordan Slagowski, Jessica Miller, John Hayes, Carson Hoffman, Minglei Kang, Carri Glide-Hurst
Date:2025-11-03 22:01:59

Purpose: To evaluate reliability of upright CT for proton dose calculation and feasibility of a simplified phantom configuration for accelerated routine QA. Methods: A calibration phantom was scanned on an upright CT following consensus guidelines for 14 sessions/7 months. CT number repeatability was assessed by standard deviation (SD). Stopping power ratio (SPR) look-up table was derived. Phantom size dependency was assessed. The simplified phantom configuration was scanned for 15 sessions/8 months. Repeatability was assessed. CT numbers and SPR were compared with consensus configuration. Both configurations were scanned on a recumbent CT to validate the findings. An anthropomorphic phantom was scanned on upright and recumbent CT. Targets were drawn mimicking spine and prostate tumor. Proton plans were developed using pencil beam scanning techniques and robust optimization. Equivalence of dose calculation were assessed via controlled comparisons. Results: Simplified configuration measured all CT numbers in 1 scan vs 5 for consensus guidelines. Upright CT demonstrated excellent longitudinal stability (inter- and intrasession SD <4.9 HU and 1.6 HU, respectively). Size dependency was identified with significant (p<.05) differences in CT numbers, propagated to $\Delta$SPR <5.3%. Significant (p<.05) differences were found comparing upright CT numbers measured by 2 configurations ($\Delta$SPR<2.6%). Recumbent CT showed smaller $\Delta$SPR (<0.7%). Both dosimetric comparison showed local differences (<8% of prescription dose) while clinical equivalence was found with target coverage differences <0.2% and gamma pass rates=100% at 3 mm/3% for all controlled comparison of different CT machines and phantom configurations. Conclusions: The upright CT demonstrated reliability to support adaptive proton therapy. The simplified configuration shows feasibility for rapid QA.

Human-AI Co-Embodied Intelligence for Scientific Experimentation and Manufacturing

Authors:Xinyi Lin, Yuyang Zhang, Yuanhang Gan, Juntao Chen, Hao Shen, Yichun He, Lijun Li, Ze Yuan, Shuang Wang, Chaohao Wang, Rui Zhang, Na Li, Jia Liu
Date:2025-11-03 21:12:48

Scientific experiment and manufacture rely on complex, multi-step procedures that demand continuous human expertise for precise execution and decision-making. Despite advances in machine learning and automation, conventional models remain confined to virtual domains, while real-world experiment and manufacture still rely on human supervision and expertise. This gap between machine intelligence and physical execution limits reproducibility, scalability, and accessibility across scientific and manufacture workflows. Here, we introduce human-AI co-embodied intelligence, a new form of physical AI that unites human users, agentic AI, and wearable hardware into an integrated system for real-world experiment and intelligent manufacture. In this paradigm, humans provide precise execution and control, while agentic AI contributes memory, contextual reasoning, adaptive planning, and real-time feedback. The wearable interface continuously captures the experimental and manufacture processes, facilitates seamless communication between humans and AI for corrective guidance and interpretable collaboration. As a demonstration, we present Agentic-Physical Experimentation (APEX) system, coupling agentic reasoning with physical execution through mixed-reality. APEX observes and interprets human actions, aligns them with standard operating procedures, provides 3D visual guidance, and analyzes every step. Implemented in a cleanroom for flexible electronics fabrication, APEX system achieves context-aware reasoning with accuracy exceeding general multimodal large language models, corrects errors in real time, and transfers expertise to beginners. These results establish a new class of agentic-physical-human intelligence that extends agentic reasoning beyond computation into the physical domain, transforming scientific research and manufacturing into autonomous, traceable, interpretable, and scalable processes.

Machine learning in LHCb Simulation: From fast to flash

Authors:Michał Mazurek
Date:2025-11-03 19:49:12

Monte Carlo simulations are essential for physics analyses in high-energy physics, but their computational demands are continuously increasing. In LHCb, 90 % of computing resources are used for simulations, with the calorimeter simulation being the most computationally intensive part. Fast simulations and flash simulations, leveraging machine learning techniques, offer promising solutions to this challenge with different levels of detail and speed. The CaloML framework accelerates electromagnetic shower propagation of photons and electrons in the LHCb calorimeter by up to two orders of magnitude, achieving a systematic error on reconstructed energies as low as 0.01\%. Lamarr is an in-house flash simulation framework that reduces CPU time of the whole simulation phase by two orders of magnitude compared to traditional Geant4-based methods. In this paper, these two approaches are presented, highlighting their methodologies, performance, and validation results, as well as future development plans.

Found in Translation: at the limits of the Hudetz program

Authors:Toby Meadows
Date:2025-11-03 19:30:43

This paper aims to provide an analysis of what it means when we say that a pair of theories, very generously construed, are equivalent in the sense that they are interdefinable. With regard to theories articulated in first order logic, we already have a natural and well-understood device for addressing this problem: the theory of relative interpretability as based on translation. However, many important theories in the sciences and mathematics (and, in particular, physics) are precisely formulated but are not naturally articulated in first order logic or any obvious language at all. In this paper, we plan to generalize the ordinary theory of interpretation to accommodate such theories by offering an account where definability does not mean definability relative to a particular structure, but rather definability without such reservations: definable in the language of mathematics.

Avoiding Blindness in Baryon Number Violating Processes: Free-Beam and Intranuclear Paths to Neutron-Antineutron Transitions

Authors:Joshua L. Barrow, Peter Fierlinger, Yuri Kamyshkov, Bernhard Meirose, David Milstead, Rabindra N. Mohapatra, Valentina Santoro
Date:2025-11-03 18:34:18

Experimental searches for neutron--antineutron ($n \rightarrow \bar n$) transitions can be considered via two approaches: conversion in free-neutron beams and intranuclear transformation leading to matter instability in large-mass detectors. Plans for next-generation searches make it timely to highlight the complementarity, necessity, and limitations of each method. Converting the bound neutron limit into one for free neutrons traditionally utilizes nucleus-specific estimates of the in-medium suppression of $n \rightarrow \bar n$, obtained within mean-field theory under a single-operator assumption. This paper highlights how this suppression can be scenario-dependent, which can lead to deviations from the standard approach that can span several orders of magnitude. A further goal of the paper is to point out the need for a broader phenomenology program for $n\rightarrow \bar{n}$ that is akin to those developed for electric dipole moments and other systems for which short-distance new physics must be studied in-medium.

MOBIUS: A Multi-Modal Bipedal Robot that can Walk, Crawl, Climb, and Roll

Authors:Alexander Schperberg, Yusuke Tanaka, Stefano Di Cairano, Dennis Hong
Date:2025-11-03 17:28:38

This article presents a Multi-Modal Bipedal Intelligent Urban Scout robot (MOBIUS) capable of walking, crawling, climbing, and rolling. MOBIUS features four limbs--two 6-DoF arms with two-finger grippers for manipulation and climbing, and two 4-DoF legs for locomotion--enabling smooth transitions across diverse terrains without reconfiguration. A hybrid control architecture combines reinforcement learning-based locomotion with model-based predictive and admittance control enhanced for safety by a Reference Governor toward compliant contact interactions. A high-level MIQCP planner autonomously selects locomotion modes to balance stability and energy efficiency. Hardware experiments demonstrate robust gait transitions, dynamic climbing, and full-body load support via pinch grasp. Overall, MOBIUS demonstrates the importance of tight integration between morphology, high-level planning, and control to enable mobile loco-manipulation and grasping, substantially expanding its interaction capabilities, workspace, and traversability.

UniLION: Towards Unified Autonomous Driving Model with Linear Group RNNs

Authors:Zhe Liu, Jinghua Hou, Xiaoqing Ye, Jingdong Wang, Hengshuang Zhao, Xiang Bai
Date:2025-11-03 17:24:19

Although transformers have demonstrated remarkable capabilities across various domains, their quadratic attention mechanisms introduce significant computational overhead when processing long-sequence data. In this paper, we present a unified autonomous driving model, UniLION, which efficiently handles large-scale LiDAR point clouds, high-resolution multi-view images, and even temporal sequences based on the linear group RNN operator (i.e., performs linear RNN for grouped features). Remarkably, UniLION serves as a single versatile architecture that can seamlessly support multiple specialized variants (i.e., LiDAR-only, temporal LiDAR, multi-modal, and multi-modal temporal fusion configurations) without requiring explicit temporal or multi-modal fusion modules. Moreover, UniLION consistently delivers competitive and even state-of-the-art performance across a wide range of core tasks, including 3D perception (e.g., 3D object detection, 3D object tracking, 3D occupancy prediction, BEV map segmentation), prediction (e.g., motion prediction), and planning (e.g., end-to-end planning). This unified paradigm naturally simplifies the design of multi-modal and multi-task autonomous driving systems while maintaining superior performance. Ultimately, we hope UniLION offers a fresh perspective on the development of 3D foundation models in autonomous driving. Code is available at https://github.com/happinesslz/UniLION

Dynamic Estimates of Displacement in Disaster Regions: A Policy-driven framework triangulating data

Authors:Elisabetta Pietrostefani, Matt Mason, Rodgers Iradukunda, Hong Tran-Jones, Iryna Loktieva, Francisco Rowe
Date:2025-11-03 16:46:00

While traditional data systems remain fundamental to humanitarian response, they often lack the real-time responsiveness and spatial precision needed to capture increasingly complex patterns of displacement. Internal displacement reached an unprecedented 83.4 million people by the end of 2024, underscoring the urgent need for innovative, data driven approaches to monitor and understand population movements. This report examines how integrating traditional data sources with emerging digital trace data, such as mobile phone GPS and social media activity, can enhance the accuracy, responsiveness, and granularity of displacement monitoring. Drawing on lessons from recent crises, including the escalation of the war in Ukraine and the 2022 floods in Pakistan, the report presents a structured pilot effort that tests the triangulation of multiple data streams to produce more robust and reliable displacement estimates. Statistical indicators derived from digital trace data are benchmarked against the International Organisation for Migration, Displacement Tracking Matrix datasets, to assess their validity, transparency, and scalability. The findings demonstrate how triangulated data approaches can deliver real-time, high-resolution insights into population movements, improving humanitarian resource allocation and intervention planning. The report includes a scalable framework for crisis monitoring that leverages digital innovation to strengthen humanitarian data systems and support evidence-based decision-making in complex emergencies.

Extending to the Submillimeter Universe with the CCAT Observatory

Authors:Eve M. Vavagiakis
Date:2025-11-03 16:15:23

The CCAT Observatory's Fred Young Submillimeter Telescope, a novel, high-throughput, 6-meter aperture telescope, is scheduled for first light in 2026. Located at 5600 m on Cerro Chajnantor in the Chilean Atacama Desert, the CCAT site enables unprecedented submillimeter measurement capabilities, fully overlapping with millimeter-wave surveys like the Simons Observatory. CCAT will address a suite of science goals, from Big Bang cosmology, star formation, and line-intensity mapping of cosmic reionization, to galactic magnetic fields, transients, and galaxy evolution over cosmic time. We highlight CCAT's science goals with Prime-Cam, a first generation science instrument for the Fred Young Submillimeter Telescope. Prime-Cam will field over 100,000 kinetic inductance detectors across seven instrument modules to enable over ten times faster mapping speed than previous submillimeter observatories in windows between 1.4 - 0.3 mm (220 - 850 GHz). We give an instrument summary, discuss the project status, and outline preliminary plans for early science.

Double white dwarf ZTF J0538+1953 as the brightest verification binary for space laser interferometers

Authors:Serguey Antipin, Alexander Belinski, Leonid Berdnikov, Alexandra Zubareva, Natalia Maslennikova, Konstantin Postnov, Ivan Strakhov
Date:2025-11-03 15:07:54

A decrease in the orbital period of the ultrashort-period binary white dwarf \ZTF, which is one of the Galactic verification binaries in the millihertz frequency range for planned space laser interferometers, has been measured. Based on photometric observations carried out on the 2.5-m telescope of the Caucasian Mountain Observatory of the Sternberg Astronomical Institute of Moscow State University (CMO SAI MSU), a diagram \textit{O-C} is constructed. It can be described by quadratic elements of the brightness variation, which correspond to a decrease rate of the orbital period of the system of $dP/dt=-(1.16\pm 0.22)\times 10^{-11}$ s/s. The decrease rate of the orbital period in the quadrupole approximation for the emission of gravitational waves by a binary system corresponds to its chirp mass $\mathcal{M}=0.434\pm 0.05 M_\odot$, which turned out to be $\sim 30\%$ higher than the value obtained earlier from spectroscopic mass determination. The chirp mass of \ZTF inferred from the measured orbital decay rate makes this system the brightest Galactic verification binary for LISA and TianQin space interferometers with a signal-to-noise ratio of $\approx 119$ and $\approx 30$ over 5 years and 2.5 years of observations, respectively.

MARS: Multi-Agent Robotic System with Multimodal Large Language Models for Assistive Intelligence

Authors:Renjun Gao, Peiyan Zhong
Date:2025-11-03 13:58:37

Multimodal large language models (MLLMs) have shown remarkable capabilities in cross-modal understanding and reasoning, offering new opportunities for intelligent assistive systems, yet existing systems still struggle with risk-aware planning, user personalization, and grounding language plans into executable skills in cluttered homes. We introduce MARS - a Multi-Agent Robotic System powered by MLLMs for assistive intelligence and designed for smart home robots supporting people with disabilities. The system integrates four agents: a visual perception agent for extracting semantic and spatial features from environment images, a risk assessment agent for identifying and prioritizing hazards, a planning agent for generating executable action sequences, and an evaluation agent for iterative optimization. By combining multimodal perception with hierarchical multi-agent decision-making, the framework enables adaptive, risk-aware, and personalized assistance in dynamic indoor environments. Experiments on multiple datasets demonstrate the superior overall performance of the proposed system in risk-aware planning and coordinated multi-agent execution compared with state-of-the-art multimodal models. The proposed approach also highlights the potential of collaborative AI for practical assistive scenarios and provides a generalizable methodology for deploying MLLM-enabled multi-agent systems in real-world environments.

Populations of tidal and pulsating variables in eclipsing binaries

Authors:Alex Kemp, Jasmine Vrancken, Joey S. G. Mombarg, Luc IJspeert, Mykyta Kilapets, Andrew Tkachenko, Conny Aerts
Date:2025-11-03 12:24:00

In this work, we seek to characterise a large sample of 14377 main sequence eclipsing binaries in terms of their stellar, asteroseismic, and orbital properties. We conduct manual vetting on a 4000-target subset of our full 14377-target sample to identify targets with pressure or gravity modes. We infer stellar properties including the mass, convective core mass, radius, and central H fraction for the primary using Gaia Data Release 3 effective temperature and luminosity estimates and a grid of asteroseismically calibrated stellar models. We use surface brightness ratio and radius ratio estimates from previous eclipse analysis to study the effect of binarity on our results. Our manual vetting identifies 751 candidate g-mode pulsators, 131 p-mode pulsators, and a further 48 hybrid pulsators. The inferred stellar properties of the hybrid and p-mode pulsators are highly correlated, while the orbital properties of the hybrid pulsators align best with the g-mode pulsators. The g-mode pulsators themselves show a distribution that peaks around the classical g dor instability region but extends continuously towards higher masses, with no detectable divide between the classical g dor and SPB instability regions. There is evidence at the population level for a heightened level of tidal efficiency in stars showing g-mode or hybrid variability. Correcting the primary mass inference for binarity based on eclipse measurements of the surface brightness and radius ratios results in a relatively small shift towards lower masses. This work provides a working initial characterisation of this sample from which more detailed analyses folding in asteroseismic information can be built. It also provides a foundational understanding of the limitations and capabilities of this kind of rapid, scalable analysis that will be highly relevant in planning the exploitation of future large-scale binary surveys.

Floor Plan-Guided Visual Navigation Incorporating Depth and Directional Cues

Authors:Wei Huang, Jiaxin Li, Zang Wan, Huijun Di, Wei Liang, Zhu Yang
Date:2025-11-03 12:00:15

Guiding an agent to a specific target in indoor environments based solely on RGB inputs and a floor plan is a promising yet challenging problem. Although existing methods have made significant progress, two challenges remain unresolved. First, the modality gap between egocentric RGB observations and the floor plan hinders the integration of visual and spatial information for both local obstacle avoidance and global planning. Second, accurate localization is critical for navigation performance, but remains challenging at deployment in unseen environments due to the lack of explicit geometric alignment between RGB inputs and floor plans. We propose a novel diffusion-based policy, denoted as GlocDiff, which integrates global path planning from the floor plan with local depth-aware features derived from RGB observations. The floor plan offers explicit global guidance, while the depth features provide implicit geometric cues, collectively enabling precise prediction of optimal navigation directions and robust obstacle avoidance. Moreover, GlocDiff introduces noise perturbation during training to enhance robustness against pose estimation errors, and we find that combining this with a relatively stable VO module during inference results in significantly improved navigation performance. Extensive experiments on the FloNa benchmark demonstrate GlocDiff's efficiency and effectiveness in achieving superior navigation performance, and the success of real-world deployments also highlights its potential for widespread practical applications.

MO-SeGMan: Rearrangement Planning Framework for Multi Objective Sequential and Guided Manipulation in Constrained Environments

Authors:Cankut Bora Tuncer, Marc Toussaint, Ozgur S. Oguz
Date:2025-11-03 11:38:57

In this work, we introduce MO-SeGMan, a Multi-Objective Sequential and Guided Manipulation planner for highly constrained rearrangement problems. MO-SeGMan generates object placement sequences that minimize both replanning per object and robot travel distance while preserving critical dependency structures with a lazy evaluation method. To address highly cluttered, non-monotone scenarios, we propose a Selective Guided Forward Search (SGFS) that efficiently relocates only critical obstacles and to feasible relocation points. Furthermore, we adopt a refinement method for adaptive subgoal selection to eliminate unnecessary pick-and-place actions, thereby improving overall solution quality. Extensive evaluations on nine benchmark rearrangement tasks demonstrate that MO-SeGMan generates feasible motion plans in all cases, consistently achieving faster solution times and superior solution quality compared to the baselines. These results highlight the robustness and scalability of the proposed framework for complex rearrangement planning problems.

Edge-Enabled UAV Swarm Deployment for Rapid Post-Disaster Search and Rescue

Authors:Alaa Awad Abdellatif, Helder Fontes, Andre Coelho, Luis M. Pessoa, Rui Campos
Date:2025-11-03 11:19:37

This paper presents an optimized Joint Radar-Communication (JRC) system utilizing multiple Unmanned Aerial Vehicles (UAVs) to simultaneously achieve sensing and communication objectives. By leveraging UAVs equipped with dual radar and communication capabilities, the proposed framework aims to maximize radar sensing performance across all UAVs in challenging environments. The proposed approach focuses on formulating and solving a UAV positioning and power allocation problem to optimize multi-UAV sensing and communications performance over multiple targets within designated zones. Due to the NP-hard and combinatorial nature of the problem, we propose a Distributed JRC-based (DJRC) solution. This solution employs an efficient reward for potential actions and consistently selects the best action that maximizes the reward while ensuring both communications and sensing performance. Simulation results demonstrate significant performance improvements of the proposed solution over state-of-the-art radar- or communication-centric trajectory planning methods, with polynomial complexity dependent on the number of UAVs and linear dependence on the iteration count.

CaRLi-V: Camera-RADAR-LiDAR Point-Wise 3D Velocity Estimation

Authors:Landson Guo, Andres M. Diaz Aguilar, William Talbot, Turcan Tuna, Marco Hutter, Cesar Cadena
Date:2025-11-03 09:32:59

Accurate point-wise velocity estimation in 3D is crucial for robot interaction with non-rigid, dynamic agents, such as humans, enabling robust performance in path planning, collision avoidance, and object manipulation in dynamic environments. To this end, this paper proposes a novel RADAR, LiDAR, and camera fusion pipeline for point-wise 3D velocity estimation named CaRLi-V. This pipeline leverages raw RADAR measurements to create a novel RADAR representation, the velocity cube, which densely represents radial velocities within the RADAR's field-of-view. By combining the velocity cube for radial velocity extraction, optical flow for tangential velocity estimation, and LiDAR for point-wise range measurements through a closed-form solution, our approach can produce 3D velocity estimates for a dense array of points. Developed as an open-source ROS2 package, CaRLi-V has been field-tested against a custom dataset and proven to produce low velocity error metrics relative to ground truth, enabling point-wise velocity estimation for robotic applications.

MIQ-SAM3D: From Single-Point Prompt to Multi-Instance Segmentation via Competitive Query Refinement

Authors:Jierui Qu, Jianchun Zhao
Date:2025-11-03 08:48:28

Accurate segmentation of medical images is fundamental to tumor diagnosis and treatment planning. SAM-based interactive segmentation has gained attention for its strong generalization, but most methods follow a single-point-to-single-object paradigm, which limits multi-lesion segmentation. Moreover, ViT backbones capture global context but often miss high-fidelity local details. We propose MIQ-SAM3D, a multi-instance 3D segmentation framework with a competitive query optimization strategy that shifts from single-point-to-single-mask to single-point-to-multi-instance. A prompt-conditioned instance-query generator transforms a single point prompt into multiple specialized queries, enabling retrieval of all semantically similar lesions across the 3D volume from a single exemplar. A hybrid CNN-Transformer encoder injects CNN-derived boundary saliency into ViT self-attention via spatial gating. A competitively optimized query decoder then enables end-to-end, parallel, multi-instance prediction through inter-query competition. On LiTS17 and KiTS21 dataset, MIQ-SAM3D achieved comparable levels and exhibits strong robustness to prompts, providing a practical solution for efficient annotation of clinically relevant multi-lesion cases.

Embodied Cognition Augmented End2End Autonomous Driving

Authors:Ling Niu, Xiaoji Zheng, Han Wang, Chen Zheng, Ziyuan Yang, Bokui Chen, Jiangtao Gong
Date:2025-11-03 08:34:44

In recent years, vision-based end-to-end autonomous driving has emerged as a new paradigm. However, popular end-to-end approaches typically rely on visual feature extraction networks trained under label supervision. This limited supervision framework restricts the generality and applicability of driving models. In this paper, we propose a novel paradigm termed $E^{3}AD$, which advocates for comparative learning between visual feature extraction networks and the general EEG large model, in order to learn latent human driving cognition for enhancing end-to-end planning. In this work, we collected a cognitive dataset for the mentioned contrastive learning process. Subsequently, we investigated the methods and potential mechanisms for enhancing end-to-end planning with human driving cognition, using popular driving models as baselines on publicly available autonomous driving datasets. Both open-loop and closed-loop tests are conducted for a comprehensive evaluation of planning performance. Experimental results demonstrate that the $E^{3}AD$ paradigm significantly enhances the end-to-end planning performance of baseline models. Ablation studies further validate the contribution of driving cognition and the effectiveness of comparative learning process. To the best of our knowledge, this is the first work to integrate human driving cognition for improving end-to-end autonomous driving planning. It represents an initial attempt to incorporate embodied cognitive data into end-to-end autonomous driving, providing valuable insights for future brain-inspired autonomous driving systems. Our code will be made available at Github