planning - 2025-08-22

Intern-S1: A Scientific Multimodal Foundation Model

Authors:Lei Bai, Zhongrui Cai, Maosong Cao, Weihan Cao, Chiyu Chen, Haojiong Chen, Kai Chen, Pengcheng Chen, Ying Chen, Yongkang Chen, Yu Cheng, Yu Cheng, Pei Chu, Tao Chu, Erfei Cui, Ganqu Cui, Long Cui, Ziyun Cui, Nianchen Deng, Ning Ding, Nanqin Dong, Peijie Dong, Shihan Dou, Sinan Du, Haodong Duan, Caihua Fan, Ben Gao, Changjiang Gao, Jianfei Gao, Songyang Gao, Yang Gao, Zhangwei Gao, Jiaye Ge, Qiming Ge, Lixin Gu, Yuzhe Gu, Aijia Guo, Qipeng Guo, Xu Guo, Conghui He, Junjun He, Yili Hong, Siyuan Hou, Caiyu Hu, Hanglei Hu, Jucheng Hu, Ming Hu, Zhouqi Hua, Haian Huang, Junhao Huang, Xu Huang, Zixian Huang, Zhe Jiang, Lingkai Kong, Linyang Li, Peiji Li, Pengze Li, Shuaibin Li, Tianbin Li, Wei Li, Yuqiang Li, Dahua Lin, Junyao Lin, Tianyi Lin, Zhishan Lin, Hongwei Liu, Jiangning Liu, Jiyao Liu, Junnan Liu, Kai Liu, Kaiwen Liu, Kuikun Liu, Shichun Liu, Shudong Liu, Wei Liu, Xinyao Liu, Yuhong Liu, Zhan Liu, Yinquan Lu, Haijun Lv, Hongxia Lv, Huijie Lv, Qidang Lv, Ying Lv, Chengqi Lyu, Chenglong Ma, Jianpeng Ma, Ren Ma, Runmin Ma, Runyuan Ma, Xinzhu Ma, Yichuan Ma, Zihan Ma, Sixuan Mi, Junzhi Ning, Wenchang Ning, Xinle Pang, Jiahui Peng, Runyu Peng, Yu Qiao, Jiantao Qiu, Xiaoye Qu, Yuan Qu, Yuchen Ren, Fukai Shang, Wenqi Shao, Junhao Shen, Shuaike Shen, Chunfeng Song, Demin Song, Diping Song, Chenlin Su, Weijie Su, Weigao Sun, Yu Sun, Qian Tan, Cheng Tang, Huanze Tang, Kexian Tang, Shixiang Tang, Jian Tong, Aoran Wang, Bin Wang, Dong Wang, Lintao Wang, Rui Wang, Weiyun Wang, Wenhai Wang, Yi Wang, Ziyi Wang, Ling-I Wu, Wen Wu, Yue Wu, Zijian Wu, Linchen Xiao, Shuhao Xing, Chao Xu, Huihui Xu, Jun Xu, Ruiliang Xu, Wanghan Xu, GanLin Yang, Yuming Yang, Haochen Ye, Jin Ye, Shenglong Ye, Jia Yu, Jiashuo Yu, Jing Yu, Fei Yuan, Bo Zhang, Chao Zhang, Chen Zhang, Hongjie Zhang, Jin Zhang, Qiaosheng Zhang, Qiuyinzhe Zhang, Songyang Zhang, Taolin Zhang, Wenlong Zhang, Wenwei Zhang, Yechen Zhang, Ziyang Zhang, Haiteng Zhao, Qian Zhao, Xiangyu Zhao, Xiangyu Zhao, Bowen Zhou, Dongzhan Zhou, Peiheng Zhou, Yuhao Zhou, Yunhua Zhou, Dongsheng Zhu, Lin Zhu, Yicheng Zou

Date:2025-08-21 17:58:00

In recent years, a plethora of open-source foundation models have emerged, achieving remarkable progress in some widely attended fields, with performance being quite close to that of closed-source models. However, in high-value but more challenging scientific professional fields, either the fields still rely on expert models, or the progress of general foundation models lags significantly compared to those in popular areas, far from sufficient for transforming scientific research and leaving substantial gap between open-source models and closed-source models in these scientific domains. To mitigate this gap and explore a step further toward Artificial General Intelligence (AGI), we introduce Intern-S1, a specialized generalist equipped with general understanding and reasoning capabilities with expertise to analyze multiple science modal data. Intern-S1 is a multimodal Mixture-of-Experts (MoE) model with 28 billion activated parameters and 241 billion total parameters, continually pre-trained on 5T tokens, including over 2.5T tokens from scientific domains. In the post-training stage, Intern-S1 undergoes offline and then online reinforcement learning (RL) in InternBootCamp, where we propose Mixture-of-Rewards (MoR) to synergize the RL training on more than 1000 tasks simultaneously. Through integrated innovations in algorithms, data, and training systems, Intern-S1 achieved top-tier performance in online RL training.On comprehensive evaluation benchmarks, Intern-S1 demonstrates competitive performance on general reasoning tasks among open-source models and significantly outperforms open-source models in scientific domains, surpassing closed-source state-of-the-art models in professional tasks, such as molecular synthesis planning, reaction condition prediction, predicting thermodynamic stabilities for crystals. Our models are available at https://huggingface.co/internlm/Intern-S1.

Understanding and Utilizing Dynamic Coupling in Free-Floating Space Manipulators for On-Orbit Servicing

Authors:Gargi Das, Daegyun Choi, Donghoon Kim

Date:2025-08-21 17:20:52

This study proposes a dynamic coupling-informed trajectory optimization algorithm for free-floating space manipulator systems (SMSs). Dynamic coupling between the base and the manipulator arms plays a critical role in influencing the system's behavior. While prior research has predominantly focused on minimizing this coupling, often overlooking its potential advantages, this work investigates how dynamic coupling can instead be leveraged to improve trajectory planning. Singular value decomposition (SVD) of the dynamic coupling matrix is employed to identify the dominant components governing coupling behavior. A quantitative metric is then formulated to characterize the strength and directionality of the coupling and is incorporated into a trajectory optimization framework. To assess the feasibility of the optimized trajectory, a sliding mode control-based tracking controller is designed to generate the required joint torque inputs. Simulation results demonstrate that explicitly accounting for dynamic coupling in trajectory planning enables more informed and potentially more efficient operation, offering new directions for the control of free-floating SMSs.

CM2LoD3: Reconstructing LoD3 Building Models Using Semantic Conflict Maps

Authors:Franz Hanke, Antonia Bieringer, Olaf Wysocki, Boris Jutzi

Date:2025-08-21 15:54:13

Detailed 3D building models are crucial for urban planning, digital twins, and disaster management applications. While Level of Detail 1 (LoD)1 and LoD2 building models are widely available, they lack detailed facade elements essential for advanced urban analysis. In contrast, LoD3 models address this limitation by incorporating facade elements such as windows, doors, and underpasses. However, their generation has traditionally required manual modeling, making large-scale adoption challenging. In this contribution, CM2LoD3, we present a novel method for reconstructing LoD3 building models leveraging Conflict Maps (CMs) obtained from ray-to-model-prior analysis. Unlike previous works, we concentrate on semantically segmenting real-world CMs with synthetically generated CMs from our developed Semantic Conflict Map Generator (SCMG). We also observe that additional segmentation of textured models can be fused with CMs using confidence scores to further increase segmentation performance and thus increase 3D reconstruction accuracy. Experimental results demonstrate the effectiveness of our CM2LoD3 method in segmenting and reconstructing building openings, with the 61% performance with uncertainty-aware fusion of segmented building textures. This research contributes to the advancement of automated LoD3 model reconstruction, paving the way for scalable and efficient 3D city modeling. Our project is available: https://github.com/InFraHank/CM2LoD3

Mind and Motion Aligned: A Joint Evaluation IsaacSim Benchmark for Task Planning and Low-Level Policies in Mobile Manipulation

Authors:Nikita Kachaev, Andrei Spiridonov, Andrey Gorodetsky, Kirill Muravyev, Nikita Oskolkov, Aditya Narendra, Vlad Shakhuro, Dmitry Makarov, Aleksandr I. Panov, Polina Fedotova, Alexey K. Kovalev

Date:2025-08-21 15:48:51

Benchmarks are crucial for evaluating progress in robotics and embodied AI. However, a significant gap exists between benchmarks designed for high-level language instruction following, which often assume perfect low-level execution, and those for low-level robot control, which rely on simple, one-step commands. This disconnect prevents a comprehensive evaluation of integrated systems where both task planning and physical execution are critical. To address this, we propose Kitchen-R, a novel benchmark that unifies the evaluation of task planning and low-level control within a simulated kitchen environment. Built as a digital twin using the Isaac Sim simulator and featuring more than 500 complex language instructions, Kitchen-R supports a mobile manipulator robot. We provide baseline methods for our benchmark, including a task-planning strategy based on a vision-language model and a low-level control policy based on diffusion policy. We also provide a trajectory collection system. Our benchmark offers a flexible framework for three evaluation modes: independent assessment of the planning module, independent assessment of the control policy, and, crucially, an integrated evaluation of the whole system. Kitchen-R bridges a key gap in embodied AI research, enabling more holistic and realistic benchmarking of language-guided robotic agents.

Adapting A Vector-Symbolic Memory for Lisp ACT-R

Authors:Meera Ray, Christopher L. Dancy

Date:2025-08-21 14:54:25

Holographic Declarative Memory (HDM) is a vector-symbolic alternative to ACT-R's Declarative Memory (DM) system that can bring advantages such as scalability and architecturally defined similarity between DM chunks. We adapted HDM to work with the most comprehensive and widely-used implementation of ACT-R (Lisp ACT-R) so extant ACT-R models designed with DM can be run with HDM without major changes. With this adaptation of HDM, we have developed vector-based versions of common ACT-R functions, set up a text processing pipeline to add the contents of large documents to ACT-R memory, and most significantly created a useful and novel mechanism to retrieve an entire chunk of memory based on a request using only vector representations of tokens. Preliminary results indicate that we can maintain vector-symbolic advantages of HDM (e.g., chunk recall without storing the actual chunk and other advantages with scaling) while also extending it so that previous ACT-R models may work with the system with little (or potentially no) modifications within the actual procedural and declarative memory portions of a model. As a part of iterative improvement of this newly translated holographic declarative memory module, we will continue to explore better time-context representations for vectors to improve the module's ability to reconstruct chunks during recall. To more fully test this translated HDM module, we also plan to develop decision-making models that use instance-based learning (IBL) theory, which is a useful application of HDM given the advantages of the system.

Multi-perspective monitoring of wildlife and human activities from camera traps and drones with deep learning models

Authors:Hao Chen, Fang Qiu, Li An, Douglas Stow, Eve Bohnett, Haitao Lyu, Shuang Tian

Date:2025-08-21 14:53:16

Wildlife and human activities are key components of landscape systems. Understanding their spatial distribution is essential for evaluating human wildlife interactions and informing effective conservation planning. Multiperspective monitoring of wildlife and human activities by combining camera traps and drone imagery. Capturing the spatial patterns of their distributions, which allows the identification of the overlap of their activity zones and the assessment of the degree of human wildlife conflict. The study was conducted in Chitwan National Park (CNP), Nepal, and adjacent regions. Images collected by visible and nearinfrared camera traps and thermal infrared drones from February to July 2022 were processed to create training and testing datasets, which were used to build deep learning models to automatic identify wildlife and human activities. Drone collected thermal imagery was used for detecting targets to provide a multiple monitoring perspective. Spatial pattern analysis was performed to identify animal and resident activity hotspots and delineation potential human wildlife conflict zones. Among the deep learning models tested, YOLOv11s achieved the highest performance with a precision of 96.2%, recall of 92.3%, mAP50 of 96.7%, and mAP50 of 81.3%, making it the most effective for detecting objects in camera trap imagery. Drone based thermal imagery, analyzed with an enhanced Faster RCNN model, added a complementary aerial viewpoint for camera trap detections. Spatial pattern analysis identified clear hotspots for both wildlife and human activities and their overlapping patterns within certain areas in the CNP and buffer zones indicating potential conflict. This study reveals human wildlife conflicts within the conserved landscape. Integrating multiperspective monitoring with automated object detection enhances wildlife surveillance and landscape management.

Neutralization of Levitated Charged Nanodiamond: Towards matter-wave interferometry with massive objects

Authors:Sela Liran, Or Dobkowski, Rafael Benjaminov, Peter Skakunenko, Michael Averbukh, Yaniv Bar-Haim, David Groswasser, Joshua H. Baraban, Ron Folman

Date:2025-08-21 14:50:58

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle, can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in completely new regimes, as well as the interface between QM and GR, e.g., testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several paths are being pursued, we focus on utilizing nanodiamonds as our particle, and a spin embedded in the ND together with Stern-Gerlach forces, to achieve a closed loop in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work we demonstrate the neutralization of levitated nanodiamonds using ultraviolet photoemission, and characterize the dependence of this process on both the illumination wavelength and particle size. Furthermore, we demonstrate discrete, single-electron charge manipulation of levitated nanodiamond in a needle Paul trap at a pressure of 0.5\,Torr. Finally, we demonstrate fast neutralization of levitated nanodiamonds, achieving a neutralization rate much faster than the state of the art. As neutralization is crucial to avoid spatial decoherence, this constitutes a significant step towards the realization of a nanodiamond spatial interferometer. We would be happy to make available more details upon request.

The Digital Life of Parisian Parks: Multifunctionality and Urban Context Uncovered by Mobile Application Traffic

Authors:André Felipe Zanella, Linus W. Dietz, Sanja Šćepanović, Ke Zhou, Zbigniew Smoreda, Daniele Quercia

Date:2025-08-21 12:43:42

Landscape architecture typically considers urban parks through the lens of form and function. While past research on equitable access has focused mainly on form, studies of functions have been constrained by limited scale and coarse measurement. Existing efforts have partially quantified functions through small-scale surveys and movement data (e.g., GPS) or general usage records (e.g., CDR), but have not captured the activities and motivations underlying park visits. As a result, our understanding of the functional roles urban parks play remains incomplete. To address this gap, we introduce a method that refines mobile base station coverage using antenna azimuths, enabling clearer distinction of mobile traffic within parks versus surrounding areas. Using Paris as a case study, we analyze a large-scale set of passively collected per-app mobile network traffic - 492 million hourly records for 45 parks. We test two hypotheses: the central-city hypothesis, which posits multifunctional parks emerge in dense, high-rent areas due to land scarcity; and the socio-spatial hypothesis, which views parks as reflections of neighborhood routines and preferences. Our analysis shows that parks have distinctive mobile traffic signatures, differing from both their surroundings and from each other. By clustering parks on temporal and app usage patterns, we identify three functional types - lunchbreak, cultural, and recreational - with different visitation motivations. Centrally located parks (cultural and lunchbreak) display more diverse app use and temporal variation, while suburban (recreational) parks reflect digital behaviors of nearby communities, with app preferences aligned to neighborhood income. These findings demonstrate the value of mobile traffic as a proxy for studying urban green space functions, with implications for park planning, public health, and well-being.

Modeling of Light Production in Inorganic Scintillators

Authors:B. Kreider, I. Cox, R. Grzywacz, J. M. Allmond, A. Augustyn, N. Braukman, P. Brionnet, A. Esmaylzadeh, J. Fischer, N. Fukuda, G. Garcia De Lorenzo, S. Go, S. Hanai, D. Hoskins, N. Imai, T. T. King, N. Kitamura, K. Kolos, A. Korgul, C. Mazzocchi, S. Nishimura, K. Nishio, V. Phong, T. Ruland, K. P. Rykaczewski, A. Skruch, Z. Y. Xu, R. Yokoyama

Date:2025-08-21 12:38:53

In recent experiments, inorganic scintillators have been used to study the decays of exotic nuclei, providing an alternative to silicon detectors and enabling measurements that were previously impossible. However, proper use of these materials requires us to understand and quantify the scintillation process. In this work, we propose a framework based on that of Birks [Proc. Phys. Soc. A 64, 874] and Meyer and Murray [Phys. Rev. 128, 98] to model the light output of inorganic scintillators in response to beams of energetic heavy ions over a broad range of energies. Our model suggests that, for sufficiently heavy ions at high energies, the majority of the light output is associated with the creation of delta electrons, which are induced by the passage of the beam through the material. These delta electrons dramatically impact the response of detection systems when subject to ions with velocities typical of beams in modern fragmentation facilities. We test the accuracy of our model with data from Lutetium Yttrium Orthosilicate (LYSO:Ce), a common inorganic scintillator. We compare calculated light production and quenching factors with experimental data for heavy ions of varying mass and energy as well as make a quantitative estimate of the effects of delta rays on overall light output. The model presented herein will serve as a basic framework for further studies of scintillator response to heavy ions. Our results are crucial in planning future experiments where relativistic exotic nuclei are interacting with scintillator detectors.

Post-processing of ensemble photovoltaic power forecasts with distributional and quantile regression methods

Authors:Martin János Mayer, Ágnes Baran, Sebastian Lerch, Nina Horat, Dazhi Yang, Sándor Baran

Date:2025-08-21 12:35:29

Accurate and reliable forecasting of photovoltaic (PV) power generation is crucial for grid operations, electricity markets, and energy planning, as solar systems now contribute a significant share of the electricity supply in many countries. PV power forecasts are often generated by converting forecasts of relevant weather variables to power predictions via a model chain. The use of ensemble simulations from numerical weather prediction models results in probabilistic PV forecasts in the form of a forecast ensemble. However, weather forecasts often exhibit systematic errors that propagate through the model chain, leading to biased and/or uncalibrated PV power predictions. These deficiencies can be mitigated by statistical post-processing. Using PV production data and corresponding short-term PV power ensemble forecasts at seven utility-scale PV plants in Hungary, we systematically evaluate and compare seven state-of-the-art methods for post-processing PV power forecasts. These include both parametric and non-parametric techniques, as well as statistical and machine learning-based approaches. Our results show that compared to the raw PV power ensemble, any form of statistical post-processing significantly improves the predictive performance. Non-parametric methods outperform parametric models, with advanced nonlinear quantile regression models showing the best results. Furthermore, machine learning-based approaches surpass their traditional statistical counterparts.

Quantum control of Nitrogen-Vacancy spin in Diamonds: Towards matter-wave interferometry with massive objects

Authors:N. Levi, O. Feldman, Y. Rosenzweig, D. Groswasser, A. Elgarat, M. Gal-Katizri, R. Folman

Date:2025-08-21 12:31:11

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in previously unexplored regimes. It also opens the possibility of probing the interface between QM and GR, such as testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several approaches are being explored, we focus on utilizing nanodiamonds with embedded spins as test particles which, in combination with Stern-Gerlach forces, enable the realization of a closed-loop matter-wave interferometer in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. Here we present our work on quantum control of a nitrogen-vacancy spin system in bulk diamonds and in levitated diamonds as a step towards Stern-Gerlach interferometry with levitated nanodiamonds. Our simulations show that the current state of the art for spin coherence time in nanodiamonds of a few tens of microseconds, is good enough to enable an SGI spatial splitting on the order of nanometers for an ND composed of 10^7 atoms. We would be happy to make available more details upon request.

LGMSNet: Thinning a medical image segmentation model via dual-level multiscale fusion

Authors:Chengqi Dong, Fenghe Tang, Rongge Mao, Xinpei Gao, S. Kevin Zhou

Date:2025-08-21 11:54:09

Medical image segmentation plays a pivotal role in disease diagnosis and treatment planning, particularly in resource-constrained clinical settings where lightweight and generalizable models are urgently needed. However, existing lightweight models often compromise performance for efficiency and rarely adopt computationally expensive attention mechanisms, severely restricting their global contextual perception capabilities. Additionally, current architectures neglect the channel redundancy issue under the same convolutional kernels in medical imaging, which hinders effective feature extraction. To address these challenges, we propose LGMSNet, a novel lightweight framework based on local and global dual multiscale that achieves state-of-the-art performance with minimal computational overhead. LGMSNet employs heterogeneous intra-layer kernels to extract local high-frequency information while mitigating channel redundancy. In addition, the model integrates sparse transformer-convolutional hybrid branches to capture low-frequency global information. Extensive experiments across six public datasets demonstrate LGMSNet's superiority over existing state-of-the-art methods. In particular, LGMSNet maintains exceptional performance in zero-shot generalization tests on four unseen datasets, underscoring its potential for real-world deployment in resource-limited medical scenarios. The whole project code is in https://github.com/cq-dong/LGMSNet.

Lang2Lift: A Framework for Language-Guided Pallet Detection and Pose Estimation Integrated in Autonomous Outdoor Forklift Operation

Authors:Huy Hoang Nguyen, Johannes Huemer, Markus Murschitz, Tobias Glueck, Minh Nhat Vu, Andreas Kugi

Date:2025-08-21 10:28:39

The logistics and construction industries face persistent challenges in automating pallet handling, especially in outdoor environments with variable payloads, inconsistencies in pallet quality and dimensions, and unstructured surroundings. In this paper, we tackle automation of a critical step in pallet transport: the pallet pick-up operation. Our work is motivated by labor shortages, safety concerns, and inefficiencies in manually locating and retrieving pallets under such conditions. We present Lang2Lift, a framework that leverages foundation models for natural language-guided pallet detection and 6D pose estimation, enabling operators to specify targets through intuitive commands such as "pick up the steel beam pallet near the crane." The perception pipeline integrates Florence-2 and SAM-2 for language-grounded segmentation with FoundationPose for robust pose estimation in cluttered, multi-pallet outdoor scenes under variable lighting. The resulting poses feed into a motion planning module for fully autonomous forklift operation. We validate Lang2Lift on the ADAPT autonomous forklift platform, achieving 0.76 mIoU pallet segmentation accuracy on a real-world test dataset. Timing and error analysis demonstrate the system's robustness and confirm its feasibility for deployment in operational logistics and construction environments. Video demonstrations are available at https://eric-nguyen1402.github.io/lang2lift.github.io/

Planning with Minimal Disruption

Authors:Alberto Pozanco, Marianela Morales, Daniel Borrajo, Manuela Veloso

Date:2025-08-21 08:38:17

In many planning applications, we might be interested in finding plans that minimally modify the initial state to achieve the goals. We refer to this concept as plan disruption. In this paper, we formally introduce it, and define various planning-based compilations that aim to jointly optimize both the sum of action costs and plan disruption. Experimental results in different benchmarks show that the reformulated task can be effectively solved in practice to generate plans that balance both objectives.

RETAIL: Towards Real-world Travel Planning for Large Language Models

Authors:Bin Deng, Yizhe Feng, Zeming Liu, Qing Wei, Xiangrong Zhu, Shuai Chen, Yuanfang Guo, Yunhong Wang

Date:2025-08-21 08:08:38

Although large language models have enhanced automated travel planning abilities, current systems remain misaligned with real-world scenarios. First, they assume users provide explicit queries, while in reality requirements are often implicit. Second, existing solutions ignore diverse environmental factors and user preferences, limiting the feasibility of plans. Third, systems can only generate plans with basic POI arrangements, failing to provide all-in-one plans with rich details. To mitigate these challenges, we construct a novel dataset \textbf{RETAIL}, which supports decision-making for implicit queries while covering explicit queries, both with and without revision needs. It also enables environmental awareness to ensure plan feasibility under real-world scenarios, while incorporating detailed POI information for all-in-one travel plans. Furthermore, we propose a topic-guided multi-agent framework, termed TGMA. Our experiments reveal that even the strongest existing model achieves merely a 1.0% pass rate, indicating real-world travel planning remains extremely challenging. In contrast, TGMA demonstrates substantially improved performance 2.72%, offering promising directions for real-world travel planning.

Dosimetric Evaluation of MapCHECK in MapPHAN for Conformal Arc SABR Quality Assurance

Authors:Nathan I. N. Henry, Christopher M. Thompson, Steven Marsh, Jack D. Aylward

Date:2025-08-21 05:36:46

This study investigates the sensitivity and specificity of the MapCHECK phantom housed in MapPHAN to errors in conformal arc Stereotactic Ablative Body Radiotherapy (SABR). A specific focus is applied to Organ At Risk (OAR) dosimetry. Multi-Leaf Collimator (MLC) class shift errors up to 2 mm, and isocentre shift errors of up to 1 mm, were introduced to 23 simulated 6 MV X-ray conformal arc lung SABR plans in Raystation. Overall, 198 plans were delivered on a Varian Clinac iX linear accelerator to an in-house modified MapCHECK in MapPHAN. Gamma analysis was used to compare the MapCHECK measurements to the simulated Raystation plans. Receiver Operating Characteristic (ROC) curves were generated from these results to determine the sensitivity and specificity of the measurement technique to the introduced errors. For MapCHECK in MapPHAN, a combination of 5%/1 mm with 95% gamma tolerance and 2%/1 mm with 90% tolerance provides good sensitivity and specificity for Quality Assurance (QA) of conformal arc SABR plans.

Mobile-Agent-v3: Foundamental Agents for GUI Automation

Authors:Jiabo Ye, Xi Zhang, Haiyang Xu, Haowei Liu, Junyang Wang, Zhaoqing Zhu, Ziwei Zheng, Feiyu Gao, Junjie Cao, Zhengxi Lu, Jitong Liao, Qi Zheng, Fei Huang, Jingren Zhou, Ming Yan

Date:2025-08-21 00:39:12

This paper introduces GUI-Owl, a foundational GUI agent model that achieves state-of-the-art performance among open-source end-to-end models on ten GUI benchmarks across desktop and mobile environments, covering grounding, question answering, planning, decision-making, and procedural knowledge. GUI-Owl-7B achieves 66.4 on AndroidWorld and 29.4 on OSWorld. Building on this, we propose Mobile-Agent-v3, a general-purpose GUI agent framework that further improves performance to 73.3 on AndroidWorld and 37.7 on OSWorld, setting a new state-of-the-art for open-source GUI agent frameworks. GUI-Owl incorporates three key innovations: (1) Large-scale Environment Infrastructure: a cloud-based virtual environment spanning Android, Ubuntu, macOS, and Windows, enabling our Self-Evolving GUI Trajectory Production framework. This generates high-quality interaction data via automated query generation and correctness validation, leveraging GUI-Owl to refine trajectories iteratively, forming a self-improving loop. It supports diverse data pipelines and reduces manual annotation. (2) Diverse Foundational Agent Capabilities: by integrating UI grounding, planning, action semantics, and reasoning patterns, GUI-Owl supports end-to-end decision-making and can act as a modular component in multi-agent systems. (3) Scalable Environment RL: we develop a scalable reinforcement learning framework with fully asynchronous training for real-world alignment. We also introduce Trajectory-aware Relative Policy Optimization (TRPO) for online RL, achieving 34.9 on OSWorld. GUI-Owl and Mobile-Agent-v3 are open-sourced at https://github.com/X-PLUG/MobileAgent.

Condensation Clouds in Substellar Atmospheres with Virga

Authors:Natasha E. Batalha, Caoimhe M. Rooney, Channon Visscher, Sarah E. Moran, Mark S. Marley, Aditya R. Sengupta, Sven Kiefer, Matt G. Lodge, James Mang, Caroline V. Morley, Sagnick Mukherjee, Jonathan J. Fortney, Peter Gao, Nikole K. Lewis, L. C. Mayorga, Logan A. Pearce, Hannah R. Wakeford

Date:2025-08-20 22:40:38

Here we present an open-source cloud model for substellar atmospheres, called Virga. The Virga-v0 series has already been widely adopted in the literature. It is written in Python and has heritage from the Ackerman & Marley (2001) model (often referred to as eddysed), used to study clouds on both exoplanets and brown dwarfs. In the development of the official Virga-v1 we have retained all the original functionality of eddysed and updated/expanded several components including the back-end optical constants data, calculations of the Mie properties, available condensate species, saturation vapor pressure curves and formalism for fall speeds calculations. Here we benchmark Virga by reproducing key results in the literature, including the SiO2 cloud detection in WASP-17 b and the brown dwarf Diamondback-Sonora model series. Development of Virga is ongoing, with future versions already planned and ready for release. We encourage community feedback and collaborations within the GitHub code repository.

Alpha Berkeley: A Scalable Framework for the Orchestration of Agentic Systems

Authors:Thorsten Hellert, João Montenegro, Antonin Sulc

Date:2025-08-20 20:57:13

Coordinating workflows across heterogeneous control systems remains a central challenge in safety-critical environments such as scientific facilities, industrial plants, and energy infrastructures. Language-model-driven agents offer a natural interface for these tasks, but existing approaches often lack scalability, reliability, and human oversight. We introduce the Alpha Berkeley Framework, a production-ready architecture for scalable agentic systems that integrate conversational context with robust tool orchestration. The framework features dynamic capability classification to select only relevant tools per task, a plan-first orchestration model that generates execution plans with explicit dependencies and optional human approval, context-aware task extraction that combines dialogue history with external memory and domain resources, and production-ready execution environments with checkpointing, artifact management, and modular deployment. We demonstrate its versatility through two case studies: a tutorial-style wind farm monitoring example and a deployment at the Advanced Light Source particle accelerator. These results establish Alpha Berkeley as a reliable and transparent framework for agentic systems in high-stakes domains.

Estimating the spatial economic and environmental impact of planned offshore wind energy in the USA using Environmentally Extended Multiregional Input-Output analysis

Authors:Apoorva Bademi, Miriam Stevens, Isha Sura, Shweta Singh

Date:2025-08-20 19:04:33

There is a projected increase in offshore wind energy generation in the United States over the next three decades, driven by legislative commitments and government funding. Like other renewable technologies, the construction of offshore wind farms has environmental impacts and spillover effects that must be assessed. Developing offshore wind as a reliable domestic energy source requires a multiregional analysis of economic and environmental effects of constructing projects along lakefronts and coastal regions. Although no commercial offshore wind farms currently operate in the United States, seven states have announced capacity commitments exceeding 28 gigawatts by 2035. This study evaluates the spatial economic and environmental impacts of planned projects by linking the National Renewable Energy Laboratory Offshore Renewables Balance-of-system Installation Tool (ORBIT) with a multiregional input-output model of the U.S. economy developed in the Virtual Industrial Ecology Lab. ORBIT provides capital investment requirements for installation, which are combined with the model to estimate economic spillover effects. Environmental impacts are assessed using a newly developed multiregional greenhouse gas emissions dataset for the U.S. to capture supply chain emissions of offshore wind construction. The five projects analyzed require 16.3 billion dollars in capital investment and generate 27.6 billion dollars in direct and indirect economic impacts across the country. Emissions results show that states active in energy generation are most affected, but impacts can be reduced by decarbonizing the grid. A carbon payback analysis indicates the projects offset construction-phase emissions in less than a year. The framework highlights which states experience the greatest spillover effects in terms of emissions and economic activity required to support offshore wind expansion.

Systematic Evaluation of Wavelet-Based Denoising for MRI Brain Images: Optimal Configurations and Performance Benchmarks

Authors:Asadullah Bin Rahman, Masud Ibn Afjal, Md. Abdulla Al Mamun

Date:2025-08-20 19:04:32

Medical imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound are essential for accurate diagnosis and treatment planning in modern healthcare. However, noise contamination during image acquisition and processing frequently degrades image quality, obscuring critical diagnostic details and compromising clinical decision-making. Additionally, enhancement techniques such as histogram equalization may inadvertently amplify existing noise artifacts, including salt-and-pepper distortions. This study investigates wavelet transform-based denoising methods for effective noise mitigation in medical images, with the primary objective of identifying optimal combinations of threshold values, decomposition levels, and wavelet types to achieve superior denoising performance and enhanced diagnostic accuracy. Through systematic evaluation across various noise conditions, the research demonstrates that the bior6.8 biorthogonal wavelet with universal thresholding at decomposition levels 2-3 consistently achieves optimal denoising performance, providing significant noise reduction while preserving essential anatomical structures and diagnostic features critical for clinical applications.

GraspQP: Differentiable Optimization of Force Closure for Diverse and Robust Dexterous Grasping

Authors:René Zurbrügg, Andrei Cramariuc, Marco Hutter

Date:2025-08-20 18:43:16

Dexterous robotic hands enable versatile interactions due to the flexibility and adaptability of multi-fingered designs, allowing for a wide range of task-specific grasp configurations in diverse environments. However, to fully exploit the capabilities of dexterous hands, access to diverse and high-quality grasp data is essential -- whether for developing grasp prediction models from point clouds, training manipulation policies, or supporting high-level task planning with broader action options. Existing approaches for dataset generation typically rely on sampling-based algorithms or simplified force-closure analysis, which tend to converge to power grasps and often exhibit limited diversity. In this work, we propose a method to synthesize large-scale, diverse, and physically feasible grasps that extend beyond simple power grasps to include refined manipulations, such as pinches and tri-finger precision grasps. We introduce a rigorous, differentiable energy formulation of force closure, implicitly defined through a Quadratic Program (QP). Additionally, we present an adjusted optimization method (MALA*) that improves performance by dynamically rejecting gradient steps based on the distribution of energy values across all samples. We extensively evaluate our approach and demonstrate significant improvements in both grasp diversity and the stability of final grasp predictions. Finally, we provide a new, large-scale grasp dataset for 5,700 objects from DexGraspNet, comprising five different grippers and three distinct grasp types. Dataset and Code:https://graspqp.github.io/

Virtual Community: An Open World for Humans, Robots, and Society

Authors:Qinhong Zhou, Hongxin Zhang, Xiangye Lin, Zheyuan Zhang, Yutian Chen, Wenjun Liu, Zunzhe Zhang, Sunli Chen, Lixing Fang, Qiushi Lyu, Xinyu Sun, Jincheng Yang, Zeyuan Wang, Bao Chi Dang, Zhehuan Chen, Daksha Ladia, Jiageng Liu, Chuang Gan

Date:2025-08-20 17:59:32

The rapid progress in AI and Robotics may lead to a profound societal transformation, as humans and robots begin to coexist within shared communities, introducing both opportunities and challenges. To explore this future, we present Virtual Community-an open-world platform for humans, robots, and society-built on a universal physics engine and grounded in real-world 3D scenes. With Virtual Community, we aim to study embodied social intelligence at scale: 1) How robots can intelligently cooperate or compete; 2) How humans develop social relations and build community; 3) More importantly, how intelligent robots and humans can co-exist in an open world. To support these, Virtual Community features: 1) An open-source multi-agent physics simulator that supports robots, humans, and their interactions within a society; 2) A large-scale, real-world aligned community generation pipeline, including vast outdoor space, diverse indoor scenes, and a community of grounded agents with rich characters and appearances. Leveraging Virtual Community, we propose two novel challenges. The Community Planning Challenge evaluates multi-agent reasoning and planning ability in open-world settings, such as cooperating to help agents with daily activities and efficiently connecting other agents. The Community Robot Challenge requires multiple heterogeneous robots to collaborate in solving complex open-world tasks. We evaluate various baselines on these tasks and demonstrate the challenges in both high-level open-world task planning and low-level cooperation controls. We hope that Virtual Community will unlock further study of human-robot coexistence within open-world environments.

Safe and Transparent Robots for Human-in-the-Loop Meat Processing

Authors:Sagar Parekh, Casey Grothoff, Ryan Wright, Robin White, Dylan P. Losey

Date:2025-08-20 15:10:01

Labor shortages have severely affected the meat processing sector. Automated technology has the potential to support the meat industry, assist workers, and enhance job quality. However, existing automation in meat processing is highly specialized, inflexible, and cost intensive. Instead of forcing manufacturers to buy a separate device for each step of the process, our objective is to develop general-purpose robotic systems that work alongside humans to perform multiple meat processing tasks. Through a recently conducted survey of industry experts, we identified two main challenges associated with integrating these collaborative robots alongside human workers. First, there must be measures to ensure the safety of human coworkers; second, the coworkers need to understand what the robot is doing. This paper addresses both challenges by introducing a safety and transparency framework for general-purpose meat processing robots. For safety, we implement a hand-detection system that continuously monitors nearby humans. This system can halt the robot in situations where the human comes into close proximity of the operating robot. We also develop an instrumented knife equipped with a force sensor that can differentiate contact between objects such as meat, bone, or fixtures. For transparency, we introduce a method that detects the robot's uncertainty about its performance and uses an LED interface to communicate that uncertainty to the human. Additionally, we design a graphical interface that displays the robot's plans and allows the human to provide feedback on the planned cut. Overall, our framework can ensure safe operation while keeping human workers in-the-loop about the robot's actions which we validate through a user study.

Design of high-efficiency UHV loading of nanodiamonds into a Paul trap: Towards Matter-Wave Interferometry with Massive Objects

Authors:Rafael Benjaminov, Sela Liran, Or Dobkowski, Yaniv Bar-Haim, Michael Averbukh, Ron Folman

Date:2025-08-20 14:01:50

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle, can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in completely new regimes, as well as the interface between QM and GR, e.g., testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several paths are being pursued, we focus on utilizing nanodiamonds as our particle, and a spin embedded in the ND together with Stern-Gerlach forces, to achieve a closed loop in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work, we review current methods for loading nanodiamonds into a Paul trap, and their capabilities and limitations regarding our application. We also present our experiments on loading and launching nanodiamonds using a vibrating piezoelectric element and by electrical forces. Finally, we present our design of a novel nanodiamond loading method for ultra-high-vacuum experiments. As the production of highly accurate, high-purity nanodiamonds with a single NV required for interferometric measurements is expected to be expensive, we put emphasis on achieving high loading efficiency, while loading the charged ND into a Paul trap in ultra-high vacuum.

Rule-based Key-Point Extraction for MR-Guided Biomechanical Digital Twins of the Spine

Authors:Robert Graf, Tanja Lerchl, Kati Nispel, Hendrik Möller, Matan Atad, Julian McGinnis, Julius Maria Watrinet, Johannes Paetzold, Daniel Rueckert, Jan S. Kirschke

Date:2025-08-20 13:31:40

Digital twins offer a powerful framework for subject-specific simulation and clinical decision support, yet their development often hinges on accurate, individualized anatomical modeling. In this work, we present a rule-based approach for subpixel-accurate key-point extraction from MRI, adapted from prior CT-based methods. Our approach incorporates robust image alignment and vertebra-specific orientation estimation to generate anatomically meaningful landmarks that serve as boundary conditions and force application points, like muscle and ligament insertions in biomechanical models. These models enable the simulation of spinal mechanics considering the subject's individual anatomy, and thus support the development of tailored approaches in clinical diagnostics and treatment planning. By leveraging MR imaging, our method is radiation-free and well-suited for large-scale studies and use in underrepresented populations. This work contributes to the digital twin ecosystem by bridging the gap between precise medical image analysis with biomechanical simulation, and aligns with key themes in personalized modeling for healthcare.

Trapping and cooling of nanodiamonds in a Paul trap under ultra-high vacuum: Towards matter-wave interferometry with massive objects

Authors:Omer Feldman, Ben Baruch Shultz, Maria Muretova, Or Dobkowski, Yonathan Japha, David Grosswasser, Ron Folman

Date:2025-08-20 13:05:23

Quantum mechanics (QM) and General relativity (GR), also known as the theory of gravity, are the two pillars of modern physics. A matter-wave interferometer with a massive particle can test numerous fundamental ideas, including the spatial superposition principle - a foundational concept in QM - in previously unexplored regimes. It also opens the possibility of probing the interface between QM and GR, such as testing the quantization of gravity. Consequently, there exists an intensive effort to realize such an interferometer. While several approaches are being explored, we focus on utilizing nanodiamonds with embedded spins as test particles which, in combination with Stern-Gerlach forces, enable the realization of a closed-loop matter-wave interferometer in space-time. There is a growing community of groups pursuing this path [1]. We are posting this technical note (as part of a series of seven such notes), to highlight our plans and solutions concerning various challenges in this ambitious endeavor, hoping this will support this growing community. In this work we detail the trapping of a nanodiamond at 10^-8 mbar, which is good enough for the realization of a short-duration Stern-Gerlach interferometer. We describe in detail the cooling we have performed to sub-Kelvin temperatures, and demonstrate that the nanodiamond remains confined within the trap even under high-intensity 1560 nm laser illumination. We would be happy to make available more details upon request.

An Informative Planning Framework for Target Tracking and Active Mapping in Dynamic Environments with ASVs

Authors:Sanjeev Ramkumar Sudha, Marija Popović, Erlend M. Coates

Date:2025-08-20 11:44:30

Mobile robot platforms are increasingly being used to automate information gathering tasks such as environmental monitoring. Efficient target tracking in dynamic environments is critical for applications such as search and rescue and pollutant cleanups. In this letter, we study active mapping of floating targets that drift due to environmental disturbances such as wind and currents. This is a challenging problem as it involves predicting both spatial and temporal variations in the map due to changing conditions. We propose an informative path planning framework to map an arbitrary number of moving targets with initially unknown positions in dynamic environments. A key component of our approach is a spatiotemporal prediction network that predicts target position distributions over time. We propose an adaptive planning objective for target tracking that leverages these predictions. Simulation experiments show that our proposed planning objective improves target tracking performance compared to existing methods that consider only entropy reduction as the planning objective. Finally, we validate our approach in field tests using an autonomous surface vehicle, showcasing its ability to track targets in real-world monitoring scenarios.

TRUST-Planner: Topology-guided Robust Trajectory Planner for AAVs with Uncertain Obstacle Spatial-temporal Avoidance

Authors:Junzhi Li, Teng Long, Jingliang Sun, Jianxin Zhong

Date:2025-08-20 10:52:28

Despite extensive developments in motion planning of autonomous aerial vehicles (AAVs), existing frameworks faces the challenges of local minima and deadlock in complex dynamic environments, leading to increased collision risks. To address these challenges, we present TRUST-Planner, a topology-guided hierarchical planning framework for robust spatial-temporal obstacle avoidance. In the frontend, a dynamic enhanced visible probabilistic roadmap (DEV-PRM) is proposed to rapidly explore topological paths for global guidance. The backend utilizes a uniform terminal-free minimum control polynomial (UTF-MINCO) and dynamic distance field (DDF) to enable efficient predictive obstacle avoidance and fast parallel computation. Furthermore, an incremental multi-branch trajectory management framework is introduced to enable spatio-temporal topological decision-making, while efficiently leveraging historical information to reduce replanning time. Simulation results show that TRUST-Planner outperforms baseline competitors, achieving a 96\% success rate and millisecond-level computation efficiency in tested complex environments. Real-world experiments further validate the feasibility and practicality of the proposed method.

Multi-Tier UAV Edge Computing for Low Altitude Networks Towards Long-Term Energy Stability

Authors:Yufei Ye, Shijian Gao, Xinhu Zheng, Liuqing Yang

Date:2025-08-20 10:37:22

This paper presents a novel multi-tier UAV-assisted edge computing system designed for low-altitude networks. The system comprises vehicle users, lightweight Low-Tier UAVs (L-UAVs), and High-Tier UAV (H-UAV). L-UAVs function as small-scale edge servers positioned closer to vehicle users, while the H-UAV, equipped with more powerful server and larger-capacity battery, serves as mobile backup server to address the limitations in endurance and computing resources of L-UAVs. The primary objective is to minimize task execution delays while ensuring long-term energy stability for L-UAVs. To address this challenge, the problem is first decoupled into a series of deterministic problems for each time slot using Lyapunov optimization. The priorities of task delay and energy consumption for L-UAVs are adaptively adjusted based on real-time energy status. The optimization tasks include assignment of tasks, allocation of computing resources, and trajectory planning for both L-UAVs and H-UAV. Simulation results demonstrate that the proposed approach achieves a reduction of at least 26% in transmission energy for L-UAVs and exhibits superior energy stability compared to existing benchmarks.