planning - 2025-04-06

Handover and SINR-Aware Path Optimization in 5G-UAV mmWave Communication using DRL

Authors:Achilles Kiwanuka Machumilane, Alberto Gotta, Pietro Cassarà
Date:2025-04-03 15:28:04

Path planning and optimization for unmanned aerial vehicles (UAVs)-assisted next-generation wireless networks is critical for mobility management and ensuring UAV safety and ubiquitous connectivity, especially in dense urban environments with street canyons and tall buildings. Traditional statistical and model-based techniques have been successfully used for path optimization in communication networks. However, when dynamic channel propagation characteristics such as line-of-sight (LOS), interference, handover, and signal-to-interference and noise ratio (SINR) are included in path optimization, statistical and model-based path planning solutions become obsolete since they cannot adapt to the dynamic and time-varying wireless channels, especially in the mmWave bands. In this paper, we propose a novel model-free actor-critic deep reinforcement learning (AC-DRL) framework for path optimization in UAV-assisted 5G mmWave wireless networks, which combines four important aspects of UAV communication: \textit{flight time, handover, connectivity and SINR}. We train an AC-RL agent that enables a UAV connected to a gNB to determine the optimal path to a desired destination in the shortest possible time with minimal gNB handover, while maintaining connectivity and the highest possible SINR. We train our model with data from a powerful ray tracing tool called Wireless InSite, which uses 3D images of the propagation environment and provides data that closely resembles the real propagation environment. The simulation results show that our system has superior performance in tracking high SINR compared to other selected RL algorithms.

Two-Stage nnU-Net for Automatic Multi-class Bi-Atrial Segmentation from LGE-MRIs

Authors:Y. On, C. Galazis, C. Chiu, M. Varela
Date:2025-04-03 15:08:33

Late gadolinium enhancement magnetic resonance imaging (LGE-MRI) is used to visualise atrial fibrosis and scars, providing important information for personalised atrial fibrillation (AF) treatments. Since manual analysis and delineations of these images can be both labour-intensive and subject to variability, we develop an automatic pipeline to perform segmentation of the left atrial (LA) cavity, the right atrial (RA) cavity, and the wall of both atria on LGE-MRI. Our method is based on a two-stage nnU-Net architecture, combining 2D and 3D convolutional networks, and incorporates adaptive histogram equalisation to improve tissue contrast in the input images and morphological operations on the output segmentation maps. We achieve Dice similarity coefficients of 0.92 +/- 0.03, 0.93 +/- 0.03, 0.71 +/- 0.05 and 95% Hausdorff distances of (3.89 +/- 6.67) mm, (4.42 +/- 1.66) mm and (3.94 +/- 1.83) mm for LA, RA, and wall, respectively. The accurate delineation of the LA, RA and the myocardial wall is the first step in analysing atrial structure in cardiovascular patients, especially those with AF. This can allow clinicians to provide adequate and personalised treatment plans in a timely manner.

Adaptive Frequency Enhancement Network for Remote Sensing Image Semantic Segmentation

Authors:Feng Gao, Miao Fu, Jingchao Cao, Junyu Dong, Qian Du
Date:2025-04-03 14:42:49

Semantic segmentation of high-resolution remote sensing images plays a crucial role in land-use monitoring and urban planning. Recent remarkable progress in deep learning-based methods makes it possible to generate satisfactory segmentation results. However, existing methods still face challenges in adapting network parameters to various land cover distributions and enhancing the interaction between spatial and frequency domain features. To address these challenges, we propose the Adaptive Frequency Enhancement Network (AFENet), which integrates two key components: the Adaptive Frequency and Spatial feature Interaction Module (AFSIM) and the Selective feature Fusion Module (SFM). AFSIM dynamically separates and modulates high- and low-frequency features according to the content of the input image. It adaptively generates two masks to separate high- and low-frequency components, therefore providing optimal details and contextual supplementary information for ground object feature representation. SFM selectively fuses global context and local detailed features to enhance the network's representation capability. Hence, the interactions between frequency and spatial features are further enhanced. Extensive experiments on three publicly available datasets demonstrate that the proposed AFENet outperforms state-of-the-art methods. In addition, we also validate the effectiveness of AFSIM and SFM in managing diverse land cover types and complex scenarios. Our codes are available at https://github.com/oucailab/AFENet.

The FCC integrated programme: a physics manifesto

Authors:Alain Blondel, Christophe Grojean, Patrick Janot, Srini Rajagopalan, Guy Wilkinson
Date:2025-04-03 14:31:24

The FCC integrated programme comprises an $\rm e^+e^-$ high-luminosity circular collider that will produce very large samples of data in an energy range $88 \le \sqrt{s} \le 365$ GeV, followed by a high-energy $\rm pp$ machine that, with the current baseline plan, will operate at a collision energy of around 85 TeV and deliver datasets an order of magnitude larger than those of the HL-LHC. This visionary project will allow for transformative measurements across a very broad range of topics, which in almost all cases will exceed in sensitivity the projections of any other proposed facility, and simultaneously provide the best possible opportunity for discovering physics beyond the Standard Model. The highlights of the physics programme are presented, together with discussion on the key attributes of the integrated project that enable the physics reach. It is noted that the baseline programme of FCC-ee, in particular, is both flexible and extendable, and also that the synergy and complementarity of the electron and proton machines, and the sharing of a common infrastructure, provides a remarkably efficient, timely and cost-effective approach to addressing the most pressing open questions in elementary particle physics.

The Homotopy Category of Strongly flat modules

Authors:Javad Asadollahi, Somayeh Sadeghi
Date:2025-04-03 14:03:52

In this paper, we plan to build upon significant results by Amnon Neeman regarding the homotopy category of flat modules to study ${\mathbb{K}}({S\rm{SF}}\mbox{-}R)$, the homotopy category of $S$-strongly flat modules, where $S$ is a multiplicatively closed subset of a commutative ring $R$. The category ${\mathbb{K}}({S\rm{SF}}\mbox{-}R)$ is an intermediate triangulated category that includes ${\mathbb{K}}({\rm{Prj}\mbox{-}} R)$, the homotopy category of projective $R$-modules, which is always well generated by a result of Neeman, and is included in ${\mathbb{K}}({\rm{Flat}}\mbox{-} R)$, the homotopy category of flat $R$-modules, which is well generated if and only if $R$ is perfect, by a result of \v{S}\'{t}ov\'{i}\v{c}ek. We analyze corresponding inclusion functors and the existence of their adjoints. In this way, we provide a new, fully faithful embedding of the homotopy category of projectives to the homotopy category of $S$-strongly flat modules. We introduce the notion of $S$-almost well generated triangulated categories. If $R$ is an $S$-almost perfect ring, ${\mathbb{K}}({\rm{Flat}}\mbox{-} R)$ is $S$-almost well generated. We show that the converse is true under certain conditions on the ring $R$. We hope that this approach provides insights into the largely mysterious class of $S$-strongly flat modules.

Assessing Geographical and Seasonal Influences on Energy Efficiency of Electric Drayage Trucks

Authors:Ankur Shiledar, Manfredi Villani, Joseph N. E. Lucero, Ruixiao Sun, Vivek A. Sujan, Simona Onori, Giorgio Rizzoni
Date:2025-04-03 13:39:21

The electrification of heavy-duty vehicles is a critical pathway towards improved energy efficiency of the freight sector. The current battery electric truck technology poses several challenges to the operations of commercial vehicles, such as limited driving range, sensitivity to climate conditions, and long recharging times. Estimating the energy consumption of heavy-duty electric trucks is crucial to assess the feasibility of the fleet electrification and its impact on the electric grid. This paper focuses on developing a model-based simulation approach to predict and analyze the energy consumption of drayage trucks used in ports logistic operations, considering seasonal climate variations and geographical characteristics. The paper includes results for three major container ports within the United States, providing region-specific insights into driving range, payload capacity, and charging infrastructure requirements, which will inform decision-makers in integrating electric trucks into the existing drayage operations and plan investments for electric grid development.

A Planning Framework for Stable Robust Multi-Contact Manipulation

Authors:Lin Yang, Sri Harsha Turlapati, Zhuoyi Lu, Chen Lv, Domenico Campolo
Date:2025-04-03 12:05:12

While modeling multi-contact manipulation as a quasi-static mechanical process transitioning between different contact equilibria, we propose formulating it as a planning and optimization problem, explicitly evaluating (i) contact stability and (ii) robustness to sensor noise. Specifically, we conduct a comprehensive study on multi-manipulator control strategies, focusing on dual-arm execution in a planar peg-in-hole task and extending it to the Multi-Manipulator Multiple Peg-in-Hole (MMPiH) problem to explore increased task complexity. Our framework employs Dynamic Movement Primitives (DMPs) to parameterize desired trajectories and Black-Box Optimization (BBO) with a comprehensive cost function incorporating friction cone constraints, squeeze forces, and stability considerations. By integrating parallel scenario training, we enhance the robustness of the learned policies. To evaluate the friction cone cost in experiments, we test the optimal trajectories computed for various contact surfaces, i.e., with different coefficients of friction. The stability cost is analytical explained and tested its necessity in simulation. The robustness performance is quantified through variations of hole pose and chamfer size in simulation and experiment. Results demonstrate that our approach achieves consistently high success rates in both the single peg-in-hole and multiple peg-in-hole tasks, confirming its effectiveness and generalizability. The video can be found at https://youtu.be/IU0pdnSd4tE.

Industrial Internet Robot Collaboration System and Edge Computing Optimization

Authors:Qian Zuo, Dajun Tao, Tian Qi, Jieyi Xie, Zijie Zhou, Zhen Tian, Yu Mingyu
Date:2025-04-03 11:15:10

In a complex environment, for a mobile robot to safely and collision - free avoid all obstacles, it poses high requirements for its intelligence level. Given that the information such as the position and geometric characteristics of obstacles is random, the control parameters of the robot, such as velocity and angular velocity, are also prone to random deviations. To address this issue in the framework of the Industrial Internet Robot Collaboration System, this paper proposes a global path control scheme for mobile robots based on deep learning. First of all, the dynamic equation of the mobile robot is established. According to the linear velocity and angular velocity of the mobile robot, its motion behaviors are divided into obstacle - avoidance behavior, target - turning behavior, and target approaching behavior. Subsequently, the neural network method in deep learning is used to build a global path planning model for the robot. On this basis, a fuzzy controller is designed with the help of a fuzzy control algorithm to correct the deviations that occur during path planning, thereby achieving optimized control of the robot's global path. In addition, considering edge computing optimization, the proposed model can process local data at the edge device, reducing the communication burden between the robot and the central server, and improving the real time performance of path planning. The experimental results show that for the mobile robot controlled by the research method in this paper, the deviation distance of the path angle is within 5 cm, the deviation convergence can be completed within 10 ms, and the planned path is shorter. This indicates that the proposed scheme can effectively improve the global path planning ability of mobile robots in the industrial Internet environment and promote the collaborative operation of robots through edge computing optimization.

Adaptive path planning for efficient object search by UAVs in agricultural fields

Authors:Rick van Essen, Eldert van Henten, Lammert Kooistra, Gert Kootstra
Date:2025-04-03 10:47:31

This paper presents an adaptive path planner for object search in agricultural fields using UAVs. The path planner uses a high-altitude coverage flight path and plans additional low-altitude inspections when the detection network is uncertain. The path planner was evaluated in an offline simulation environment containing real-world images. We trained a YOLOv8 detection network to detect artificial plants placed in grass fields to showcase the potential of our path planner. We evaluated the effect of different detection certainty measures, optimized the path planning parameters, investigated the effects of localization errors and different numbers of objects in the field. The YOLOv8 detection confidence worked best to differentiate between true and false positive detections and was therefore used in the adaptive planner. The optimal parameters of the path planner depended on the distribution of objects in the field, when the objects were uniformly distributed, more low-altitude inspections were needed compared to a non-uniform distribution of objects, resulting in a longer path length. The adaptive planner proved to be robust against localization uncertainty. When increasing the number of objects, the flight path length increased, especially when the objects were uniformly distributed. When the objects were non-uniformly distributed, the adaptive path planner yielded a shorter path than a low-altitude coverage path, even with high number of objects. Overall, the presented adaptive path planner allowed to find non-uniformly distributed objects in a field faster than a coverage path planner and resulted in a compatible detection accuracy. The path planner is made available at https://github.com/wur-abe/uav_adaptive_planner.

Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-ray: Summary of the PENGWIN 2024 Challenge

Authors:Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D. Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Płotka, Rafał Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S. Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang
Date:2025-04-03 08:19:36

The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm attained an IoU of 0.774, highlighting the greater challenges posed by overlapping anatomical structures. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.

A Comparative Study of MINLP and MPVC Formulations for Solving Complex Nonlinear Decision-Making Problems in Aerospace Applications

Authors:Andrea Ghezzi, Armin Nurkanović, Avishai Weiss, Moritz Diehl, Stefano Di Cairano
Date:2025-04-03 08:08:52

High-level decision-making for dynamical systems often involves performance and safety specifications that are activated or deactivated depending on conditions related to the system state and commands. Such decision-making problems can be naturally formulated as optimization problems where these conditional activations are regulated by discrete variables. However, solving these problems can be challenging numerically, even on powerful computing platforms, especially when the dynamics are nonlinear. In this work, we consider decision-making for nonlinear systems where certain constraints, as well as possible terms in the cost function, are activated or deactivated depending on the system state and commands. We show that these problems can be formulated either as mixed-integer nonlinear programs (MINLPs) or as mathematical programs with vanishing constraints (MPVCs), where the former formulation involves discrete decision variables, whereas the latter relies on continuous variables subject to structured nonconvex constraints. We discuss the different solution methods available for both formulations and demonstrate them on optimal trajectory planning problems in various aerospace applications. Finally, we compare the strengths and weaknesses of the MINLP and MPVC approaches through a focused case study on powered descent guidance with divert-feasible regions.

A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning

Authors:Shivesh Prakash, Viki Kumar Prasad, Hans-Arno Jacobsen
Date:2025-04-03 00:23:21

We introduce MHNpath, a machine learning-driven retrosynthetic tool designed for computer-aided synthesis planning. Leveraging modern Hopfield networks and novel comparative metrics, MHNpath efficiently prioritizes reaction templates, improving the scalability and accuracy of retrosynthetic predictions. The tool incorporates a tunable scoring system that allows users to prioritize pathways based on cost, reaction temperature, and toxicity, thereby facilitating the design of greener and cost-effective reaction routes. We demonstrate its effectiveness through case studies involving complex molecules from ChemByDesign, showcasing its ability to predict novel synthetic and enzymatic pathways. Furthermore, we benchmark MHNpath against existing frameworks, replicating experimentally validated "gold-standard" pathways from PaRoutes. Our case studies reveal that the tool can generate shorter, cheaper, moderate-temperature routes employing green solvents, as exemplified by compounds such as dronabinol, arformoterol, and lupinine.

Model Predictive Control with Visibility Graphs for Humanoid Path Planning and Tracking Against Adversarial Opponents

Authors:Ruochen Hou, Gabriel I. Fernandez, Mingzhang Zhu, Dennis W. Hong
Date:2025-04-03 00:00:34

In this paper we detail the methods used for obstacle avoidance, path planning, and trajectory tracking that helped us win the adult-sized, autonomous humanoid soccer league in RoboCup 2024. Our team was undefeated for all seated matches and scored 45 goals over 6 games, winning the championship game 6 to 1. During the competition, a major challenge for collision avoidance was the measurement noise coming from bipedal locomotion and a limited field of view (FOV). Furthermore, obstacles would sporadically jump in and out of our planned trajectory. At times our estimator would place our robot inside a hard constraint. Any planner in this competition must also be be computationally efficient enough to re-plan and react in real time. This motivated our approach to trajectory generation and tracking. In many scenarios long-term and short-term planning is needed. To efficiently find a long-term general path that avoids all obstacles we developed DAVG (Dynamic Augmented Visibility Graphs). DAVG focuses on essential path planning by setting certain regions to be active based on obstacles and the desired goal pose. By augmenting the states in the graph, turning angles are considered, which is crucial for a large soccer playing robot as turning may be more costly. A trajectory is formed by linearly interpolating between discrete points generated by DAVG. A modified version of model predictive control (MPC) is used to then track this trajectory called cf-MPC (Collision-Free MPC). This ensures short-term planning. Without having to switch formulations cf-MPC takes into account the robot dynamics and collision free constraints. Without a hard switch the control input can smoothly transition in cases where the noise places our robot inside a constraint boundary. The nonlinear formulation runs at approximately 120 Hz, while the quadratic version achieves around 400 Hz.

Preference-Driven Active 3D Scene Representation for Robotic Inspection in Nuclear Decommissioning

Authors:Zhen Meng, Kan Chen, Xiangmin Xu, Erwin Jose Lopez Pulgarin, Emma Li, Philip G. Zhao, David Flynn
Date:2025-04-02 22:20:48

Active 3D scene representation is pivotal in modern robotics applications, including remote inspection, manipulation, and telepresence. Traditional methods primarily optimize geometric fidelity or rendering accuracy, but often overlook operator-specific objectives, such as safety-critical coverage or task-driven viewpoints. This limitation leads to suboptimal viewpoint selection, particularly in constrained environments such as nuclear decommissioning. To bridge this gap, we introduce a novel framework that integrates expert operator preferences into the active 3D scene representation pipeline. Specifically, we employ Reinforcement Learning from Human Feedback (RLHF) to guide robotic path planning, reshaping the reward function based on expert input. To capture operator-specific priorities, we conduct interactive choice experiments that evaluate user preferences in 3D scene representation. We validate our framework using a UR3e robotic arm for reactor tile inspection in a nuclear decommissioning scenario. Compared to baseline methods, our approach enhances scene representation while optimizing trajectory efficiency. The RLHF-based policy consistently outperforms random selection, prioritizing task-critical details. By unifying explicit 3D geometric modeling with implicit human-in-the-loop optimization, this work establishes a foundation for adaptive, safety-critical robotic perception systems, paving the way for enhanced automation in nuclear decommissioning, remote maintenance, and other high-risk environments.

LakeVisage: Towards Scalable, Flexible and Interactive Visualization Recommendation for Data Discovery over Data Lakes

Authors:Yihao Hu, Jin Wang, Sajjadur Rahman
Date:2025-04-02 21:49:43

Data discovery from data lakes is an essential application in modern data science. While many previous studies focused on improving the efficiency and effectiveness of data discovery, little attention has been paid to the usability of such applications. In particular, exploring data discovery results can be cumbersome due to the cognitive load involved in understanding raw tabular results and identifying insights to draw conclusions. To address this challenge, we introduce a new problem -- visualization recommendation for data discovery over data lakes -- which aims at automatically identifying visualizations that highlight relevant or desired trends in the results returned by data discovery engines. We propose LakeVisage, an end-to-end framework as the first solution to this problem. Given a data lake, a data discovery engine, and a user-specified query table, LakeVisage intelligently explores the space of visualizations and recommends the most useful and ``interesting'' visualization plans. To this end, we developed (i) approaches to smartly construct the candidate visualization plans from the results of the data discovery engine and (ii) effective pruning strategies to filter out less interesting plans so as to accelerate the visual analysis. Experimental results on real data lakes show that our proposed techniques can lead to an order of magnitude speedup in visualization recommendation. We also conduct a comprehensive user study to demonstrate that LakeVisage offers convenience to users in real data analysis applications by enabling them seamlessly get started with the tasks and performing explorations flexibly.

Estimation of the complier causal hazard ratio under dependent censoring

Authors:Gilles Crommen, Jad Beyhum, Ingrid Van Keilegom
Date:2025-04-02 19:56:34

In this work, we are interested in studying the causal effect of an endogenous binary treatment on a dependently censored duration outcome. By dependent censoring, it is meant that the duration time ($T$) and right censoring time ($C$) are not statistically independent of each other, even after conditioning on the measured covariates. The endogeneity issue is handled by making use of a binary instrumental variable for the treatment. To deal with the dependent censoring problem, it is assumed that on the stratum of compliers: (i) $T$ follows a semiparametric proportional hazards model; (ii) $C$ follows a fully parametric model; and (iii) the relation between $T$ and $C$ is modeled by a parametric copula, such that the association parameter can be left unspecified. In this framework, the treatment effect of interest is the complier causal hazard ratio (CCHR). We devise an estimation procedure that is based on a weighted maximum likelihood approach, where the weights are the probabilities of an observation coming from a complier. The weights are estimated non-parametrically in a first stage, followed by the estimation of the CCHR. Novel conditions under which the model is identifiable are given, a two-step estimation procedure is proposed and some important asymptotic properties are established. Simulations are used to assess the validity and finite-sample performance of the estimation procedure. Finally, we apply the approach to estimate the CCHR of both job training programs on unemployment duration and periodic screening examinations on time until death from breast cancer. The data come from the National Job Training Partnership Act study and the Health Insurance Plan of Greater New York experiment respectively.

Evaluation of Flight Parameters in UAV-based 3D Reconstruction for Rooftop Infrastructure Assessment

Authors:Nick Chodura, Melissa Greeff, Joshua Woods
Date:2025-04-02 19:43:20

Rooftop 3D reconstruction using UAV-based photogrammetry offers a promising solution for infrastructure assessment, but existing methods often require high percentages of image overlap and extended flight times to ensure model accuracy when using autonomous flight paths. This study systematically evaluates key flight parameters-ground sampling distance (GSD) and image overlap-to optimize the 3D reconstruction of complex rooftop infrastructure. Controlled UAV flights were conducted over a multi-segment rooftop at Queen's University using a DJI Phantom 4 Pro V2, with varied GSD and overlap settings. The collected data were processed using Reality Capture software and evaluated against ground truth models generated from UAV-based LiDAR and terrestrial laser scanning (TLS). Experimental results indicate that a GSD range of 0.75-1.26 cm combined with 85% image overlap achieves a high degree of model accuracy, while minimizing images collected and flight time. These findings provide guidance for planning autonomous UAV flight paths for efficient rooftop assessments.

RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics

Authors:Zhiyuan Zhang, Yuxin He, Yong Sun, Junyu Shi, Lijiang Liu, Qiang Nie
Date:2025-04-02 19:02:08

Visual Language Models (VLMs) have emerged as pivotal tools for robotic systems, enabling cross-task generalization, dynamic environmental interaction, and long-horizon planning through multimodal perception and semantic reasoning. However, existing open-source VLMs predominantly trained for generic vision-language alignment tasks fail to model temporally correlated action semantics that are crucial for robotic manipulation effectively. While current image-based fine-tuning methods partially adapt VLMs to robotic applications, they fundamentally disregard temporal evolution patterns in video sequences and suffer from visual feature entanglement between robotic agents, manipulated objects, and environmental contexts, thereby limiting semantic decoupling capability for atomic actions and compromising model generalizability.To overcome these challenges, this work presents RoboAct-CLIP with dual technical contributions: 1) A dataset reconstruction framework that performs semantic-constrained action unit segmentation and re-annotation on open-source robotic videos, constructing purified training sets containing singular atomic actions (e.g., "grasp"); 2) A temporal-decoupling fine-tuning strategy based on Contrastive Language-Image Pretraining (CLIP) architecture, which disentangles temporal action features across video frames from object-centric characteristics to achieve hierarchical representation learning of robotic atomic actions.Experimental results in simulated environments demonstrate that the RoboAct-CLIP pretrained model achieves a 12% higher success rate than baseline VLMs, along with superior generalization in multi-object manipulation tasks.

Towards Operationalizing Heterogeneous Data Discovery

Authors:Jin Wang, Yanlin Feng, Chen Shen, Sajjadur Rahman, Eser Kandogan
Date:2025-04-02 18:38:38

Querying and exploring massive collections of data sources, such as data lakes, has been an essential research topic in the database community. Although many efforts have been paid in the field of data discovery and data integration in data lakes, they mainly focused on the scenario where the data lake consists of structured tables. However, real-world enterprise data lakes are always more complicated, where there might be silos of multi-modal data sources with structured, semi-structured and unstructured data. In this paper, we envision an end-to-end system with declarative interface for querying and analyzing the multi-modal data lakes. First of all, we come up with a set of multi-modal operators, which is a unified interface that extends the relational operations with AI-composed ones to express analytical workloads over data sources in various modalities. In addition, we formally define the essential steps in the system, such as data discovery, query planning, query processing and results aggregation. On the basis of it, we then pinpoint the research challenges and discuss potential opportunities in realizing and optimizing them with advanced techniques brought by Large Language Models. Finally, we demonstrate our preliminary attempts to address this problem and suggest the future plan for this research topic.

Path planning with moving obstacles using stochastic optimal control

Authors:Seyyed Reza Jafari, Anders Hansson, Bo Wahlberg
Date:2025-04-02 18:34:25

Navigating a collision-free, optimal path for a robot poses a perpetual challenge, particularly in the presence of moving objects such as humans. In this study, we formulate the problem of finding an optimal path as a stochastic optimal control problem. However, obtaining a solution to this problem is nontrivial. Therefore, we consider a simplified problem, which is more tractable. For this simplified formulation, we are able to solve the corresponding Bellman equation. However, the solution obtained from the simplified problem does not sufficiently address the original problem of interest. To address the full problem, we propose a numerical procedure where we solve an optimization problem at each sampling instant. The solution to the simplified problem is integrated into the online formulation as a final-state penalty. We illustrate the efficiency of the proposed method using a numerical example.

Toward Real-world BEV Perception: Depth Uncertainty Estimation via Gaussian Splatting

Authors:Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen
Date:2025-04-02 17:59:38

Bird's-eye view (BEV) perception has gained significant attention because it provides a unified representation to fuse multiple view images and enables a wide range of down-stream autonomous driving tasks, such as forecasting and planning. Recent state-of-the-art models utilize projection-based methods which formulate BEV perception as query learning to bypass explicit depth estimation. While we observe promising advancements in this paradigm, they still fall short of real-world applications because of the lack of uncertainty modeling and expensive computational requirement. In this work, we introduce GaussianLSS, a novel uncertainty-aware BEV perception framework that revisits unprojection-based methods, specifically the Lift-Splat-Shoot (LSS) paradigm, and enhances them with depth un-certainty modeling. GaussianLSS represents spatial dispersion by learning a soft depth mean and computing the variance of the depth distribution, which implicitly captures object extents. We then transform the depth distribution into 3D Gaussians and rasterize them to construct uncertainty-aware BEV features. We evaluate GaussianLSS on the nuScenes dataset, achieving state-of-the-art performance compared to unprojection-based methods. In particular, it provides significant advantages in speed, running 2.5x faster, and in memory efficiency, using 0.3x less memory compared to projection-based methods, while achieving competitive performance with only a 0.4% IoU difference.

Deep Representation Learning for Unsupervised Clustering of Myocardial Fiber Trajectories in Cardiac Diffusion Tensor Imaging

Authors:Mohini Anand, Xavier Tricoche
Date:2025-04-02 17:56:57

Understanding the complex myocardial architecture is critical for diagnosing and treating heart disease. However, existing methods often struggle to accurately capture this intricate structure from Diffusion Tensor Imaging (DTI) data, particularly due to the lack of ground truth labels and the ambiguous, intertwined nature of fiber trajectories. We present a novel deep learning framework for unsupervised clustering of myocardial fibers, providing a data-driven approach to identifying distinct fiber bundles. We uniquely combine a Bidirectional Long Short-Term Memory network to capture local sequential information along fibers, with a Transformer autoencoder to learn global shape features, with pointwise incorporation of essential anatomical context. Clustering these representations using a density-based algorithm identifies 33 to 62 robust clusters, successfully capturing the subtle distinctions in fiber trajectories with varying levels of granularity. Our framework offers a new, flexible, and quantitative way to analyze myocardial structure, achieving a level of delineation that, to our knowledge, has not been previously achieved, with potential applications in improving surgical planning, characterizing disease-related remodeling, and ultimately, advancing personalized cardiac care.

End-to-End Driving with Online Trajectory Evaluation via BEV World Model

Authors:Yingyan Li, Yuqi Wang, Yang Liu, Jiawei He, Lue Fan, Zhaoxiang Zhang
Date:2025-04-02 17:47:23

End-to-end autonomous driving has achieved remarkable progress by integrating perception, prediction, and planning into a fully differentiable framework. Yet, to fully realize its potential, an effective online trajectory evaluation is indispensable to ensure safety. By forecasting the future outcomes of a given trajectory, trajectory evaluation becomes much more effective. This goal can be achieved by employing a world model to capture environmental dynamics and predict future states. Therefore, we propose an end-to-end driving framework WoTE, which leverages a BEV World model to predict future BEV states for Trajectory Evaluation. The proposed BEV world model is latency-efficient compared to image-level world models and can be seamlessly supervised using off-the-shelf BEV-space traffic simulators. We validate our framework on both the NAVSIM benchmark and the closed-loop Bench2Drive benchmark based on the CARLA simulator, achieving state-of-the-art performance. Code is released at https://github.com/liyingyanUCAS/WoTE.

Strengthening Multi-Robot Systems for SAR: Co-Designing Robotics and Communication Towards 6G

Authors:Juan Bravo-Arrabal, Ricardo Vázquez-Martín, J. J. Fernández-Lozano, Alfonso García-Cerezo
Date:2025-04-02 17:47:11

This paper presents field-tested use cases from Search and Rescue (SAR) missions, highlighting the co-design of mobile robots and communication systems to support Edge-Cloud architectures based on 5G Standalone (SA). The main goal is to contribute to the effective cooperation of multiple robots and first responders. Our field experience includes the development of Hybrid Wireless Sensor Networks (H-WSNs) for risk and victim detection, smartphones integrated into the Robot Operating System (ROS) as Edge devices for mission requests and path planning, real-time Simultaneous Localization and Mapping (SLAM) via Multi-Access Edge Computing (MEC), and implementation of Uncrewed Ground Vehicles (UGVs) for victim evacuation in different navigation modes. These experiments, conducted in collaboration with actual first responders, underscore the need for intelligent network resource management, balancing low-latency and high-bandwidth demands. Network slicing is key to ensuring critical emergency services are performed despite challenging communication conditions. The paper identifies architectural needs, lessons learned, and challenges to be addressed by 6G technologies to enhance emergency response capabilities.

The TELOS Collaboration Approach to Reproducibility and Open Science

Authors:Ed Bennett
Date:2025-04-02 16:32:29

The TELOS Collaboration is committed to producing and analysing lattice data reproducibly, and sharing its research openly. In this document, we set out the ways that we make this happen, where there is scope for improvement, and how we plan to achieve this. This is intended to work both as a statement of policy, and a guide to practice for those beginning to work with us. Some details and recommendations are specific to the context in which the Collaboration works (such as references to requirements imposed by funders in the United Kingdom); however, most recommendations may serve as a template for other collaborations looking to make their own work reproducible. Full tutorials on every aspect of reproducibility are beyond the scope of this document, but we refer to other resources for further information.

Virtual Target Trajectory Prediction for Stochastic Targets

Authors:Marc Schneider, Renato Loureiro, Torbjørn Cunis, Walter Fichter
Date:2025-04-02 16:02:43

Trajectory prediction of other vehicles is crucial for autonomous vehicles, with applications from missile guidance to UAV collision avoidance. Typically, target trajectories are assumed deterministic, but real-world aerial vehicles exhibit stochastic behavior, such as evasive maneuvers or gliders circling in thermals. This paper uses Conditional Normalizing Flows, an unsupervised Machine Learning technique, to learn and predict the stochastic behavior of targets of guided missiles using trajectory data. The trained model predicts the distribution of future target positions based on initial conditions and parameters of the dynamics. Samples from this distribution are clustered using a time series k-means algorithm to generate representative trajectories, termed virtual targets. The method is fast and target-agnostic, requiring only training data in the form of target trajectories. Thus, it serves as a drop-in replacement for deterministic trajectory predictions in guidance laws and path planning. Simulated scenarios demonstrate the approach's effectiveness for aerial vehicles with random maneuvers, bridging the gap between deterministic predictions and stochastic reality, advancing guidance and control algorithms for autonomous vehicles.

US National Input to the European Strategy Update for Particle Physics

Authors:André de Gouvêa, Hitoshi Murayama, Mark Palmer, Heidi Schellman
Date:2025-04-02 15:12:04

In this document we summarize the output of the US community planning exercises for particle physics that were performed between 2020 and 2023 and comment upon progress made since then towards our common scientific goals. This document leans heavily on the formal report of the Particle Physics Project Prioritization Panel and other recent US planning documents, often quoting them verbatim to retain the community consensus.

How to write competitive proposals and job applications

Authors:Johan H. Knapen, Henri M. J. Boffin, Nushkia Chamba, Natashya Chamba
Date:2025-04-02 11:53:45

Writing proposals and job applications is arguably one of the most important tasks in the career of a scientist. The proposed ideas must be scientifically compelling, but how a proposal is planned, written, and presented can make an enormous difference. This Perspective is the third in a series aimed at training the writing skills of professional astronomers. In the first two papers we concentrated on the writing of papers, here we concentrate on how proposals and job applications can be optimally written and presented. We discuss how to select where to propose or apply, how to optimise your writing, and add notes on the potential use of artificial intelligence tools. This guide is aimed primarily at more junior researchers, but we hope that our observations and suggestions may also be helpful for more experienced applicants, as well as for reviewers and funding agencies.

The Mini-SiTian Array: Design and application of Master Control System

Authors:Zheng Wang, Jin-hang Zou, Liang Ge, Min He, Jian Li, Yi Hu, Jianfeng Tian
Date:2025-04-02 11:26:11

The SiTian Project represents a groundbreaking initiative in astronomy, aiming to deploy a global network of telescopes, each with a 1-meter aperture, for comprehensive time-domain sky surveys. The network's innovative architecture features multiple observational nodes, each comprising three strategically aligned telescopes equipped with filters. This design enables three-color (g, r, i) channel imaging within each node, facilitating precise and coordinated observations. As a pathfinder to the full-scale project, the Mini-SiTian Project serves as the scientific and technological validation platform, utilizing three 30-centimeter aperture telescopes to validate the methodologies and technologies planned for the broader SiTian network. This paper focuses on the development and implementation of the Master Control System (MCS),and the central command hub for the Mini-SiTian array. The MCS is designed to facilitate seamless communication with the SiTian Brain, the project's central processing and decision-making unit, while ensuring accurate task allocation, real-time status monitoring, and optimized observational workflows. The system adopts a robust architecture that separates front-end and back-end functionalities.A key innovation of the MCS is its ability to dynamically adjust observation plans in response to transient source alerts, enabling rapid and coordinated scans of target sky regions...(abridged)

The Mini-SiTian Array: the mini-SiTian Realtime Image Processing pipeline (STRIP)

Authors:Hongrui Gu, Yang Huang, Yongkang Sun, Kai Xiao, Zhirui Li, Beichuan Wang, Zhou Fan, Chuanjie Zheng, Henggeng Han, Hu Zou, Wenxiong Li, Hong Wu, Jifeng Liu
Date:2025-04-02 11:25:59

This paper provides a comprehensive introduction to the Mini-SiTian Real-Time Image Processing pipeline (STRIP) and evaluates its operational performance. The STRIP pipeline is specifically designed for real-time alert triggering and light curve generation for transient sources. By applying the STRIP pipeline to both simulated and real observational data of the Mini-SiTian survey, it successfully identified various types of variable sources, including stellar flares, supernovae, variable stars, and asteroids, while meeting requirements of reduction speed within 5 minutes. For the real observational dataset, the pipeline detected 1 flare event, 127 variable stars, and 14 asteroids from three monitored sky regions. Additionally, two datasets were generated: one, a real-bogus training dataset comprising 218,818 training samples, and the other, a variable star light curve dataset with 421 instances. These datasets will be used to train machine learning algorithms, which are planned for future integration into STRIP.