Aerial robotic arms aim to enable inspection and environment interaction in otherwise hard-to-reach areas from the air. However, many aerial manipulators feature bulky or heavy robot manipulators mounted to large, high-payload aerial vehicles. Instead, we propose an aerial robotic arm with low mass and a small stowed configuration called a "flying vine". The flying vine consists of a small, maneuverable quadrotor equipped with a soft, growing, inflated beam as the arm. This soft robot arm is underactuated, and positioning of the end effector is achieved by controlling the coupled quadrotor-vine dynamics. In this work, we present the flying vine design and a modeling and control framework for tracking desired end effector trajectories. The dynamic model leverages data-driven modeling methods and introduces bilinear interpolation to account for time-varying dynamic parameters. We use trajectory optimization to plan quadrotor controls that produce desired end effector motions. Experimental results on a physical prototype demonstrate that our framework enables the flying vine to perform high-speed end effector tracking, laying a foundation for performing dynamic maneuvers with soft aerial manipulators.
Accurate segmentation of nodules in both 2D breast ultrasound (BUS) and 3D automated breast ultrasound (ABUS) is crucial for clinical diagnosis and treatment planning. Therefore, developing an automated system for nodule segmentation can enhance user independence and expedite clinical analysis. Unlike fully-supervised learning, weakly-supervised segmentation (WSS) can streamline the laborious and intricate annotation process. However, current WSS methods face challenges in achieving precise nodule segmentation, as many of them depend on inaccurate activation maps or inefficient pseudo-mask generation algorithms. In this study, we introduce a novel multi-agent reinforcement learning-based WSS framework called Flip Learning, which relies solely on 2D/3D boxes for accurate segmentation. Specifically, multiple agents are employed to erase the target from the box to facilitate classification tag flipping, with the erased region serving as the predicted segmentation mask. The key contributions of this research are as follows: (1) Adoption of a superpixel/supervoxel-based approach to encode the standardized environment, capturing boundary priors and expediting the learning process. (2) Introduction of three meticulously designed rewards, comprising a classification score reward and two intensity distribution rewards, to steer the agents' erasing process precisely, thereby avoiding both under- and over-segmentation. (3) Implementation of a progressive curriculum learning strategy to enable agents to interact with the environment in a progressively challenging manner, thereby enhancing learning efficiency. Extensively validated on the large in-house BUS and ABUS datasets, our Flip Learning method outperforms state-of-the-art WSS methods and foundation models, and achieves comparable performance as fully-supervised learning algorithms.
Ensuring robustness against epistemic, possibly adversarial, perturbations is essential for reliable real-world decision-making. While the Probabilistic Ensembles with Trajectory Sampling (PETS) algorithm inherently handles uncertainty via ensemble-based probabilistic models, it lacks guarantees against structured adversarial or worst-case uncertainty distributions. To address this, we propose DR-PETS, a distributionally robust extension of PETS that certifies robustness against adversarial perturbations. We formalize uncertainty via a p-Wasserstein ambiguity set, enabling worst-case-aware planning through a min-max optimization framework. While PETS passively accounts for stochasticity, DR-PETS actively optimizes robustness via a tractable convex approximation integrated into PETS planning loop. Experiments on pendulum stabilization and cart-pole balancing show that DR-PETS certifies robustness against adversarial parameter perturbations, achieving consistent performance in worst-case scenarios where PETS deteriorates.
The rapid growth of photovoltaic (PV) systems in Austria's medium- and low-voltage grids has intensified challenges in grid access, with technical limits increasingly leading to restrictions on full feed-in power. This issue has sparked discussions about limiting PV feed-in power and the implications for both generated and curtailed PV energy. At the same time, expanding PV capacity remains critical to achieving future climate targets. However, there is a lack of robust methodologies of quantify the impact of PV feed-in limitations when implemented in an optimization model. This impact affects both the curtailed energy and the increase in maximum PV installation capacity and total energy production. To address this gap, we have developed a mathematical formulation of dynamic PV feed-in limitations and integrated it into an optimization model. This approach enables a comprehensive evaluation of its effects on PV integration potential and energy curtailment, validated through case studies on four representative real-world Austrian medium- and low-voltage grids. We analyzed maximum PV expansion, energy generation, and curtailment under feed-in constraints. The results highlight the potential for integrating up to 32% additional PV systems within existing infrastructure while keeping PV curtailment relatively low, i.e. at 2%. We provide actionable insights for grid operators and policymakers aiming to balance renewable energy expansion with grid reliability.
Let $G$ be a connected graph with $N$ vertices. Let $k$ be the number of vertices in a longest path of $G$ such that every vertex on the path is a cut vertex of $G$, and every intermediate vertex of the path is a degree-two vertex of $G$. %Let $k$ be the number of vertices of such a longest path of $T$ that every vertex of %the path is a cut vertex and that every intermediate vertex of the path is a degree-two vertex of $T$. Let $P=\{1,\ldots,n\}$ be a set of pebbles with $n+k < N$. A \textit{configuration} of $P$ on $G$ is defined as a function $f$ from $V(G)$ to $\{0, 1, \ldots, n \}$ with $|f^{-1}(i)| = 1$ for $1 \le i \le n$, where $f^{-1}(i)$ is a vertex occupied with the $i$th pebble for $1 \le i \le n$ and $f^{-1}(0)$ is a set of unoccupied vertices. A \textit{move} is defined as shifting a pebble from a vertex to %its unoccupied neighbour. some unoccupied neighbor. The {\it pebble motion problem on the pair $(G,P)$} is to decide whether a given configuration of pebbles is reachable from another by executing a sequence of moves. In this paper, we show that the length of the shortest solution sequence of the pebble motion problem on the pair $(G,P)$ is in $O(Nn + n^2 \log(\min\{n,k\}))$ if $G$ is a $N$-vertex tree, and it is in $O(N^2 + \frac{n^3}{N-n} + n^2 \log(\min\{n,N-n\}))$ if $G$ is a connected general $N$-vertex graph. We provide an algorithm that can obtain a solution sequence of lengths that satisfy these orders, with the same computational complexity as the order of the length. Keywords: pebble motion, motion planning, multi-agent path finding, $15$-puzzle, tree
This paper considers multi-goal motion planning in unstructured, obstacle-rich environments where a robot is required to reach multiple regions while avoiding collisions. The planned motions must also satisfy the differential constraints imposed by the robot dynamics. To find solutions efficiently, this paper leverages machine learning, Traveling Salesman Problem (TSP), and sampling-based motion planning. The approach expands a motion tree by adding collision-free and dynamically-feasible trajectories as branches. A TSP solver is used to compute a tour for each node to determine the order in which to reach the remaining goals by utilizing a cost matrix. An important aspect of the approach is that it leverages machine learning to construct the cost matrix by combining runtime and distance predictions to single-goal motion-planning problems. During the motion-tree expansion, priority is given to nodes associated with low-cost tours. Experiments with a vehicle model operating in obstacle-rich environments demonstrate the computational efficiency and scalability of the approach.
Most, if not all, robot navigation systems employ a decomposed planning framework that includes global and local planning. To trade-off onboard computation and plan quality, current systems have to limit all robot dynamics considerations only within the local planner, while leveraging an extremely simplified robot representation (e.g., a point-mass holonomic model without dynamics) in the global level. However, such an artificial decomposition based on either full or zero consideration of robot dynamics can lead to gaps between the two levels, e.g., a global path based on a holonomic point-mass model may not be realizable by a non-holonomic robot, especially in highly constrained obstacle environments. Motivated by such a limitation, we propose a novel paradigm, Decremental Dynamics Planning that integrates dynamic constraints into the entire planning process, with a focus on high-fidelity dynamics modeling at the beginning and a gradual fidelity reduction as the planning progresses. To validate the effectiveness of this paradigm, we augment three different planners with DDP and show overall improved planning performance. We also develop a new DDP-based navigation system, which achieves first place in the simulation phase of the 2025 BARN Challenge. Both simulated and physical experiments validate DDP's hypothesized benefits.
Hybrid storage systems (HSS) combine multiple storage devices with diverse characteristics to achieve high performance and capacity at low cost. The performance of an HSS highly depends on the effectiveness of two key policies: (1) the data-placement policy, which determines the best-fit storage device for incoming data, and (2) the data-migration policy, which rearranges stored data across the devices to sustain high HSS performance. Prior works focus on improving only data placement or only data migration in HSS, which leads to sub-optimal HSS performance. Unfortunately, no prior work tries to optimize both policies together. Our goal is to design a holistic data-management technique for HSS that optimizes both data-placement and data-migration policies to fully exploit the potential of an HSS. We propose Harmonia, a multi-agent reinforcement learning (RL)-based data-management technique that employs two light-weight autonomous RL agents, a data-placement agent and a data-migration agent, which adapt their policies for the current workload and HSS configuration, and coordinate with each other to improve overall HSS performance. We evaluate Harmonia on a real HSS with up to four heterogeneous storage devices with diverse characteristics. Our evaluation using 17 data-intensive workloads on performance-optimized (cost-optimized) HSS with two storage devices shows that, on average, Harmonia (1) outperforms the best-performing prior approach by 49.5% (31.7%), (2) bridges the performance gap between the best-performing prior work and Oracle by 64.2% (64.3%). On an HSS with three (four) devices, Harmonia outperforms the best-performing prior work by 37.0% (42.0%). Harmonia's performance benefits come with low latency (240ns for inference) and storage overheads (206 KiB for both RL agents together). We plan to open-source Harmonia's implementation to aid future research on HSS.
We discuss a concept of a lower-energy version of the Large Hadron-electron Collider (LHeC), delivering electron-hadron collisions concurrently to the hadron-hadron collisions at the high-luminosity LHC at CERN. Assuming the use of a 20 GeV electron Energy Recovery Linac (ERL), we describe the optimised beam dynamics, accelerator technologies, and detector constraints required for such a "phase-one" LHeC. Finally, we also discuss the ERL configurations, the possibility of delivering electron-hadron collisions during the planned LHC Run5 and briefly outline the scientific potential of this proposal.
Accurate segmentation of glioma brain tumors is crucial for diagnosis and treatment planning. Deep learning techniques offer promising solutions, but optimal model architectures remain under investigation. We used the BraTS 2021 dataset, selecting T1 with contrast enhancement (T1CE), T2, and Fluid-Attenuated Inversion Recovery (FLAIR) sequences for model development. The proposed Attention Xception UNet (AXUNet) architecture integrates an Xception backbone with dot-product self-attention modules, inspired by state-of-the-art (SOTA) large language models such as Google Bard and OpenAI ChatGPT, within a UNet-shaped model. We compared AXUNet with SOTA models. Comparative evaluation on the test set demonstrated improved results over baseline models. Inception-UNet and Xception-UNet achieved mean Dice scores of 90.88 and 93.24, respectively. Attention ResUNet (AResUNet) attained a mean Dice score of 92.80, with the highest score of 84.92 for enhancing tumor (ET) among all models. Attention Gate UNet (AGUNet) yielded a mean Dice score of 90.38. AXUNet outperformed all models with a mean Dice score of 93.73. It demonstrated superior Dice scores across whole tumor (WT) and tumor core (TC) regions, achieving 92.59 for WT, 86.81 for TC, and 84.89 for ET. The integration of the Xception backbone and dot-product self-attention mechanisms in AXUNet showcases enhanced performance in capturing spatial and contextual information. The findings underscore the potential utility of AXUNet in facilitating precise tumor delineation.
Navigating in environments alongside humans requires agents to reason under uncertainty and account for the beliefs and intentions of those around them. Under a sequential decision-making framework, egocentric navigation can naturally be represented as a Markov Decision Process (MDP). However, social navigation additionally requires reasoning about the hidden beliefs of others, inherently leading to a Partially Observable Markov Decision Process (POMDP), where agents lack direct access to others' mental states. Inspired by Theory of Mind and Epistemic Planning, we propose (1) a neuro-symbolic model-based reinforcement learning architecture for social navigation, addressing the challenge of belief tracking in partially observable environments; and (2) a perspective-shift operator for belief estimation, leveraging recent work on Influence-based Abstractions (IBA) in structured multi-agent settings.
This paper addresses a generalization problem of Multi-Agent Pathfinding (MAPF), called Collaborative Task Sequencing - Multi-Agent Pathfinding (CTS-MAPF), where agents must plan collision-free paths and visit a series of intermediate task locations in a specific order before reaching their final destinations. To address this problem, we propose a new approach, Collaborative Task Sequencing - Conflict-Based Search (CTS-CBS), which conducts a two-level search. In the high level, it generates a search forest, where each tree corresponds to a joint task sequence derived from the jTSP solution. In the low level, CTS-CBS performs constrained single-agent path planning to generate paths for each agent while adhering to high-level constraints. We also provide heoretical guarantees of its completeness and optimality (or sub-optimality with a bounded parameter). To evaluate the performance of CTS-CBS, we create two datasets, CTS-MAPF and MG-MAPF, and conduct comprehensive experiments. The results show that CTS-CBS adaptations for MG-MAPF outperform baseline algorithms in terms of success rate (up to 20 times larger) and runtime (up to 100 times faster), with less than a 10% sacrifice in solution quality. Furthermore, CTS-CBS offers flexibility by allowing users to adjust the sub-optimality bound omega to balance between solution quality and efficiency. Finally, practical robot tests demonstrate the algorithm's applicability in real-world scenarios.
Model-based reinforcement learning (MBRL) has demonstrated superior sample efficiency compared to model-free reinforcement learning (MFRL). However, the presence of inaccurate models can introduce biases during policy learning, resulting in misleading trajectories. The challenge lies in obtaining accurate models due to limited diverse training data, particularly in regions with limited visits (uncertain regions). Existing approaches passively quantify uncertainty after sample generation, failing to actively collect uncertain samples that could enhance state coverage and improve model accuracy. Moreover, MBRL often faces difficulties in making accurate multi-step predictions, thereby impacting overall performance. To address these limitations, we propose a novel framework for uncertainty-aware policy optimization with model-based exploratory planning. In the model-based planning phase, we introduce an uncertainty-aware k-step lookahead planning approach to guide action selection at each step. This process involves a trade-off analysis between model uncertainty and value function approximation error, effectively enhancing policy performance. In the policy optimization phase, we leverage an uncertainty-driven exploratory policy to actively collect diverse training samples, resulting in improved model accuracy and overall performance of the RL agent. Our approach offers flexibility and applicability to tasks with varying state/action spaces and reward structures. We validate its effectiveness through experiments on challenging robotic manipulation tasks and Atari games, surpassing state-of-the-art methods with fewer interactions, thereby leading to significant performance improvements.
Reactive mobile robot navigation in unstructured environments is challenging when robots encounter unexpected obstacles that invalidate previously planned trajectories. Model predictive path integral control (MPPI) enables reactive planning, but still suffers from limited prediction horizons that lead to local minima traps near obstacles. Current solutions rely on heuristic cost design or scenario-specific pre-training, which often limits their adaptability to new environments. We introduce dynamic repulsive potential augmented MPPI (DRPA-MPPI), which dynamically detects potential entrapments on the predicted trajectories. Upon detecting local minima, DRPA-MPPI automatically switches between standard goal-oriented optimization and a modified cost function that generates repulsive forces away from local minima. Comprehensive testing in simulated obstacle-rich environments confirms DRPA-MPPI's superior navigation performance and safety compared to conventional methods with less computational burden.
Autonomous vehicle (AV) control systems increasingly rely on ML models for tasks such as perception and planning. Current practice is to run these models on the car's local hardware due to real-time latency constraints and reliability concerns, which limits model size and thus accuracy. Prior work has observed that we could augment current systems by running larger models in the cloud, relying on faster cloud runtimes to offset the cellular network latency. However, prior work does not account for an important practical constraint: limited cellular bandwidth. We show that, for typical bandwidth levels, proposed techniques for cloud-augmented AV models take too long to transfer data, thus mostly falling back to the on-car models and resulting in no accuracy improvement. In this work, we show that realizing cloud-augmented AV models requires intelligent use of this scarce bandwidth, i.e. carefully allocating bandwidth across tasks and providing multiple data compression and model options. We formulate this as a resource allocation problem to maximize car utility, and present our system \sysname which achieves an increase in average model accuracy by up to 15 percentage points on driving scenarios from the Waymo Open Dataset.
Modern reinforcement learning (RL) systems have demonstrated remarkable capabilities in complex environments, such as video games. However, they still fall short of achieving human-like sample efficiency and adaptability when learning new domains. Theory-based reinforcement learning (TBRL) is an algorithmic framework specifically designed to address this gap. Modeled on cognitive theories, TBRL leverages structured, causal world models - "theories" - as forward simulators for use in planning, generalization and exploration. Although current TBRL systems provide compelling explanations of how humans learn to play video games, they face several technical limitations: their theory languages are restrictive, and their planning algorithms are not scalable. To address these challenges, we introduce TheoryCoder, an instantiation of TBRL that exploits hierarchical representations of theories and efficient program synthesis methods for more powerful learning and planning. TheoryCoder equips agents with general-purpose abstractions (e.g., "move to"), which are then grounded in a particular environment by learning a low-level transition model (a Python program synthesized from observations by a large language model). A bilevel planning algorithm can exploit this hierarchical structure to solve large domains. We demonstrate that this approach can be successfully applied to diverse and challenging grid-world games, where approaches based on directly synthesizing a policy perform poorly. Ablation studies demonstrate the benefits of using hierarchical abstractions.
The planned NASA Habitable Worlds Observatory (HWO) flagship mission aims to image and spectroscopically characterise 25 Earth-size planets in the habitable zones of their stars. However, one giant planet in the habitable zone can ruin your whole day. Recent work has examined the current state of our knowledge on the presence or absence of such objects in samples of likely HWO targets, and that knowledge has been found wanting; even Saturn-mass planets remain undetectable in many of these systems. In this work, we present simulations assessing the degree to which new campaigns of high-cadence radial velocity observations can ameliorate this woeful state of affairs. In particular, we highlight the value of moderate-precision but highly flexibly-scheduled RV facilities in aiding this necessary HWO precursor science. We find that for a subset of Southern HWO stars, 6 years of new RVs from the Minerva-Australis telescope array in Australia can improve the median detection sensitivity in the habitable zones of 13 likely HWO targets to $\sim$50 Earth masses, an improvement of $\sim$44%.
Urban transportation networks are vital for the efficient movement of people and goods, necessitating effective traffic management and planning. An integral part of traffic management is understanding the turning movement counts (TMCs) at intersections, Accurate TMCs at intersections are crucial for traffic signal control, congestion mitigation, and road safety. In general, TMCs are obtained using physical sensors installed at intersections, but this approach can be cost-prohibitive and technically challenging, especially for cities with extensive road networks. Recent advancements in machine learning and data-driven approaches have offered promising alternatives for estimating TMCs. Traffic patterns can vary significantly across different intersections due to factors such as road geometry, traffic signal settings, and local driver behaviors. This domain discrepancy limits the generalizability and accuracy of machine learning models when applied to new or unseen intersections. In response to these limitations, this research proposes a novel framework leveraging domain adaptation (DA) to estimate TMCs at intersections by using traffic controller event-based data, road infrastructure data, and point-of-interest (POI) data. Evaluated on 30 intersections in Tucson, Arizona, the performance of the proposed DA framework was compared with state-of-the-art models and achieved the lowest values in terms of Mean Absolute Error and Root Mean Square Error.
This paper tackles a novel problem, extendable long-horizon planning-enabling agents to plan trajectories longer than those in training data without compounding errors. To tackle this, we propose the Hierarchical Multiscale Diffuser (HM-Diffuser) and Progressive Trajectory Extension (PTE), an augmentation method that iteratively generates longer trajectories by stitching shorter ones. HM-Diffuser trains on these extended trajectories using a hierarchical structure, efficiently handling tasks across multiple temporal scales. Additionally, we introduce Adaptive Plan Pondering and the Recursive HM-Diffuser, which consolidate hierarchical layers into a single model to process temporal scales recursively. Experimental results demonstrate the effectiveness of our approach, advancing diffusion-based planners for scalable long-horizon planning.
Health plan deductibles are a form of cost-sharing that require patients to pay out-of-pocket before insurance pays for benefits. Deductible plans have become increasingly common in the United Sates to mitigate escalating healthcare costs. Quantifying the impact of increased cost-sharing from deductibles on utilization is a challenging empirical question because individuals and employers self-select into plans with deductibles. We evaluated the impact of cost-sharing in plans with deductibles by leveraging an accidental injury to a family member as an instrumental variable that strongly predicted the non-injured family member reaching their deductible maximum, which resulted in a reduction in cost-sharing. Our outcome measures examined utilization subject to cost-sharing as compared to utilization exempt from cost-sharing. Using data from the same healthcare system to control for quality and provider network, we found that reaching the deductible increased emergency department (ED) utilization by 10.0 percentage points (pp). Nearly one-quarter of the increased ED utilization was potentially avoidable. Wellness visits not subject to cost-sharing decreased by 5.7 pp. Results were similar for high-deductible plans and for families meeting their maximum out-of-pocket amount. These findings provide causal evidence that individuals enrolled in plans with deductibles change utilization patterns after an exogenous reduction in cost-sharing.
In this paper we review the physics opportunities at linear $e^+e^-$ colliders with a special focus on high centre-of-mass energies and beam polarisation, take a fresh look at the various accelerator technologies available or under development and, for the first time, discuss how a facility first equipped with a technology mature today could be upgraded with technologies of tomorrow to reach much higher energies and/or luminosities. In addition, we will discuss detectors and alternative collider modes, as well as opportunities for beyond-collider experiments and R\&D facilities as part of a linear collider facility (LCF). The material of this paper will support all plans for $e^+e^-$ linear colliders and additional opportunities they offer, independently of technology choice or proposed site, as well as R\&D for advanced accelerator technologies. This joint perspective on the physics goals, early technologies and upgrade strategies has been developed by the LCVision team based on an initial discussion at LCWS2024 in Tokyo and a follow-up at the LCVision Community Event at CERN in January 2025. It heavily builds on decades of achievements of the global linear collider community, in particular in the context of CLIC and ILC.
Intelligent organisms can solve truly novel problems which they have never encountered before, either in their lifetime or their evolution. An important component of this capacity is the ability to ``think'', that is, to mentally manipulate objects, concepts and behaviors in order to plan and evaluate possible solutions to novel problems, even without environment interaction. To generate problems that are truly qualitatively novel, while still solvable zero-shot (by mental simulation), we use the combinatorial nature of environments: we train the agent while withholding a specific combination of the environment's elements. The novel test task, based on this combination, is thus guaranteed to be truly novel, while still mentally simulable since the agent has been exposed to each individual element (and their pairwise interactions) during training. We propose a method to train agents endowed with world models to make use their mental simulation abilities, by selecting tasks based on the difference between the agent's pre-thinking and post-thinking performance. When tested on the novel, withheld problem, the resulting agent successfully simulated alternative scenarios and used the resulting information to guide its behavior in the actual environment, solving the novel task in a single real-environment trial (zero-shot).
Highly accurate geometric precision and dense image features characterize True Digital Orthophoto Maps (TDOMs), which are in great demand for applications such as urban planning, infrastructure management, and environmental monitoring. Traditional TDOM generation methods need sophisticated processes, such as Digital Surface Models (DSM) and occlusion detection, which are computationally expensive and prone to errors. This work presents an alternative technique rooted in 2D Gaussian Splatting (2DGS), free of explicit DSM and occlusion detection. With depth map generation, spatial information for every pixel within the TDOM is retrieved and can reconstruct the scene with high precision. Divide-and-conquer strategy achieves excellent GS training and rendering with high-resolution TDOMs at a lower resource cost, which preserves higher quality of rendering on complex terrain and thin structure without a decrease in efficiency. Experimental results demonstrate the efficiency of large-scale scene reconstruction and high-precision terrain modeling. This approach provides accurate spatial data, which assists users in better planning and decision-making based on maps.
In this study, we formulate the drone delivery problem as a control problem and solve it using Model Predictive Control. Two experiments are performed: The first is on a less challenging grid world environment with lower dimensionality, and the second is with a higher dimensionality and added complexity. The MPC method was benchmarked against three popular Multi-Agent Reinforcement Learning (MARL): Independent $Q$-Learning (IQL), Joint Action Learners (JAL), and Value-Decomposition Networks (VDN). It was shown that the MPC method solved the problem quicker and required fewer optimal numbers of drones to achieve a minimized cost and navigate the optimal path.
Reliable quantification of Ki-67, a key proliferation marker in breast cancer, is essential for molecular subtyping and informed treatment planning. Conventional approaches, including visual estimation and manual counting, suffer from interobserver variability and limited reproducibility. This study introduces an AI-assisted method using the YOLOv8 object detection framework for automated Ki-67 scoring. High-resolution digital images (40x magnification) of immunohistochemically stained tumor sections were captured from Ki-67 hotspot regions and manually annotated by a domain expert to distinguish Ki-67-positive and negative tumor cells. The dataset was augmented and divided into training (80%), validation (10%), and testing (10%) subsets. Among the YOLOv8 variants tested, the Medium model achieved the highest performance, with a mean Average Precision at 50% Intersection over Union (mAP50) exceeding 85% for Ki-67-positive cells. The proposed approach offers an efficient, scalable, and objective alternative to conventional scoring methods, supporting greater consistency in Ki-67 evaluation. Future directions include developing user-friendly clinical interfaces and expanding to multi-institutional datasets to enhance generalizability and facilitate broader adoption in diagnostic practice.
This paper introduces a multi-agent application system designed to enhance office collaboration efficiency and work quality. The system integrates artificial intelligence, machine learning, and natural language processing technologies, achieving functionalities such as task allocation, progress monitoring, and information sharing. The agents within the system are capable of providing personalized collaboration support based on team members' needs and incorporate data analysis tools to improve decision-making quality. The paper also proposes an intelligent agent architecture that separates Plan and Solver, and through techniques such as multi-turn query rewriting and business tool retrieval, it enhances the agent's multi-intent and multi-turn dialogue capabilities. Furthermore, the paper details the design of tools and multi-turn dialogue in the context of office collaboration scenarios, and validates the system's effectiveness through experiments and evaluations. Ultimately, the system has demonstrated outstanding performance in real business applications, particularly in query understanding, task planning, and tool calling. Looking forward, the system is expected to play a more significant role in addressing complex interaction issues within dynamic environments and large-scale multi-agent systems.
The ATLAS experiment is planning a complete replacement of its inner detector(ID) with a new all-silicon inner tracker (ITk) for the ATLAS Inner Tracker Phase-2 upgrade. The ATLAS18 silicon strip sensors are designed to operate up to the integrated luminosity of 4000 fb$^{-1}$, which corresponds to the maximum fluence of $1.6 \times 10^{15} \, \text n_{\text{eq}} / \text{cm}^2$ (including safety factor). To enhance the quality assurance (QA) program to monitor the key properties of the sensors, the strip sensor community is considering to include China Spallation Neutron Source (CSNS) as a proton irradiation site and Institute of High Energy Physics (IHEP) as a QA test site. A total of 18 ATLAS18 ITk QA test pieces were irradiated with $6.0 \times 10^{14}$, $1.6 \times 10^{15}$, and $2.6 \times 10^{15} \, \text n_{\text{eq}} / \text{cm}^2$ protons at CSNS, and measured at IHEP, including IV (leakage current-voltage), CV (capacitance-voltage) and CCE (charge collection efficiency) measurements. The upgraded irradiation setup at CSNS and measurement setup at IHEP are shown in this paper. Irradiated samples were exchanged between IHEP, Ljubljana and Birmingham to cross-check CCE measurements.
In perioperative care, precise in-bed 3D patient pose and shape estimation (PSE) can be vital in optimizing patient positioning in preoperative planning, enabling accurate overlay of medical images for augmented reality-based surgical navigation, and mitigating risks of prolonged immobility during recovery. Conventional PSE methods relying on modalities such as RGB-D, infrared, or pressure maps often struggle with occlusions caused by bedding and complex patient positioning, leading to inaccurate estimation that can affect clinical outcomes. To address these challenges, we present the first multi-modal in-bed patient 3D PSE network that fuses detailed geometric features extracted from routinely acquired computed tomography (CT) scans with depth maps (mPSE-CT). mPSE-CT incorporates a shape estimation module that utilizes probabilistic correspondence alignment, a pose estimation module with a refined neural network, and a final parameters mixing module. This multi-modal network robustly reconstructs occluded body regions and enhances the accuracy of the estimated 3D human mesh model. We validated mPSE-CT using proprietary whole-body rigid phantom and volunteer datasets in clinical scenarios. mPSE-CT outperformed the best-performing prior method by 23% and 49.16% in pose and shape estimation respectively, demonstrating its potential for improving clinical outcomes in challenging perioperative environments.
This study evaluates the accuracy of nuclear fragmentation simulations using a quantum molecular dynamics (QMD) model based on relativistic mean field (RMF) theory for an energy range of 50-400 MeV/u, relevant to hadron therapy. A total of 16 parameter sets within the RMF framework are assessed based on their ability to reproduce ground-state properties such as the mean squared radius and binding energy, as obtained in QMD simulations. Among these, the NS2 parameter set is identified as the most suitable for describing stable nuclei over a wide mass range, with the use of an adaptive Gaussian wave packet width. Fragmentation cross sections of carbon ion projectiles on light nuclei targets (H, C, O, Al, Ti, and Cu) are simulated at incident energies of 50, 95, 290, and 400 MeV/u and compared with experimental data. The results indicate that the RQMD.RMF model provides superior reproductions for fragmentation at lower energies (50 and 95 MeV/u) compared to the Light Ion QMD (LIQMD) model implemented in Geant4 version 11.2. At higher energies (290 and 400 MeV/u), the RQMD.RMF model performs comparably to the LIQMD. This study demonstrates that the RQMD.RMF model provides a reliable framework for analyzing nuclear fragmentation and holds potential for applications in the planning and quality assurance of hadron therapy.