Recent advancements in large language models have demonstrated how chain-of-thought (CoT) and reinforcement learning (RL) can improve performance. However, applying such reasoning strategies to the visual generation domain remains largely unexplored. In this paper, we present T2I-R1, a novel reasoning-enhanced text-to-image generation model, powered by RL with a bi-level CoT reasoning process. Specifically, we identify two levels of CoT that can be utilized to enhance different stages of generation: (1) the semantic-level CoT for high-level planning of the prompt and (2) the token-level CoT for low-level pixel processing during patch-by-patch generation. To better coordinate these two levels of CoT, we introduce BiCoT-GRPO with an ensemble of generation rewards, which seamlessly optimizes both generation CoTs within the same training step. By applying our reasoning strategies to the baseline model, Janus-Pro, we achieve superior performance with 13% improvement on T2I-CompBench and 19% improvement on the WISE benchmark, even surpassing the state-of-the-art model FLUX.1. Code is available at: https://github.com/CaraJ7/T2I-R1
Sense of touch that allows robots to detect contact and measure interaction forces enables them to perform challenging tasks such as grasping fragile objects or using tools. Tactile sensors in theory can equip the robots with such capabilities. However, accuracy of the measured forces is not on a par with those of the force sensors due to the potential calibration challenges and noise. This has limited the values these sensors can offer in manipulation applications that require force control. In this paper, we introduce GeoDEx, a unified estimation, planning, and control framework using geometric primitives such as plane, cone and ellipsoid, which enables dexterous as well as extrinsic manipulation in the presence of uncertain force readings. Through various experimental results, we show that while relying on direct inaccurate and noisy force readings from tactile sensors results in unstable or failed manipulation, our method enables successful grasping and extrinsic manipulation of different objects. Additionally, compared to directly running optimization using SOCP (Second Order Cone Programming), planning and force estimation using our framework achieves a 14x speed-up.
Deterministic partially observable Markov decision processes (DetPOMDPs) often arise in planning problems where the agent is uncertain about its environmental state but can act and observe deterministically. In this paper, we propose DetMCVI, an adaptation of the Monte Carlo Value Iteration (MCVI) algorithm for DetPOMDPs, which builds policies in the form of finite-state controllers (FSCs). DetMCVI solves large problems with a high success rate, outperforming existing baselines for DetPOMDPs. We also verify the performance of the algorithm in a real-world mobile robot forest mapping scenario.
Automated parking is a critical feature of Advanced Driver Assistance Systems (ADAS), where accurate trajectory prediction is essential to bridge perception and planning modules. Despite its significance, research in this domain remains relatively limited, with most existing studies concentrating on single-modal trajectory prediction of vehicles. In this work, we propose ParkDiffusion, a novel approach that predicts the trajectories of both vehicles and pedestrians in automated parking scenarios. ParkDiffusion employs diffusion models to capture the inherent uncertainty and multi-modality of future trajectories, incorporating several key innovations. First, we propose a dual map encoder that processes soft semantic cues and hard geometric constraints using a two-step cross-attention mechanism. Second, we introduce an adaptive agent type embedding module, which dynamically conditions the prediction process on the distinct characteristics of vehicles and pedestrians. Third, to ensure kinematic feasibility, our model outputs control signals that are subsequently used within a kinematic framework to generate physically feasible trajectories. We evaluate ParkDiffusion on the Dragon Lake Parking (DLP) dataset and the Intersections Drone (inD) dataset. Our work establishes a new baseline for heterogeneous trajectory prediction in parking scenarios, outperforming existing methods by a considerable margin.
Multimodal magnetic resonance imaging (MRI) constitutes the first line of investigation for clinicians in the care of brain tumors, providing crucial insights for surgery planning, treatment monitoring, and biomarker identification. Pre-training on large datasets have been shown to help models learn transferable representations and adapt with minimal labeled data. This behavior is especially valuable in medical imaging, where annotations are often scarce. However, applying this paradigm to multimodal medical data introduces a challenge: most existing approaches assume that all imaging modalities are available during both pre-training and fine-tuning. In practice, missing modalities often occur due to acquisition issues, specialist unavailability, or specific experimental designs on small in-house datasets. Consequently, a common approach involves training a separate model for each desired modality combination, making the process both resource-intensive and impractical for clinical use. Therefore, we introduce BM-MAE, a masked image modeling pre-training strategy tailored for multimodal MRI data. The same pre-trained model seamlessly adapts to any combination of available modalities, extracting rich representations that capture both intra- and inter-modal information. This allows fine-tuning on any subset of modalities without requiring architectural changes, while still benefiting from a model pre-trained on the full set of modalities. Extensive experiments show that the proposed pre-training strategy outperforms or remains competitive with baselines that require separate pre-training for each modality subset, while substantially surpassing training from scratch on several downstream tasks. Additionally, it can quickly and efficiently reconstruct missing modalities, highlighting its practical value. Code and trained models are available at: https://github.com/Lucas-rbnt/bmmae
Learning to solve complex tasks with signal temporal logic (STL) specifications is crucial to many real-world applications. However, most previous works only consider fixed or parametrized STL specifications due to the lack of a diverse STL dataset and encoders to effectively extract temporal logic information for downstream tasks. In this paper, we propose TeLoGraF, Temporal Logic Graph-encoded Flow, which utilizes Graph Neural Networks (GNN) encoder and flow-matching to learn solutions for general STL specifications. We identify four commonly used STL templates and collect a total of 200K specifications with paired demonstrations. We conduct extensive experiments in five simulation environments ranging from simple dynamical models in the 2D space to high-dimensional 7DoF Franka Panda robot arm and Ant quadruped navigation. Results show that our method outperforms other baselines in the STL satisfaction rate. Compared to classical STL planning algorithms, our approach is 10-100X faster in inference and can work on any system dynamics. Besides, we show our graph-encoding method's capability to solve complex STLs and robustness to out-distribution STL specifications. Code is available at https://github.com/mengyuest/TeLoGraF
Collaborative robots must continually adapt to novel tasks and user preferences without overburdening the user. While prior interactive robot learning methods aim to reduce human effort, they are typically limited to single-task scenarios and are not well-suited for sustained, multi-task collaboration. We propose COIL (Cost-Optimal Interactive Learning) -- a multi-task interaction planner that minimizes human effort across a sequence of tasks by strategically selecting among three query types (skill, preference, and help). When user preferences are known, we formulate COIL as an uncapacitated facility location (UFL) problem, which enables bounded-suboptimal planning in polynomial time using off-the-shelf approximation algorithms. We extend our formulation to handle uncertainty in user preferences by incorporating one-step belief space planning, which uses these approximation algorithms as subroutines to maintain polynomial-time performance. Simulated and physical experiments on manipulation tasks show that our framework significantly reduces the amount of work allocated to the human while maintaining successful task completion.
Automated GUI agents aims to facilitate user interaction by automatically performing complex tasks in digital environments, such as web, mobile, desktop devices. It receives textual task instruction and GUI description to generate executable actions (\emph{e.g.}, click) and operation boxes step by step. Training a GUI agent mainly involves grounding and planning stages, in which the GUI grounding focuses on finding the execution coordinates according to the task, while the planning stage aims to predict the next action based on historical actions. However, previous work suffers from the limitations of insufficient training data for GUI grounding, as well as the ignorance of backtracking historical behaviors for GUI planning. To handle the above challenges, we propose ScaleTrack, a training framework by scaling grounding and backtracking planning for automated GUI agents. We carefully collected GUI samples of different synthesis criterions from a wide range of sources, and unified them into the same template for training GUI grounding models. Moreover, we design a novel training strategy that predicts the next action from the current GUI image, while also backtracking the historical actions that led to the GUI image. In this way, ScaleTrack explains the correspondence between GUI images and actions, which effectively describes the evolution rules of the GUI environment. Extensive experimental results demonstrate the effectiveness of ScaleTrack. Data and code will be available at url.
This paper reviews two established formulations for modelling multi-year energy investments: the simple method, which aggregates all capacity regardless of commissioning year, and the vintage method, which explicitly tracks investments by year to capture differences in technical parameters over time. While the vintage method improves modelling fidelity, it significantly increases model size. To address this, we propose a novel compact formulation that maintains the ability to represent year-specific characteristics while reducing the dimensionality of the model. The proposed compact formulation is implemented in the open-source model TulipaEnergyModel.jl and offers a tractable alternative for detailed long-term energy system planning.
Surgery plays an important role within the treatment for neuroblastoma, a common pediatric cancer. This requires careful planning, often via magnetic resonance imaging (MRI)-based anatomical 3D models. However, creating these models is often time-consuming and user dependent. We organized the Surgical Planning in Pediatric Neuroblastoma (SPPIN) challenge, to stimulate developments on this topic, and set a benchmark for fully automatic segmentation of neuroblastoma on multi-model MRI. The challenge started with a training phase, where teams received 78 sets of MRI scans from 34 patients, consisting of both diagnostic and post-chemotherapy MRI scans. The final test phase, consisting of 18 MRI sets from 9 patients, determined the ranking of the teams. Ranking was based on the Dice similarity coefficient (Dice score), the 95th percentile of the Hausdorff distance (HD95) and the volumetric similarity (VS). The SPPIN challenge was hosted at MICCAI 2023. The final leaderboard consisted of 9 teams. The highest-ranking team achieved a median Dice score 0.82, a median HD95 of 7.69 mm and a VS of 0.91, utilizing a large, pretrained network called STU-Net. A significant difference for the segmentation results between diagnostic and post-chemotherapy MRI scans was observed (Dice = 0.89 vs Dice = 0.59, P = 0.01) for the highest-ranking team. SPPIN is the first medical segmentation challenge in extracranial pediatric oncology. The highest-ranking team used a large pre-trained network, suggesting that pretraining can be of use in small, heterogenous datasets. Although the results of the highest-ranking team were high for most patients, segmentation especially in small, pre-treated tumors were insufficient. Therefore, more reliable segmentation methods are needed to create clinically applicable models to aid surgical planning in pediatric neuroblastoma.
This paper proposes an integrated approach for the safe and efficient control of mobile robots in dynamic and uncertain environments. The approach consists of two key steps: one-shot multimodal motion prediction to anticipate motions of dynamic obstacles and model predictive control to incorporate these predictions into the motion planning process. Motion prediction is driven by an energy-based neural network that generates high-resolution, multi-step predictions in a single operation. The prediction outcomes are further utilized to create geometric shapes formulated as mathematical constraints. Instead of treating each dynamic obstacle individually, predicted obstacles are grouped by proximity in an unsupervised way to improve performance and efficiency. The overall collision-free navigation is handled by model predictive control with a specific design for proactive dynamic obstacle avoidance. The proposed approach allows mobile robots to navigate effectively in dynamic environments. Its performance is accessed across various scenarios that represent typical warehouse settings. The results demonstrate that the proposed approach outperforms other existing dynamic obstacle avoidance methods.
While game-theoretic planning frameworks are effective at modeling multi-agent interactions, they require solving optimization problems with hundreds or thousands of variables, resulting in long computation times that limit their use in large-scale, real-time systems. To address this issue, we propose PSN Game: a novel game-theoretic planning framework that reduces runtime by learning a Player Selection Network (PSN). A PSN outputs a player selection mask that distinguishes influential players from less relevant ones, enabling the ego player to solve a smaller, masked game involving only selected players. By reducing the number of variables in the optimization problem, PSN directly lowers computation time. The PSN Game framework is more flexible than existing player selection methods as it i) relies solely on observations of players' past trajectories, without requiring full state, control, or other game-specific information; and ii) requires no online parameter tuning. We train PSNs in an unsupervised manner using a differentiable dynamic game solver, with reference trajectories from full-player games guiding the learning. Experiments in both simulated scenarios and human trajectory datasets demonstrate that i) PSNs outperform baseline selection methods in trajectory smoothness and length, while maintaining comparable safety and achieving a 10x speedup in runtime; and ii) PSNs generalize effectively to real-world scenarios without fine-tuning. By selecting only the most relevant players for decision-making, PSNs offer a general mechanism for reducing planning complexity that can be seamlessly integrated into existing multi-agent planning frameworks.
This article deals with the description and recognition of fiber bundles, in particular nerves, in medical images, based on the anatomical description of the fiber trajectories. To this end, we propose a logical formalization of this anatomical knowledge. The intrinsically imprecise description of nerves, as found in anatomical textbooks, leads us to propose fuzzy semantics combined with first-order logic. We define a language representing spatial entities, relations between these entities and quantifiers. A formula in this language is then a formalization of the natural language description. The semantics are given by fuzzy representations in a concrete domain and satisfaction degrees of relations. Based on this formalization, a spatial reasoning algorithm is proposed for segmentation and recognition of nerves from anatomical and diffusion magnetic resonance images, which is illustrated on pelvic nerves in pediatric imaging, enabling surgeons to plan surgery.
Solving problems related to planning and operations of large-scale power systems is challenging on classical computers due to their inherent nature as mixed-integer and nonlinear problems. Quantum computing provides new avenues to approach these problems. We develop a hybrid quantum-classical algorithm for the Unit Commitment (UC) problem in power systems which aims at minimizing the total cost while optimally allocating generating units to meet the hourly demand of the power loads. The hybrid algorithm combines a variational quantum algorithm (VQA) with a classical Bender's type heuristic. The resulting algorithm computes approximate solutions to UC in three stages: i) a collection of UC vectors capable meeting the power demand with lowest possible operating costs is generated based on VQA; ii) a classical sequential least squares programming (SLSQP) routine is leveraged to find the optimal power level corresponding to a predetermined number of candidate vectors; iii) in the last stage, the approximate solution of UC along with generating units power level combination is given. To demonstrate the effectiveness of the presented method, three different systems with 3 generating units, 10 generating units, and 26 generating units were tested for different time periods. In addition, convergence of the hybrid quantum-classical algorithm for select time periods is proven out on IonQ's Forte system.
Electronic-structure theory is the foundation of the description of materials including multiscale modeling of their properties and functions. Obviously, without sufficient accuracy at the base, reliable predictions are unlikely at any level that follows. The software package FHI-aims has proven to be a game changer for accurate free-energy calculations because of its scalability, numerical precision, and its efficient handling of density functional theory (DFT) with hybrid functionals and van der Waals interactions. It treats molecules, clusters, and extended systems (solids and liquids) on an equal footing. Besides DFT, FHI-aims also includes quantum-chemistry methods, descriptions for excited states and vibrations, and calculations of various types of transport. Recent advancements address the integration of FHI-aims into an increasing number of workflows and various artificial intelligence (AI) methods. This Roadmap describes the state-of-the-art of FHI-aims and advancements that are currently ongoing or planned.
India is a founder-member country to participate in the construction of the international multipurpose accelerator facility called the Facility for Antiproton and Ion Research (FAIR) at Darmstadt, Germany. Bose Institute, Kolkata, has been designated as the Indian shareholder of the FAIR GmbH and the nodal Indian Institution for co-ordinating Indian participation in the FAIR programme. Indian participation in FAIR is twofold. Firstly, the advancement of knowledge in nuclear astrophysics and reaction, high-energy nuclear physics, atomic \& plasma physics and application through the participation of Indian researchers, engineers and students in various experiments planned at FAIR. In addition to this, India is also contributing high-tech accelerator equipment as in-kind contribution to FAIR. Our active involvement include the designing, manufacturing and supply of in-kind accelerator items e.g. power converters, vacuum chamber, beam catchers, IT diagnostic cables among them and coordinating the participation of Indian scientists in the FAIR experiments including detector development, physics simulation, experimental data analysis. Indian researchers have been participating in the two major experiments at FAIR, i.e. Nuclear Structure, Astrophysics and Reactions (NUSTAR) and Compressed Baryonic Matter (CBM) and in particular Bose Institute is involved in the CBM experiment, to study and characterize the matter created in the relativistic nucleus-nucleus collisions at high net baryon density and relatively moderate temperature. In this article a brief overview on the FAIR facility, the experiments at FAIR and Indian participation are presented.
This paper reviews discounting approaches for modeling multi-year energy investments, focusing on total versus annualised cost formulations. We discuss how time value of money is handled, and how salvage value and milestone-year weighting can address mismatches between asset lifetimes and model horizons. These methods are implemented in the open-source TulipaEnergyModel to support transparent and tractable long-term energy system planning.
This article proposes a new path planning method for addressing multi-level terrain situations. The proposed method includes innovations in three aspects: 1) the pre-processing of point cloud maps with a multi-level skip-list structure and data-slimming algorithm for well-organized and simplified map formalization and management, 2) the direct acquisition of local traversability indexes through vehicle and point cloud interaction analysis, which saves work in surface fitting, and 3) the assignment of traversability indexes on a multi-level connectivity graph to generate a weighted traversability graph for generally search-based path planning. The A* algorithm is modified to utilize the traversability graph to generate a short and safe path. The effectiveness and reliability of the proposed method are verified through indoor and outdoor experiments conducted in various environments, including multi-floor buildings, woodland, and rugged mountainous regions. The results demonstrate that the proposed method can properly address 3D path planning problems for ground vehicles in a wide range of situations.
This study examines the psychological impact of energy crises on households, utilising the Perceived Stress Scale-10 (PSS-10) to measure the stress induced by disruptions in electricity, gas, and fuel supply and pricing. Through a multivariate analysis incorporating Ordinary Least Squares (OLS) regression, Simultaneous-Quantile Regressions (SQR), Random Forest (RF) and Ordered Probit models, the research identifies the key socio-demographic and environmental factors influencing household stress. Our findings reveal that urban residency, low-income households, older individuals, and those with low environmental awareness are particularly vulnerable to stress during energy crises. Regional disparities and attitudes towards nuclear and renewable energy also significantly shape stress responses. The study emphasises the need for psychologically-informed energy policy, advocating for the inclusion of stress metrics in energy planning to enhance resilience and address the multi-dimensional nature of energy insecurity. This research contributes a novel, human-centric perspective to energy policy, urging policymakers to integrate psychosocial resilience alongside traditional technical and economic considerations in the design of energy interventions.
Cloud computing has become a pivotal platform for executing scientific workflows due to its scalable and cost-effective infrastructure. Scientific Cloud Service Providers (SCSPs) act as intermediaries that rent virtual machines (VMs) from Infrastructure-as-a-Service (IaaS) providers to meet users' workflow execution demands. The SCSP earns profit from the execution of scientific workflows if it completes the execution of the workflow before the specified deadline of the workflow. This paper addresses two key challenges that impact the profitability of SCSPs: the cold start problem and the efficient management of diverse VM pricing models, namely reserved, on-demand, and spot instances. We propose a hybrid scheduling framework that integrates initial planning based on historical data with real-time adaptations informed by actual workload variations. In the initial phase, VMs are provisioned using reserved pricing based on predicted workloads and spot instances. During execution, the system dynamically adjusts by provisioning additional VMs through on-demand or spot instances to accommodate unexpected bursts in task arrivals. Our framework also incorporates a dependency-aware task scheduling strategy that accounts for cold start delays and spot pricing volatility. Experimental results on real-world benchmark datasets demonstrate that our approach outperforms state-of-the-art methods, achieving up to 20% improvement over cold-start-focused techniques and 15% over pricing-model-based VM provisioning strategies.
Precise manipulation tasks require accurate knowledge of payload inertial parameters. Unfortunately, identifying these parameters for unknown payloads while ensuring that the robotic system satisfies its input and state constraints while avoiding collisions with the environment remains a significant challenge. This paper presents an integrated framework that enables robotic manipulators to safely and automatically identify payload parameters while maintaining operational safety guarantees. The framework consists of two synergistic components: an online trajectory planning and control framework that generates provably-safe exciting trajectories for system identification that can be tracked while respecting robot constraints and avoiding obstacles and a robust system identification method that computes rigorous overapproximative bounds on end-effector inertial parameters assuming bounded sensor noise. Experimental validation on a robotic manipulator performing challenging tasks with various unknown payloads demonstrates the framework's effectiveness in establishing accurate parameter bounds while maintaining safety throughout the identification process. The code is available at our project webpage: https://roahmlab.github.io/OnlineSafeSysID/.
Continuous Software Engineering (CSE) is widely adopted in the industry, integrating practices such as Continuous Integration and Continuous Deployment (CI/CD). Beyond technical aspects, CSE also encompasses business activities like continuous planning, budgeting, and operational processes. Coordinating these activities in large-scale product development involves multiple stakeholders, increasing complexity. This study aims to address this complexity by identifying and analyzing critical dependencies in large-scale CSE. Based on 17 semi-structured interviews conducted at two Nordic fintech companies, our preliminary findings indicate that dependencies between software teams and support functions, as well as between software teams and external entities, are the primary sources of delays and bottlenecks. As a next step, we plan to further refine our understanding of critical dependencies in large-scale CSE and explore coordination mechanisms that can better support software development teams in managing these challenges.
We introduce Phi-4-reasoning, a 14-billion parameter reasoning model that achieves strong performance on complex reasoning tasks. Trained via supervised fine-tuning of Phi-4 on carefully curated set of "teachable" prompts-selected for the right level of complexity and diversity-and reasoning demonstrations generated using o3-mini, Phi-4-reasoning generates detailed reasoning chains that effectively leverage inference-time compute. We further develop Phi-4-reasoning-plus, a variant enhanced through a short phase of outcome-based reinforcement learning that offers higher performance by generating longer reasoning traces. Across a wide range of reasoning tasks, both models outperform significantly larger open-weight models such as DeepSeek-R1-Distill-Llama-70B model and approach the performance levels of full DeepSeek-R1 model. Our comprehensive evaluations span benchmarks in math and scientific reasoning, coding, algorithmic problem solving, planning, and spatial understanding. Interestingly, we observe a non-trivial transfer of improvements to general-purpose benchmarks as well. In this report, we provide insights into our training data, our training methodologies, and our evaluations. We show that the benefit of careful data curation for supervised fine-tuning (SFT) extends to reasoning language models, and can be further amplified by reinforcement learning (RL). Finally, our evaluation points to opportunities for improving how we assess the performance and robustness of reasoning models.
Alzheimer's Disease (AD) is marked by significant inter-individual variability in its progression, complicating accurate prognosis and personalized care planning. This heterogeneity underscores the critical need for predictive models capable of forecasting patient-specific disease trajectories. Artificial Intelligence (AI) offers powerful tools to address this challenge by analyzing complex, multi-modal, and longitudinal patient data. This paper provides a comprehensive survey of AI methodologies applied to personalized AD progression prediction. We review key approaches including state-space models for capturing temporal dynamics, deep learning techniques like Recurrent Neural Networks for sequence modeling, Graph Neural Networks (GNNs) for leveraging network structures, and the emerging concept of AI-driven digital twins for individualized simulation. Recognizing that data limitations often impede progress, we examine common challenges such as high dimensionality, missing data, and dataset imbalance. We further discuss AI-driven mitigation strategies, with a specific focus on synthetic data generation using Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) to augment and balance datasets. The survey synthesizes the strengths and limitations of current approaches, emphasizing the trend towards multimodal integration and the persistent need for model interpretability and generalizability. Finally, we identify critical open challenges, including robust external validation, clinical integration, and ethical considerations, and outline promising future research directions such as hybrid models, causal inference, and federated learning. This review aims to consolidate current knowledge and guide future efforts in developing clinically relevant AI tools for personalized AD prognostication.
While most heuristics studied in heuristic search depend only on the state, some accumulate information during search and thus also depend on the search history. Various existing approaches use such dynamic heuristics in $\mathrm{A}^*$-like algorithms and appeal to classic results for $\mathrm{A}^*$ to show optimality. However, doing so ignores the complexities of searching with a mutable heuristic. In this paper we formalize the idea of dynamic heuristics and use them in a generic algorithm framework. We study a particular instantiation that models $\mathrm{A}^*$ with dynamic heuristics and show general optimality results. Finally we show how existing approaches from classical planning can be viewed as special cases of this instantiation, making it possible to directly apply our optimality results.
Efficient mission planning for cooperative systems involving Unmanned Aerial Vehicles (UAVs) and Unmanned Ground Vehicles (UGVs) requires addressing energy constraints, scalability, and coordination challenges between agents. UAVs excel in rapidly covering large areas but are constrained by limited battery life, while UGVs, with their extended operational range and capability to serve as mobile recharging stations, are hindered by slower speeds. This heterogeneity makes coordination between UAVs and UGVs critical for achieving optimal mission outcomes. In this work, we propose a scalable deep reinforcement learning (DRL) framework to address the energy-constrained cooperative routing problem for multi-agent UAV-UGV teams, aiming to visit a set of task points in minimal time with UAVs relying on UGVs for recharging during the mission. The framework incorporates sortie-wise agent switching to efficiently manage multiple agents, by allocating task points and coordinating actions. Using an encoder-decoder transformer architecture, it optimizes routes and recharging rendezvous for the UAV-UGV team in the task scenario. Extensive computational experiments demonstrate the framework's superior performance over heuristic methods and a DRL baseline, delivering significant improvements in solution quality and runtime efficiency across diverse scenarios. Generalization studies validate its robustness, while dynamic scenario highlights its adaptability to real-time changes with a case study. This work advances UAV-UGV cooperative routing by providing a scalable, efficient, and robust solution for multi-agent mission planning.
We study a variant of LTLf synthesis that synthesizes adaptive strategies for achieving a multi-tier goal, consisting of multiple increasingly challenging LTLf objectives in nondeterministic planning domains. Adaptive strategies are strategies that at any point of their execution (i) enforce the satisfaction of as many objectives as possible in the multi-tier goal, and (ii) exploit possible cooperation from the environment to satisfy as many as possible of the remaining ones. This happens dynamically: if the environment cooperates (ii) and an objective becomes enforceable (i), then our strategies will enforce it. We provide a game-theoretic technique to compute adaptive strategies that is sound and complete. Notably, our technique is polynomial, in fact quadratic, in the number of objectives. In other words, it handles multi-tier goals with only a minor overhead compared to standard LTLf synthesis.
Mechanical search (MS) in cluttered environments remains a significant challenge for autonomous manipulators, requiring long-horizon planning and robust state estimation under occlusions and partial observability. In this work, we introduce XPG-RL, a reinforcement learning framework that enables agents to efficiently perform MS tasks through explainable, priority-guided decision-making based on raw sensory inputs. XPG-RL integrates a task-driven action prioritization mechanism with a learned context-aware switching strategy that dynamically selects from a discrete set of action primitives such as target grasping, occlusion removal, and viewpoint adjustment. Within this strategy, a policy is optimized to output adaptive threshold values that govern the discrete selection among action primitives. The perception module fuses RGB-D inputs with semantic and geometric features to produce a structured scene representation for downstream decision-making. Extensive experiments in both simulation and real-world settings demonstrate that XPG-RL consistently outperforms baseline methods in task success rates and motion efficiency, achieving up to 4.5$\times$ higher efficiency in long-horizon tasks. These results underscore the benefits of integrating domain knowledge with learnable decision-making policies for robust and efficient robotic manipulation.
We propose an opinion-driven navigation framework for multi-robot traversal through a narrow corridor. Our approach leverages a multi-agent decision-making model known as the Nonlinear Opinion Dynamics (NOD) to address the narrow corridor passage problem, formulated as a multi-robot navigation game. By integrating the NOD model with a multi-robot path planning algorithm, we demonstrate that the framework effectively reduces the likelihood of deadlocks during corridor traversal. To ensure scalability with an increasing number of robots, we introduce a game reduction technique that enables efficient coordination in larger groups. Extensive simulation studies are conducted to validate the effectiveness of the proposed approach.
Access to high-quality medical data is often restricted due to privacy concerns, posing significant challenges for training artificial intelligence (AI) algorithms within Electronic Health Record (EHR) applications. In this study, prompt engineering with the GPT-4 API was employed to generate high-quality synthetic datasets aimed at overcoming this limitation. The generated data encompassed a comprehensive array of patient admission information, including healthcare provider details, hospital departments, wards, bed assignments, patient demographics, emergency contacts, vital signs, immunizations, allergies, medical histories, appointments, hospital visits, laboratory tests, diagnoses, treatment plans, medications, clinical notes, visit logs, discharge summaries, and referrals. To ensure data quality and integrity, advanced validation techniques were implemented utilizing models such as BERT's Next Sentence Prediction for sentence coherence, GPT-2 for overall plausibility, RoBERTa for logical consistency, autoencoders for anomaly detection, and conducted diversity analysis. Synthetic data that met all validation criteria were integrated into a comprehensive PostgreSQL database, serving as the data management system for the EHR application. This approach demonstrates that leveraging generative AI models with rigorous validation can effectively produce high-quality synthetic medical data, facilitating the training of AI algorithms while addressing privacy concerns associated with real patient data.