Foundation Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Massachusetts Institute of Technology

Overview of the hierarchical control framework where an LLM-based high-level planner assigns a leader for a multi-robot system, resulting in a leader-follower formation.

Abstract

Multi-agent robotic systems (MRS) are prone to deadlocks in an obstacle environment where the robots can get stuck away from their desired locations under a smooth low-level control policy. Without an external intervention, often in terms of a high-level command, a low-level control policy can not resolve such deadlocks. Utilizing the generalizability and low data requirements of foundation models, this paper explores the possibility of using text-based models, i.e., large language models (LLMs), and text-and-image-based models, i.e., vision-language models (VLMs) for deadlock resolution. We propose a hierarchical control framework in which a foundation model-based high-level planner resolves deadlocks by assigning a leader and a set of waypoints to the leader of the MRS. Then, a low-level distributed control policy based on graph neural networks is executed. We conduct extensive experiments on various MRS environments using the best available pre-trained LLMs and VLMs. We compare their performance with a grid-based planner, in terms of the effectiveness in assisting the MRS to reach their goal locations and the computational time. Our results illustrate that foundation models can assist MRS operating in complex obstacle-cluttered environments to resolve deadlocks efficiently. In particular, compared to grid-based planners, the foundation models have better performance in terms of goal-reaching rate and computational time for complex environments.

Prompt engineering

User prompts used for querying LLMs are given below.
    • Task description: We are working with a multi-robot system navigating in an obstacle environment to reach their respective goal locations. The objective is to move the robots toward their goal locations while maintaining safety with obstacles, safety with each other, and inter-agent connectivity. Safety is based on agents maintaining a minimum inter-agent "safety radius" while connectivity is based on connected agents remaining within a "connectivity radius". Your role as the helpful assistant to provide a high-level command when the system gets stuck near obstacles. The high-level command is in terms of a leader assignment for the multi-robot system and a direction of motion for the leader. An optimal choice of leader and moving direction minimizes the traveling distance of agents toward their goals and maintains safety and connectivity.
    • Environment state: The multi-robot environment description consists of the tuple: (Number of agents, Safety radius, Connectivity radius, Agent locations, Agent goals, Obstacles, Number of waypoints). The environment consists of robot Agents with information ("AgentId"=id, "current state"=(x,y), "goal location"=(xg,yg)). The obstacles are represented as a bottom-left corner, its width, and height. The obstacles are represented as a list of tuples [(x,y,w,h)]. In addition, there are global environment variables "Number of agents" = N, "Safety radius" = r, "Connectivity radius" = R. The task is to provide a leader assignment for the multi-robot system and a set of waypoints for the leader. The leader assignment is an integer value in the range (1, Number of agents) and the waypoints for the leader are (x,y) coordinates. The number of waypoints is described by the variable "Number of waypoints" = M.
    • Desired output: The expected output is a JSON format file with the keys "Leader" and "Waypoints". The key "Leader" can take integer values in the range (1, Number of agents) and "Waypoints" are of the form [(x1, y1), (x2, y2), ..., (xM, yM)]. The waypoints are ordered in the sequence the leader should visit them. The first point should NOT be the current location of the leader. All the waypoints should be at least 2r distance from all the obstacles. The waypoints should be such that the leader can move toward its goal location while maintaining safety with the obstacles. The path connecting the leader and the waypoints should NOT intersect with any of the obstacles. The waypoints should be in the free space of the environment, away from ALL the known obstacles. The waypoints can be chosen to wrap around the obstacles to allow the leader to move toward its goal location while evading the obstacles. If the leader cannot move directly in the direction of its goal location, the first waypoint should be to the left or right of the leader to avoid obstacles. The consecutive waypoints should be such that the leader moves toward its goal location while maintaining safety with the obstacles.
    • Example Scenario An example environment description is as follows.
User prompts used for querying VLMs are given below.
    • Task description: We are working with a multi-robot system navigating in an obstacle environment to reach their respective goal locations. The objective is to move the robots toward their goal locations while maintaining safety with obstacles, safety with each other, and inter-agent connectivity. Safety is based on agents maintaining a minimum inter-agent "safety radius" while connectivity is based on connected agents remaining within a "connectivity radius". Your role as the helpful assistant to provide a high-level command when the system gets stuck near obstacles. The high-level command is in terms of a leader assignment for the multi-robot system and a direction of motion for the leader. An optimal choice of leader and moving direction minimizes the traveling distance of agents toward their goals and maintains safety and connectivity.
    • Environment state: The input image represents a grid world where the obstacles are given in black color. The location of the agents are given in blue color and the goal locations are given in green color. The task is to provide a high-level command in terms of a leader assignment for the multi-robot system and a set of waypoints for the leader. The leader assignment is an integer value in the range (1, Number of agents) and the waypoints for the leader are (x,y) coordinates. The number of waypoints is described by the variable "Number of waypoints" = M.
    • Desired output: The expected output is a JSON format file with the keys "Leader" and "Waypoints". The key "Leader" can take integer values in the range (1, Number of agents) and "Waypoints" are of the form [(x1, y1), (x2, y2), ..., (xM, yM)]. The leader should be assigned as the agent that can move freely in the environment. The leader should not be assigned to an agent that is blocked by obstacles or other agents. The waypoints are ordered in the sequence the leader should visit them. The first point should NOT be the current location of the leader. All the waypoints should be at least 2r distance from all the obstacles. The consecutive waypoints should be such that the leader moves toward its goal location. The waypoints should be such that the leader can move toward its goal location while maintaining safety with the obstacles. The path connecting the leader and the waypoints should NOT intersect with any of the obstacles. The waypoints should be in the free space of the environment, away from ALL the known obstacles. The obstacles can be chosen to wrap around the obstacles to allow the leader to move toward its goal location while evading the obstacles. The leader assignment is based on agent being able to freely move. That means there should be no obstacle or other agents in its path connected to its goal. If the leader cannot move directly in the direction of its goal location, the first waypoint should be to the left or right of the leader to avoid obstacles. The consecutive waypoints should be such that the leader moves toward its goal location while maintaining safety with the obstacles.
    • Example Scenario An example environment description is as follows.

Evaluations

Performance of various high-level planners for ``Room'' environments with $N=5$ agents (Top plots), ``Maze'' environments with $N=25$ agents (Middle plots), and ``Maze'' environments with $N=50$ agents (Bottom plots). From left to right: 1) The bar shows the ratio of the trajectories where all the agents reach their goals over the total number of trajectories, and the orange dot shows the ratio of agents that reach their goals over all agents; 2) Box plot of the number of times the high-level planner intervened; 3) Box plot of the time spent for each high-level planner intervention; and 4) Box plot for the input + output token per intervention. In the box plots, the median values are in orange and the mean values are in green.
Comparison of distance traveled by agents under various high-level planners.
LLM-based high level planner in action. The LLM suggests a leader and a direction for it to move along when the agents get stuck in a deadlock.
Ablation on environment information: Performance of various high-level planners for ``Maze'' environments with $N=50$ agents with all known environment information and partial information (the case with all known environment information is indicated with the suffix ``-All", e.g. ``GPT4-VLM-All").
Ablation on number of leaders: Performance of Claude3-Sonnet-VLM planner for ``Maze'' environments with $N=50$ agents and GPT3.5-LLM for ``Maze" environment with $N=25$ with a single leader and multi-leader assignment (the case with one leader is indicated with suffix ``-One", and that with multi-leader with ``-Multi''.