Skip to content
Snippets Groups Projects

Multi-agent Path Finding

This Unity and ML agents project visualizes multi-agent pathfinding algorithms. Visualization is done in Unity, algorithms are written in Python and communication between Unity and the Python code is done via ML agents.

There are two ways to work with the project:

  • using a Unity build
  • using the Unity editor

More details can be found in the Quick start section.

When using the Unity editor, note that the version used in the project is 2021.3.7f1, together with ML agents release 17. When using Unity Hub the setup of the Unity project and the corresponding packages is done automatically and you do not need to install or set up anything.

For running the Python algorithms it is important to use the correct version of the Python packages. For the ML agents release used in the project the needed ml-agents packages should be:

  • mlagents = "==0.26.0"
  • mlagents-envs = "==0.26.0"

For existing Python algorithms we have added pipfiles specifying the versions of all used Python packages.

NOTE: If using python 3.6 and you encounter problems with pipenv try pipenv version <= 2022.4.8.

Extremely important: guidelines for incorporating your work in this project can be found in the Adding your own algorithms section. It is important that you do not alter the Unity code.

The repository has the following structure:

  • Algorithms: a folder containing algorithms written in Python that can be used in different settings of the pathfinding problems, you can also add algoritms yourself.
  • PathFinding: Unity code, this code should not be altered
  • VisualizeStats: code that you can use and/or adapt for statistics and visualizations

The Unity project consists of two scenes / problem settings which will be explained below.

  • Centralized
  • Decentralized

Quick start

Using a Unity build

The easist way to start working with the project is downloading the correct Unity build from the AI lab website. There are different builds for centralized and decentralized.

To run one of the examples in the Algorithms folder using the Unity build:

  • Locate the StreamingAssets folder in the downloaded build.
  • The file InputOutput.txt inside StreamingAssets specifies the file that will set up the environment and the file to which statistics should be written to. These paths should be relative to the StreamingAssets folder. Do not change the name of InputOutput.txt or move it outside of StreamingAssets.
  • Open the Python project you want to run.
  • Make sure you have installed a Python virtual environment using the correct version for the ml-agents packages (also specified in pipfile).
  • In the Python project where you set up the Unity environment, specify the path to the build:
UnityEnvironment(file_name=<path to build>, side_channels=[envChannel])
  • Run the main Python file. This will automatically open the Unity build and run the algorithm.

Using the Unity editor

Open the Unity project folder PathFinding in Unity Hub. This is the folder that contains the subfolder Assets. The setup of the Unity project and the used packages is done automatically after opening the project. If necessary Unity Hub will ask you to install the particular Unity Editor version. Do not change the Unity version nor the versions of the packages that are being used.

After opening the project folder in Unity, you will see a 3d environment containing the two different scenes which will be described below.

To run one of the examples in the Algorithms folder using the Unity editor:

  • Select the correct scene (centralized/decentralized) in the Unity editor.
  • Set up the configuration files in StreamingAssets which can be found in the Assets folder.
    • The file InputOutput.txt specifies the file that will set up the environment and the file to which statistics should be written to. Theses should be relative to the StreamingAssets folder. Do not change the name of InputOutput.txt or move it outside of StreamingAssets.
  • Open the Python project you want to run.
  • Make sure you have installed a Python virtual environment using the correct version for the ml-agents packages (also specified in pipfile).
  • In the Python project where you set up the Unity environment, do not specify a path for the build:
UnityEnvironment(file_name=None, side_channels=[envChannel])
  • Run the main Python file.
  • Press the play button in Unity.

Of course, when using the editor, you can also create a build yourself and use the build instead of the editor.

Overview of readme and next steps

More details about setting up the warehouse and running the simulation can be found in Initializing the warehouse and Running the environment.

Overview of this readme:

Problem definition

Description

Given

  • a grid based world of walls and tiles,
  • n robots,
  • n initial nodes (position in grid + rotation) for the robots and
  • n goal nodes (position in grid + rotation) for the robots (by default in the same order as the initial nodes)

plan a path for each robot from its initial to its goal node.

Centralized vs decentralized

Since we use ML-agents to connect Unity with algorithms written in Python a distinction has to be made between centralized and decentralized.

In the centralized setting, actions for the robot are created all at once, in the decentralized setting this is done by every robot separately.

Nodes

At all time steps, robots are in nodes which is combination of a position and a rotation.

Positions in the warehouse are of the form (i,j) with i the row and j the column. The rotation is an integer in [0,360) where 0 corresponds to "north", 90 to "east", 180 to "south" and 270 to "west" and where the layout is of the form

(0,0) (0,1) ... (0,n)

(1,0) (1,1) ... (1,n)

...

(m,0) (m,1) ... (m,n)

Hence a robot at

  • location (1,1) and rotation 0 is facing location (0,1)
  • location (1,1) and rotation 90 is facing location (1,2)
  • location (1,1) and rotation 180 is facing location (2,1)
  • location (1,1) and rotation 270 is facing location (1,0)

A robot can rotate according to a degree of rotation, e.g. if this is 45 degrees, then for each location a robot has 8 possible rotations: 0,45,90,135,180,225,270,315. If it is for instance at location (1,1) and rotation 45 it can move to

  • location (1,1) and rotation 0 and
  • location (1,1) and rotation 90

The idea behind using this unit of rotation is that the time it takes to move one tile forward without rotation (e.g. from (0,0), rotation 90 to (0,1), rotation 90) is equal to the time it takes to rotate over this unit of rotation.

We add the restriction that the unit of rotation must be a divisor of 90. This way a robot can move from one tile to another if its rotation is equal to 0,90,180 or 270.

Note: The Unity environment does no check on which movements are "allowed". For instance, if you say that a robot should move from node location (1,1) and rotation 0 to location (2,1) and rotation 0 the robot will make this move (as long as there are no obstacles). Even to a node which is not nearby such as location (10,10) and rotation 0. In one timestep the robot will move between these two nodes. This also implies that you can use the environment even if you do not want to include rotations.

Initializing the warehouse

In the Unity editor select the correct scene:

  • Centralized
  • Decentralized

and specify where Unity can find the necessary input to set up the warehouse and where it should save statistics. You specify these paths in the file InputOutput.txt under /Pathfinding/Assets/StreamingAssets.Do not alter the name or the location of InputOutput.txt. The file must be in JSON format and must correspond to an instance of the class InputOutput.cs, i.e. you must use the following naming:

  • ConfigFile of type string: the name of the warehouse config file. This file must be placed under /Pathfinding/Assets/StreamingAssets.
  • StatsFile of type string: the name of the file to output the statistics to. The file will be generated under /Pathfinding/Assets/StreamingAssets.

A typical InputOutput file would look like this:

{
  "ConfigFile": "/ConfigurationFiles/config.txt",
  "StatsFile": "/Stats/stat.txt"
}

More information on which metrics are stored by the Unity environment can be found in section Metrics.

The file corresponding to ConfigFile must be in JSON format and must correspond to an instance of WarehouseParameters.cs. Examples can be found in ConfigurationFiles. These are the parameters you have to set :

  • Matrix of type int[][]: contains the grid based layout of the warehouse, the values that are accepted are as follows and can also be consulted in Constants.cs:

    • CodeWall: 1
    • CodeTile: 0
    • CodeTileArrowEast: 3
    • CodeTileArrowSouth: 4
    • CodeTileArrowWest: 5
    • CodeTileArrowNorth: 6

    We have walls, tiles and tiles with an arrow drawn on top.

  • Rotation of type int: this is the unit of rotation, i.e. the rotation in degrees that corresponds in time to moving forward (or backward) from one tile to an adjacent one

  • TimeRobotPerTile of type int: the time in ms that corresponds to moving one tile forward (or backward) or equivalent to a minimal rotation

  • NumberOfRobots of type int: the number of robots in the warehouse

  • InitialNodes of type Node[]: an array of initial nodes, the length of this array is equal to NumberOfRobots, InitialNodes[i] contains the initial node of robot i

  • GoalNodes of type Node[]: an array of goal nodes, the length of this array is equal to NumberOfRobots, if not otherwise specified GoalNodes[i] contains the goal node of robot i

Optionally you can also add the following field. If omitted, this will be set to False

  • GoalNodesOrderNotFixed of type bool. If you set this property to True, your algorithm has to assign goal nodes to the different robots and you must send these assignments back to Unity. Important: these assignments must be send back before a first action is sent to Unity!

Robots travel from node to node. These nodes are instances of the class Node.cs. Each node contains two properties

  • GridPosition: an instance of Position.cs corresponding to its position in the grid
  • Degree: an int corresponding to its rotation

Following checks are done:

  • Matrix should have correct dimensions, i.e. every row has the same length and its entries should be valid integers.
  • Rotation should be a divisor of 90.
  • The length of InitialNodes and GoalNodes should be equal to NumberOfRobots.
  • The nodes in InitialNodes and GoalNodes should be valid nodes, i.e. the position may not correspond to a wall and the rotation should be a divisor of Rotation.

Running the environment

Using the Unity editor

See the quick start on how to run the environment.

You change camera view using the following keys:

  • W/S : closer/further
  • D/A : right/left
  • E/Q : down/up

You can also use the mouse to rotate.

Goal nodes will all get a different and random color and the robots will be colored in the same color as their assigned goal node.

If there is a collisions between robots (or between a robot and a wall) a message "Collission" will be shown and the simulation stops.

Metrics

In InputOutput.txt you have to specify to which file you want to write statistics when robots have reached their goal nodes.

Saving statistics is done by Statistics.cs. In every timestep, the history is updated: for every robot we save whether it was at its goal node, whether it was not at its goal or whether it was at its goal node but received the action to leave that goal node.

When all robots are at their goal locations, a json object containing the following elements will be saved in statsPath.

  • elapsedTimeMilliSecondsUntilFirstAction of type float: total time in milliseconds between the moment the initial information on the environment is sent to Python and the moment the first action was sent back to Unity
  • makespan of type int: the length of the longest single agent path, i.e. the maximum of lengths
  • sumOfCosts of type int: the sum of all single agent paths, i.e. the sum of lengths
  • lengthOfPaths of type Dictionary<int, int>: for every robotID the number of actions received before arriving in goal node
  • history of type Dictionary<int, Dictionary<string, List<int>>>: contains for each robot (instanceID) a dictionary with keys "all timesteps not at goal", "all timesteps at goal" and "timesteps not at goal node but previous node was goal node"

Of course you're allowed to add more metrics in your own local version.

Note that one timestep corresponds to TimeRobotPerTile.

In the Python project VisualizeStats you can find inspiration and examples on how to visualize soms metrics using the above mentioned json object.

Adding your own algorithms

If you want to add your algorithm to this project, it is important that you do not alter the Unity code. You can submit changes to the Unity code through a merge request but these will have to be approved and this will only happen if they do not change the environment and address bugs or improve overall flow.

You can add your own algorithms (written in Python) in directory Algorithms. Two subdirectories corresponding to the different scenes have been added:

In each directory toy examples can be found. You can add a new algorithm by creating a new folder called /Algorithm_<description>. Add a readme specifying which is you main file (see below for an example) and also contains a description of your algorithm.

Recall that it is important to use the correct version of the Python ML-agents packages.

Before integrating your changes into the main branch, check the following

  1. For each new algorithm create a folder named /Algorithm_<description> in the correct subdirectory of Algorithms.
  2. Add a readme in the same folder /Algorithm_<description> that clearly documents, the problem you are solving, which is the algorithm and which is the main file.
  3. Remember that you may not change anything in the Unity code.
  4. Before integrating your commits into the main branch, merge the main branch into your branch or rebase your branch onto the main branch.
  5. Test your algorithm using the configuration files in ConfigurationFiles.
  6. If nothing breaks, you can open a merge request.

Observations and actions

Observation and actions are fixed. An observation contains the current node of a robot. An action contains information about the next node. It is important to note that Unity will not check whether next nodes are reachable. No check is done in Unity, if actions are incorrect, e.g. your next action is a node that is not reachable from the observation, you will see weird things in Unity.

In the centralized versions, actions for all robots will be requested at the same time. In the decentralized versions, each robot will request individual actions and not necessarily at the same time.

An individual observation has size 4: an ID and a node. The order is fixed:

  • the instance ID (integer, uniquely generated by Unity)
  • the last seen row (integer)
  • the last seen column (integer)
  • the last seen degrees/rotation (integer)

Your algorithm then has to specify next actions. An individual action has size 4 and specifies the next node where the robot should move. The order is fixed:

  • the instance ID (integer)
  • the next row (integer)
  • the next column (integer)
  • the next degrees/rotation (integer)

In the centralized versions, observations sent to the Python code will have size 4*NumberOfRobots since they are grouped. Actions that have to be sent back must have size 4*NumberOfRobots.

In the decentralized versions, observations sent to the Python code will have size 4. Actions that have to be sent back must have size 4.

Example

This section describes in detail how to add your own examples, a high level overview is given of how visualization and communication through ML-agents is done, but remember that you should not alter the Unity code.

You can also check toy examples that have been added for each scene. In these toy examples, if goal nodes have not been assigned yet in the configuration file, they are assigned in order. Next, a shortest path is computed for every robot but no check is done on collisions.

As a running example we use Toy example Centralized. However in all toy examples the flow is the same and as follows:

The file main.py sets up the side channel and the unity environment:

envChannel = EnvChannel.EnvChannel()
env = UnityEnvironment(file_name=None, side_channels=[envChannel])

If you are not using the Unity editor but a build, then you have to specify the path in file_name:

envChannel = EnvChannel.EnvChannel()
env = UnityEnvironment(file_name=<path to build>, side_channels=[envChannel])

It then sets up an instance of class EnvironmentManager which controls defining new actions and the communication with unity:

envManager = EnvironmentManager.EnvironmentManager(envChannel.envData, env, envChannel)
  • envChannel.envData is an instance of EnvironmentData which holds all the info on the environment
  • env is the unity environment
  • envChannel is the side channel that is used for initial communication on info environment and also for communicating goal assignments

The initialization method of EnvironmentManager takes care of setting all attributes in envChannel.envData. This is done by calling self.reset() which triggers the on_message_received method in EnvChannel, see also documentation on ML agents.

Communication via side channels

A side channel envChannel of type EnvSideChannel.cs is set up at Unity side for sending initial information to the Python side. It is also used for sending information on assignment of goal nodes back to Unity. In your Python folder you should have a file such as EnvChannel.py which processes the sent JSON object. Make sure that the UUID corresponds to ENV_SIDE_CHANNEL_ID in Constants.cs.

Initially the following information is sent from Unity to Python

  • matrixLayout: matrix
  • rotationRobot: the unit of rotation
  • sizeOfAction: size of action for one robot, here this will always be 4
  • sizeOfObservation: size of observation for one robot, here this will always be 4
  • numberOfRobotsInEnv: number of robots in the environment
  • initial: array of initial nodes
  • goal: array of goal nodes
  • orderGoalNodesNotFixed: optional parameter denoting that specifies whether goal nodes have to be assigned (if True: algorithm has to assign goal nodes using the side channel EnvChannel.py)
  • instanceIDS: the instance IDS of the robots

Arrays initial, goal and instanceIDS will have size numberOfRobotsInEnv and for every index i the instanceID instanceIDS[i] is the Unity instance ID for the robot that is in initial node initial[i]. If orderGoalNodesNotFixed is false (default), then its goal node is goal[i] and in Unity the robot will colored in the same color as that goal node.

It is entirely up to you how you process this information (which is in JSON format). You cannot change however which information is sent, if you would need extra information to be send, please create a new issue or a merge request.

Assigning goal nodes to agents

Optionally, the Python code has to assign the goal nodes to each of the robots. In this example we assign the goal nodes in the same order as the initial nodes.

In order to send the information back to Unity, we reuse the same sidechannel EnvChannel. Unity will then color the robots using the same color as the assigned goal node. It is very important to use the correct structure as specified at Unity side in EnvSideChannel.cs

private class GoalAssignment
{
    public int RobotID = -1; // instance ID of robot
    public Node GoalNode = null; // goal node that will be assigned
}

In our example envManager takes care of this by calling send_goal_nodes_to_unity() in its initialization method.

def send_goal_nodes_to_unity(self):
    for i in range(len(self.envData.goal_nodes)):
        goal_dict = {
            "RobotID": self.envData.instance_ids[i],
            "GoalNode": {"GridPosition":
                             {"Row": self.envData.goal_nodes[i]["GridPosition"]["Row"],
                              "Column":self.envData.goal_nodes[i]["GridPosition"]["Column"]},
                        "Degrees":self.envData.goal_nodes[i]["Degrees"]}
                     }
        self.envChannel.send_string(json.dumps(goal_dict))

It uses for each goal assignment a dictionary that contains the keys and values GoalAssignment in EnvSideChannel.cs expects. Note that envChannel.send_string() only allows arguments of the form goal_dict: this has to be done for each robot separately.

The queued assignments will be send through the side channel the next time envManager.step() is called, see also documentation on ML agents.

Computing actions: the algorithm

In our main.py file we call

while simulation_going:
    envManager.step()

which checks whether there are new requests and then computes new actions. Currently the control variable simulation_going is never set to False. If there is a new action available for a robot it will be send back, otherwise the current observation will be send back. This can be changed the way you want, e.g. by calling envManager.close() which terminates communication. In Unity the robots will move towards the action that was received.

In our example we use precomputed paths. A dedicated data structure Node has been defined to represent nodes and in the initialization method of EnvironmentManager a graph is created

self.graph = Graph(envData.matrix, envData.rotation, envData.can_go_backwards)

See Graph for implementation details. A path is then computed for each of the robots by calling

self.compute_shortest_paths()

and the paths are stored in the dictionary self.shortest_paths whose keys are the Unity robot IDs and the values the next nodes in the path.

Instance method step of EnvironmentManager then computes new actions and sends them to Unity. This is done as follows: Every time a request has been made, the information on the observation is stored in

decision_steps, terminal_steps = self.unityEnv.get_steps(self.behavior_name)
  • decision_steps.obs contains a list of numpy arrays containing the collected observations per group of agents.
  • decision_steps.agent_id contains a list with info on all agents that have requested an action, in particular its length correspond to the number of agents that have requested an action

The number of agents that have requested an actions equals len(decision_steps.agent_id) and for each agent i in range(len(decision_steps.agent_id)) its observation can be found in

observation = decision_steps.obs[0][i]
Centralized

In the centralized versions, len(decision_steps.agent_id) will always be 1 since there is is only one agent that collects all individual observations and thus the observation can be found in

observation = decision_steps.obs[0][0]

In our toy example, we then extract the individual observations from observation and store them in a dictionary observations where the key is the instanceID and the value is a list containing the observation, i.e. the row, column and rotation:

observations = {}
for i in range(self.envData.number_of_robots):
    observation_i = observation[i * self.envData.size_of_observation:
                                i * self.envData.size_of_observation + self.envData.size_of_observation]
    observations[observation_i[0]] = observation_i[1:]

Decentralized

In the decentralized versions, we can just work directly with decision_steps.obs[0][i].

Create actions to send via ML agents

In the toy example, the next action is found in the precomputed path.

We store all actions that need to be sent back to Unity in a list actions which will contain all individual actions. This list is not a nested list! In the centralized version its length will be self.envData.size_of_action * self.envData.number_of_robots. In the centralized version its length will be self.envData.size_of_action * n where n = len(decision_steps.agent_id), the number of agents that have requested a decision.

Suppose we need to add the action for robot with ID robot_id_unity:

  • If the next action is going to node (next_action.x, next_action.y, next_action.g), then:
actions.extend([robot_id_unity, next_action.x, next_action.y, next_action.g])
  • If there are no more actions and obs contains the current observation [row, column, degrees] of the robot, then:
actions.extend([robot_id_unity, obs[0], obs[1], obs[2]])

Send actions back to Unity

In order to send these actions back to Unity some manipulations have to be done, see also ML agents documentation.

First we convert actions to a numpy array of the correct size: first dimension is the number of agents that requested a decision and the second is the action size for each of the agents:

  • The number of agents that have requested an action can be found in len(decision_steps.agent_id).
  • The action size is self.envData.size_of_action * self.envData.number_of_robots for centralized and self.envData.size_of_action for decentralized.

For centralized we have:

actions = np.array(actions)
actions.resize(len(decision_steps.agent_id), self.envData.size_of_action * self.envData.number_of_robots)

For decentralized we have:

actions = np.array(actions)
actions.resize(len(decision_steps.agent_id), self.envData.size_of_action)

Then we have to add actions to the ActionTuple object as discrete actions

action_tuple = ActionTuple()
action_tuple.add_discrete(actions)

Finally we set the actions and send them to Unity by calling the step method of self.unityEnv.

self.unityEnv.set_actions(self.behavior_name, action_tuple)
self.unityEnv.step() 

Recall that self.unityEnv.step() will also send queued information through the side channels.

Note on requesting actions

In the centralized approach, the script corresponding to the simulation manager extends the ML agents interface Agent. In the decentralized approach, the script corresponding to the robot extends Agent. Such an agent is an entity that observes its environment and executes the actions that it receives from the external algorithm. In particular it collects all observations for all the robots and request decisions for all of them at the same time. The corresponding gameobject must have a component Behavior Parameters added to itself. This has already been added in the Unity editor. If you haven't altered anything in the unity editor this should work correctly.