MOMAland is an open source Python library for developing and comparing multi-objective multi-agent reinforcement learning algorithms by providing a standard API to communicate between learning algorithms and environments, as well as a standard set of environments compliant with that API. Essentially, the environments follow the standard PettingZoo APIs, but return vectorized rewards as numpy arrays instead of scalar values.
The documentation website is at https://momaland.farama.org/, and we have a public discord server (which we also use to coordinate development work) that you can join here.
MOMAland includes environments taken from the MOMARL literature, as well as multi-objective version of classical environments, such as SISL or Butterfly. The full list of environments is available at https://momaland.farama.org/environments/all-envs/.
To install MOMAland, use:
pip install momaland
This does not include dependencies for all components of MOMAland (not everything is required for the basic usage, and some can be problematic to install on certain systems).
pip install "momaland[testing]"
to install dependencies for API testing.pip install "momaland[learning]"
to install dependencies for the supplied learning algorithms.pip install "momaland[all]"
for all dependencies for all components.
Similar to PettingZoo, the MOMAland API models environments as simple Python env
classes. Creating environment instances and interacting with them is very simple - here's an example using the "momultiwalker_v0" environment:
from momaland.envs.momultiwalker import momultiwalker_v0 as _env
import numpy as np
# .env() function will return an AEC environment, as per PZ standard
env = _env.env(render_mode="human")
env.reset(seed=42)
for agent in env.agent_iter():
# vec_reward is a numpy array
observation, vec_reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
action = env.action_space(agent).sample() # this is where you would insert your policy
env.step(action)
env.close()
# optionally, you can scalarize the reward with weights
# Making the vector reward a scalar reward to shift to single-objective multi-agent (aka PettingZoo)
# We can assign different weights to the objectives of each agent.
weights = {
"walker_0": np.array([0.1, 0.7, 0.2]),
"walker_1": np.array([0.6, 0.1, 0.3]),
"walker_2": np.array([0.2, 0.2, 0.6]),
}
env = LinearizeReward(env, weights)
For details on multi-objective multi-agent RL definitions, see Multi-Objective Multi-Agent Decision Making: A Utility-based Analysis and Survey.
You can also check more examples in this colab notebook!
We provide a set of learning algorithms that are compatible with the MOMAland environments. The learning algorithms are implemented in the learning/ directory. To keep everything as self-contained as possible, each algorithm is implemented as a single-file (close to cleanRL's philosophy).
Nevertheless, we reuse tools provided by other libraries, like multi-objective evaluations and performance indicators from MORL-Baselines.
Here is a list of algorithms that are currently implemented:
Name | Single/Multi-policy | Reward | Utility | Observation space | Action space | Paper |
---|---|---|---|---|---|---|
MOMAPPO (OLS) continuous, discrete |
Multi | Team | Team / Linear | Any | Any |
MOMAland keeps strict versioning for reproducibility reasons. All environments end in a suffix like "_v0". When changes are made to environments that might impact learning results, the number is increased by one to prevent potential confusion.
We have a roadmap for future development available here: TODO.
Project Managers: TODO
Maintenance for this project is also contributed by the broader Farama team: farama.org/team.
If you use this repository in your research, please cite:
@inproceedings{TODO}
Clone the repo and run pre-commit install
to setup the pre-commit hooks.