diff --git a/docs/environments/all-envs.md b/docs/environments/all-envs.md index d1f8c110..76d8fef1 100644 --- a/docs/environments/all-envs.md +++ b/docs/environments/all-envs.md @@ -15,18 +15,18 @@ firstpage: MOMAland includes environments taken from the MO/MARL literature, as well as multi-objective versions of environments from PettingZoo. More information are available in the TODO [MOMAland paper](). -| Env | Cooperative/Adversarial | Obs/Action spaces | Objectives | Description | -|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|-------------------------|------------------------------------------------------------------|| -| [`catch-v0`](https://momaland.farama.org/environments/catch/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must corner and catch a target drone while maintaining distance between themselves. | -| [`escort-v0`](https://momaland.farama.org/environments/escort/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must circle around a mobile target drone and escort it to its destination without breaking formation while maintaining distance between themselves. | -| [`surround-v0`](https://momaland.farama.org/environments/surround/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must surround a fixed target point while maintaining distance between themselves. | -| [`mo-beach-v0`](https://momaland.farama.org/environments/mobeach/)
| Any | Continuous / Discrete | `[occupation, mixture]` | Taken from [Mannion_2018](https://www.cambridge.org/core/journals/knowledge-engineering-review/article/reward-shaping-for-knowledgebased-multiobjective-multiagent-reinforcement-learning/75F1507F7CAC7C6625F87AE7CD344D52). MO-Beach is a game with two objectives, reflecting the enjoyment of tourists (agents) on their respective beach sections in terms of crowdedness and diversity of attendees. Each beach section is characterised by a capacity and each agent is characterised by a type. | -| [`mo-breakthrough-v0`](https://momaland.farama.org/environments/mobreakthrough/)
| Adversarial | Discrete / Discrete | `[win, fast win, capturing opponent's pieces, avoiding capture]` | Multi-objective version of the two-player, turn-based, board game [Breakthrough](https://en.wikipedia.org/wiki/Breakthrough_(board_game)). | -| [`mo-congestion-v0`](https://momaland.farama.org/environments/mocongestion/)
| Mixed | Continuous / Discrete | `[travel time, cost]` | MO-RouteChoice is a multi-objective extension of the route choice problem [Thomasini_2023](https://alaworkshop2023.github.io/papers/ALA2023_paper_69.pdf), where a number of self-interested drivers (agents) must navigate a road network. | -| [`mo-connect4-v0`](https://momaland.farama.org/environments/moconnect4/)
| Adversarial | Discrete / Discrete | `[win, fast win, [column #n]]` | MO version of [Connect 4](https://en.wikipedia.org/wiki/Connect_Four). Additional objectives are fast win and optionally one objective per column. | -| [`mo-gem-mining-v0`](https://momaland.farama.org/environments/mogem_mining/)
| Cooperative | Continuous / Discrete | `[#gems]` (configurable) | MO version of Gem Mining [Bargiacchi_2018](https://proceedings.mlr.press/v80/bargiacchi18a/bargiacchi18a.pdf). Agents go to different mines to extract different gems (objectives). There are restrictions on which mines can be reached for each agent. Agents also influence each other's producitivity. | -| [`mo-ingenious-v0`](https://momaland.farama.org/environments/moingenious/)
| Any | Discrete / Discrete | `[#colors]` (configurable) | MO adaptation of the zero-sum, turn-based board game [Ingenious](https://en.wikipedia.org/wiki/Ingenious_(board_game)). The game's original rules support 2-4 players collecting scores in multiple colors (objectives), with the goal of winning by maximizing the minimum score over all colors. | -| [`mo-item-gathering-v0`](https://momaland.farama.org/environments/moitem_gathering/)
| Adversarial | Discrete / Discrete | `[#objects]` (configurable) | Adapted from [Kallstrom_2019](https://www.diva-portal.org/smash/get/diva2:1362933/FULLTEXT01.pdf), is a multi-agent grid world, containing items of different colours. Each colour represents a different objective and the goal of the agents is to collect as many objects as possible. | -| [`mo-multiwalker-stability-v0`](https://momaland.farama.org/environments/momultiwalker_stabilty/)
| Cooperative | Continuous / Continuous | `[progress right, package stability]` | A MO version of [PZ's MultiWalker](https://pettingzoo.farama.org/environments/sisl/multiwalker/) introduced in [Gupta_2017](https://link.springer.com/chapter/10.1007/978-3-319-71682-4_5), where the agents also seek to keep the package steady. | -| [`mo-pistonball-v0`](https://momaland.farama.org/environments/mopistonball/)
| Cooperative | Continuous / Any | `[agent_#n_reward]` (configurable) | An MO version of [PZ's Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) where the reward of each agent is kept separate. | -| [`mo-same-game-v0`](https://momaland.farama.org/environments/mosame_game/)
| Any | Discrete / Discrete | `[colors_n]` (configurable) | MO-SameGame is a multi-objective, multi-agent variant of the single-player, single-objective turn-based puzzle game called SameGame [Baier_2015](https://project.dke.maastrichtuniversity.nl/games/files/phd/Baier_thesis.pdf). The original single-player, single-objective SameGame rewards the player with $n^2$ points for removing any group of n tiles. MO-SameGame can extend this in two ways. Agents can either only get points for their own actions, leading to competition between them, or all rewards can be shared in ``team reward'' mode. Additionally, points for every colour can be counted as separate objectives, allowing for different trade-offs between colours, or they can be accumulated in a single objective like in the default game variant, essentially providing a single-objective wrapper for the game. | +| Env | Cooperative/Adversarial | Obs/Action spaces | Objectives | Description | +|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|-------------------------|------------------------------------------------------------------|| +| [`mo-beach-v0`](https://momaland.farama.org/environments/mobeach/)
| Any | Discrete / Discrete | `[occupation, mixture]` | Taken from [Mannion_2018](https://www.cambridge.org/core/journals/knowledge-engineering-review/article/reward-shaping-for-knowledgebased-multiobjective-multiagent-reinforcement-learning/75F1507F7CAC7C6625F87AE7CD344D52). MO-Beach is a game with two objectives, reflecting the enjoyment of tourists (agents) on their respective beach sections in terms of crowdedness and diversity of attendees. Each beach section is characterised by a capacity and each agent is characterised by a type. | +| [`mo-item-gathering-v0`](https://momaland.farama.org/environments/moitem_gathering/)
| Adversarial | Discrete / Discrete | `[#objects]` (configurable) | Adapted from [Kallstrom_2019](https://www.diva-portal.org/smash/get/diva2:1362933/FULLTEXT01.pdf), is a multi-agent grid world, containing items of different colours. Each colour represents a different objective and the goal of the agents is to collect as many objects as possible. | +| [`mo-gem-mining-v0`](https://momaland.farama.org/environments/mogem_mining/)
| Cooperative | - / Discrete | `[#gems]` (configurable) | MO version of Gem Mining [Bargiacchi_2018](https://proceedings.mlr.press/v80/bargiacchi18a/bargiacchi18a.pdf). Agents go to different mines to extract different gems (objectives). There are restrictions on which mines can be reached for each agent. Agents also influence each other's producitivity. | +| [`mo-congestion-v0`](https://momaland.farama.org/environments/mocongestion/)
| Adversarial | - / Discrete | `[travel time, cost]` | MO-RouteChoice is a multi-objective extension of the route choice problem [Thomasini_2023](https://alaworkshop2023.github.io/papers/ALA2023_paper_69.pdf), where a number of self-interested drivers (agents) must navigate a road network. | +| [`mo-pistonball-v0`](https://momaland.farama.org/environments/mopistonball/)
| Cooperative | Continuous / Any | `[agent_#n_reward]` (configurable) | An MO version of [PZ's Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) where the reward of each agent is kept separate. | +| [`mo-multiwalker-stability-v0`](https://momaland.farama.org/environments/momultiwalker_stabilty/)
| Cooperative | Continuous / Continuous | `[progress right, package stability]` | A MO version of [PZ's MultiWalker](https://pettingzoo.farama.org/environments/sisl/multiwalker/) introduced in [Gupta_2017](https://link.springer.com/chapter/10.1007/978-3-319-71682-4_5), where the agents also seek to keep the package steady. | +| [`catch-v0`](https://momaland.farama.org/environments/catch/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must corner and catch a target drone while maintaining distance between themselves. | +| [`escort-v0`](https://momaland.farama.org/environments/escort/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must circle around a mobile target drone and escort it to its destination without breaking formation while maintaining distance between themselves. | +| [`surround-v0`](https://momaland.farama.org/environments/surround/)
| Cooperative | Continuous / Continuous | `[distance_target, distance_other_drones]` | Agents must surround a fixed target point while maintaining distance between themselves. | +| [`mo-breakthrough-v0`](https://momaland.farama.org/environments/mobreakthrough/)
| Adversarial | Discrete / Discrete | `[win, fast win, capturing opponent's pieces, avoiding capture]` | Multi-objective version of the two-player, turn-based, board game [Breakthrough](https://en.wikipedia.org/wiki/Breakthrough_(board_game)). | +| [`mo-connect4-v0`](https://momaland.farama.org/environments/moconnect4/)
| Adversarial | Discrete / Discrete | `[win, fast win, [column #n]]` | MO version of [Connect 4](https://en.wikipedia.org/wiki/Connect_Four). Additional objectives are fast win and optionally one objective per column. | +| [`mo-ingenious-v0`](https://momaland.farama.org/environments/moingenious/)
| Any | Discrete / Discrete | `[#colors]` (configurable) | MO adaptation of the zero-sum, turn-based board game [Ingenious](https://en.wikipedia.org/wiki/Ingenious_(board_game)). The game's original rules support 2-4 players collecting scores in multiple colors (objectives), with the goal of winning by maximizing the minimum score over all colors. In MO-Ingenious, we leave the utility wrapper up to the users and only return the vector of scores in each colour objective. | +| [`mo-same-game-v0`](https://momaland.farama.org/environments/mosame_game/)
| Any | Discrete / Discrete | `[colors_n]` (configurable) | MO-SameGame is a multi-objective, multi-agent variant of the single-player, single-objective turn-based puzzle game called SameGame [Baier_2015](https://project.dke.maastrichtuniversity.nl/games/files/phd/Baier_thesis.pdf). The original single-player, single-objective SameGame rewards the player with $n^2$ points for removing any group of $n$ tiles. MO-SameGame can extend this in two ways. Agents can either only get points for their own actions, leading to competition between them, or all rewards can be shared in ``team reward'' mode. Additionally, points for every colour can be counted as separate objectives, allowing for different trade-offs between colours, or they can be accumulated in a single objective like in the default game variant, essentially providing a single-objective wrapper for the game. |