GitHub - zxzzz0/DI-engine: OpenDILab Decision AI Engine

Updated on 2021.09.30 DI-engine-v0.2.0 (beta)

Introduction to DI-engine (beta)

DI-engine is a generalized Decision Intelligence engine. It supports most basic deep reinforcement learning (DRL) algorithms, such as DQN, PPO, SAC, and domain-specific algorithms like QMIX in multi-agent RL, GAIL in inverse RL, and RND in exploration problems. Various training pipelines and customized decision AI applications are also supported. Have fun with exploration and exploitation.

Application

Environment

GoBigger

System Optimization and Design

Other

Installation

You can simply install DI-engine from PyPI with the following command:

pip install DI-engine

If you use Anaconda or Miniconda, you can install DI-engine from conda-forge through the following command:

conda install -c opendilab di-engine

For more information about installation, you can refer to installation.

And our dockerhub repo can be found here，we prepare base image and env image with common RL environments.

base: opendilab/ding:nightly
atari: opendilab/ding:nightly-atari
mujoco: opendilab/ding:nightly-mujoco
smac: opendilab/ding:nightly-smac

Documentation

The detailed documentation are hosted on doc(中文文档).

Quick Start

3 Minutes Kickoff

3 Minutes Kickoff(colab)

3 分钟上手中文版(kaggle)

Bonus: Train RL agent in one line code:

ding -m serial -e cartpole -p dqn -s 0

Feature

Algorithm Versatility

No	Algorithm	Doc and Implementation	Runnable Demo
1	DQN	DQN中文文档 policy/dqn	python3 -u cartpole_dqn_main.py / ding -m serial -c cartpole_dqn_config.py -s 0
2	C51	policy/c51	ding -m serial -c cartpole_c51_config.py -s 0
3	QRDQN	policy/qrdqn	ding -m serial -c cartpole_qrdqn_config.py -s 0
4	IQN	policy/iqn	ding -m serial -c cartpole_iqn_config.py -s 0
5	Rainbow	policy/rainbow	ding -m serial -c cartpole_rainbow_config.py -s 0
6	SQL	policy/sql	ding -m serial -c cartpole_sql_config.py -s 0
7	R2D2	policy/r2d2	ding -m serial -c cartpole_r2d2_config.py -s 0
8	A2C	policy/a2c	ding -m serial -c cartpole_a2c_config.py -s 0
9	PPO/MAPPO	policy/ppo	python3 -u cartpole_ppo_main.py / ding -m serial_onpolicy -c cartpole_ppo_config.py -s 0
10	PPG	policy/ppg	python3 -u cartpole_ppg_main.py
11	ACER	policy/acer	ding -m serial -c cartpole_acer_config.py -s 0
12	IMPALA	policy/impala	ding -m serial -c cartpole_impala_config.py -s 0
13	DDPG/PADDPG	policy/ddpg	ding -m serial -c pendulum_ddpg_config.py -s 0
14	TD3	policy/td3	python3 -u pendulum_td3_main.py / ding -m serial -c pendulum_td3_config.py -s 0
15	D4PG	policy/d4pg	python3 -u pendulum_d4pg_config.py
16	SAC	policy/sac	ding -m serial -c pendulum_sac_config.py -s 0
17	PDQN	policy/pdqn	ding -m serial -c gym_hybrid_pdqn_config.py -s 0
18	QMIX	policy/qmix	ding -m serial -c smac_3s5z_qmix_config.py -s 0
19	COMA	policy/coma	ding -m serial -c smac_3s5z_coma_config.py -s 0
20	QTran	policy/qtran	ding -m serial -c smac_3s5z_qtran_config.py -s 0
21	WQMIX	policy/wqmix	ding -m serial -c smac_3s5z_wqmix_config.py -s 0
22	CollaQ	policy/collaq	ding -m serial -c smac_3s5z_collaq_config.py -s 0
23	GAIL	reward_model/gail	ding -m serial_gail -c cartpole_dqn_gail_config.py -s 0
24	SQIL	entry/sqil	ding -m serial_sqil -c cartpole_sqil_config.py -s 0
25	DQFD	policy/dqfd	ding -m serial_dqfd -c cartpole_dqfd_config.py -s 0
26	HER	reward_model/her	python3 -u bitflip_her_dqn.py
27	RND	reward_model/rnd	python3 -u cartpole_ppo_rnd_main.py
28	CQL	policy/cql	python3 -u d4rl_cql_main.py
29	TD3BC	policy/td3_bc	python3 -u mujoco_td3_bc_main.py
30	MBPO	model/template/model_based/mbpo	python3 -u sac_halfcheetah_mopo_default_config.py
31	PER	worker/replay_buffer	`rainbow demo`
32	GAE	rl_utils/gae	`ppo demo`

means discrete action space, which is only label in normal DRL algorithms (1-16)

means continuous action space, which is only label in normal DRL algorithms (1-16)

means hybrid (discrete + continuous) action space (1-16)

means distributed training (collector-learner parallel) RL algorithm

means multi-agent RL algorithm

means RL algorithm which is related to exploration and sparse reward

means Imitation Learning, including Behaviour Cloning, Inverse RL, Adversarial Structured IL

means offline RL algorithm

means model-based RL algorithm

means other sub-direction algorithm, usually as plugin-in in the whole pipeline

P.S: The .py file in Runnable Demo can be found in dizoo

Environment Versatility

No	Environment	Code and Doc Links
1	atari	code link env tutorial 环境指南
2	box2d/bipedalwalker	dizoo link
3	box2d/lunarlander	dizoo link
4	classic_control/cartpole	dizoo link
5	classic_control/pendulum	dizoo link
6	competitive_rl	dizoo link
7	gfootball	dizoo link
8	minigrid	dizoo link
9	mujoco	dizoo link
10	multiagent_particle	dizoo link
11	overcooked	dizoo link
12	procgen	dizoo link
13	pybullet	dizoo link
14	smac	dizoo link
15	d4rl	dizoo link
16	league_demo	dizoo link
17	pomdp atari	dizoo link
18	bsuite	dizoo link
19	ImageNet	dizoo link
20	slime_volleyball	dizoo link
21	gym_hybrid	dizoo link
22	GoBigger	opendilab link
23	gym_soccer	dizoo link

means discrete action space

means continuous action space

means hybrid (discrete + continuous) action space

means multi-agent RL environment

means environment which is related to exploration and sparse reward

means offline RL environment

means Imitation Learning or Supervised Learning Dataset

means environment that allows agent VS agent battle

P.S. some enviroments in Atari, such as MontezumaRevenge, are also sparse reward type

Contribution

We appreciate all contributions to improve DI-engine, both algorithms and system designs. Please refer to CONTRIBUTING.md for more guides. And our roadmap can be accessed by this link.

And users can join our slack communication channel or our forum for more detailed discussion.

For future plans or milestones, please refer to our GitHub Projects.

Citation

@misc{ding,
    title={{DI-engine: OpenDILab} Decision Intelligence Engine},
    author={DI-engine Contributors},
    publisher = {GitHub},
    howpublished = {\url{https://github.com/opendilab/DI-engine}},
    year={2021},
}

License

DI-engine released under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
.github		.github
ding		ding
dizoo		dizoo
.coveragerc		.coveragerc
.flake8		.flake8
.gitignore		.gitignore
.style.yapf		.style.yapf
CHANGELOG		CHANGELOG
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile.base		Dockerfile.base
Dockerfile.env		Dockerfile.env
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cloc.sh		cloc.sh
codecov.yml		codecov.yml
format.sh		format.sh
pytest.ini		pytest.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to DI-engine (beta)

Application

Environment

System Optimization and Design

Other

Installation

Documentation

Quick Start

Feature

Algorithm Versatility

Environment Versatility

Contribution

Citation

License

About

Releases

Packages

Languages

License

zxzzz0/DI-engine

Folders and files

Latest commit

History

Repository files navigation

Introduction to DI-engine (beta)

Application

Environment

System Optimization and Design

Other

Installation

Documentation

Quick Start

Feature

Algorithm Versatility

Environment Versatility

Contribution

Citation

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages