ERNIE: A Robust MARL Algorithm

Repository for the paper Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms, accepted to NeurIPS 2023.

The simplest version of ERNIE can be implemented as follows:

perturbed_tensor = old_global_obs + torch.normal(torch.zeros_like(old_global_obs), torch.ones_like(old_global_obs) * 1e-3)
perturbed_tensor.requires_grad = True

for k in range(self.config.alg.perturb_num_steps):
	# Calculate adversarial perurbation
	distance_loss = torch.norm(self.global_net(old_global_obs) - self.global_net(perturbed_tensor), p="fro")
	grad = torch.autograd.grad(outputs=distance_loss, inputs=perturbed_tensor, grad_outputs=torch.ones_like(loss), retain_graph=True, create_graph=True)[0]
	perturbed_tensor = perturbed_tensor + self.config.alg.perturb_alpha * grad * torch.abs(old_global_obs.detach())

adv_reg_loss = torch.norm(old_global_obs, perturbed_tensor)

This loss can simply be added to your algorithm's training loss. Note that here ERNIE is applied to the global policy.

To train policies in the traffic light control environment, first install follow the installation instructions at https://flow-project.github.io. Then run the command

python train_policy.py

This results in the following curve.

Cite

Please cite our paper if you use this code in your own work:

@article{bukharin2023robust,
  title={Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms},
  author={Bukharin, Alexander and Li, Yan and Yu, Yue and Zhang, Qingru and Chen, Zhehui and Zuo, Simiao and Zhang, Chao and Zhang, Songan and Zhao, Tuo},
  journal={arXiv preprint arXiv:2310.10810},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ERNIE: A Robust MARL Algorithm

Cite

Files

README.md

Latest commit

History

README.md

File metadata and controls

ERNIE: A Robust MARL Algorithm

Cite