Skip to content

jduquevan/advantage-alignment

Repository files navigation

Advantage Alignment

Code for the fastest existing algorithm for scalable opponent shaping: https://arxiv.org/abs/2406.14662

Update log: 04/20/2024 by Tianyu

  1. include tournament.yaml and tournament.py. Usage: python tournament.py. The definition of the aa and ppo agent network information is here. All the network-based agent must share the same architecture. The tornament between agents with different architectures is not supported yet.
  2. include agent load, save, eval, train. Check here Usage:
agent.to(cfg.device)
agent.save('agent.pth') # save all the nn.Module state_dict in the agent
agent.load('agent.pth') # load all the nn.Module state_dict in the agent
agent.eval()
agent.train()
  1. implement detach_and_move_to_cpu for the trajecotry class to be able to save to json.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •