Skip to content

Latest commit

 

History

History
19 lines (15 loc) · 1007 Bytes

README.md

File metadata and controls

19 lines (15 loc) · 1007 Bytes

Advantage Alignment

Code for the fastest existing algorithm for scalable opponent shaping: https://arxiv.org/abs/2406.14662

Update log: 04/20/2024 by Tianyu

  1. include tournament.yaml and tournament.py. Usage: python tournament.py. The definition of the aa and ppo agent network information is here. All the network-based agent must share the same architecture. The tornament between agents with different architectures is not supported yet.
  2. include agent load, save, eval, train. Check here Usage:
agent.to(cfg.device)
agent.save('agent.pth') # save all the nn.Module state_dict in the agent
agent.load('agent.pth') # load all the nn.Module state_dict in the agent
agent.eval()
agent.train()
  1. implement detach_and_move_to_cpu for the trajecotry class to be able to save to json.