Code for the fastest existing algorithm for scalable opponent shaping: https://arxiv.org/abs/2406.14662
- include
tournament.yaml
andtournament.py
. Usage:python tournament.py
. The definition of theaa
andppo
agent network information is here. All the network-based agent must share the same architecture. The tornament between agents with different architectures is not supported yet. - include agent load, save, eval, train. Check here Usage:
agent.to(cfg.device)
agent.save('agent.pth') # save all the nn.Module state_dict in the agent
agent.load('agent.pth') # load all the nn.Module state_dict in the agent
agent.eval()
agent.train()
- implement
detach_and_move_to_cpu
for thetrajecotry
class to be able to save to json.