Advantage Alignment

Code for the fastest existing algorithm for scalable opponent shaping: https://arxiv.org/abs/2406.14662

Update log: 04/20/2024 by Tianyu

include tournament.yaml and tournament.py. Usage: python tournament.py. The definition of the aa and ppo agent network information is here. All the network-based agent must share the same architecture. The tornament between agents with different architectures is not supported yet.
include agent load, save, eval, train. Check here Usage:

agent.to(cfg.device)
agent.save('agent.pth') # save all the nn.Module state_dict in the agent
agent.load('agent.pth') # load all the nn.Module state_dict in the agent
agent.eval()
agent.train()

implement detach_and_move_to_cpu for the trajecotry class to be able to save to json.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
configs		configs
highway_env		highway_env
src		src
sweep		sweep
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
run_aa.slurm		run_aa.slurm
run_aa_f1.slurm		run_aa_f1.slurm
run_aa_milad.slurm		run_aa_milad.slurm
tournament.py		tournament.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advantage Alignment

Update log: 04/20/2024 by Tianyu

About

Releases

Packages

Contributors 4

Languages

jduquevan/advantage-alignment

Folders and files

Latest commit

History

Repository files navigation

Advantage Alignment

Update log: 04/20/2024 by Tianyu

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages