Counterfactual Regret Minimization (CFR) without generating all the states apriori #1239

PURANJAY14 · 2024-06-12T21:22:03Z

PURANJAY14
Jun 12, 2024

Hi,
Is there a way to run counterfactual regret minimization (CFR) without precomputing (get_all_states.py) all the states? The intermediate policies updated during CFR often have sparse action probabilities, making it unnecessary to traverse the entire game tree.

Answered by lanctot

Jun 23, 2024

Ah yes, but to be honest I really don't remember why it does it that way. There was a deep technical reason. @jblespiau, do you remember?

@PURANJAY14 if I were you, I would make an alternative tabular policy called TabularPolicyDict that simply wraps a python dictionary but otherwise exposes the necessary methods in the Policy base class. (And initializing the state policy to uniform + adding it to the dictionary upon first call). Then replace all the ones in cfr.py with your custom dict.

Hope this helps!

View full answer

lanctot · 2024-06-22T09:18:47Z

lanctot
Jun 22, 2024
Maintainer

CFR should not require getting all states.

I just checked cfr.py and I don't see an import for get_all_states.py or where it gets all the states. Can you point to where you think the CFR implementation gets all the states?

Oh wait, do you mean information states? You may want to look into MCCFR in that case.

0 replies

PURANJAY14 · 2024-06-23T06:51:24Z

PURANJAY14
Jun 23, 2024
Author

In the policy.py which is accessed inside cfr.py.

0 replies

lanctot · 2024-06-23T13:16:43Z

lanctot
Jun 23, 2024
Maintainer

Ah yes, but to be honest I really don't remember why it does it that way. There was a deep technical reason. @jblespiau, do you remember?

@PURANJAY14 if I were you, I would make an alternative tabular policy called TabularPolicyDict that simply wraps a python dictionary but otherwise exposes the necessary methods in the Policy base class. (And initializing the state policy to uniform + adding it to the dictionary upon first call). Then replace all the ones in cfr.py with your custom dict.

Hope this helps!

2 replies

PURANJAY14 Jun 24, 2024
Author

Thanks for the idea

lanctot Jul 1, 2024
Maintainer

No problem. If you do it, I would encourage you to submit a PR so others might benefit from it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counterfactual Regret Minimization (CFR) without generating all the states apriori #1239

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Counterfactual Regret Minimization (CFR) without generating all the states apriori #1239

PURANJAY14 Jun 12, 2024

Replies: 3 comments · 2 replies

lanctot Jun 22, 2024 Maintainer

PURANJAY14 Jun 23, 2024 Author

lanctot Jun 23, 2024 Maintainer

PURANJAY14 Jun 24, 2024 Author

lanctot Jul 1, 2024 Maintainer

PURANJAY14
Jun 12, 2024

Replies: 3 comments 2 replies

lanctot
Jun 22, 2024
Maintainer

PURANJAY14
Jun 23, 2024
Author

lanctot
Jun 23, 2024
Maintainer

PURANJAY14 Jun 24, 2024
Author

lanctot Jul 1, 2024
Maintainer