Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What hyperparameters are used for the Rainbow ablations? #11

Open
ianporada opened this issue Jan 7, 2025 · 1 comment
Open

What hyperparameters are used for the Rainbow ablations? #11

ianporada opened this issue Jan 7, 2025 · 1 comment

Comments

@ianporada
Copy link

Very nice work!

Is it possible to share the exact hyperparameters used for the ablations on the Rainbow environment? I am trying to recreate these results using the smaller, default Transformer size (3 layers, 128 dim, 8 heads). However, I find that most problems are solved at exactly 64 nodes expanded. (Interestingly there also seems to be a jump in Figure 4 from the paper at 64 nodes expanded.)

Here are what my current results look like:

image

Thanks so much!

@revalo
Copy link
Owner

revalo commented Jan 10, 2025

Hey Ian! Thank you!

I just checked my wandb logs, the hyperparameters you used are indeed correct. There is a small quirk in the way the number of nodes is calculated across different methods. Particularly, for rejection sampling it isn't clear if, say, the 20th generation of parallel generation of 64 programs should be counted as 20 expanded nodes, or 64 since that compute was spent.

I forget how we did these between different methods, let me take a look into the codebase and re-run these ablations again to see if maybe there is a bug with quantization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants