Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about D4RL-gym dataset version #4

Open
FineArtz opened this issue Jan 12, 2022 · 1 comment
Open

Question about D4RL-gym dataset version #4

FineArtz opened this issue Jan 12, 2022 · 1 comment

Comments

@FineArtz
Copy link

Hi, recently I read your paper and it inspire me a lot, and I think it is no doubt a good paper. However, I am confused about the version of D4RL dataset used for your compared baselines. I notice that in "Appendix C Baseline performance sources", the results of BC, MOPO (by the way, I didn't find MOPO in your experiment part) and MBOP are taken from their original papers, all of which use D4RL-gym-v0 datasets.
Because I find that the performance of CQL on D4RL-gym-v0^[1] is greatly different from that on D4RL-gym-v2[2] on several datasets, I wonder that will scores of the above baselines change greatly on D4RL-gym-v2, or you have evidence that this will not happen, since you compare these scores directly?

@jannerm
Copy link
Owner

jannerm commented Feb 1, 2022

Nice catch!

BC on v2 performs 4.1 percentage points higher than on v0, with an average score of 51.8 versus 47.7 [1]. I'll update this in the next arXiv version.

I have reached out to the authors of MBOP to see if they can share code for reevaluation on the v2 datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants