Skip to content

Commit

Permalink
Update wandb reports (#10)
Browse files Browse the repository at this point in the history
* Update wandb reports

* Update CQL report in cods

* Update DT report in docs
  • Loading branch information
DT6A authored Nov 30, 2023
1 parent 2424df1 commit 14abd6e
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 9 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,17 +44,17 @@ docker run --gpus=all -it --rm --name <container_name> <image_name>
| Algorithm | Variants Implemented | Wandb Report |
|--------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------| ----------- |
| **Offline and Offline-to-Online** | |
| ✅ [Conservative Q-Learning for Offline Reinforcement Learning <br>(CQL)](https://arxiv.org/abs/2006.04779) | [`offline/cql.py`](algorithms/offline/cql.py) <br /> [`finetune/cql.py`](algorithms/finetune/cql.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-CQL--VmlldzoyNzA2MTk5) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-CQL--Vmlldzo0NTQ3NTMz)
| ✅ [Accelerating Online Reinforcement Learning with Offline Datasets <br>(AWAC)](https://arxiv.org/abs/2006.09359) | [`offline/awac.py`](algorithms/offline/awac.py) <br /> [`finetune/awac.py`](algorithms/finetune/awac.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-AWAC--VmlldzoyNzA2MjE3) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-AWAC--VmlldzozODAyNzQz)
| ✅ [Offline Reinforcement Learning with Implicit Q-Learning <br>(IQL)](https://arxiv.org/abs/2110.06169) | [`offline/iql.py`](algorithms/offline/iql.py) <br /> [`finetune/iql.py`](algorithms/finetune/iql.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-IQL--VmlldzoyNzA2MTkx) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-IQL--VmlldzozNzE1MTEy)
| ✅ [Conservative Q-Learning for Offline Reinforcement Learning <br>(CQL)](https://arxiv.org/abs/2006.04779) | [`offline/cql.py`](algorithms/offline/cql.py) <br /> [`finetune/cql.py`](algorithms/finetune/cql.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-CQL--Vmlldzo1MzM4MjY3) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-CQL--Vmlldzo0NTQ3NTMz)
| ✅ [Accelerating Online Reinforcement Learning with Offline Datasets <br>(AWAC)](https://arxiv.org/abs/2006.09359) | [`offline/awac.py`](algorithms/offline/awac.py) <br /> [`finetune/awac.py`](algorithms/finetune/awac.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-AWAC--Vmlldzo1MzM4MTEy) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-AWAC--VmlldzozODAyNzQz)
| ✅ [Offline Reinforcement Learning with Implicit Q-Learning <br>(IQL)](https://arxiv.org/abs/2110.06169) | [`offline/iql.py`](algorithms/offline/iql.py) <br /> [`finetune/iql.py`](algorithms/finetune/iql.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-IQL--Vmlldzo1MzM4MzQz) <br /> <br /> [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-IQL--VmlldzozNzE1MTEy)
| **Offline-to-Online only** | |
| ✅ [Supported Policy Optimization for Offline Reinforcement Learning <br>(SPOT)](https://arxiv.org/abs/2202.06239) | [`finetune/spot.py`](algorithms/finetune/spot.py) | [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-SPOT--VmlldzozODk5MTgx)
| ✅ [Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning <br>(Cal-QL)](https://arxiv.org/abs/2303.05479) | [`finetune/cal_ql.py`](algorithms/finetune/cal_ql.py) | [`Offline-to-online`](https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-Cal-QL--Vmlldzo0NTQ3NDk5)
| **Offline only** | |
| ✅ Behavioral Cloning <br>(BC) | [`offline/any_percent_bc.py`](algorithms/offline/any_percent_bc.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-BC--VmlldzoyNzA2MjE1)
| ✅ Behavioral Cloning <br>(BC) | [`offline/any_percent_bc.py`](algorithms/offline/any_percent_bc.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-BC--Vmlldzo1MzM4MzU2)
| ✅ Behavioral Cloning-10% <br>(BC-10%) | [`offline/any_percent_bc.py`](algorithms/offline/any_percent_bc.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-BC-10---VmlldzoyNzEwMjcx)
| ✅ [A Minimalist Approach to Offline Reinforcement Learning <br>(TD3+BC)](https://arxiv.org/abs/2106.06860) | [`offline/td3_bc.py`](algorithms/offline/td3_bc.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-TD3-BC--VmlldzoyNzA2MjA0)
| ✅ [Decision Transformer: Reinforcement Learning via Sequence Modeling <br>(DT)](https://arxiv.org/abs/2106.01345) | [`offline/dt.py`](algorithms/offline/dt.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-Decision-Transformer--VmlldzoyNzA2MTk3)
| ✅ [A Minimalist Approach to Offline Reinforcement Learning <br>(TD3+BC)](https://arxiv.org/abs/2106.06860) | [`offline/td3_bc.py`](algorithms/offline/td3_bc.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-TD3-BC--Vmlldzo1MzM4MDA2)
| ✅ [Decision Transformer: Reinforcement Learning via Sequence Modeling <br>(DT)](https://arxiv.org/abs/2106.01345) | [`offline/dt.py`](algorithms/offline/dt.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-Decision-Transformer--Vmlldzo1MzM3OTkx)
| ✅ [Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble <br>(SAC-N)](https://arxiv.org/abs/2110.01548) | [`offline/sac_n.py`](algorithms/offline/sac_n.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-SAC-N--VmlldzoyNzA1NTY1)
| ✅ [Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble <br>(EDAC)](https://arxiv.org/abs/2110.01548) | [`offline/edac.py`](algorithms/offline/edac.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-EDAC--VmlldzoyNzA5ODUw)
| ✅ [Revisiting the Minimalist Approach to Offline Reinforcement Learning <br>(ReBRAC)](https://arxiv.org/abs/2305.09836) | [`offline/rebrac.py`](algorithms/offline/rebrac.py) | [`Offline`](https://wandb.ai/tlab/CORL/reports/-Offline-ReBRAC--Vmlldzo0ODkzOTQ2)
Expand Down
4 changes: 2 additions & 2 deletions docs/algorithms/cql.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Possible extensions:
For detailed scores on all benchmarked datasets see [benchmarks section](../benchmarks/offline.md).
Reports visually compare our reproduction results with original paper scores to make sure our implementation is working properly.

<iframe src="https://wandb.ai/tlab/CORL/reports/-Offline-CQL--VmlldzoyNzA2MTk5" style="width:100%; height:500px" title="CQL Offline Report"></iframe>
<iframe src="https://wandb.ai/tlab/CORL/reports/-Offline-CQL--Vmlldzo1MzM4MjY3" style="width:100%; height:500px" title="CQL Offline Report"></iframe>

<iframe src="https://wandb.ai/tlab/CORL/reports/-Offline-to-Online-CQL--Vmlldzo0NTQ3NTMz" style="width:100%; height:500px" title="CQL Finetune Report"></iframe>

Expand Down Expand Up @@ -239,4 +239,4 @@ TrainConfig:
--project str wandb project name (default: CORL)
--group str wandb group name (default: CQL-D4RL)
--name str wandb run name (default: CQL)
```
```
2 changes: 1 addition & 1 deletion docs/algorithms/dt.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ We'd be glad if someone would be interested in contributing them!
For detailed scores on all benchmarked datasets see [benchmarks section](../benchmarks/offline.md).
Reports visually compare our reproduction results with original paper scores to make sure our implementation is working properly.

<iframe src="https://wandb.ai/tlab/CORL/reports/-Offline-Decision-Transformer--VmlldzoyNzA2MTk3" style="width:100%; height:500px" title="Decision Transformer Report"></iframe>
<iframe src="https://wandb.ai/tlab/CORL/reports/-Offline-Decision-Transformer--Vmlldzo1MzM3OTkx" style="width:100%; height:500px" title="Decision Transformer Report"></iframe>

## Training options

Expand Down

0 comments on commit 14abd6e

Please sign in to comment.