Most experiments were run on a Google Cloud Compute Engine VM with the following specs:
- NVIDIA T4 GPU
- 4 vCPU, 15 GB RAM (n1-standard-4)
- 200 GB disk
- Debian 10 based Deep Learning VM with M102 image
Once you've created the machine and installed the NVIDIA drivers (you will be prompted to do so on your first login) follow the steps below to configure the environment:
- Download and install anaconda by running the following command:
wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
sh Anaconda3-2022.10-Linux-x86_64.sh
- Install the
wilds
package and removetorch
andtorchvision
(we need to install an older version of these packages because of issues with CUDA dynamic libraries). In order to run all the CGD experiments you will need to use our custom version of wilds, available in thewilds
folder.
# Create a conda environment if you haven't done so already. Wilds suggests Python 3.8.5
conda create -n wilds python=3.8.5
conda activate wilds
# Install dependencies
# Install the modified version of wilds
pip install -e ./wilds
# Install additional libraries required by wilds
pip install transformers
pip install wandb # only if you're using WandB
# Remove torch and torchvision
pip uninstall torch
pip uninstall torchvision
- Install
torch==1.12.1
and the related dependencies
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
- Install
torch-geometric
pip install pyg-lib torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.12.0+cu113.html
pip install torch-geometric
Now you should be ready to run experiments!
In order to run processes in the background (and avoid them crashing when the ssh session is terminated) run them as
nohup <command> &
And then monitor their stdout with the following command (the PID will be printed out when the process starts)
tail -f /proc/<pid>/fd/1