1.1 MCMSEM in the cloud

MCMSEM on cloud GPU

Preface

⚠️ Please read carefully before you continue ⚠️

Getting all requirements for MCMSEM to work properly on a cloud GPU can be tricky, this page is an attempt to make the process easier, but we cannot guarantee that this will always work 100%. Linux experience will definitely help.

Typically cloud providers will have per-second billing. This means that every second with an active cloud instance will cost money. This not only includes the time spent actually running MCMSEM, but also the time installing everything. Be prepared for this.

When your analysis is done, or you are shutting down for the day, or for whatever reason you won't be using your VM for a while, make sure to stop it from your VM instances console https://console.cloud.google.com/compute/instances otherwise you will pay for server time you do not use.

If you are done with your analyses and won't be needing your VM again for a longer time, make sure to terminate it completely, and make sure all associated disks are removed (as these are billed separately). If you are planning on using MCMSEM again in the future I recommend you make a custom machine image from the VM, as this will save you from having to set up the VM again, and storage of a machine image is significantly cheaper than storage of a disk.

In short, this will cost money, and we are not responsible for any expenses as a result of running MCMSEM, following this manual, or any other reason.

Cloud providers

There are a lot of options for cloud service providers, and the VM types they offer are subject to change. Ofcourse, you should primarily select the cloud provider based on your budget and/or institutional requirements. If you have a choice between multiple providers, find the one that provides single VM instances with 1 GPU with as much VRAM as possible (or as much as you can afford). Additionally, be on the lookout for cloud providers that offer pre-compiled machine images with the correct CUDA and cudNN versions.

At the time of writing Google Cloud is the recommended provider as they have servers in multiple regions (USA/EU/Asia), and have an instance type with 1 GPU with 40GB RAM (a2-highgpu-1g), and may soon have 80GB GPUs available (a2-ultragpu-1g), and have a machine image that fits the current version of R torch. As a bonus Google Cloud is relatively easy to set up and use, compared to some others.

Setup on Google Cloud

First create or set up a Google cloud account with billing information
Create a Google cloud project
- This should pop up once you visit https://console.cloud.google.com, or you can find projects in the top left hand corner and create one there
Next request a quota increase from Google for the desired node types in the desired region (this may take a few days)
- You will get a notification linking you to the correct page when you try to create a node outside your quota
Once this is approved (or to get to the request page), go to your VM instances: https://console.cloud.google.com/compute/instances
- Click CREATE INSTANCE
- In Machine configuration, select GPU, and select the preferred GPU and machine type.
- I used and recommend NVIDIA A100 40GB (a2-highgpu-1g)
  - Important: Multiple GPUs will not benefit you, don't pay for stuff you won't use!
- Under Boot Disk, select CHANGE
- Under Operating System choose Deep Learning on Linux
- Then, under Version choose Debian 10 based Deep Learning VM with M97*
  - At the time of writing this is the preferred image, but this is likely to change in the future... The point is that you choose an image CUDA 11.3 and cudNN 8.4 (or whatever R torch requires when you are reading this). This saves you from having to install CUDA and cudNN yourself, saving you a lot of time, money and headaches
- Make sure to set the disk size to an appropriate amount given your data, for an estimate, data size + 50GB should be fine
- Create the server
In the list of your VM instances, wait until your VM is running.
Once your VM is up and running, connect to it via the SSH button
This connects you to your VM via SSH in your browser and has Google manage all required keys. If you really want to set this up yourself and not connect from the browser, see https://cloud.google.com/compute/docs/instances/connecting-advanced
Once connected you can start uploading your data (as this may be slow depending on traffic) and start the installation procedure.
First, install R 4.2.1 (or some later version), Rs own manual, at the time of writing, is a bit vague on how to do this https://cran.r-project.org/bin/linux/debian/#debian-buster-stable but the general idea is:
- Import key fingerprint apt-key adv --keyserver keyserver.ubuntu.com --recv-key '95C0FAF38DB3CCAD0C080A7BDC78B2DDEABC47B7'
- Add deb http://cloud.r-project.org/bin/linux/debian bullseye-cran40/ to /etc/apt/sources.list
- sudo apt update
- sudo apt install -t buster-cran40 r-base
install dependencies
- sudo apt install libxml2-dev libharfbuzz-dev libfribidi-dev libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev r-base-dev build-essential build-essential libssl-dev libcurl4-openssl-dev libcairo2-dev libgit2-dev
Install R packages:
- sudo R
- install.packages("devtools")
- install.packages("torch")
- library(torch)
- Check if CUDA works : a <- torch_tensor(1, device=torch_device('cuda'))
Odds are some of these steps will not work on the first go, your best source of information to fix this would be Google/Stackoverflow the usual suspects.
Once that is done, library(devtools) and install the latest version of MCMSEM
Then you are good to go. I recommend making an image of your VM at this point for future use should you need to do this again at some point.
- The cost of a machine image is significantly lower than the cost of an active disk, so especially if you are done with your current analysis but are planning on using it again in the future, save a machine image before you terminate your VM and remove the disk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.1 MCMSEM in the cloud

MCMSEM on cloud GPU

Preface

Cloud providers

Setup on Google Cloud

Clone this wiki locally