Skip to content

Latest commit

 

History

History
32 lines (22 loc) · 1.22 KB

readme.md

File metadata and controls

32 lines (22 loc) · 1.22 KB

02-{5,7}12 Final Project: Simulating Latent Single-Cell Gene Expression of COVID-19 Patients with Variational Autoencoder and Implications for Augmenting Classification Models

Abstract

We present a Variational Autoencoder-based simulation of single cell gene expression from healthy and COVID-19 patients' PBMC cells. To demonstrate the utility of this simulation, we build a synthetic dataset seeded from rare cell types. We then train a classifier of ventilation severity and demonstrate that, by augmenting with simulated data, predictive performance declines slightly. However, based on phylogeny tree analysis, the simulated rare cell data does not differ dramatically from the original distribution, as desired. Regardless of these mixed results, we consider our simulation as a solid baseline and a promising future direction to aiding in single cell research of COVID-19.

Setup

Pull the code properly

git clone --recursive https://github.com/simonlevine/singlecell-sim-by-VAE.git

Install dependencies

python -m venv venv
./venv/bin/pip install -r requirements.txt
pip install --user dvc

Grab the data

source .envrc && make download_data

Reproduce

source .envrc && dvc repro