As part of the Allen Institute for Cell Science's mission to understand the principles by which human induced pluripotent stem cells establish and maintain robust dynamic localization of cellular structure, CytoDL
aims to unify deep learning approaches for understanding 2D and 3D biological data as images, point clouds, and tabular data.
The bulk of CytoDL
's underlying structure bases the lightning-hydra-template organization - we highly recommend that you familiarize yourself with their (short) docs for detailed instructions on running training, overrides, etc.
Our currently available code is roughly split into two domains: image-to-image transformations and representation learning. The image-to-image code (denoted im2im) contains configuration files detailing how to train and predict using models for resolution enhancement using conditional GANs (e.g. predicting 100x images from 20x images), semantic and instance segmentation, and label-free prediction. Representation learning code includes a wide variety of Variational Auto Encoder (VAE) architectures. Due to dependency issues, equivariant autoencoders are not currently supported on Windows.
As we rely on recent versions of pytorch, users wishing to train and run models on GPU hardware will need up-to-date NVIDIA drivers. Users with older GPUs should not expect code to work out of the box. Similarly, we do not currently support training/predicting on Mac GPUs. In most cases, cpu-based training should work when GPU training fails.
For im2im models, we provide a handful of example 3D images for training the basic image-to-image tranformation-type models and default model configuration files for users to become comfortable with the framework and prepare them for training and applying these models on their own data. Note that these default models are very small and train on heavily downsampled data in order to make tests run efficiently - for best performance, the model size should be increased and downsampling removed from the data configuration.
Install dependencies. Dependencies are platform specific, please replace PLATFORM
with your platform - either linux
, windows
, or mac
# clone project
git clone https://github.com/AllenCellModeling/cyto-dl
cd cyto-dl
# [OPTIONAL] create conda environment
conda create -n myenv python=3.9
conda activate myenv
pip install -r requirements/PLATFORM/requirements.txt
# [OPTIONAL] install extra dependencies - equivariance related
pip install -r requirements/PLATFORM/equiv-requirements.txt
pip install -e .
#[OPTIONAL] if you want to use default experiments on example data
python scripts/download_test_data.py
Train model with chosen experiment configuration from configs/experiment/
#gpu
python cyto_dl/train.py experiment=im2im/experiment_name.yaml trainer=gpu
#cpu
python cyto_dl/train.py experiment=im2im/experiment_name.yaml trainer=cpu
You can override any parameter from command line like this
python cyto_dl/train.py trainer.max_epochs=20 datamodule.batch_size=64