This project provides a structured template for deep learning research and applications in PyTorch. The organized directory structure facilitates easy navigation, maintenance, and scalability.
This file contains the licensing information for the project, specifying how others can use, modify, and distribute the code.
A collection of utility tools and scripts.
- hpc: Houses scripts related to High-Performance Computing. Currently, it contains:
qsub.py
: A script for submitting jobs to an HPC cluster.
Configuration files and utilities for the project reside here.
-
Model Configurations: Files like
PROJECTNAME_train_config.py
, etc., define hyperparameters and settings for different model trainings. -
Utilities:
config_utils.py
: Contains utility functions that assist with configuration display and color-coded messaging.configlib.py
: Core library functions for configuration management, inspired by this guide.global_train_config.py
: A dynamic configuration handler that allows for a modular approach to various projects. It provides general settings and configurations common across all training setups and facilitates the selection of specific project configurations based on command line arguments.
This directory is reserved for storing logs generated during training, testing, or any other processes. Logs are essential for tracking the progress and performance of models.
Scripts and files related to data preprocessing. Generally ran on raw input data to prepare it for training and inference.
- example_project
/00.data_clean.py
: Script for data cleaning.
Modify the save_root
and src_root
in 00.data_clean.py
and run as follows. It will convert the nifty images into numpy array and save them in the save_root
. You will need to write these yourself for your own project.
python 00.data_clean.py
Lists all the Python packages and their specific versions required to run the project.
Contains shell scripts for various purposes, like starting training or testing processes, especially useful in an HPC environment.
- example_project/local:
example_cv0.sh
: An example script for testing theexample
model with cross-validation fold set 0. Will need to be modified based on the project and model.
Source code directory.
- data: Contains code related to data handling.
dataloaders.py
: Functions to load data batches.preprocess.py
: Functions for data preprocessing.
- model: All model-related code.
- archs: Model architectures, inherit from
baseArch.py
. - networks: Network configurations or implementations, like
local.py
. - Utility Files:
functions.py
(general utility functions),layers.py
(custom layers),loss.py
(loss functions), andmetric.py
(evaluation metrics).
- archs: Model architectures, inherit from
Script for testing trained models against validation or test datasets.
Main script to train deep learning models.
- Clone the Repository: Use the following command to clone the repository to your local machine:
git clone MyRepo
- Set Up the Environment: Navigate to the project directory and install the required packages:
cd deep-learning-project-template
python3 -m pip install -r requirements.txt
It is highly recommended you set up a virtual environment to manage dependencies and avoid potential conflicts:
python3 -m venv envs/PROJECTNAME_env
source envs/PROJECTNAME_env/bin/activate # On Windows use: env\Scripts\activate
python3 -m pip install -r envs/PROJECTNAME_requirements.txt
- Adjust Configurations:
-
Navigate to the
config
directory. -
Review and adjust project/model-specific configuration files of the format
PROJECTNAME_train_config.py
, based on your requirements. -
For global settings that apply to all training setups, modify
global_train_config.py
.
- Data Preprocessing:
- Ensure your data is placed in the appropriate directories.
- Run preprocessing scripts from the
preprocessing
directory, such as00.data_clean.py
, to clean and prepare your data for training.
- Training:
-
Use
train.py
to start training your models. Depending on your configurations, the trained models and logs will be saved in designated directories. -
Monitor the
logs
directory for training progress and potential issues; or if you are using an experient management tool like Weights & Biases, you can monitor the training progress there.
- Testing & Evaluation:
- After the training is done, you can copy the models back and do the inference in a local machine to avoid queuing.
- The models will be saved in
./logs/[--project]/[--exp_name]
- Use the following command to do the inference. It will do inference on the best model if the ''[num_epoch]'' is omit.
python3 -m test.py ./logs/[--project_name]/[--exp_name] [gpu_id] [num_epoch]
- Explore Further:
- Dive into the
src
directory to explore model architectures, utilities, and other functionalities. - Use scripts from the
scripts
directory for specific tasks, especially if you're working in an HPC environment.
For detailed explanations or if you encounter any issues, please refer to the individual READMEs in specific directories or raise an issue in the repository.
- Nuri Cingillioglu, "Argparse with multiple files to handle configuration in Python", Link.
This project is based on original work by Qianye Yang. The current repository serves as a template, with specific project details stripped for generalization purposes. I deeply appreciate Qianye Yang for their invaluable contributions and expertise.
You can find the original project and more of Qianye Yang's work at https://github.com/QianyeYang/mpmrireg.