Skip to content

Latest commit

 

History

History
96 lines (78 loc) · 4.11 KB

README.md

File metadata and controls

96 lines (78 loc) · 4.11 KB

Cookiecutter Genomics Project Template

The goal of this template is to organize your computational genomics projects.

Requirements to use the cookiecutter template:


  • Python 3.5+
  • Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:
$ pip install cookiecutter

or

$ conda config --add channels conda-forge
$ conda install cookiecutter

To start a new project, run:


cookiecutter https://github.com/santiago1234/cookiecutter-genomics-project

The resulting directory structure

The directory structure of your new project looks like this:

├── LICENSE
├── README.md          <- The top-level README for developers using this project.
├── TableOfContents.md <- The table of contents that points to specific/important
│                           analysis or data.
│
├── data               <- Directory for storing fixed data sets. 
│   │
│   ├── generated      <- Important data and results that are generated.
│   ├── external       <- Data from third party sources.
│   └── raw            <- The original, immutable data dump.
│
│                      Inside each data subfolder we use the naming convention:
│                           year (last two digits)-month (number, two digits)-short-description
│                           e.g. `2306-1kgpmetada`, `2307-OtherData`
│
├── docs               <- Project documentation.
│
├── experiments       <- Directory to place/run experiments, to make a new experiment
│                           use the cookiecutter template [cookiecutter-analysis-directory].
│                           This cookiecutter is based on Noble 2009, Carrying Out a Single Experiment
│
├── scratch            <- This folder contains temporal or intermediate files that can be
│                           easily generated. Periodically the contents of this folder
│                           will be deleted.
│
├── envs               <- Conda environments for reproducing experiments in the Project.
│   └── popgene.yaml   <- A conda environment.
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── bin               <- Project-specific scripts.
│                           This type of scripts provides a generic functionality
│                           used by multiple experiments within the given project.
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
│
│
└── {{cookiecutter.python_module_name}} <- Source code for use in this project.
    ├── __init__.py    <- Makes {{cookiecutter.python_module_name}} a Python module
    │
    ├── data.py        <- Scripts to download or generate data
    │
    └─ utils.py       <- Usefull functions

Carrying out a sinlge experiment

Inside the results dir is where experiments are performed, to se a new experiment/analysis use the cookiecutter-analysis-directory

cookiecutter https://github.com/santiago1234/cookiecutter-analysis-directory

References

This template is based on my personal experience working in bioinformatics projects, as well as the best practices and insights gained from: