MINING-D

MINING-D is a tool (written in Python 3.6.5) for inference of D genes using Rep-Seq data from diverse species. It takes as input a fasta file with consensus CDR3s from an immunosequencing dataset (see paper) and writes the inferred D genes in the output file in the fasta format.

There are two ways of running MINING-D.

From command line - To use default MINING-D parameters, use

$ python MINING_D.py -i <input_CDR_file> -o <output_file>

To check all available options, use python MINING-D.py --help.

$ python MINING_D.py --help 
Usage: MINING_D.py [OPTIONS]

Options:
-i, --input_cdr_file TEXT  Input file with consensus CDR3s  [required]
-o, --output_file TEXT     Output file to store inferred D genes  [required]
-k INTEGER                 Length of starting k-mers  [default: 10]
-n, --num_k_mers INTEGER   Number of most abundant k-mers to extend [default: 300]
-p, --p_val_th FLOAT       p-value threshold for the extension procedure [default: 4.5e-36]
-t, --n_cores INTEGER      Number of cores  [default: 10]
--bidir_alpha FLOAT        Alpha for filtering bidirectional extensions (see paper)  [default: 0.5]
-g, --cliq_th INTEGER      Similarity metric threshold for generating graphs [default: 2]
--help                     Show this message and exit.

Jupyter Notebook - If you would like to run MINING-D in an interactive environment, there is a Jupyter notebook named "interactive_MINING_D.ipynb" in the repo. Please specify the input file, output file, and other parameters at the top of the notebook like shown in the example.

Dependencies

Biopython
Networkx<2.4
Joblib
NumPy
SciPy
click

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Readme.md

Readme.md

MINING-D

Files

Readme.md

Latest commit

History

Readme.md

File metadata and controls

MINING-D