Skip to content

dbready/needler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Needler

Code and algorithm to support our paper Needler: An Algorithm to Develop a Comprehensive Targeted MS Method Capable of Monitoring the Human Proteome.

Steps

Makefile should replicate the needed steps assuming a standard Linux environment (make, wget, etc) + Python Poetry are available. Alternatively, can use the provided .devcontainer which will create an environment with necessary tooling.

Makefile high-level outline:

  • make build_env Configure the Python virtual environment with Poetry
  • make download Download required datasets
  • make munge pre-process the datasets
    • extract liver proteins from Tissue Atlas study
    • extract human proteins from Uniprot
    • digest proteins into tryptic peptides
    • filter tryptic peptides
      • sized 5-30 residues
      • distinct (represented by single protein)
      • does not contain uncommon amino acid (X or U)
      • does not contain methionine (commonly oxidized and is inappropriate for MS quantification)
      • each protein represented by >= 2 peptides (common MS quantitation criteria)
    • assign predicted iRT and retention time value to all retained peptide sequences
    • produce sub-proteomes for study: liver, kinase, dub
  • make fit run the needler algorithm to produce targeted MS methods for each proteome. Recommend invoking with a -j <CPU_COUNT> option to run fits in parallel. This represents a significant amount of computation and is not advised to run on a single machine as it represents potentially years of dedicated computer time.

Data

Datasets contained within the repo itself:

License

Code is available under an Apache 2 license.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published