You must be signed in to change notification settings - Fork 1
This folder contains code used while working with MS2PIP.
Reads an MSF file (SQLite DB), combines it with the matched (multiple) MGF files and writes a spectral library as 1 MS2PIP PEPREC and MGF file. Filters by a given FDR threshold, using q-values calculated from decoy hits or from Percolator.
Input: MSF and MGF files
Output: Matched PEPREC and MGF file (MS2PIP spectral library)
usage: MSF_to_MS2PIP_SpecLib.py [-h] [-s MSF_FOLDER] [-g MGF_FOLDER]
[-o OUTNAME] [-f FDR_CUTOFF] [-p] [-c]
Convert Sequest MSF and MGF to MS2PIP spectral library.
optional arguments:
-h, --help show this help message and exit
Folder with Sequest MSF files (default: "msf")
Folder with MGF spectrum files (default: "mgf")
Name for output files (default: "SpecLib")
FDR cut-off value to filter PSMs (default: 0.01)
-p Use Percolator q-values instead of calculating them
from TDS (default: False)
-c Combine multiple MSF files into one spectral library
(default: False)
Split MS2PIP spectral library (PEPREC and MGF file) into a train and test set.
Input: PEPREC and MGF file
Output: PEPREC and MGF files for both train and test data set.
usage: Split_MS2PIP_SpecLib.py [-h] [-o OUT_FILENAME] [-f TEST_FRACTION]
peprec_file mgf_file
Split MS2PIP spectral library (PEPREC and MGF file) into a train and test set.
positional arguments:
peprec_file PEPREC file input
mgf_file MGF file input
optional arguments:
-h, --help show this help message and exit
-o OUT_FILENAME Name for output files (default: "SpecLib")
-f TEST_FRACTION Fraction of input to use for test data set (default: 0.1)
Adds amino acid suffix to "Phospho" modifications in PEPREC file. "Phospho" becomes, for instance, "PhosphoY". Also, for unmodified peptides, a hyphen is added to the PEPREC file.
Input: Folder with PEPREC files
Output: PEPREC files with amino acid suffix added to "Phospho" modifications
usage: PEPREC_AddPhosphoSuffix.py [-h] [-f PEPREC_FOLDER] [-r]
Add amino acid suffix to "Phospho" in modifications column in PEPREC file(s).
optional arguments:
-h, --help show this help message and exit
-f PEPREC_FOLDER Folder with input PEPREC files (default: "")
-r Replace the original PEPREC files instead of writing a new
file (default: False)
Takes every protein in a FASTA file, generates a PEPREC file with all tryptic peptides (check global variables to set charge states, min/max lengths and number of missed cleavages)
Input: FASTA file
Output: PEPREC file
Requirements: Biopython to parse FASTA file; Pyteomics for in silico cleavage; tqdm for progress bar.
usage: fasta2peprec.py [-h] fasta_file
Generate a PEPREC file for all proteins in a fasta file
positional arguments:
fasta_file FASTA file with proteins of interest
optional arguments:
-h, --help show this help message and exit
Takes spectra predicted by MS2PIP and write SkyLine spectral library format MS2+SSL
Input: MS2PIP predictions file
Output: MS2 and SSL files
Requirements: Pyteomics for mass calculations; tqdm for progress bar.
usage: prediction2ms2.py [-h] pep_file
Generate MS2 and SSL files from MS2PIP predictions
positional arguments:
pep_file PEPREC file used to generate predictions
optional arguments:
-h, --help show this help message and exit