Skip to content

zpliulab/PST-PRNA

Repository files navigation

PST

PST-PRNA: Prediction of RNA-Binding Sites Using Protein Surface Topography and Deep Learning.

Description

PST-PRNA is a method to decipher RNA binding sites on protein surface based on protein surface topography. To achieve this, PST-PRNA builds the topographies and applies deep learning methods to learn from these. For convenient use, please visit the web service www.zpliulab.cn/PSTPRNA. For the standalone offline version, please install and use as follows.

Standard alone Software prerequisites

  • Conda Conda is recommended for environment management.
  • Python (3.6).
  • reduce (3.23). To add protons to proteins.
  • MSMS (2.6.1). To compute the surface of proteins.
  • PSI-Blast (2.6.0+) To generate PSSM.
  • hhblits (3.3.0) To generate HMM.
  • CD-HIT(4.8.1) To clustering protein sequences .
  • DSSP To standardize secondary structure assignment.

Some important Python packages

  • BioPython (1.78). To parse PDB files.
  • Pytorch (1.7.1). pytorch with GPU version. Use to model, train, and evaluate the actual neural networks.
  • scikit-learn (0.24.1).

Specific usage

1 Download and install the standard alone software listed above.

Change the paths of these executable file at default_config/bin_path.py.

2 Topography preparing

(1) The script 'protein.py' contains the class RBP which interates all procedures that are needed to convert a RBP to topographies.

(2) For each protein, it takes tens of minutes to calculate topographies. So we recommend using parallel computing tools, such as slurm. The bash script 'prepare_all.slurm' helps for extracting topographies in parallel cooperating with the python script 'prepare_all.py'.

(3) Users can also use 'prepare_all.py' all alone for preprocessing data. The files containing RBP_ids are in data/pdbid_chain. And the path of PDB_id lists should be specific the two 'prepare_all' scripts.

3 Training

To train an ab initio model, simply uses the script 'train.py'. Specific the RBPs list in default_config/dir_options:

python train.py

4 Predicting

To predict new RNA-binding sites: a. set the dir_opts['PDB_list_to_predict'] (in default_config) referring to the list file containing the PDB names (one name one line) b. move the PDB files to folder dir_opts['raw_pdb_dir']: then execute:

a. python prepare_all.py
b. python predict.py

License

PST-PRNA is released under an MIT License.

Reference

If you use this code, please use the bibtex entry in citation.bib.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published