PST-PRNA is a method to decipher RNA binding sites on protein surface based on protein surface topography. To achieve this, PST-PRNA builds the topographies and applies deep learning methods to learn from these. For convenient use, please visit the web service www.zpliulab.cn/PSTPRNA. For the standalone offline version, please install and use as follows.
- Conda Conda is recommended for environment management.
- Python (3.6).
- reduce (3.23). To add protons to proteins.
- MSMS (2.6.1). To compute the surface of proteins.
- PSI-Blast (2.6.0+) To generate PSSM.
- hhblits (3.3.0) To generate HMM.
- CD-HIT(4.8.1) To clustering protein sequences .
- DSSP To standardize secondary structure assignment.
- BioPython (1.78). To parse PDB files.
- Pytorch (1.7.1). pytorch with GPU version. Use to model, train, and evaluate the actual neural networks.
- scikit-learn (0.24.1).
Change the paths of these executable file at default_config/bin_path.py.
(1) The script 'protein.py' contains the class RBP which interates all procedures that are needed to convert a RBP to topographies.
(2) For each protein, it takes tens of minutes to calculate topographies. So we recommend using parallel computing tools, such as slurm. The bash script 'prepare_all.slurm' helps for extracting topographies in parallel cooperating with the python script 'prepare_all.py'.
(3) Users can also use 'prepare_all.py' all alone for preprocessing data. The files containing RBP_ids are in data/pdbid_chain. And the path of PDB_id lists should be specific the two 'prepare_all' scripts.
To train an ab initio model, simply uses the script 'train.py'. Specific the RBPs list in default_config/dir_options:
python train.py
To predict new RNA-binding sites: a. set the dir_opts['PDB_list_to_predict'] (in default_config) referring to the list file containing the PDB names (one name one line) b. move the PDB files to folder dir_opts['raw_pdb_dir']: then execute:
a. python prepare_all.py
b. python predict.py
PST-PRNA is released under an MIT License.
If you use this code, please use the bibtex entry in citation.bib.