Helix encoder: a compound-protein interaction prediction model specifically designed for class A GPCRs

Dependencies

Python = 3.7.10
pytorch >= 1.2.0
numpy = 1.20.2
RDkit = 2020.09.1
pandas = 1.3.4
Gensim >=3.4.0

Setup

Clone TransformerCPI
Place each file in this repository in the TransformerCPI directory

Data

/csvData
- csv files of protein sequences, compound SMILES, and interaction data used in the experiments
/data
- Text data as input for mol_featurizer
- data format
  - A text file containing compound SMILES, protein sequences, and interactions (0 or 1) in this order, separated by spaces. Protein sequences of each transmembrane region and extracellular loop region are also separated by spaces.

O=C(OCn1ncc(Br)c(Br)c1=O)c1c(F)cccc1F GLSVAASCLVVLENLLVLAAI LVNITLSDLLTGAAYLANVLL WFLREGLLFTALAASTFSLLF VYGFIGLCWLLAALLGMLPLL FCLVIFAGVLATIMGLYGAIF VLMILLAFLVCWGPLFGLLLA MDWILALAVLNSAVNPIIYSF 1

/dataset
- Generated when cloning the transformerCPI repository
- Directory where data embedded by mol_featurizer is stored

How to use

Embedding

Generate input for Helix encoder.
- python mol_featurizer_for_TM.py

Model training

Trains Helix encoder model.
- python helix_encoder_main.py

Predict

A trained model, Helix encoder (TM + ECL2), exists in this repository (/output/model/helixEncoder_TM_ECL2). If you want to use this model to predict your own data, use the following.

Place the data you want to predict in /data/.
At mol_featurizer_for_TM, place the embedding vector in /dataset/.
For prediction, run python predict.py

Citation

If you use this code, please cite the following paper:

@ARTICLE{10.3389/fbinf.2023.1193025,
    AUTHOR={Yamane, Haruki and Ishida, Takashi},   
    TITLE={Helix encoder: a compound-protein interaction prediction model specifically designed for class A GPCRs},      
    JOURNAL={Frontiers in Bioinformatics},      
    VOLUME={3},           
    YEAR={2023},      
    URL={https://www.frontiersin.org/articles/10.3389/fbinf.2023.1193025},       
    DOI={10.3389/fbinf.2023.1193025},      
    ISSN={2673-7647},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Helix encoder: a compound-protein interaction prediction model specifically designed for class A GPCRs

Dependencies

Setup

Data

How to use

Embedding

Model training

Predict

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
csvData		csvData
data		data
mol_featurizer		mol_featurizer
output		output
README.md		README.md
helix_ecl2_model.py		helix_ecl2_model.py
helix_encoder_main.py		helix_encoder_main.py
helix_model.py		helix_model.py
predict.py		predict.py

Haru38/HelixEncoder

Folders and files

Latest commit

History

Repository files navigation

Helix encoder: a compound-protein interaction prediction model specifically designed for class A GPCRs

Dependencies

Setup

Data

How to use

Embedding

Model training

Predict

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages