Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
MolCLR		MolCLR
YouGraphRF		YouGraphRF
data		data
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

Repository files navigation

DrugANNs

Global AI Challenge solution Overall pipeline:

data preprocessing (removed unneeded parts of molecules)
generated Morgan, MACCS and Estate fingerprints
applied MolCLR graph neurla network
applied RandomForest to the features described before
the models' results were merged and averaged
the results from the previous point were also passed to the Lipinski rule checker

Repository structure

notebooks - contains all the notebooks which were used during the analysis
data - folder with all the data we used
MolCLR - directory with MolCLR model
YouGraphRF - directory with random forest model

Model running

Run data_preprocessing.ipynb to make canonical SMILES
Run ogb-rdk-transform.ipynb to get preprocessed dataset
Go to YouGraphRF and run python random_forest.py --smiles_file ... --smiles_test_file ...
Take predictions from rf_preds/rf_final_pred.npy
Go to MolCLR
Place preprocessed molecules data to data/covid/COVID.csv and data/covid/COVID-test.csv for train and test subsets correspondingly.
Run python finetune_contrast.py
Finally, run predict-molclr.ipynb. You need to change model path with your checkpoint. Or you can find checkpoint used for submission in finetune folder
The final predictions should be passed to lipinski_rule_application.ipynb

Requirements

You can find the requirements in requirements.txt file

About

Global AI Challenge solution

Report repository

Releases

No releases published

Packages

No packages published

Contributors 4

Languages