DanQ is a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. This is implemented by tensorflow-2.0 again.
CNN + BidLSTM + Dense
Binary Cross Entropy
Adam
We run training on Ubuntu 18.04 LTS with a GTX 1080ti GPU.
Python (3.7.3) | Tensorflow (2.0.0) | CUDA (10.0) | cuDNN (7.6.0)
You need to first download the training, validation, and testing sets from DeepSEA. You can download the datasets from here. After you have extracted the contents of the tar.gz file, move the 3 .mat files into the data/ folder.
The model that trained by myself is available in BAIDU Net Disk here
Because of my RAM limited, I firstly transform the train.mat file to .tfrecord files.
python preprocess.py
Then you can train the model initially.
CUDA_VISIBLE_DEVICES=0 python main_DanQ.py -e train
When you have trained successfully, you can evaluate the model.
CUDA_VISIBLE_DEVICES=0 python main_DanQ.py -e test
Yon can get the result in the ./result/
directory.
For DanQ:
For DanQ-JASPAR:
We use two metrics to evaluate the model. (AUROC, AUPR)
For DanQ:
- | DNase | TFBinding | HistoneMark | All |
---|---|---|---|---|
AUROC | 0.9022 | 0.9317 | 0.8303 | 0.9162 |
AUPR | 0.4072 | 0.2984 | 0.3373 | 0.3176 |
For DanQ-JASPAR:
- | DNase | TFBinding | HistoneMark | All |
---|---|---|---|---|
AUROC | 0.9124 | 0.9451 | 0.8395 | 0.9287 |
AUPR | 0.4323 | 0.3271 | 0.3508 | 0.3441 |
DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences | Github