You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 19, 2024. It is now read-only.
Hello, I appreciate your excellent work. I am also attempting to train this model on a larger dataset and have started with the CrossDocked2020-v1.3 dataset, which has an RMSD < 2A. This dataset already includes clustered training and test data distributions, so I'm unsure about the need for mmseqs2 clustering. Could you please explain the reason for this step?
Additionally, the CrossDocked2020 dataset has various docking forms, such as Autodock Vina docked poses of ligands in the receptor and the first and second iterations of CNN-optimized poses. Which of these did you use in your training process?
Lastly, I noticed that the .PDB files in your training dataset are smaller than those in the CrossDocked2020 dataset. Did you perform any extra processing steps to obtain these smaller receptor files?
Thank you in advance for your time and assistance!
The text was updated successfully, but these errors were encountered:
I also noticed on this. I believe the Autodock Vina data was not utilized, and I also found that the training datasets are smaller than those in the CrossDocked2020 dataset. How did you later handle this dataset?
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hello, I appreciate your excellent work. I am also attempting to train this model on a larger dataset and have started with the CrossDocked2020-v1.3 dataset, which has an RMSD < 2A. This dataset already includes clustered training and test data distributions, so I'm unsure about the need for mmseqs2 clustering. Could you please explain the reason for this step?
Additionally, the CrossDocked2020 dataset has various docking forms, such as Autodock Vina docked poses of ligands in the receptor and the first and second iterations of CNN-optimized poses. Which of these did you use in your training process?
Lastly, I noticed that the .PDB files in your training dataset are smaller than those in the CrossDocked2020 dataset. Did you perform any extra processing steps to obtain these smaller receptor files?
Thank you in advance for your time and assistance!
The text was updated successfully, but these errors were encountered: