-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low hom_ref recall during model training for new organism #904
Comments
Hi @DanJeffries , two questions for you: 1. Have you already trained a model to the end, and completed a variant calling + hap.py evaluation? 2. How good is your truth set and confident regions? |
Hi @pichuan , Thanks for the quick response! Regarding your questions:
I'll post back here once I have explored these points further. Thanks! Dan |
Thank you for your update! |
Dear Devs,
I am currently training a model (starting from wgs.1.6.1) for use in a fish species. The programs are running well, I have confident regions and truth variants defined, and am currently tuning hyperparameters to optimise the training.
However . . . . I notice when tracking the model eval stats (specifically f1, precision, recall), that the hom_ref classifications are much less reliable than hom_alt and het classes. My question is whether this is to be expected, or whether there might be something wrong with my training setup, or perhaps the examples.
The test example set I am using to tune the hyperparams looks like this:
The training command looks like this:
During other tests I have run training jobs with several other example sets (several times larger), for tens of thousands of steps and multiple epochs, and also using different learning rates and batch sizes. While these things of course make a difference to learning performance, the lower recall for class 0 (hom_ref) remains consistent.
Here are some lines from the log file during one such training run:
Thanks in advance for your help!
Dan
The text was updated successfully, but these errors were encountered: