-
Hello, As I am trying to use Topaz more, I figured I should take a closer look at the training results, so I decided to systematically plot the log file in addition to inspecting the picks visually. But I don't know much about machine learning, so I am not sure how to interpret these plots. Here is an attempt below, I would appreciate any comments. Thank you in advance. 😃 LossAs I understand it, this metric should go down during training, so the following plot looks ok to me. GE penaltyShould this metric also go down (or stay down)? Looks like it does anyway. PrecisionThis one takes values between 0 and 1. It will always approach 1 for the training set, because this is one of the parameters being optimized in the training procedure. What we want is that it also approaches 1 for the test set (micrographs and coordinates never seen in training, used to assess how well the model predicts on new data). By this metric, this training run doesn't look very good. Or could the low precision measured on the test set result from sparsely picked micrographs? (I mean sparsely picked by me when preparing the training set, so the model likely picked a lot more particles than were labeled in the test set). True/false positive rateBoth take values between 0 and 1. We want Test setBy this metric, it looks to me that my trained model doesn't perform much better than the built-in one. Training setArea under precision/recall curveBased on this article: https://en.wikipedia.org/wiki/Precision_and_recall Both precision and recall vary in a range from 0 to 1, so when plotting one versus the other, the resulting curve can have a maximum area of 1. Training is optimal when Looks like this training run is far from optimal. Unless in practice it's normal for |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
An important thing to note for precision and AUPR/average-precision scores: these only go to 1 for a perfect classifier if all of the ground truth particles are labeled! Therefore, we should not expect this to go to 1. Here's a rough example: Let A be the number of ground truth positives, let B be the number of predicted positives, and let TP be the number of true positives, that is the number of ground truth positives that are also predicted positives (A U B). The precision is TP/B. If all of the predicted positives are ground truth positives, then precision=1. Now, imagine that A is incompletely labeled. What if instead of having all ground truth positives, A, we only have a labeled subsample, A'. Given a perfect predictor (i.e. B = A), then TP' is A', but we predict all of A. This means our precision is TP/A = A'/A! Therefore, a perfect predictor would only achieve a precision of A'/A which is the fraction of ground truth positives that are labeled! The AUPR is similarly upper bounded. |
Beta Was this translation helpful? Give feedback.
-
So, it looks like I was not downscaling the micrographs enough. I used a factor 4 for the training run described above. When I used a factor 10 (and a few more particles labeled in the training set too), I got much better tpr/fpr and auprc: |
Beta Was this translation helpful? Give feedback.
An important thing to note for precision and AUPR/average-precision scores: these only go to 1 for a perfect classifier if all of the ground truth particles are labeled! Therefore, we should not expect this to go to 1. Here's a rough example:
Let A be the number of ground truth positives, let B be the number of predicted positives, and let TP be the number of true positives, that is the number of ground truth positives that are also predicted positives (A U B). The precision is TP/B. If all of the predicted positives are ground truth positives, then precision=1.
Now, imagine that A is incompletely labeled. What if instead of having all ground truth positives, A, we only have a labeled sub…