'Recent' metrics: Should Metrics be Link-able? #498
-
Given that in many models the influence of older data tends to lessen, shouldn't we be able to apply the same logic to our metrics? For instance, the Accuracy of a model is somewhat interesting, but the recent accuracy is more interesting than the total accuracy. I thus thought I'd be able to do something like
to get a measure of recent accuracy, but alas, Accuracy is not a Link. So I'm reduced to doing a Rolling Accuracy over a kind of arbitrarily sized chosen 'recent' sample size. Is there something better? Am I Doing It Wrong? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
To calculate the accuracy with an exponential average I would do it like this: from river import stats
metric = stats.EWMean(alpha = 0.5)
y_pred = [1, 1, 1, 1, 1]
y_true = [0, 0, 1, 1, 1]
for pred, true in zip(y_pred, y_true):
metric.update(pred == true)
print(metric.get()) 1.0
from river import metrics
metric = metrics.Rolling(metrics.Accuracy(), window_size=3)
y_pred = [1, 1, 1, 1, 1]
y_true = [0, 0, 1, 1, 1]
for pred, true in zip(y_pred, y_true):
metric.update(y_pred = pred, y_true = true) Rolling of size 3 Accuracy: 100% Raphaël |
Beta Was this translation helpful? Give feedback.
-
Hey there. As @raphaelsty suggests (and it seems you already know) you can wrap a metric with We could in theory allow you to compute exponentially weighted averages for metrics that are decomposable, but how would that be interpreted? I find it really confusing to say that the "exponentially weighted accuracy is equal to x". On the contrary, communicating the accuracy over a window of fixed size is much easier to interpret. What do you think? |
Beta Was this translation helpful? Give feedback.
To calculate the accuracy with an exponential average I would do it like this:
1.0
So I'm reduced to doing a Rolling Accuracy over a kind of arbitrarily sized chosen 'recent' sample size.
You will choose an equally arbitrary alpha coefficient for the exponential average. Personally, I find the RollingAccuracy more easily interpretable.