Reasoning behind choosing the same bucket length for different buckets in adwin #1477
Replies: 2 comments 2 replies
-
Hi @LorenzoCutrupi, thanks for reporting. This does indeed look odd. I went to check the original MOA code and the same happens there. I was not involved in the original port from there, nor the conversion to Cython. I will need to study a bit more to understand the possible reason if there is any. A possible reason is that ADWIN organizes the buckets by levels and follows an exponential histogram idea where buckets at the same level have the same capacity:
and so on... Maybe that is the reasoning but I can't say for sure. |
Beta Was this translation helpful? Give feedback.
-
You can also check a pure Python version I implemented a while ago if you want. I did not follow the Java code for this one, just the paper. We considered bringing it to River to replace the Cython version, as none of the current maintainers are very familiar with the current Java-ish code. |
Beta Was this translation helpful? Give feedback.
-
I was looking at the code of adwin, and I noticed something I didn't expect in a method of the adwin_c.pyx (for simplicity I'll put it below):
In particular I have doubts about the lines:
Why the index is the same if we want to pick length of different buckets?
Beta Was this translation helpful? Give feedback.
All reactions