-
-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of the KNN-ICAD (KNN Inductive Conformal Anomaly Detection Algorithm) #1441
Conversation
That's not true, we have EmpiricalPrecision :) I really rather not add numpy based implementations to River. It's just not our philosophy. This could go in river-extra through. In this case in particular, PySAD's implemention already supports mini-batches, so I don't see a lot of value adding this to River. @hoanganhngo610 I encourage you to ask before implementing stuff. I don't want you to work and spend time polishing methods that might not be suited to River. |
@MaxHalford I initially thought that Although PySAD has had this within its ecosystem, I really want to have one in our ecosystem to conduct any anomaly benchmarking, particularly when I was intending to implement more algorithms to the |
You're right, it's just inverse covariance matrix. But it could be generalized.
If the goal is make a benchmark, what I would do is create a separate repository to do those benchmarks. You could create a wrapper in there to unify PySAD and River on the same API. I don't think porting their code to us just for the sake of benchmarking is a good idea. Every model we add to River is an added model we need to maintain. We already have a lot to maintain. For instance, we have non-resolved issues for clustering methods that have been open for months. Anyway, I think if PySAD is an active project, there isn't any justification to add their stuff to River, especially if it implies using NumPy. I would do it the other way round: make some benchmarks; if a model from PySAD really stands out, then yes maybe let's add it to River. But let's please not add stuff to River for the sake of it. I'm sorry to be contrarian here, but I need to make sure we don't add too much stuff to River. Our users don't care too much if we have a lot of models. They just want a few models that work well. |
@MaxHalford I totally understand your point. In that case, I will close the PR at the moment! |
Thanks a lot for your understanding Hoang |
KNN-ICAD is a conformalized density- and distance-based anoaly detection algorithms for a one-dimensional time-series data. This algorithm uses a combination of a feature extraction method, an approach to assess a score whether a new observation differs significantly from a previously observed data, and a probabilistic interpretation of this score based on the conformal paradigm.
This implementation is adapted from the implementation within PySAD (Python Streaming Anomaly Detection) and NAB (Numenta Anomaly Benchmark).
This implementation relies heavily on
numpy
due to the following reasons:River
currently does not support a utility to calculate the inverse matrix, which is crucial in the calculation of the