Detector components (costs, scores etc.) as classes #24

Tveten · 2024-10-20T15:00:19Z

Goal: Unify the detector components. Make them safer. Make the extension pattern simpler and clearer.

See #23 for discussions.

New features:

The BaseIntervalEvaluator class, inheriting from sktime.BaseEstimator. Public methods:
- fit(self, X, y=None) -> self
- evaluate(self, intervals: ArrayLike) -> np.ndarray
Four sub base classes inheriting from BaseIntervalEvaluator:
- skchange.costs.BaseCost. Expects 2 columns in intervals: start, end.
- skchange.change_scores.BaseChangeScore. Expects 3 columns in intervals: start, split, end.
- skchange.anomaly_scores.BaseSaving. Expects 2 columns in intervals: start, end.
- skchange.anomaly_scores.BaseLocalAnomalyScore: Expects 4 columns in intervals: outer_start, inner_start, inner_end, outer_end.
Classes for automatically converting costs to any of the three other score classes.
Convenience functions allowing either costs or an appropriate score to be used as input to all the detectors.

All existing functionality is implemented within the new design + additions.

Common pattern for initialisers.

…ction to scores.mean

Leave it to the numba compiler for now. Checking it in a good way is complicated due to the generic classes such as CostBasedChangeScore

Decoupled from numba and the detectors as opposed to other suggested designs.

Error in previous commit.

Unnecessary

Much better coverage of all costs used as scores in the detectors.

None means that the min_size is unknown, for example until it is fitted.

Necessary for setting good default penalties and thresholds

Tag won't pass sktime conformance test

johannvk · 2024-11-25T08:41:31Z

Naming decisions to be made before merge:

1. `BaseIntervalEvaluator`. "Interval" or "Segment" or something else as the middle term? The point is to highlight the fact that this class is for evaluating a function over intervals/slices/segments of data. This is what enables strong computational performance compared to evaluating on any subset. BaseSegmentEvaluator could be confused with a class trying to evaluate a segmentation.

2. The argument name to `evaluate`. Options:
   a) `intervals` (current): Fits natural language "evaluate over intervals". Could be misleading, as the inputs may include splitting information between the first and last column.
   b) `segments`: Same as a), but in case the base class is renamed `BaseSegmentEvaluator`.
   c) `splitter` or `splits`: Highlights the possible splitting information. Perhaps the most general option. Could be confusing, as the data isn't split in two or more parts such that all the data is contained in the union of the split data.

I'd be very happy for feedback and opinions here, @fkiraly and @johannvk.

For question 1. I am partial to the BaseSegmentEvaluator-naming on a subjective basis. I think perhaps making either ''Segment'' or ''Intervals'' plural would better communicate the intention that the evaluation is meant to be performed over multiple intervals with the same .evaluate call. So in that case I'd prefer BaseSegmentsEvaluator.

And regarding the calling convention for the .evaluate method, I'm a bit scared by how the subclasses implementing it would specify their own argument needs. As I currently understand your PR-description, the BaseLocalAnomalyScore would expect a certain number of columns in it's intervals argument, but I'm afraid that the meaning of the different columns might easily be confused by a user of the library if the only way of determining/communicating their meaning is through the column ordering.

skchange.change_scores.BaseChangeScore. Expects 3 columns in intervals: start, split, end.
skchange.anomaly_scores.BaseSaving. Expects 2 columns in intervals: start, end.
skchange.anomaly_scores.BaseLocalAnomalyScore: Expects 4 columns in intervals: outer_start, inner_start, inner_end, outer_end.

So if possible, I'm wondering if a more explicit calling convention where each column that defines the interval would be a named argument to the .evaluate method could be a choice with less oppurtunity for confusion among the library users?
I'm not entirely sure how to most easily achieve this, and it may also be possible to instead require named columns in a DataFrame of some sort as well, to make it easy for the user of Skchange to determine how to pass the required arguments correctly, but I'm unsure if I like such an API.

fkiraly · 2024-11-25T08:52:06Z

Thanks for the pointer.

As I am not familiar with the code, and the names do not seem to imply examples, may I request two examples (optimally directly here in the text) for each base API? E.g., what would be an "interval evaluator"?

Is this a performance metric, or sth else? I suppose it is sth else?

Tveten · 2024-11-25T09:05:37Z

Here's the CUSUM change score for example:

class CUSUM(BaseChangeScore):
    """CUSUM change score for a change in the mean.

    The classical CUSUM test statistic for a change in the mean is calculated as the
    weighted difference between the mean before and after a split point within an
    interval. See e.g. Section 4 of [2]_, the idea goes back to [1]_.

    References
    ----------
    .. [1] Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2),
    100-115.
    .. [2] Wang, D., Yu, Y., & Rinaldo, A. (2020). Univariate mean change point
    detection: Penalization, cusum and optimality.
    """

    def __init__(self):
        super().__init__()

    @property
    def min_size(self) -> int:
        """Minimum size of the interval to evaluate."""
        return 1

    def _fit(self, X: ArrayLike, y=None):
        """Fit the change score evaluator.

        Parameters
        ----------
        X : array-like
            Input data.
        y : None
            Ignored. Included for API consistency by convention.

        Returns
        -------
        self :
            Reference to self.
        """
        X = as_2d_array(X)
        self.sums_ = col_cumsum(X, init_zero=True)
        return self

    def _evaluate(self, intervals: np.ndarray):
        """Evaluate the change score on a set of intervals.

        Parameters
        ----------
        intervals : np.ndarray
            A 2D array with three columns of integer location-based intervals to
            evaluate. The difference between subsets X[intervals[i, 0]:intervals[i, 1]]
            and X[intervals[i, 1]:intervals[i, 2]] are evaluated for
            i = 0, ..., len(intervals).

        Returns
        -------
        scores : np.ndarray
            A 2D array of change scores. One row for each interval. The number of
            columns is 1 if the change score is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the score is
            univariate. In this case, each column represents the univariate score for
            the corresponding input data column.
        """
        starts = intervals[:, 0]
        splits = intervals[:, 1]
        ends = intervals[:, 2]
        return cusum_score(starts, ends, splits, self.sums_)

    @classmethod
    def get_test_params(cls, parameter_set="default"):
        """Return testing parameter settings for the estimator.

        Parameters
        ----------
        parameter_set : str, default="default"
            Name of the set of test parameters to return, for use in tests. If no
            special parameters are defined for a value, will return `"default"` set.
            There are currently no reserved values for interval evaluators.

        Returns
        -------
        params : dict or list of dict, default = {}
            Parameters to create testing instances of the class
            Each dict are parameters to construct an "interesting" test instance, i.e.,
            `MyClass(**params)` or `MyClass(**params[i])` creates a valid test instance.
            `create_test_instance` uses the first (or only) dictionary in `params`
        """
        # CUSUM does not have any parameters to set
        params = [{}, {}]
        return params

fkiraly · 2024-11-25T09:12:35Z

I see, thanks!

Major comment: could the top base class be the same as for evaluation metrics? I understand now that the objects here are used as components in the various change point algorithms, however, would an evaluation metric like windowed F1 also be following the interface? I would tend to "no", i.e., that should be a distinct class, but I just wanted to share this surface thought.

Minor comment: get_test_params does not need to return two parameter sets if there is no parameter to set (I believe the skbase test that usually asks for this is smart enough here).

fkiraly · 2024-11-25T09:17:17Z

More points:

for naming - BaseIntervalScorer? two of the derived base classes have "score" in the name already, and cost/saving is generic
for cost vs saving - do these need to be separate classes? The type interface looks the same. Could this be a boolean tag `lower_is_better?

Tveten · 2024-11-25T09:30:56Z

Major comment: could the top base class be the same as for evaluation metrics? I understand now that the objects here are used as components in the various change point algorithms, however, would an evaluation metric like windowed F1 also be following the interface? I would tend to "no", i.e., that should be a distinct class, but I just wanted to share this surface thought.

Let's think about it in the future! For now, I think they should be distinct.

Minor comment: get_test_params does not need to return two parameter sets if there is no parameter to set (I believe the skbase test that usually asks for this is smart enough here).

Ok, thanks!

for naming - BaseIntervalScorer? two of the derived base classes have "score" in the name already, and cost/saving is generic

I might go with this one. I don't like the very generic "Evaluator", so "Scorer" might be better. My hesitation is due to "scoring" being used a lot of places, and that the way I've used it in the code so far is to mean "higher is better", while costs are "lower is better".

for cost vs saving - do these need to be separate classes? The type interface looks the same. Could this be a boolean tag `lower_is_better?

They should be separate classes. All detectors can take cost inputs, but only a few can take in savings. The typing and input handling is much simpler with different classes, and they really serve different purposes.

johannvk · 2024-11-25T09:34:57Z

Regarding the calling conventions for .evaluate, perhaps we could specialize on the level of BaseCost, BaseSaving, etc.?

These base classes inherit from BaseIntervalEvaluator, but could in principle specialize their .evaluate methods. E.g. for BaseLocalAnomalyScore, it could be expanded with

class BaseLocalAnomalyScore(BaseIntervalEvaluator):
    """Base class template for local anomaly scores.

    Local anomaly scores are used to detect anomalies in a time series or sequence by
    evaluating the deviation of the data distribution within a subinterval of a larger,
    local interval.
    """

    [...]

    def evaluate(self, inner_intervals: ArrayLike, outer_intervals: ArrayLike) -> np.ndarray:
        """Evaluate on a set of inner and outer intervals.

        Parameters
        ----------
        inner_intervals : ArrayLike
            Integer location-based intervals to evaluate. If intervals is a 2D array,
            the subsets X[intervals[i, 0]:intervals[i, -1]] for
            i = 0, ..., len(intervals) are evaluated.
        
        outer_intervals : ArrayLike
            Integer location-based intervals to evaluate. If intervals is a 2D array,
            the subsets X[intervals[i, 0]:intervals[i, -1]] for
            i = 0, ..., len(intervals) are evaluated.

        Returns
        -------
        values : np.ndarray
            A 2D array of values. One row for each inner and outer interval.

        """
        self.check_is_fitted()

        # Optionally, check the lengths are the same.
        inner_intervals = as_2d_array(inner_intervals, vector_as_column=False)
        outer_intervals = as_2d_array(outer_intervals, vector_as_column=False)

        inner_intervals = self._check_intervals(inner_intervals)
        outer_intervals = self._check_intervals(outer_intervals)

        values = self._evaluate(inner_intervals=inner_intervals, outer_intervals=outer_intervals)
        return values

Then the meaning of the expected_interval_entries would change, but hopefully it would be clearer what the required inputs to the different interval evaluators are. And the implementation of the LocalAnomalyScore._evaluate method would simplify slightly as the start of the method no longer extracts the inner_intervals and outer_intervals from the generic intervals array, but are instead passed directly.

Current version:

    def _evaluate(self, intervals: np.ndarray) -> np.ndarray:
        """Evaluate the saving on a set of intervals.

        Parameters
        ----------
        intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.

        Returns
        -------
        savings : np.ndarray
            A 2D array of savings. One row for each interval. The number of
            columns is 1 if the saving is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the saving is
            univariate. In this case, each column represents the univariate saving for
            the corresponding input data column.
        """
        X = as_2d_array(self._X)

        inner_intervals = intervals[:, 1:3]
        outer_intervals = intervals[:, [0, 3]]
        inner_costs = self._interval_cost.evaluate(inner_intervals)
        outer_costs = self._interval_cost.evaluate(outer_intervals)

        surrounding_costs = np.zeros_like(outer_costs)
        for i, interval in enumerate(intervals):
            before_inner_interval = interval[0:2]
            after_inner_interval = interval[2:4]

            before_data = X[before_inner_interval[0] : before_inner_interval[1]]
            after_data = X[after_inner_interval[0] : after_inner_interval[1]]
            surrounding_data = np.concatenate((before_data, after_data))
            self._any_subset_cost.fit(surrounding_data)
            surrounding_costs[i] = self._any_subset_cost.evaluate(
                [0, surrounding_data.shape[0]]
            )

        anomaly_scores = outer_costs - (inner_costs + surrounding_costs)
        return np.array(anomaly_scores)

Potential version:

    def _evaluate(self, inner_intervals: np.ndarray, outer_intervals) -> np.ndarray:
        """Evaluate the saving on a set of inner and outer intervals.

        Parameters
        ----------
        inner_intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.

        outer_intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.


        Returns
        -------
        savings : np.ndarray
            A 2D array of savings. One row for each interval. The number of
            columns is 1 if the saving is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the saving is
            univariate. In this case, each column represents the univariate saving for
            the corresponding input data column.
        """
        X = as_2d_array(self._X)

        inner_costs = self._interval_cost.evaluate(inner_intervals)
        outer_costs = self._interval_cost.evaluate(outer_intervals)

        surrounding_costs = np.zeros_like(outer_costs)
        for i, interval in enumerate(intervals):
            before_inner_interval = interval[0:2]
            after_inner_interval = interval[2:4]

            before_data = X[before_inner_interval[0] : before_inner_interval[1]]
            after_data = X[after_inner_interval[0] : after_inner_interval[1]]
            surrounding_data = np.concatenate((before_data, after_data))
            self._any_subset_cost.fit(surrounding_data)
            surrounding_costs[i] = self._any_subset_cost.evaluate(
                [0, surrounding_data.shape[0]]
            )

        anomaly_scores = outer_costs - (inner_costs + surrounding_costs)
        return np.array(anomaly_scores)

Tveten and others added 30 commits October 17, 2024 21:45

feat: add option to append a zero-row at the beginning of cumsum output

14e68d1

Common pattern for initialisers.

feat: add initialiser utilities

5c1de5d

refactor(api): move all functions related to mean change/anomaly dete…

fdc95a7

…ction to scores.mean

feat(api)!: add initial structure for detector components as classes

51ba5d8

docs: fix module docstring

03ba2a5

refactor: Rename MeanCost to L2Cost

446b864

fix: precomputed type to tuple

36f6ff8

refactor: rename precomputed_params -> precomputed

dae0376

feat: add first version of cost based anomaly score

7ff49d2

feat: remove check on the precompute pipeline

0a2c92e

Leave it to the numba compiler for now. Checking it in a good way is complicated due to the generic classes such as CostBasedChangeScore

refactor: precomputed typing and internal cost name

6a098ec

Add new base class for costs with more functionality

8e97e69

feat: add identify function

1838af9

refactor: name output scores

fe05137

feat: make cost based anomaly score work with new base cost

07621e7

feat: add cost based savings

7492e98

rename identify_func to identity

bbff69a

feat: add numba config and njit_configured decorator

e473dec

fix: njit_configured

037929b

fix: set default use_njit to None in update_config

a944f9c

feat: add function for converting to 2d array

dd38e35

feat: improve check_jitted error message

35a0936

delete experimental detector_components module

8f26238

feat: readd initialiser utilities

97fc3b0

rename mean_change_score

0083eff

add experimental numba subset evaluators

35e77f7

add experimental numba subset evaluators

aeca7a1

feat: add initial version of interval evaluators

234bbab

Decoupled from numba and the detectors as opposed to other suggested designs.

feat: add base cost class and l2 cost as interval evaluators

5ca63ab

feat: add interactive script for exploring evaluators

278015b

Tveten added 23 commits November 23, 2024 01:36

tests: fix saving_evaluate test

35ee401

Error in previous commit.

docs: fix param docs

bee847a

feat: add check_var utility

9d9c419

refactor: remove sample_sizes precompute

751d101

Unnecessary

tests: use 2-dimensional data as input to cost evaluation test

565f264

feat: convert number input to check_cov to diagonal covariance matrix

e2a20e7

docs: correct param docs

0976272

feat: add first version of GaussianVarCost

3fa78bb

tests: add all costs as possible inputs

aed1eb1

Much better coverage of all costs used as scores in the detectors.

feat(api)!: allow min_size to be either int or None

3aa0a07

None means that the min_size is unknown, for example until it is fitted.

feat: add get_param_size method to cost and savings

ab6352e

Necessary for setting good default penalties and thresholds

feat: use data_type class attribute rather than tag

3e22733

Tag won't pass sktime conformance test

feat: rework capa and mvcapa to work for all available costs

b8360cd

tests: update capa and mvcapa tests to work for all available costs

d7d8bc9

tests: add test for evaluate output by data_type

cf5d568

tests: remove error message match checking

eed0e34

feat(api)!: unify meaning and use of min_size across evaluators

6b6dc41

rename: data_type class variable to evaluation_type

a583ab0

fix: add typing to _evaluate

c9f524b

feat: add the CUSUM test for a change in the mean

a9ac840

feat: add CUSUM to init

a92600d

fix: interval indexing in cusum score

55b62a9

feat: add L2 saving for a zero-mean baseline

99a0f10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detector components (costs, scores etc.) as classes #24

Detector components (costs, scores etc.) as classes #24

Tveten commented Oct 20, 2024 •

edited

Loading

johannvk commented Nov 25, 2024

fkiraly commented Nov 25, 2024 •

edited

Loading

Tveten commented Nov 25, 2024

fkiraly commented Nov 25, 2024

fkiraly commented Nov 25, 2024

Tveten commented Nov 25, 2024

johannvk commented Nov 25, 2024

Detector components (costs, scores etc.) as classes #24

Are you sure you want to change the base?

Detector components (costs, scores etc.) as classes #24

Conversation

Tveten commented Oct 20, 2024 • edited Loading

johannvk commented Nov 25, 2024

fkiraly commented Nov 25, 2024 • edited Loading

Tveten commented Nov 25, 2024

fkiraly commented Nov 25, 2024

fkiraly commented Nov 25, 2024

Tveten commented Nov 25, 2024

johannvk commented Nov 25, 2024

Tveten commented Oct 20, 2024 •

edited

Loading

fkiraly commented Nov 25, 2024 •

edited

Loading