Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detector components (costs, scores etc.) as classes #24

Open
wants to merge 204 commits into
base: main
Choose a base branch
from

Conversation

Tveten
Copy link
Collaborator

@Tveten Tveten commented Oct 20, 2024

Goal: Unify the detector components. Make them safer. Make the extension pattern simpler and clearer.

See #23 for discussions.

New features:

  • The BaseIntervalEvaluator class, inheriting from sktime.BaseEstimator. Public methods:

    • fit(self, X, y=None) -> self
    • evaluate(self, intervals: ArrayLike) -> np.ndarray
  • Four sub base classes inheriting from BaseIntervalEvaluator:

    • skchange.costs.BaseCost. Expects 2 columns in intervals: start, end.
    • skchange.change_scores.BaseChangeScore. Expects 3 columns in intervals: start, split, end.
    • skchange.anomaly_scores.BaseSaving. Expects 2 columns in intervals: start, end.
    • skchange.anomaly_scores.BaseLocalAnomalyScore: Expects 4 columns in intervals: outer_start, inner_start, inner_end, outer_end.
  • Classes for automatically converting costs to any of the three other score classes.

  • Convenience functions allowing either costs or an appropriate score to be used as input to all the detectors.

All existing functionality is implemented within the new design + additions.

Tveten and others added 30 commits October 17, 2024 21:45
Leave it to the numba compiler for now. Checking it in a good way is complicated due to the generic classes such as CostBasedChangeScore
Decoupled from numba and the detectors as opposed to other suggested designs.
Error in previous commit.
Much better coverage of all costs used as scores in the detectors.
None means that the min_size is unknown, for example until it is fitted.
Necessary for setting good default penalties and thresholds
Tag won't pass sktime conformance test
@johannvk
Copy link
Contributor

Naming decisions to be made before merge:

1. `BaseIntervalEvaluator`. "Interval" or "Segment" or something else as the middle term? The point is to highlight the fact that this class is for evaluating a function over intervals/slices/segments of data. This is what enables strong computational performance compared to evaluating on any subset. BaseSegmentEvaluator could be confused with a class trying to evaluate a segmentation.

2. The argument name to `evaluate`. Options:
   a) `intervals` (current): Fits natural language "evaluate over intervals". Could be misleading, as the inputs may include splitting information between the first and last column.
   b) `segments`: Same as a), but in case the base class is renamed `BaseSegmentEvaluator`.
   c) `splitter` or `splits`: Highlights the possible splitting information. Perhaps the most general option. Could be confusing, as the data isn't split in two or more parts such that all the data is contained in the union of the split data.

I'd be very happy for feedback and opinions here, @fkiraly and @johannvk.

For question 1. I am partial to the BaseSegmentEvaluator-naming on a subjective basis. I think perhaps making either ''Segment'' or ''Intervals'' plural would better communicate the intention that the evaluation is meant to be performed over multiple intervals with the same .evaluate call. So in that case I'd prefer BaseSegmentsEvaluator.

And regarding the calling convention for the .evaluate method, I'm a bit scared by how the subclasses implementing it would specify their own argument needs. As I currently understand your PR-description, the BaseLocalAnomalyScore would expect a certain number of columns in it's intervals argument, but I'm afraid that the meaning of the different columns might easily be confused by a user of the library if the only way of determining/communicating their meaning is through the column ordering.

skchange.change_scores.BaseChangeScore. Expects 3 columns in intervals: start, split, end.
skchange.anomaly_scores.BaseSaving. Expects 2 columns in intervals: start, end.
skchange.anomaly_scores.BaseLocalAnomalyScore: Expects 4 columns in intervals: outer_start, inner_start, inner_end, outer_end.

So if possible, I'm wondering if a more explicit calling convention where each column that defines the interval would be a named argument to the .evaluate method could be a choice with less oppurtunity for confusion among the library users?
I'm not entirely sure how to most easily achieve this, and it may also be possible to instead require named columns in a DataFrame of some sort as well, to make it easy for the user of Skchange to determine how to pass the required arguments correctly, but I'm unsure if I like such an API.

@fkiraly
Copy link

fkiraly commented Nov 25, 2024

Thanks for the pointer.

As I am not familiar with the code, and the names do not seem to imply examples, may I request two examples (optimally directly here in the text) for each base API? E.g., what would be an "interval evaluator"?

Is this a performance metric, or sth else? I suppose it is sth else?

@Tveten
Copy link
Collaborator Author

Tveten commented Nov 25, 2024

Here's the CUSUM change score for example:

class CUSUM(BaseChangeScore):
    """CUSUM change score for a change in the mean.

    The classical CUSUM test statistic for a change in the mean is calculated as the
    weighted difference between the mean before and after a split point within an
    interval. See e.g. Section 4 of [2]_, the idea goes back to [1]_.

    References
    ----------
    .. [1] Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1/2),
    100-115.
    .. [2] Wang, D., Yu, Y., & Rinaldo, A. (2020). Univariate mean change point
    detection: Penalization, cusum and optimality.
    """

    def __init__(self):
        super().__init__()

    @property
    def min_size(self) -> int:
        """Minimum size of the interval to evaluate."""
        return 1

    def _fit(self, X: ArrayLike, y=None):
        """Fit the change score evaluator.

        Parameters
        ----------
        X : array-like
            Input data.
        y : None
            Ignored. Included for API consistency by convention.

        Returns
        -------
        self :
            Reference to self.
        """
        X = as_2d_array(X)
        self.sums_ = col_cumsum(X, init_zero=True)
        return self

    def _evaluate(self, intervals: np.ndarray):
        """Evaluate the change score on a set of intervals.

        Parameters
        ----------
        intervals : np.ndarray
            A 2D array with three columns of integer location-based intervals to
            evaluate. The difference between subsets X[intervals[i, 0]:intervals[i, 1]]
            and X[intervals[i, 1]:intervals[i, 2]] are evaluated for
            i = 0, ..., len(intervals).

        Returns
        -------
        scores : np.ndarray
            A 2D array of change scores. One row for each interval. The number of
            columns is 1 if the change score is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the score is
            univariate. In this case, each column represents the univariate score for
            the corresponding input data column.
        """
        starts = intervals[:, 0]
        splits = intervals[:, 1]
        ends = intervals[:, 2]
        return cusum_score(starts, ends, splits, self.sums_)

    @classmethod
    def get_test_params(cls, parameter_set="default"):
        """Return testing parameter settings for the estimator.

        Parameters
        ----------
        parameter_set : str, default="default"
            Name of the set of test parameters to return, for use in tests. If no
            special parameters are defined for a value, will return `"default"` set.
            There are currently no reserved values for interval evaluators.

        Returns
        -------
        params : dict or list of dict, default = {}
            Parameters to create testing instances of the class
            Each dict are parameters to construct an "interesting" test instance, i.e.,
            `MyClass(**params)` or `MyClass(**params[i])` creates a valid test instance.
            `create_test_instance` uses the first (or only) dictionary in `params`
        """
        # CUSUM does not have any parameters to set
        params = [{}, {}]
        return params

@fkiraly
Copy link

fkiraly commented Nov 25, 2024

I see, thanks!

Major comment: could the top base class be the same as for evaluation metrics? I understand now that the objects here are used as components in the various change point algorithms, however, would an evaluation metric like windowed F1 also be following the interface? I would tend to "no", i.e., that should be a distinct class, but I just wanted to share this surface thought.

Minor comment: get_test_params does not need to return two parameter sets if there is no parameter to set (I believe the skbase test that usually asks for this is smart enough here).

@fkiraly
Copy link

fkiraly commented Nov 25, 2024

More points:

  • for naming - BaseIntervalScorer? two of the derived base classes have "score" in the name already, and cost/saving is generic
  • for cost vs saving - do these need to be separate classes? The type interface looks the same. Could this be a boolean tag `lower_is_better?

@Tveten
Copy link
Collaborator Author

Tveten commented Nov 25, 2024

Major comment: could the top base class be the same as for evaluation metrics? I understand now that the objects here are used as components in the various change point algorithms, however, would an evaluation metric like windowed F1 also be following the interface? I would tend to "no", i.e., that should be a distinct class, but I just wanted to share this surface thought.

Let's think about it in the future! For now, I think they should be distinct.

Minor comment: get_test_params does not need to return two parameter sets if there is no parameter to set (I believe the skbase test that usually asks for this is smart enough here).

Ok, thanks!

  • for naming - BaseIntervalScorer? two of the derived base classes have "score" in the name already, and cost/saving is generic

I might go with this one. I don't like the very generic "Evaluator", so "Scorer" might be better. My hesitation is due to "scoring" being used a lot of places, and that the way I've used it in the code so far is to mean "higher is better", while costs are "lower is better".

  • for cost vs saving - do these need to be separate classes? The type interface looks the same. Could this be a boolean tag `lower_is_better?

They should be separate classes. All detectors can take cost inputs, but only a few can take in savings. The typing and input handling is much simpler with different classes, and they really serve different purposes.

@johannvk
Copy link
Contributor

Regarding the calling conventions for .evaluate, perhaps we could specialize on the level of BaseCost, BaseSaving, etc.?

These base classes inherit from BaseIntervalEvaluator, but could in principle specialize their .evaluate methods. E.g. for BaseLocalAnomalyScore, it could be expanded with

class BaseLocalAnomalyScore(BaseIntervalEvaluator):
    """Base class template for local anomaly scores.

    Local anomaly scores are used to detect anomalies in a time series or sequence by
    evaluating the deviation of the data distribution within a subinterval of a larger,
    local interval.
    """

    [...]

    def evaluate(self, inner_intervals: ArrayLike, outer_intervals: ArrayLike) -> np.ndarray:
        """Evaluate on a set of inner and outer intervals.

        Parameters
        ----------
        inner_intervals : ArrayLike
            Integer location-based intervals to evaluate. If intervals is a 2D array,
            the subsets X[intervals[i, 0]:intervals[i, -1]] for
            i = 0, ..., len(intervals) are evaluated.
        
        outer_intervals : ArrayLike
            Integer location-based intervals to evaluate. If intervals is a 2D array,
            the subsets X[intervals[i, 0]:intervals[i, -1]] for
            i = 0, ..., len(intervals) are evaluated.

        Returns
        -------
        values : np.ndarray
            A 2D array of values. One row for each inner and outer interval.

        """
        self.check_is_fitted()

        # Optionally, check the lengths are the same.
        inner_intervals = as_2d_array(inner_intervals, vector_as_column=False)
        outer_intervals = as_2d_array(outer_intervals, vector_as_column=False)

        inner_intervals = self._check_intervals(inner_intervals)
        outer_intervals = self._check_intervals(outer_intervals)

        values = self._evaluate(inner_intervals=inner_intervals, outer_intervals=outer_intervals)
        return values

Then the meaning of the expected_interval_entries would change, but hopefully it would be clearer what the required inputs to the different interval evaluators are. And the implementation of the LocalAnomalyScore._evaluate method would simplify slightly as the start of the method no longer extracts the inner_intervals and outer_intervals from the generic intervals array, but are instead passed directly.

Current version:

    def _evaluate(self, intervals: np.ndarray) -> np.ndarray:
        """Evaluate the saving on a set of intervals.

        Parameters
        ----------
        intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.

        Returns
        -------
        savings : np.ndarray
            A 2D array of savings. One row for each interval. The number of
            columns is 1 if the saving is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the saving is
            univariate. In this case, each column represents the univariate saving for
            the corresponding input data column.
        """
        X = as_2d_array(self._X)

        inner_intervals = intervals[:, 1:3]
        outer_intervals = intervals[:, [0, 3]]
        inner_costs = self._interval_cost.evaluate(inner_intervals)
        outer_costs = self._interval_cost.evaluate(outer_intervals)

        surrounding_costs = np.zeros_like(outer_costs)
        for i, interval in enumerate(intervals):
            before_inner_interval = interval[0:2]
            after_inner_interval = interval[2:4]

            before_data = X[before_inner_interval[0] : before_inner_interval[1]]
            after_data = X[after_inner_interval[0] : after_inner_interval[1]]
            surrounding_data = np.concatenate((before_data, after_data))
            self._any_subset_cost.fit(surrounding_data)
            surrounding_costs[i] = self._any_subset_cost.evaluate(
                [0, surrounding_data.shape[0]]
            )

        anomaly_scores = outer_costs - (inner_costs + surrounding_costs)
        return np.array(anomaly_scores)

Potential version:

    def _evaluate(self, inner_intervals: np.ndarray, outer_intervals) -> np.ndarray:
        """Evaluate the saving on a set of inner and outer intervals.

        Parameters
        ----------
        inner_intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.

        outer_intervals : np.ndarray
            A 2D array with two columns of integer location-based intervals to evaluate.
            The subsets X[intervals[i, 0]:intervals[i, 1]] for
            i = 0, ..., len(intervals) are evaluated.


        Returns
        -------
        savings : np.ndarray
            A 2D array of savings. One row for each interval. The number of
            columns is 1 if the saving is inherently multivariate. The number of
            columns is equal to the number of columns in the input data if the saving is
            univariate. In this case, each column represents the univariate saving for
            the corresponding input data column.
        """
        X = as_2d_array(self._X)

        inner_costs = self._interval_cost.evaluate(inner_intervals)
        outer_costs = self._interval_cost.evaluate(outer_intervals)

        surrounding_costs = np.zeros_like(outer_costs)
        for i, interval in enumerate(intervals):
            before_inner_interval = interval[0:2]
            after_inner_interval = interval[2:4]

            before_data = X[before_inner_interval[0] : before_inner_interval[1]]
            after_data = X[after_inner_interval[0] : after_inner_interval[1]]
            surrounding_data = np.concatenate((before_data, after_data))
            self._any_subset_cost.fit(surrounding_data)
            surrounding_costs[i] = self._any_subset_cost.evaluate(
                [0, surrounding_data.shape[0]]
            )

        anomaly_scores = outer_costs - (inner_costs + surrounding_costs)
        return np.array(anomaly_scores)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants