-
Notifications
You must be signed in to change notification settings - Fork 165
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1197 from elementary-data/ele-1790-period-param-r…
…enaming-finalization-docs ELE-1790: Renaming, Detection Delay Docs
- Loading branch information
Showing
15 changed files
with
178 additions
and
147 deletions.
There are no files selected for viewing
58 changes: 0 additions & 58 deletions
58
docs/guides/anomaly-detection-configuration/backfill-days.mdx
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
63 changes: 63 additions & 0 deletions
63
docs/guides/anomaly-detection-configuration/detection-period.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
--- | ||
title: "detection_period" | ||
sidebarTitle: "detection_period" | ||
--- | ||
|
||
``` | ||
detection_period: | ||
period: < time period > # supported periods: day, week, month | ||
count: < number of periods > | ||
``` | ||
|
||
Configuration to define the detection period. | ||
If the detection_period are set to 2 days, only data points in the last 2 days will be included in the detection period and could be flagged anomalous. | ||
If detection_period is set to 7 days, the detection period will be 7 days long. | ||
|
||
For incremental models, this is also the period for re-calculating metrics. | ||
If metrics for buckets in the backfill days were already calculated, Elementary will overwrite them. The reason behind it is to monitor recent backfills of data, if there were any. | ||
This configuration should be changed according to your data delays. | ||
|
||
- _Default: 2 days_ | ||
- _Relevant tests: Anomaly detection tests with `timestamp_column`_ | ||
|
||
<img src="/pics/anomalies/detection-period.png" alt="Detection Period" /> | ||
|
||
<RequestExample> | ||
|
||
```yml test | ||
models: | ||
- name: this_is_a_model | ||
tests: | ||
- elementary.volume_anomalies: | ||
detection_period: | ||
period: day | ||
count: 30 | ||
``` | ||
```yml model | ||
models: | ||
- name: this_is_a_model | ||
config: | ||
elementary: | ||
detection_period: | ||
period: month | ||
count: 1 | ||
``` | ||
```yml dbt_project.yml | ||
vars: | ||
detection_period: | ||
period: week | ||
count: 2 | ||
``` | ||
</RequestExample> | ||
#### How it works? | ||
The `detection_period` param only works for tests that have `timestamp_column` configuration. | ||
|
||
It works differently according to the table materialization: | ||
|
||
- **Regular tables and views** - `detection_period` defines the detection period. | ||
- **Incremental models and sources** - `detection_period` defines the detection period, and the period for which metrics will be re-calculated. |
69 changes: 69 additions & 0 deletions
69
docs/guides/anomaly-detection-configuration/training-period.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
--- | ||
title: "training_period" | ||
sidebarTitle: "training_period" | ||
--- | ||
|
||
``` | ||
training_period: | ||
period: < time period > # supported periods: day, week, month | ||
count: < number of periods > | ||
``` | ||
|
||
The maximal timeframe for which the test will collect data. | ||
This timeframe includes the training period and detection period. If a detection delay is defined, the whole training period is being delayed. | ||
|
||
- _Default: 14 days_ | ||
- _Relevant tests: Anomaly detection tests with `timestamp_column`_ | ||
|
||
<img src="/pics/anomalies/training-period.png" alt="Training Period" /> | ||
|
||
<RequestExample> | ||
|
||
```yml test | ||
models: | ||
- name: this_is_a_model | ||
tests: | ||
- elementary.volume_anomalies: | ||
training_period: | ||
period: day | ||
count: 30 | ||
``` | ||
```yml model | ||
models: | ||
- name: this_is_a_model | ||
config: | ||
elementary: | ||
detection_delay: | ||
period: week | ||
count: 1 | ||
``` | ||
```yml dbt_project.yml | ||
vars: | ||
detection_delay: | ||
period: month | ||
count: 1 | ||
``` | ||
</RequestExample> | ||
#### How it works? | ||
The `training_period` param only works for tests that have `timestamp_column` configuration. | ||
|
||
It works differently according to the table materialization: | ||
|
||
- **Regular tables and views** - The values of the full `training_period` period is calculated on each run. | ||
- **Incremental models and sources** - The values of the full `training_period` period is calculated on the first test run, and on full refresh. The following test runs will only calculate the values of the `detection_period` period. | ||
|
||
**Changes from default:** | ||
|
||
- **Full time buckets** - Elementary will increase the `training_period` automatically to insure full time buckets. For example if the `time_bucket` of the test is `period: week`, and 14 `training_period` result in Tuesday, the test will collect 2 more days back to complete a week (starting on Sunday). | ||
- **Seasonality training set** - If seasonality is configured, Elementary will increase the `training_period` automatically to ensure there are enough training set values to calculate an anomaly. For example if the `seasonality` of the test is `day_of_week`, `training_period` will be increased to ensure enough Sundays, Mondays, Tuesdays, etc. to calculate an anomaly for each. | ||
|
||
#### The impact of changing `training_period` | ||
|
||
If you **increase `training_period`** your test training set will be larger. This means a larger sample size for calculating the expected range, which should make the test less sensitive to outliers. This means less chance of false positive anomalies, but also less sensitivity so anomalies have a higher threshold. | ||
|
||
If you **decrease `training_period`** your test training set will be smaller. This means a smaller sample size for calculating the expected range, which might make the test more sensitive to outliers. This means more chance of false positive anomalies, but also more sensitivity as anomalies have a lower threshold. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.