From e0d4143034bf59c8a3de4c54cd622e6ba7244e1b Mon Sep 17 00:00:00 2001 From: Cristhian Garcia Date: Mon, 22 Apr 2024 12:11:18 -0500 Subject: [PATCH] docs: add data lifecycly policy --- docs/concepts/data_lifecycle_policy.rst | 14 ++++++++++++++ docs/concepts/index.rst | 1 + 2 files changed, 15 insertions(+) create mode 100644 docs/concepts/data_lifecycle_policy.rst diff --git a/docs/concepts/data_lifecycle_policy.rst b/docs/concepts/data_lifecycle_policy.rst new file mode 100644 index 0000000..9e1a847 --- /dev/null +++ b/docs/concepts/data_lifecycle_policy.rst @@ -0,0 +1,14 @@ +.. _data-lifecycle-policy: + +Data Lifecycle Policy +********************* + +What it is +########## + +Aspects is a data pipeline that captures, transforms, and aggregates tracking logs from the Open edX platform into xAPI statements and stores them in a ClickHouse database. +However, the data is not stored indefinitely by default. The data is keep for 1 year by default, but this can be adjusted by the site operator via the setting `ASPECTS_DATA_TTL_EXPRESSION` in the tutor plugin. + +The setting value is a ClickHouse expression that defines the time-to-live policy (TTL) for the data. The expression is evaluated for each row in the table and should return a date. Rows with a date in the past are deleted. You can read more about the TTL policy in the ClickHouse documentation: https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/#ttl + +The data is partioned by month this way the TLL policy is applied per partition. Make sure to set the TTL policy to a date that is compatible with the partitioning policy. e.g. `ASPECTS_DATA_TTL_EXPRESSION: toDateTime(emission_time) + INTERVAL 2 MONTH` or `ASPECTS_DATA_TTL_EXPRESSION: toDateTime(emission_time) + INTERVAL 2 YEAR`. diff --git a/docs/concepts/index.rst b/docs/concepts/index.rst index bd70e12..ea9a4f3 100644 --- a/docs/concepts/index.rst +++ b/docs/concepts/index.rst @@ -9,6 +9,7 @@ Concepts xAPI Tracking Logs Clickhouse + Data Lifecycle Policy dbt Ralph Vector