-
Notifications
You must be signed in to change notification settings - Fork 3
Standard Satellite v0
dpolatscalefree edited this page Feb 23, 2023
·
11 revisions
This macro creates a standard satellite version 0, meaning that it should be materialized as an incremental table. It should beapplied 'on top' of the staging layer, and is either connected to a Hub or a Link. On top of each version 0 satellite, a version 1 satellite should be created, using the sat_v1 macro. This extends the v0 satellite by a virtually calculated load end date. Each satellite can only be loaded by one source model, since we typically recommend a satellite split by source system.
Features:
- Can handle multiple updates per batch, without loosing intermediate changes. Therefor initial loading is supported.
- Using a dynamic high-water-mark to optimize loading performance of multiple loads
Parameters | Data Type | Explanation |
---|---|---|
parent_hashkey | string | Name of the hashkey column inside the stage of the object that this satellite is attached to. |
src_hashdiff | string | Name of the hashdiff column of this satellite, that was created inside the staging area and is calculated out of the entire payload of this satellite. The stage must hold one hashdiff per satellite entity. |
src_payload | list of strings | A list of all the descriptive attributes that should be included in this satellite. Needs to be the columns that are feeded into the hashdiff calculation of this satellite. |
source_model | string | Name of the underlying staging model, must be available inside dbt as a model. |
src_ldts | string | Name of the ldts column inside the source model. Is optional, will use the global variable 'datavault4dbt.ldts_alias'. Needs to use the same column name as defined as alias inside the staging model. |
src_rsrc | string | Name of the rsrc column inside the source model. Is optional, will use the global variable 'datavault4dbt.rsrc_alias'. Needs to use the same column name as defined as alias inside the staging model. |
{{ config(materialized='incremental') }}
{%- set yaml_metadata -%}
parent_hashkey: 'hk_account_h'
src_hashdiff: 'hd_account_s'
src_payload:
- name
- address
- phone
- email
source_model: 'stage_account'
{%- endset -%}
{%- set metadata_dict = fromyaml(yaml_metadata) -%}
{%- set parent_hashkey = metadata_dict['parent_hashkey'] -%}
{%- set src_hashdiff = metadata_dict['src_hashdiff'] -%}
{%- set source_model = metadata_dict['source_model'] -%}
{%- set src_payload = metadata_dict['src_payload'] -%}
{{ datavault4dbt.sat_v0(parent_hashkey=parent_hashkey,
src_hashdiff=src_hashdiff,
source_model=source_model,
src_payload=src_payload) }}
-
parent_hashkey:
- hk_account_h: The satellite would be attached to the hub account, which has the column 'hk_account_h' as a hashkey column.
-
src_hashdiff:
- hd_account_s Since we recommend naming the hashdiff column similar to the name of the satellite entity, just with a prefix, this would be the hashdiff column of the data satellite for account.
- src_payload: This satellite would hold the columns 'name', 'address', 'phone' and 'email', coming out of the underlying staging area.
-
source_models:
- stage_account: This satellite is loaded out of the stage for account.
Table of Content
- Staging
- DV-Entities
- Hubs
- Links
- Satellites
- Standard Satellite
- Multi-Active Satellite
- Non-Historized Satellite
- Record-Tracking Satellite
- Business Vault
- PIT
- Snapshot Control
- Global Variables