-
Notifications
You must be signed in to change notification settings - Fork 3
Multi active Satellite v0
This node creates the multi-active satellite version 0. A multi-active satellite contains multiple records that are active/valid for the same business key at a point in time.
In the 'Config' section on the right hand-side, expand section 'Data Vault' to modify the following configurations for the node:
Config Name | Config Type | Explanation |
---|---|---|
Hashkey Column | columnDropdownSelector | Select the parent hashkey of the satellite from the drop-down menu. |
Multi-active Key Column | columnSelector | Select one (or multiple) multi-active key columns from the list and bring it to the right side. |
Hashdiff Column | columnDropdownSelector | Select the hashdiff of this satellite from the drop-down menu. |
No Sys Columns are added inside a Multi-active Satellite v0 Node.
In this example we're modelling a multi-active satellite v0. The source for this satellite contains customer addresses. The table contains the business key Customer Key (CA_CustKey) and multiple addresses per customer - each address has an unique address identifier (CA_ADDRESS_ID).
Prerequisite to model the multi-active satellite: A DV stage table 'DVSTG_CUSTOMER_ADDRESS' with proper config for multi-active data. FIXME add reference to Stage.
Create the node:
- Right-click on the stage node DVSTG_CUSTOMER_ADDRESS, select "Add node" → select "Datavault by Scalefree: Satellite v0".
- Rename the satellite using your naming conventions. For this example, we're renaming it to "SATMV0_CUSTOMER_ADDRESS".
- Open the newly created Multi-active Satellite node and remove all irrelevant columns, so that you only keep the Hash Key, Hash Diff, Load Date Timestamp and Record Source as well as relevant descriptive attributes for the satellite.
- In the 'Config' section on the right hand-side, expand section 'Data Vault' to modify the following configurations for the node:
- Hashkey Column: Select the hash key column of this satellite from the drop-down menu.
- Multi-active key columns: Select one (or multiple) multi-active key columns from the list and bring it to the right side. In this example, there's only one Multi-active key column CA_ADDRESS_ID.
- Hashdiff Column: Select the hash diff column of this satellite from the drop-down menu.
- Create and run the node
CREATE OR REPLACE TABLE
"COALESCE_WORKSHOP"."CORE"."SATMV0_CUSTOMER_ADDRESS" (
"HK_CUSTOMER_H" STRING,
"LDTS" TIMESTAMP
COMMENT 'The Load Date Timestamp (LDTS) describes when this data first arrived in the Data Warehouse.',
"RSRC" STRING
COMMENT 'The Record Source (RSRC) describes the source of this data.',
"HD_CUSTOMER_ADDRESS_P_S" STRING,
"CA_ADDRESS_ID" NUMBER (5, 0),
"CA_NAME" VARCHAR,
"CA_ADDRESS_STREET" VARCHAR,
"CA_ADDRESS_POSTAL_CODE" NUMBER (5, 0),
"CA_NATIONKEY" NUMBER (2, 0),
"CA_MKTSEGMENT" VARIANT,
"CA_COMMENT" NUMBER
)
INSERT INTO
"COALESCE_WORKSHOP"."CORE"."SATMV0_CUSTOMER_ADDRESS"
WITH
latest_entries_in_sat AS (
/* get current rows from satellite */
SELECT
"HK_CUSTOMER_H",
"HD_CUSTOMER_ADDRESS_P_S"
FROM
"COALESCE_WORKSHOP"."CORE"."SATMV0_CUSTOMER_ADDRESS" QUALIFY ROW_NUMBER() OVER (
PARTITION BY
"HK_CUSTOMER_H"
ORDER BY
"LDTS" DESC
) = 1
),
deduplicated_numbered_source AS (
SELECT
"DVSTG_CUSTOMER_ADDRESS"."HK_CUSTOMER_H" AS HK_CUSTOMER_H,
"DVSTG_CUSTOMER_ADDRESS"."LDTS" AS LDTS,
"DVSTG_CUSTOMER_ADDRESS"."RSRC" AS RSRC,
"DVSTG_CUSTOMER_ADDRESS"."HD_CUSTOMER_ADDRESS_P_S" AS HD_CUSTOMER_ADDRESS_P_S,
"DVSTG_CUSTOMER_ADDRESS"."CA_ADDRESS_ID" AS CA_ADDRESS_ID,
"DVSTG_CUSTOMER_ADDRESS"."CA_NAME" AS CA_NAME,
"DVSTG_CUSTOMER_ADDRESS"."CA_ADDRESS_STREET" AS CA_ADDRESS_STREET,
"DVSTG_CUSTOMER_ADDRESS"."CA_ADDRESS_POSTAL_CODE" AS CA_ADDRESS_POSTAL_CODE,
"DVSTG_CUSTOMER_ADDRESS"."CA_NATIONKEY" AS CA_NATIONKEY,
"DVSTG_CUSTOMER_ADDRESS"."CA_MKTSEGMENT" AS CA_MKTSEGMENT,
"DVSTG_CUSTOMER_ADDRESS"."CA_COMMENT" AS CA_COMMENT,
ROW_NUMBER() OVER (
PARTITION BY
"HK_CUSTOMER_H"
ORDER BY
"LDTS"
) as rn
FROM
"COALESCE_WORKSHOP"."CORE"."DVSTG_CUSTOMER_ADDRESS" "DVSTG_CUSTOMER_ADDRESS"
WHERE
"LDTS" > (
SELECT
COALESCE(
MAX("LDTS"),
TO_TIMESTAMP('0001-01-01T00:00:01', 'YYYY-MM-DDTHH24:MI:SS')
)
FROM
"COALESCE_WORKSHOP"."CORE"."SATMV0_CUSTOMER_ADDRESS"
WHERE
"LDTS" != TO_TIMESTAMP('8888-12-31T23:59:59', 'YYYY-MM-DDTHH24:MI:SS')
) QUALIFY CASE
WHEN "HD_CUSTOMER_ADDRESS_P_S" = LAG("HD_CUSTOMER_ADDRESS_P_S") OVER (
PARTITION BY
"HK_CUSTOMER_H"
ORDER BY
"LDTS"
) THEN FALSE
ELSE TRUE
END
)
SELECT DISTINCT
HK_CUSTOMER_H,
LDTS,
RSRC,
HD_CUSTOMER_ADDRESS_P_S,
CA_ADDRESS_ID,
CA_NAME,
CA_ADDRESS_STREET,
CA_ADDRESS_POSTAL_CODE,
CA_NATIONKEY,
CA_MKTSEGMENT,
CA_COMMENT
FROM
deduplicated_numbered_source
WHERE
NOT EXISTS (
SELECT
1
FROM
latest_entries_in_sat
WHERE
deduplicated_numbered_source.HD_CUSTOMER_ADDRESS_P_S = latest_entries_in_sat."HD_CUSTOMER_ADDRESS_P_S"
AND deduplicated_numbered_source.HK_CUSTOMER_H = latest_entries_in_sat."HK_CUSTOMER_H"
AND deduplicated_numbered_source.rn = 1
)
Table of Content
- Staging
- DV-Entities
- Hubs
- Links
- Satellites
- Standard Satellite
- Multi-Active Satellite
- Non-Historized Satellite
- Record-Tracking Satellite
- Business Vault
- PIT
- Snapshot Control
- Global Variables