Skip to content

Standard Link

tta-scalefree edited this page Sep 29, 2023 · 13 revisions

The "Datavault4Coalesce: Link" node creates a link entity, connecting two or more entities, or an entity with itself.

User Configuration

In the "Config" section on the right hand-side, expand section "Data Vault" to modify the following configurations for the node:

Config Name Config Type Explanation
Link Hashkey Column string Select the Link hashkey of this Link and bring it to the right side.

System Columns

No Sys Columns are added inside a Link Node.

Example

In this example the Link 'NATION_REGION_L' is being created. The source table is 'STG_NATION'.

Link_neu_example_1

Create the node:

  • Select the 'STG_NATION' and create the node 'NATION_REGION_L' with the node type 'Datavault by Scalefree: Link'

Link_neu_example_2



  • The Links name will be created automatically. Rename the Link using your naming conventions. In this case rename it to 'NATION_REGION_L'. Also while having the Note open like in the Screenshot, delete all unnecessary columns, so that you only keep the Hashkey of the Link, the foreign hashkeys of the entities and the ldts aswell as the rsrc.



Link_neu_example_3



  • In the Config section on the right there is a tab called 'Data Vault'
  • Expand this and you get a column selector list 'Link Hashkey Column'
    • Select the hashkey of this Link and bring it to the right side in the 'Link Hashkey Column' section, by ticking the box and clicking on the arrow to the right

Link_neu_example_4



  • Create and run the node


Description:

  • Source: The source table that you want to use for this Link is the referring Staging Table that the entities use, that you want to create a relation between.

  • LDTS: The ldts is a system genereted column. It is set in the staging table.

  • RSRC: The rsrc is a system genereted column. The value is set in the 'Transform' field, in this example with the Jinja Expression '{{recordsource()}}' which refers to th source table.

  • hashed_columns:

    • HK_NATION_H: A hashkey called 'hk_nation_h' is defined, that is taken from the staging table and used to refer to one of the entities.

    • HK_REGION_H: A hashkey called 'hk_region_h' is defined, that is taken from the staging table and used to refer to one of the entities.

    • HK_NATION_REGION_L: The hashkey of the link. It is defined in the staging table and is calculated out of the business keys 'N_NATIONKEY' and 'N_REGIONKEY'.


Generated SQL Code of Example

CREATE OR REPLACE TABLE
    "COALESCE_WORKSHOP"."CORE"."NATION_REGION_L" (
        "HK_NATION_REGION_L" STRING,
        "HK_NATION_H" STRING,
        "HK_REGION_H" STRING,
        "LDTS" TIMESTAMP
        COMMENT 'Search for existing column in source. Only in case that nothing is available, functions like GETDATE() should be used.',
        "RSRC" STRING
        COMMENT 'If available, replace with existing column from source.'
    )
    COMMENT = 'Nation data as defined by TPC-H'

INSERT INTO
"COALESCE_WORKSHOP"."CORE"."NATION_REGION_L"
WITH
incoming AS (
    SELECT DISTINCT
    "STG_NATION"."HK_NATION_REGION_L" AS "HK_NATION_REGION_L",
    "STG_NATION"."HK_NATION_H" AS "HK_NATION_H",
    "STG_NATION"."HK_REGION_H" AS "HK_REGION_H",
    "STG_NATION"."LDTS" AS "LDTS",
    "STG_NATION"."RSRC" AS "RSRC"
    FROM
    "COALESCE_WORKSHOP"."CORE"."STG_NATION" "STG_NATION"
),
new_records AS (
    SELECT
    "SRC".*
    FROM
    incoming "SRC"
    WHERE
    NOT EXISTS (
        SELECT
        1
        FROM
        "COALESCE_WORKSHOP"."CORE"."NATION_REGION_L" "TGT"
        WHERE
        "SRC"."HK_NATION_REGION_L" = "TGT"."HK_NATION_REGION_L"
    ) QUALIFY ROW_NUMBER() OVER (
        PARTITION BY
        "HK_NATION_REGION_L"
        ORDER BY
        "LDTS"
    ) = 1
)
SELECT
*
FROM
new_records