Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update macro used for passing through all columns to ensure quoting #129

Merged
merged 30 commits into from
Oct 16, 2024
Merged
Show file tree
Hide file tree
Changes from 29 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
e40bb44
updates apply
fivetran-reneeli Oct 4, 2024
dc38bc4
rm the test field in seed data since not all warehouses like the syntax
fivetran-reneeli Oct 4, 2024
d5198a5
docs
fivetran-reneeli Oct 4, 2024
b047a69
new schema
fivetran-reneeli Oct 4, 2024
896498b
try to adapt add pass through column macro
fivetran-reneeli Oct 8, 2024
fb9ee19
adjust add pass through macro
fivetran-reneeli Oct 8, 2024
6886a74
Update macros/hubspot_add_pass_through_columns.sql
fivetran-reneeli Oct 9, 2024
a56799f
use custom hubspot pass thru columns, add company seed data for postg…
fivetran-reneeli Oct 9, 2024
fa09f4a
docs
fivetran-reneeli Oct 9, 2024
1ed3424
add databricks and snowflake seed, adjust configs
fivetran-reneeli Oct 9, 2024
403cf3a
disable
fivetran-reneeli Oct 9, 2024
e08ad5a
change schema
fivetran-reneeli Oct 9, 2024
16cd3dc
new schema
fivetran-reneeli Oct 10, 2024
5d8d1bb
column type config fix
fivetran-reneeli Oct 10, 2024
81e26f2
new schema
fivetran-reneeli Oct 10, 2024
77c4261
rm the hubspot_source prefix
fivetran-reneeli Oct 10, 2024
6742939
new schema
fivetran-reneeli Oct 10, 2024
afa5879
Update CHANGELOG.md
fivetran-reneeli Oct 15, 2024
ea99532
Update CHANGELOG.md
fivetran-reneeli Oct 15, 2024
bd58f38
update versioning and some docs
fivetran-reneeli Oct 15, 2024
900fdc9
more updates
fivetran-reneeli Oct 15, 2024
c7819c9
new schema
fivetran-reneeli Oct 15, 2024
e34eded
docs
fivetran-reneeli Oct 15, 2024
bdccd3d
rm validation test config from source
fivetran-reneeli Oct 15, 2024
019f8e3
new schema
fivetran-reneeli Oct 15, 2024
00c6305
fix databricks seed error?
fivetran-reneeli Oct 15, 2024
caca80b
try again
fivetran-reneeli Oct 16, 2024
ab9a102
Update CHANGELOG.md
fivetran-reneeli Oct 16, 2024
5bf7c7d
changelog updates
fivetran-reneeli Oct 16, 2024
1169162
quote some test configs
fivetran-reneeli Oct 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ echo `pwd`
cd integration_tests
dbt deps
if [ "$db" = "databricks-sql" ]; then
dbt seed --vars '{hubspot_schema: hubspot_sqlw_tests}' --target "$db" --full-refresh
dbt compile --vars '{hubspot_schema: hubspot_sqlw_tests}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests}' --target "$db" --full-refresh
dbt test --vars '{hubspot_schema: hubspot_sqlw_tests}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests, hubspot_marketing_enabled: true, hubspot_contact_merge_audit_enabled: true, hubspot_sales_enabled: false}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests, hubspot_marketing_enabled: false, hubspot_sales_enabled: true, hubspot_merged_deal_enabled: true, hubspot__pass_through_all_columns: true, hubspot_using_all_email_events: false, hubspot_owner_enabled: false}' --target "$db"
dbt test --vars '{hubspot_schema: hubspot_sqlw_tests}' --target "$db"
dbt seed --vars '{hubspot_schema: hubspot_sqlw_tests_1}' --target "$db" --full-refresh
dbt compile --vars '{hubspot_schema: hubspot_sqlw_tests_1}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests_1}' --target "$db" --full-refresh
dbt test --vars '{hubspot_schema: hubspot_sqlw_tests_1}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests_1, hubspot_marketing_enabled: true, hubspot_contact_merge_audit_enabled: true, hubspot_sales_enabled: false}' --target "$db"
dbt run --vars '{hubspot_schema: hubspot_sqlw_tests_1, hubspot_marketing_enabled: false, hubspot_sales_enabled: true, hubspot_merged_deal_enabled: true, hubspot__pass_through_all_columns: true, hubspot_using_all_email_events: false, hubspot_owner_enabled: false}' --target "$db"
dbt test --vars '{hubspot_schema: hubspot_sqlw_tests_1}' --target "$db"
else
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# dbt_hubspot_source v0.16.0
[PR #129](https://github.com/fivetran/dbt_hubspot_source/pull/129) includes the following updates:

## Breaking Changes
- Switched from using the `fivetran_utils.remove_prefix_from_columns` macro to the `hubspot_source.remove_duplicate_and_prefix_from_columns` macro for when `hubspot__pass_through_all_columns` is enabled and you are passing through all columns in the `stg_hubspot__company`, `stg_hubspot__contact`, and `stg_hubspot__deal` models. This also ensures the source fields passed through are all quoted from the onset. This is a breaking change because this macro can remove duplicate fields, resulting in an impact to your schema.

## Bug Fixes
- Introduced hubspot-specific version of the `fivetran_utils.pass_through_columns` macro titled `hubspot_add_pass_through_columns`, which introduces quoting around the source fields being brought in as passthrough columns. This will ensure that your warehouse reads the sql correctly, particularly if the field contains special characters or syntax. This is now used in the respective `get_<>_columns` macros of the following models:
- `stg_hubspot__company`
- `stg_hubspot__contact`
- `stg_hubspot__deal`
- `stg_hubspot__ticket`

## Under the Hood
- Updated seed data to include fields with special syntax in order to test the above changes.

fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
# dbt_hubspot_source v0.15.0
[PR #126](https://github.com/fivetran/dbt_hubspot_source/pull/126) includes the following updates:

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Include the following hubspot_source package version in your `packages.yml` file
```yaml
packages:
- package: fivetran/hubspot_source
version: [">=0.15.0", "<0.16.0"]
version: [">=0.16.0", "<0.17.0"]
```
### Step 3: Define database and schema variables
By default, this package runs using your destination and the `hubspot` schema. If this is not where your HubSpot data is (for example, if your HubSpot schema is named `hubspot_fivetran`), add the following configuration to your root `dbt_project.yml` file:
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'hubspot_source'
version: '0.15.0'
version: '0.16.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

12 changes: 6 additions & 6 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: hubspot_source_integration_tests_21
schema: hubspot_source_integration_tests_28
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: hubspot_source_integration_tests_21
schema: hubspot_source_integration_tests_28
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: hubspot_source_integration_tests_21
schema: hubspot_source_integration_tests_28
threads: 8
postgres:
type: postgres
Expand All @@ -42,21 +42,21 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: hubspot_source_integration_tests_21
schema: hubspot_source_integration_tests_28
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: hubspot_source_integration_tests_21
schema: hubspot_source_integration_tests_28
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
databricks-sql:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_SQL_DBT_HTTP_PATH') }}"
schema: hubspot_sqlw_tests
schema: hubspot_sqlw_tests_1
threads: 8
token: "{{ env_var('CI_DATABRICKS_SQL_DBT_TOKEN') }}"
type: databricks
36 changes: 30 additions & 6 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
name: 'hubspot_source_integration_tests'
version: '0.15.0'
version: '0.16.0'
profile: 'integration_tests'
config-version: 2

models:
+schema: "{{ 'hubspot_sqlw_tests' if target.name == 'databricks-sql' else 'hubspot' }}"
# +schema: "hubspot_{{ var('directed_schema','dev') }}" ## To be used for validation testing
+schema: "{{ 'hubspot_sqlw_tests_1' if target.name == 'databricks-sql' else 'hubspot' }}"

vars:
hubspot_schema: hubspot_source_integration_tests_21
hubspot_schema: hubspot_source_integration_tests_28
hubspot_source:
hubspot_service_enabled: true
# hubspot_sales_enabled: true # enable when generating docs
# hubspot_service_enabled: true # enable when generating docs
# hubspot_deal_enabled: true # enable when generating docs
# hubspot_contact_enabled: true # enable when generating docs
hubspot_sales_enabled: true # enable when generating docs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should lines 15-16 be commented out before merging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Avinash-- commented out

hubspot_company_enabled: true # enable when generating docs
# hubspot_marketing_enabled: true # enable when generating docs
# hubspot_contact_merge_audit_enabled: true # enable when generating docs
# hubspot_using_all_email_events: true # enable when generating docs
Expand Down Expand Up @@ -65,15 +67,37 @@ vars:
hubspot_email_event_dropped_identifier: "email_event_dropped_data"
hubspot_merged_deal_identifier: "merged_deal_data"

# hubspot__pass_through_all_columns: true
hubspot__company_pass_through_columns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should lines 71-73 be commented out now (or completely removed) now that we've thoroughly tested and validated this solution works, or is there a reason for keeping them in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason for keeping them in-- will comment it out

- name: "property_hs_all-funky-a9384-syntax"
alias: "funky_field"


seeds:
hubspot_source_integration_tests:
+quote_columns: "{{ true if target.type == 'redshift' else false }}"
owner_data:
+column_types:
owner_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
company_data:
+enabled: "{{ true if target.type not in ('postgres','snowflake','databricks') else false }}"
+column_types:
id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
company_data_postgres:
+alias: company_data
+enabled: "{{ true if target.type == 'postgres' else false }}"
+column_types:
id: bigint
company_data_snowflake:
+alias: company_data
+enabled: "{{ true if target.type == 'snowflake' else false }}"
+column_types:
id: bigint
company_data_databricks:
+alias: company_data
+enabled: "{{ true if target.type in ('databricks','databricks-sql') else false }}"
+column_types:
id: bigint
deal_data:
+column_types:
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
Expand Down
Loading