Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cast as int bug #140

Merged
merged 24 commits into from
Apr 16, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
2ced1bd
cast as bigint
fivetran-reneeli Apr 8, 2024
4ff078a
version
fivetran-reneeli Apr 8, 2024
942f6fb
update integration tests
fivetran-reneeli Apr 8, 2024
a56b239
data types
fivetran-reneeli Apr 8, 2024
d914af4
forgot column type label
fivetran-reneeli Apr 8, 2024
e6d74a1
cast as string JUST contact merge ids
fivetran-reneeli Apr 9, 2024
93bf2cd
try
fivetran-reneeli Apr 9, 2024
9871671
changelog
fivetran-reneeli Apr 9, 2024
9dad07e
new schema
fivetran-reneeli Apr 9, 2024
d99f759
try new schema for databricks
fivetran-reneeli Apr 10, 2024
1128a8a
try by removing the dbt compile in run script
fivetran-reneeli Apr 10, 2024
d32fefb
limit databricks adapter version and add package lock to gitignore
fivetran-reneeli Apr 10, 2024
65977e4
databricks adapter version
fivetran-reneeli Apr 10, 2024
9bfd04b
try new schema
fivetran-reneeli Apr 10, 2024
5c50c50
that didnt work
fivetran-reneeli Apr 10, 2024
6f5481a
test/jm-databricks-version-change
fivetran-joemarkiewicz Apr 10, 2024
c443197
change schema
fivetran-joemarkiewicz Apr 10, 2024
d15a750
add back dbt compile to run script
fivetran-reneeli Apr 11, 2024
029de2d
revert irrelevant seed data changes
fivetran-reneeli Apr 11, 2024
3f988f6
Update CHANGELOG.md
fivetran-reneeli Apr 12, 2024
0c06f96
changelog updates
fivetran-reneeli Apr 12, 2024
a0c7d6f
docs and comment out the enabled vars in the config to allow default
fivetran-reneeli Apr 12, 2024
e7826d1
changelog update
fivetran-reneeli Apr 15, 2024
f06b65e
changelog
fivetran-reneeli Apr 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@

target/
dbt_modules/
logs/

dbt_packages/
env/
dbt_packages/
.DS_Store
package-lock.yml
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# dbt_hubspot v0.17.1
[PR #140](https://github.com/fivetran/dbt_hubspot/pull/140) includes the following updates:

## Under the Hood
fivetran-reneeli marked this conversation as resolved.
Show resolved Hide resolved
- In `int_hubspot__contact_merge_adjust`, updates casting of `vid_to_merge` as `{{ dbt.type_int() }}` to `{{ dbt.type_string() }}` in the join of `contact_merge_audit` to `contacts`. Previously, casting only to `int` caused model failures resulting from integer fields that exceeded the range allowed in certain warehouses. In addition, for the case where the `contact_merge_audit` table is not present, the parsed `calculated_merged_vids` from the contact table are outputted as strings, therefore requiring the titular datatype cast in the join.

For context, the [Nov 2022 release of the Hubspot connector](https://fivetran.com/docs/connectors/applications/hubspot/changelog#november2022) should not have the `contact_merge_audit` table as that was deprecated in place of storing `property_hs_calculated_merged_vids` in the `contact` table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels quite lengthy and crowded. I believe we can make this a bit more streamlined and direct. What are your thoughts on the following?

Suggested change
- In `int_hubspot__contact_merge_adjust`, updates casting of `vid_to_merge` as `{{ dbt.type_int() }}` to `{{ dbt.type_string() }}` in the join of `contact_merge_audit` to `contacts`. Previously, casting only to `int` caused model failures resulting from integer fields that exceeded the range allowed in certain warehouses. In addition, for the case where the `contact_merge_audit` table is not present, the parsed `calculated_merged_vids` from the contact table are outputted as strings, therefore requiring the titular datatype cast in the join.
For context, the [Nov 2022 release of the Hubspot connector](https://fivetran.com/docs/connectors/applications/hubspot/changelog#november2022) should not have the `contact_merge_audit` table as that was deprecated in place of storing `property_hs_calculated_merged_vids` in the `contact` table.
- Included explicit datatype casts to `{{ dbt.type_string() }}` within the join of `contact_merge_audit.vid_to_merge` to `contacts.contact_id` in the `int_hubspot__contact_merge_adjust` model.
- This update was required to address a bug where the IDs in the join would overflow to bigint or be interpreted as strings. This change ensures the join fields have matching datatypes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! I would still add that snippet referring to the nov 2022 api update, so that people reading this who don't have contact_merge_audit table will know it's still relevant, but I'm going to update that part to make it more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update-- never mind, will not mention the nov 2022 api update since it's been a while since then



# dbt_hubspot v0.17.0
[PR #137](https://github.com/fivetran/dbt_hubspot/pull/137) includes the following updates:

Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'hubspot'
version: '0.17.0'
version: '0.17.1'

config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
Expand Down
3 changes: 2 additions & 1 deletion integration_tests/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ dbt_modules/
logs/
.DS_Store
dbt_packages/
env/
env/
package-lock.yml
10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: hubspot_integration_tests_11
schema: hubspot_integration_tests_55
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: hubspot_integration_tests_11
schema: hubspot_integration_tests_55
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: hubspot_integration_tests_11
schema: hubspot_integration_tests_55
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: hubspot_integration_tests_11
schema: hubspot_integration_tests_55
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: hubspot_integration_tests_11
schema: hubspot_integration_tests_55
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
8 changes: 6 additions & 2 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
name: 'hubspot_integration_tests'
version: '0.17.0'
version: '0.17.1'

profile: 'integration_tests'
config-version: 2
vars:
hubspot_schema: hubspot_integration_tests_11
hubspot_schema: hubspot_integration_tests_55
# hubspot_contact_merge_audit_enabled: true
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
hubspot_source:
hubspot_service_enabled: true
hubspot_sales_enabled: true # enable when generating docs
Expand Down Expand Up @@ -64,6 +65,9 @@ vars:
seeds:
hubspot_integration_tests:
+quote_columns: "{{ true if target.type == 'redshift' else false }}"
contact_merge_audit_data:
+column_types:
vid_to_merge: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
owner_data:
+column_types:
owner_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ dbt-redshift>=1.3.0,<2.0.0
dbt-postgres>=1.3.0,<2.0.0
dbt-spark>=1.3.0,<2.0.0
dbt-spark[PyHive]>=1.3.0,<2.0.0
dbt-databricks>=1.3.0,<2.0.0
dbt-databricks>=1.3.0,<1.7.13
2 changes: 1 addition & 1 deletion integration_tests/seeds/contact_merge_audit_data.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
canonical_vid,contact_id,vid_to_merge,_fivetran_synced,entity_id,first_name,last_name,num_properties_moved,timestamp,user_id
5555,5555,11111,2020-12-30 11:20:31,json,,,2,2019-10-08 15:50:57,
5555,5555,2147483648,2020-12-30 11:20:31,json,,,2,2019-10-08 15:50:57,
4444,4444,7777,2021-01-09 11:16:00,json,,,2,2019-09-03 15:39:29,
3333,3333,8888,2020-12-29 11:16:02,json,,,2,2019-11-20 20:17:59,
2222,2222,9999,2020-12-30 11:16:58,json,,,2,2019-12-10 18:14:34,
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ with contacts as (
from contacts

left join contact_merge_audit
on contacts.contact_id = cast(contact_merge_audit.vid_to_merge as {{ dbt.type_int() }})
on cast(contacts.contact_id as {{ dbt.type_string() }}) = cast(contact_merge_audit.vid_to_merge as {{ dbt.type_string() }})

where contact_merge_audit.vid_to_merge is null
)
Expand Down
Loading