Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle BigQuery non-string option 'max_staleness' #237

Merged
merged 2 commits into from
Mar 11, 2024

Conversation

marcbllv
Copy link
Contributor

@marcbllv marcbllv commented Oct 25, 2023

Description & motivation

resolves: #231

Option item max_staleness in BigQuery must be passed as an INTERVAL, not a string.
But it's flagged as a string in YAML config.

Therefore compiled SQL code is:

create or replace external table `... my_table_name`
options (
    max_staleness = 'INTERVAL 1 HOUR'  -- <-- Quotes makes BigQuery angry here
)

while it should be:

create or replace external table `... my_table_name`
options (
    max_staleness = INTERVAL 1 HOUR  -- <-- No quote, BigQuery's happy
)

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

@marcbllv marcbllv requested a review from jeremyyeo as a code owner October 25, 2023 10:12
@cbini
Copy link

cbini commented Nov 1, 2023

just ran into this myself! hoping this can get merged into a release soon

@marcbllv
Copy link
Contributor Author

marcbllv commented Nov 8, 2023

Thank you @cbini! Do you know when it could be merged and a new version released?

@cbini
Copy link

cbini commented Nov 8, 2023

@marcbllv nope! I'm just another user who's having the same problem. Wanted to bump this to call attention to it.

@thomas-vl
Copy link
Contributor

@marcbllv I think a cleaner solution is just to remove the check if its a string and do the quotes, if you need to pass a string into the options do so explicitly by setting the quotes in the YAML configuration. This solution adds more complexity instead of reducing it.

@marcbllv
Copy link
Contributor Author

marcbllv commented Feb 8, 2024

@thomas-vl thanks for your answer! I agree on the added complexity of the code, but IMO it's fine since it decreases complexity on YAML configuration for users: you can't write INTERVAL types in yaml, so let's have users simply write strings and dbt-external-tables does the conversion (by adding quotes or not depending on the type on BigQuery) -> no need to worry about quoting or not in yaml.

Also removing both the check and quotes will break backwards compatibility: every user will have to explicitly write quotes in yaml (and it's super painful since you need to escape quotes).

IMO the added complexity is ok: it's not that huge, and allow users to do something that can't be done otherwise!
WDYT?

@cbini
Copy link

cbini commented Feb 8, 2024

I think another thing you could do is add max_staleness to the if in the for loop like uris.

edit: nope! that'll just skip it entirely

@thomas-vl
Copy link
Contributor

@marcbllv Ok makes sense, we have an open discussion anyway to change the way how to set BigQuery options.

Copy link
Contributor

@thomas-vl thomas-vl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🧙‍♂️

@thomas-vl
Copy link
Contributor

@jeremyyeo much needed feature, lets merge it 🥇

@thomas-vl
Copy link
Contributor

@dataders I saw you merge another pr, could you also take a look at this one?

@dataders dataders merged commit 383ee3c into dbt-labs:main Mar 11, 2024
@dataders dataders linked an issue Apr 5, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bigquery max_staleness configuration
4 participants