-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add docs on dbt Cloud integration #1763
base: master
Are you sure you want to change the base?
Conversation
|
|
||
- Job ID: The unique identifier for the dbt job you wish to trigger. | ||
- Account ID: Your dbt account identifier. | ||
- API Key: The dbt API key associated with your account. This allows Estuary Flow to authenticate with dbt Cloud and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They also need an Access URL, it is mandatory, I know the connector marks it as non-required, but that's because we previously had Account Prefix and to be backward-compatible we had to keep the new one marked as non-required, but we will validate that one of the two exists every time
site/docs/guides/dbt-integration.md
Outdated
|
||
### Optional Parameters | ||
|
||
- Access URL: The dbt access URL can be found in your dbt Account Settings. Use this URL if your dbt account requires a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth adding here since a few customers have had this issue: If they can't find their Access URL in their dashboard, it is because they are old customers and have not yet migrated to the new API, in this case their Access URL is: https://cloud.getdbt.com/
site/docs/guides/dbt-integration.md
Outdated
|
||
### Job Management | ||
|
||
If you want to avoid triggering multiple overlapping dbt jobs, set Job Trigger Mode to skip. This way, if a job is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth mentioning this is the default behavior
site/docs/guides/dbt-integration.md
Outdated
|
||
### Regular Data Transformation on New Data | ||
|
||
Suppose you have a data pipeline that ingests data into a warehouse every 1 hour (configured via a Sync Frequency), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dbt cloud trigger starts the timer as soon as the first data arrives at the connector, and any subsequent timers are also started when data arrives.
If a connector has a delay of 1 hour, this is how it would look like:
Connector starts up -> runs a first dbt job trigger (this is to ensure consistency when connector restarts) -> materializes one small chunk -> starts timer to trigger dbt job in N minutes -> materializes the rest of chunks -> start 1 hour delay of connector of not backfilling -> trigger dbt job when N minutes have passed since the timer started (this includes during backfills)
So in that sense, it is best that their dbt job trigger interval is not very long. The default is 30 minutes which means 30 minutes after the first bulk of data is committed. It is not very short to avoid many jobs during backfills, but it means during non-backfill periods we will wait 30 minutes after commiting the first commit and then triggering a job. How much of a latency this creates between the final data point being materialized and the dbt job triggering depends on how long it takes for their data to be materialized to the destination
This is the current compromise we have to be able to set a minimum interval between dbt job triggers, support cases where connectors don't use Sync Interval, support use cases where data is arrival is very sparse (once a day for example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the detailed writeup, I tried to incorporate this as best as possible
Description:
(Describe the high level scope of new or changed features)
Workflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
(anything that might help someone review this PR)
This change is