You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stage external resources would run a lot faster if it used multiple threads for multiple tables
Describe alternatives you've considered
I had previously used a pre-hook before each model that referenced an external table, which as they were part of the models did run in parallel. This implementation was a bit messy though as the external table did not appear in the DAG and you had to include a CREATE OR REPLACE EXTERNAL TABLE ... in your model
Additional context
I have only used this in bigquery
Who will this benefit?
Anyone with a lot of external tables they need to stage before each build - I have 10 and it takes over a minute, and it will scale linearly with the number of external tables
The text was updated successfully, but these errors were encountered:
Hey @azdoherty definitely move that discussion over there. Fwiw - this is probably a dbt-core library issue - it's not possible to run SQL statements in parallel today - dbt-external-table package or otherwise. I've provided the same workarounds as you have done - via hooks since models can run in parallel and some other funky patterns using custom materializations: https://gist.github.com/jeremyyeo/b61655a3e5a52eb27640363650c79a1e - idea is the same though - models run in parallel (up to threads config) so use that mechanism to do parallel run operations instead.
However - this is primarily a dbt-core / dbt-adapters library issue imho.
Describe the feature
Stage external resources would run a lot faster if it used multiple threads for multiple tables
Describe alternatives you've considered
I had previously used a pre-hook before each model that referenced an external table, which as they were part of the models did run in parallel. This implementation was a bit messy though as the external table did not appear in the DAG and you had to include a
CREATE OR REPLACE EXTERNAL TABLE ...
in your modelAdditional context
I have only used this in bigquery
Who will this benefit?
Anyone with a lot of external tables they need to stage before each build - I have 10 and it takes over a minute, and it will scale linearly with the number of external tables
The text was updated successfully, but these errors were encountered: