diff --git a/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md b/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md index 6a9889d3033..0b1f1fe26bd 100644 --- a/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md +++ b/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md @@ -12,17 +12,14 @@ date: 2021-11-29 is_featured: true --- -import Latest from '/snippets/_release-stages-from-versionless.md' - - - :::tip February 2024 Update It's been a few years since dbt-core turned 1.0! Since then, we've committed to releasing zero breaking changes whenever possible and it's become much easier to upgrade dbt Core versions. In 2024, we're taking this promise further by: + - Stabilizing interfaces for everyone — adapter maintainers, metadata consumers, and (of course) people writing dbt code everywhere — as discussed in [our November 2023 roadmap update](https://github.com/dbt-labs/dbt-core/blob/main/docs/roadmap/2023-11-dbt-tng.md). -- Introducing **Latest** release track in dbt Cloud. No more manual upgrades and no need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud). +- Introducing [Release tracks](/docs/dbt-versions/cloud-release-tracks) (formerly known as Versionless) to dbt Cloud. No more manual upgrades and no need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud). We're leaving the rest of this post as is, so we can all remember how it used to be. Enjoy a stroll down memory lane. diff --git a/website/blog/2024-04-22-extended-attributes.md b/website/blog/2024-04-22-extended-attributes.md index 9013af81d47..57636cc8f6b 100644 --- a/website/blog/2024-04-22-extended-attributes.md +++ b/website/blog/2024-04-22-extended-attributes.md @@ -12,10 +12,6 @@ date: 2024-04-22 is_featured: true --- -import Latest from '/snippets/_release-stages-from-versionless.md' - - - dbt Cloud now includes a suite of new features that enable configuring precise and unique connections to data platforms at the environment and user level. These enable more sophisticated setups, like connecting a project to multiple warehouse accounts, first-class support for [staging environments](/docs/deploy/deploy-environments#staging-environment), and user-level [overrides for specific dbt versions](/docs/dbt-versions/upgrade-dbt-version-in-cloud#override-dbt-version). This gives dbt Cloud developers the features they need to tackle more complex tasks, like Write-Audit-Publish (WAP) workflows and safely testing dbt version upgrades. While you still configure a default connection at the project level and per-developer, you now have tools to get more advanced in a secure way. Soon, dbt Cloud will take this even further allowing multiple connections to be set globally and reused with _global connections_. @@ -84,7 +80,7 @@ All you need to do is configure an environment as staging and enable the **Defer ## Upgrading on a curve -Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **Latest release track**, to take advantage of new features and performance improvements. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work with Latest (and greatest) dbt. +Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **[Latest release track](/docs/dbt-versions/cloud-release-tracks)**, to take advantage of new features and performance improvements. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work with Latest (and greatest) dbt. ### Development environment diff --git a/website/blog/2024-06-12-putting-your-dag-on-the-internet.md b/website/blog/2024-06-12-putting-your-dag-on-the-internet.md index a8c3bebb61f..54864916d0e 100644 --- a/website/blog/2024-06-12-putting-your-dag-on-the-internet.md +++ b/website/blog/2024-06-12-putting-your-dag-on-the-internet.md @@ -12,11 +12,7 @@ date: 2024-06-14 is_featured: true --- -import Latest from '/snippets/_release-stages-from-versionless.md' - - - -**New in dbt: allow Snowflake Python models to access the internet** +## New in dbt: allow Snowflake Python models to access the internet With dbt 1.8, dbt released support for Snowflake’s [external access integrations](https://docs.snowflake.com/en/developer-guide/external-network-access/external-network-access-overview) further enabling the use of dbt + AI to enrich your data. This allows querying of external APIs within dbt Python models, a functionality that was required for dbt Cloud customer, [EQT AB](https://eqtgroup.com/). Learn about why they needed it and how they helped build the feature and get it shipped! @@ -49,7 +45,7 @@ This API is open and if it requires an API key, handle it similarly to managing For simplicity’s sake, we will show how to create them using [pre-hooks](/reference/resource-configs/pre-hook-post-hook) in a model configuration yml file: -``` +```yml models: - name: external_access_sample config: @@ -61,7 +57,7 @@ models: Then we can simply use the new external_access_integrations configuration parameter to use our network rule within a Python model (called external_access_sample.py): -``` +```python import snowflake.snowpark as snowpark def model(dbt, session: snowpark.Session): dbt.config( @@ -79,7 +75,7 @@ def model(dbt, session: snowpark.Session): The result is a model with some json I can parse, for example, in a SQL model to extract some information: -``` +```sql {{ config( materialized='incremental', @@ -112,12 +108,12 @@ The result is a model that will keep track of dbt invocations, and the current U This is a very new area to Snowflake and dbt -- something special about SQL and dbt is that it’s very resistant to external entropy. The second we rely on API calls, Python packages and other external dependencies, we open up to a lot more external entropy. APIs will change, break, and your models could fail. -Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models). +Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models). -**A few words about the power of Commercial Open Source Software** +## A few words about the power of Commercial Open Source Software In order to get this functionality shipped quickly, EQT opened a pull request, Snowflake helped with some problems we had with CI and a member of dbt Labs helped write the tests and merge the code in! -dbt now features this functionality in dbt 1.8+ and the "Latest" release track in dbt Cloud (dbt overview [here](/docs/dbt-versions/cloud-release-tracks)). +dbt now features this functionality in dbt 1.8+ and all [Release tracks](/docs/dbt-versions/cloud-release-tracks) in dbt Cloud. dbt Labs staff and community members would love to chat more about it in the [#db-snowflake](https://getdbt.slack.com/archives/CJN7XRF1B) slack channel. diff --git a/website/dbt-versions.js b/website/dbt-versions.js index 13ce565d354..3e59b926b80 100644 --- a/website/dbt-versions.js +++ b/website/dbt-versions.js @@ -20,6 +20,7 @@ exports.versions = [ }, { version: "1.9", + EOLDate: "2025-12-08", }, { version: "1.8", diff --git a/website/docs/docs/build/incremental-microbatch.md b/website/docs/docs/build/incremental-microbatch.md index 55c7dc92367..901f59a167c 100644 --- a/website/docs/docs/build/incremental-microbatch.md +++ b/website/docs/docs/build/incremental-microbatch.md @@ -25,7 +25,8 @@ Incremental models in dbt are a [materialization](/docs/build/materializations) Microbatch is an incremental strategy designed for large time-series datasets: - It relies solely on a time column ([`event_time`](/reference/resource-configs/event-time)) to define time-based ranges for filtering. Set the `event_time` column for your microbatch model and its direct parents (upstream models). Note, this is different to `partition_by`, which groups rows into partitions. - It complements, rather than replaces, existing incremental strategies by focusing on efficiency and simplicity in batch processing. -- Unlike traditional incremental strategies, microbatch doesn't require implementing complex conditional logic for [backfilling](#backfills). +- Unlike traditional incremental strategies, microbatch enables you to [reprocess failed batches](/docs/build/incremental-microbatch#retry), auto-detect [parallel batch execution](#parallel-batch-execution), and eliminate the need to implement complex conditional logic for [backfilling](#backfills). + - Note, microbatch might not be the best strategy for all use cases. Consider other strategies for use cases such as not having a reliable `event_time` column or if you want more control over the incremental logic. Read more in [How `microbatch` compares to other incremental strategies](#how-microbatch-compares-to-other-incremental-strategies). ### How microbatch works @@ -179,12 +180,14 @@ It does not matter whether the table already contains data for that day. Given t Several configurations are relevant to microbatch models, and some are required: -| Config | Type | Description | Default | -|----------|------|---------------|---------| -| [`event_time`](/reference/resource-configs/event-time) | Column (required) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A | -| [`begin`](/reference/resource-configs/begin) | Date (required) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | -| [`batch_size`](/reference/resource-configs/batch-size) | String (required) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | -| [`lookback`](/reference/resource-configs/lookback) | Integer (optional) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | + +| Config | Description | Default | Type | Required | +|----------|---------------|---------|------|---------| +| [`event_time`](/reference/resource-configs/event-time) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A | Column | Required | +| [`begin`](/reference/resource-configs/begin) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | Date | Required | +| [`batch_size`](/reference/resource-configs/batch-size) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | String | Required | +| [`lookback`](/reference/resource-configs/lookback) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | Integer | Optional | +| [`concurrent_batches`](/reference/resource-properties/concurrent_batches) | An override for whether batches run concurrently (at the same time) or sequentially (one after the other). | `None` | Boolean | Optional | @@ -280,7 +283,127 @@ For now, dbt assumes that all values supplied are in UTC: While we may consider adding support for custom time zones in the future, we also believe that defining these values in UTC makes everyone's lives easier. -## How `microbatch` compares to other incremental strategies? +## Parallel batch execution + +The microbatch strategy offers the benefit of updating a model in smaller, more manageable batches. Depending on your use case, configuring your microbatch models to run in parallel offers faster processing, in comparison to running batches sequentially. + +Parallel batch execution means that multiple batches are processed at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. + +dbt automatically detects whether a batch can be run in parallel in most cases, which means you don’t need to configure this setting. However, the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches) is available as an override (not a gate), allowing you to specify whether batches should or shouldn’t be run in parallel in specific cases. + +For example, if you have a microbatch model with 12 batches, you can execute those batches to run in parallel. Specifically they'll run in parallel limited by the number of [available threads](/docs/running-a-dbt-project/using-threads). + +### Prerequisites + +To enable parallel execution, you must: + +- Use a supported adapter: + - Snowflake + - Databricks + - More adapters coming soon! + - We'll be continuing to test and add concurrency support for adapters. This means that some adapters might get concurrency support _after_ the 1.9 initial release. + +- Meet [additional conditions](#how-parallel-batch-execution-works) described in the following section. + +### How parallel batch execution works + +A batch can only run in parallel if all of these conditions are met: + +| Condition | Parallel execution | Sequential execution| +| ---------------| :------------------: | :----------: | +| **Not** the first batch | ✅ | - | +| **Not** the last batch | ✅ | - | +| [Adapter supports](#prerequisites) parallel batches | ✅ | - | + + +After checking for the conditions in the previous table — and if `concurrent_batches` value isn't set, dbt will intelligently auto-detect if the model invokes the [`{{ this }}`](/reference/dbt-jinja-functions/this) Jinja function. If it references `{{ this }}`, the batches will run sequentially since `{{ this }}` represents the database of the current model and referencing the same relation causes conflict. + +Otherwise, if `{{ this }}` isn't detected (and other conditions are met), the batches will run in parallel, which can be overriden when you [set a value for `concurrent_batches`](/reference/resource-properties/concurrent_batches). + +### Parallel or sequential execution + +Choosing between parallel batch execution and sequential processing depends on the specific requirements of your use case. + +- Parallel batch execution is faster but requires logic independent of batch execution order. For example, if you're developing a data pipeline for a system that processes user transactions in batches, each batch is executed in parallel for better performance. However, the logic used to process each transaction shouldn't depend on the order of how batches are executed or completed. +- Sequential processing is slower but essential for calculations like [cumulative metrics](/docs/build/cumulative) in microbatch models. It processes data in the correct order, allowing each step to build on the previous one. + + + +### Configure `concurrent_batches` + +By default, dbt auto-detects whether batches can run in parallel for microbatch models, and this works correctly in most cases. However, you can override dbt's detection by setting the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches) in your `dbt_project.yml` or model `.sql` file to specify parallel or sequential execution, given you meet all the [conditions](#prerequisites): + + + + + + +```yaml +models: + +concurrent_batches: true # value set to true to run batches in parallel +``` + + + + + + + + +```sql +{{ + config( + materialized='incremental', + incremental_strategy='microbatch', + event_time='session_start', + begin='2020-01-01', + batch_size='day + concurrent_batches=true, # value set to true to run batches in parallel + ... + ) +}} + +select ... +``` + + + + +## How microbatch compares to other incremental strategies + +As data warehouses roll out new operations for concurrently replacing/upserting data partitions, we may find that the new operation for the data warehouse is more efficient than what the adapter uses for microbatch. In such instances, we reserve the right the update the default operation for microbatch, so long as it works as intended/documented for models that fit the microbatch paradigm. Most incremental models rely on the end user (you) to explicitly tell dbt what "new" means, in the context of each model, by writing a filter in an `{% if is_incremental() %}` conditional block. You are responsible for crafting this SQL in a way that queries [`{{ this }}`](/reference/dbt-jinja-functions/this) to check when the most recent record was last loaded, with an optional look-back window for late-arriving records. diff --git a/website/docs/docs/build/incremental-strategy.md b/website/docs/docs/build/incremental-strategy.md index 9a8f8358f0f..9176e962a3a 100644 --- a/website/docs/docs/build/incremental-strategy.md +++ b/website/docs/docs/build/incremental-strategy.md @@ -33,7 +33,7 @@ Click the name of the adapter in the below table for more information about supp | [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | ✅ | | [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | ✅ | ✅ | ✅ | | ✅ | | [dbt-trino](/reference/resource-configs/trino-configs#incremental) | ✅ | ✅ | ✅ | | | -| [dbt-fabric](/reference/resource-configs/fabric-configs#incremental) | ✅ | ✅ | ✅ | | | +| [dbt-fabric](/reference/resource-configs/fabric-configs#incremental) | ✅ | | ✅ | | | | [dbt-athena](/reference/resource-configs/athena-configs#incremental-models) | ✅ | ✅ | | ✅ | | ### Configuring incremental strategy diff --git a/website/docs/docs/build/unit-tests.md b/website/docs/docs/build/unit-tests.md index 69d89ad30e6..a81fc088de7 100644 --- a/website/docs/docs/build/unit-tests.md +++ b/website/docs/docs/build/unit-tests.md @@ -26,7 +26,7 @@ Starting in dbt Core v1.8, we have introduced an additional type of test to dbt - We currently _don't_ support unit testing models that use recursive SQL. - If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](/reference/resource-properties/unit-testing-versions) for more information. - Unit tests must be defined in a YML file in your [`models/` directory](/reference/project-configs/model-paths). -- Table names must be [aliased](/docs/build/custom-aliases) in order to unit test `join` logic. +- Table names must be aliased in order to unit test `join` logic. - Include all [`ref`](/reference/dbt-jinja-functions/ref) or [`source`](/reference/dbt-jinja-functions/source) model references in the unit test configuration as `input`s to avoid "node not found" errors during compilation. #### Adapter-specific caveats diff --git a/website/docs/docs/cloud/cloud-cli-installation.md b/website/docs/docs/cloud/cloud-cli-installation.md index 8a34401cd08..a80f1a587e0 100644 --- a/website/docs/docs/cloud/cloud-cli-installation.md +++ b/website/docs/docs/cloud/cloud-cli-installation.md @@ -319,3 +319,10 @@ This alias will allow you to use the dbt-cloud command to invoke th If you've ran a dbt command and receive a Session occupied error, you can reattach to your existing session with dbt reattach and then press Control-C and choose to cancel the invocation. + + + + +The Cloud CLI allows only one command that writes to the data warehouse at a time. If you attempt to run multiple write commands simultaneously (for example, `dbt run` and `dbt build`), you will encounter a `stuck session` error. To resolve this, cancel the specific invocation by passing its ID to the cancel command. For more information, refer to [parallel execution](/reference/dbt-commands#parallel-execution). + + \ No newline at end of file diff --git a/website/docs/docs/cloud/secure/databricks-privatelink.md b/website/docs/docs/cloud/secure/databricks-privatelink.md index d754f2b76c4..aaa6e0c6eb7 100644 --- a/website/docs/docs/cloud/secure/databricks-privatelink.md +++ b/website/docs/docs/cloud/secure/databricks-privatelink.md @@ -34,7 +34,7 @@ The following steps will walk you through the setup of a Databricks AWS PrivateL 1. Once dbt Cloud support has notified you that setup is complete, [register the VPC endpoint in Databricks](https://docs.databricks.com/administration-guide/cloud-configurations/aws/privatelink.html#step-3-register-privatelink-objects-and-attach-them-to-a-workspace) and attach it to the workspace: - [Register your VPC endpoint](https://docs.databricks.com/en/security/network/classic/vpc-endpoints.html) — Register the VPC endpoint using the VPC endpoint ID provided by dbt Support. - [Create a Private Access Settings object](https://docs.databricks.com/en/security/network/classic/private-access-settings.html) — Create a Private Access Settings (PAS) object with your desired public access settings, and setting Private Access Level to **Endpoint**. Choose the registered endpoint created in the previous step. - - [Create or update your workspace](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both) — Create a workspace, or update your an existing workspace. Under **Advanced configurations → Private Link** choose the private access settings object created in the previous step. + - [Create or update your workspace](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both) — Create a workspace, or update an existing workspace. Under **Advanced configurations → Private Link** choose the private access settings object created in the previous step. :::warning If using an existing Databricks workspace, all workloads running in the workspace need to be stopped to enable Private Link. Workloads also can't be started for another 20 minutes after making changes. From the [Databricks documentation](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both): diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md index 6ade3d5013f..9a4712af528 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md +++ b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md @@ -49,6 +49,8 @@ Starting in Core 1.9, you can use the new [microbatch strategy](/docs/build/incr - Simplified query design: Write your model query for a single batch of data. dbt will use your `event_time`, `lookback`, and `batch_size` configurations to automatically generate the necessary filters for you, making the process more streamlined and reducing the need for you to manage these details. - Independent batch processing: dbt automatically breaks down the data to load into smaller batches based on the specified `batch_size` and processes each batch independently, improving efficiency and reducing the risk of query timeouts. If some of your batches fail, you can use `dbt retry` to load only the failed batches. - Targeted reprocessing: To load a *specific* batch or batches, you can use the CLI arguments `--event-time-start` and `--event-time-end`. +- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches). + Currently microbatch is supported on these adapters with more to come: * postgres diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index dd9d066d545..e8c7816979a 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -35,7 +35,7 @@ To create a new dbt Cloud deployment environment, navigate to **Deploy** -> **En In dbt Cloud, each project can have one designated deployment environment, which serves as its production environment. This production environment is _essential_ for using features like dbt Explorer and cross-project references. It acts as the source of truth for the project's production state in dbt Cloud. - + ### Semantic Layer diff --git a/website/docs/reference/dbt-jinja-functions/config.md b/website/docs/reference/dbt-jinja-functions/config.md index 3903c82eef7..8083ea2a124 100644 --- a/website/docs/reference/dbt-jinja-functions/config.md +++ b/website/docs/reference/dbt-jinja-functions/config.md @@ -34,13 +34,21 @@ __Args__: The `config.get` function is used to get configurations for a model from the end-user. Configs defined in this way are optional, and a default value can be provided. +There are 3 cases: +1. The configuration variable exists, it is not `None` +1. The configuration variable exists, it is `None` +1. The configuration variable does not exist + Example usage: ```sql {% materialization incremental, default -%} -- Example w/ no default. unique_key will be None if the user does not provide this configuration {%- set unique_key = config.get('unique_key') -%} - -- Example w/ default value. Default to 'id' if 'unique_key' not provided + -- Example w/ alternate value. Use alternative of 'id' if 'unique_key' config is provided, but it is None + {%- set unique_key = config.get('unique_key') or 'id' -%} + + -- Example w/ default value. Default to 'id' if the 'unique_key' config does not exist {%- set unique_key = config.get('unique_key', default='id') -%} ... ``` diff --git a/website/docs/reference/global-configs/indirect-selection.md b/website/docs/reference/global-configs/indirect-selection.md index 729176a1ff4..03048b57119 100644 --- a/website/docs/reference/global-configs/indirect-selection.md +++ b/website/docs/reference/global-configs/indirect-selection.md @@ -6,7 +6,7 @@ sidebar: "Indirect selection" import IndirSelect from '/snippets/_indirect-selection-definitions.md'; -Use the `--indirect_selection` flag to `dbt test` or `dbt build` to configure which tests to run for the nodes you specify. You can set this as a CLI flag or an environment variable. In dbt Core, you can also configure user configurations in [YAML selectors](/reference/node-selection/yaml-selectors) or in the `flags:` block of `dbt_project.yml`, which sets project-level flags. +Use the `--indirect-selection` flag to `dbt test` or `dbt build` to configure which tests to run for the nodes you specify. You can set this as a CLI flag or an environment variable. In dbt Core, you can also configure user configurations in [YAML selectors](/reference/node-selection/yaml-selectors) or in the `flags:` block of `dbt_project.yml`, which sets project-level flags. When all flags are set, the order of precedence is as follows. Refer to [About global configs](/reference/global-configs/about-global-configs) for more details: diff --git a/website/docs/reference/resource-configs/batch_size.md b/website/docs/reference/resource-configs/batch_size.md index fa632bcd44d..4001545778a 100644 --- a/website/docs/reference/resource-configs/batch_size.md +++ b/website/docs/reference/resource-configs/batch_size.md @@ -7,7 +7,7 @@ description: "dbt uses `batch_size` to determine how large batches are when runn datatype: hour | day | month | year --- -Available in dbt Cloud Versionless and dbt Core v1.9 and higher. +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. ## Definition diff --git a/website/docs/reference/resource-configs/begin.md b/website/docs/reference/resource-configs/begin.md index d73ce02145b..dd47419be21 100644 --- a/website/docs/reference/resource-configs/begin.md +++ b/website/docs/reference/resource-configs/begin.md @@ -7,7 +7,7 @@ description: "dbt uses `begin` to determine when a microbatch incremental model datatype: string --- -Available in dbt Cloud Versionless and dbt Core v1.9 and higher. +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. ## Definition diff --git a/website/docs/reference/resource-configs/lookback.md b/website/docs/reference/resource-configs/lookback.md index 75d33ac5aa7..037ffdeb68f 100644 --- a/website/docs/reference/resource-configs/lookback.md +++ b/website/docs/reference/resource-configs/lookback.md @@ -7,7 +7,7 @@ description: "dbt uses `lookback` to detrmine how many 'batches' of `batch_size` datatype: int --- -Available in dbt Cloud Versionless and dbt Core v1.9 and higher. +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. ## Definition diff --git a/website/docs/reference/resource-properties/concurrent_batches.md b/website/docs/reference/resource-properties/concurrent_batches.md new file mode 100644 index 00000000000..4d6b2ea0af4 --- /dev/null +++ b/website/docs/reference/resource-properties/concurrent_batches.md @@ -0,0 +1,90 @@ +--- +title: "concurrent_batches" +resource_types: [models] +datatype: model_name +description: "Learn about concurrent_batches in dbt." +--- + +:::note + +Available in dbt Core v1.9+ or the [dbt Cloud "Latest" release tracks](/docs/dbt-versions/cloud-release-tracks). + +::: + + + + + + + +```yaml +models: + +concurrent_batches: true +``` + + + + + + + + + + +```sql +{{ + config( + materialized='incremental', + concurrent_batches=true, + incremental_strategy='microbatch' + ... + ) +}} +select ... +``` + + + + + + +## Definition + +`concurrent_batches` is an override which allows you to decide whether or not you want to run batches in parallel or sequentially (one at a time). + +For more information, refer to [how batch execution works](/docs/build/incremental-microbatch#how-parallel-batch-execution-works). +## Example + +By default, dbt auto-detects whether batches can run in parallel for microbatch models. However, you can override dbt's detection by setting the `concurrent_batches` config to `false` in your `dbt_project.yml` or model `.sql` file to specify parallel or sequential execution, given you meet these conditions: +* You've configured a microbatch incremental strategy. +* You're working with cumulative metrics or any logic that depends on batch order. + +Set `concurrent_batches` config to `false` to ensure batches are processed sequentially. For example: + + + +```yaml +models: + my_project: + cumulative_metrics_model: + +concurrent_batches: false +``` + + + + + +```sql +{{ + config( + materialized='incremental', + incremental_strategy='microbatch' + concurrent_batches=false + ) +}} +select ... + +``` + + + diff --git a/website/sidebars.js b/website/sidebars.js index 5d6e0582765..08494e4c713 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -956,6 +956,7 @@ const sidebarSettings = { "reference/resource-configs/materialized", "reference/resource-configs/on_configuration_change", "reference/resource-configs/sql_header", + "reference/resource-properties/concurrent_batches", ], }, { diff --git a/website/snippets/core-versions-table.md b/website/snippets/core-versions-table.md index 899c3dddc28..0d82ab35573 100644 --- a/website/snippets/core-versions-table.md +++ b/website/snippets/core-versions-table.md @@ -2,8 +2,8 @@ | dbt Core | Initial release | Support level and end date | |:-------------------------------------------------------------:|:---------------:|:-------------------------------------:| -| [**v1.9**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9) | Release candidate | TBA | -| [**v1.8**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8) | May 9 2024 | Active Support — May 8, 2025| +| [**v1.9**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9) | Dec 9, 2024 | Active Support — Dec 8, 2025| +| [**v1.8**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8) | May 9, 2024 | Active Support — May 8, 2025| | [**v1.7**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.7) | Nov 2, 2023 |
**dbt Core and dbt Cloud Developer & Team customers:** End of Life
**dbt Cloud Enterprise customers:** Critical Support until further notice 1
| | [**v1.6**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.6) | Jul 31, 2023 | End of Life ⚠️ | | [**v1.5**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.5) | Apr 27, 2023 | End of Life ⚠️ | diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png new file mode 100644 index 00000000000..581c4ca6cbc Binary files /dev/null and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png differ