diff --git a/src/current/_includes/v23.1/misc/tooling.md b/src/current/_includes/v23.1/misc/tooling.md index 7a3dab08975..4d6eb55492a 100644 --- a/src/current/_includes/v23.1/misc/tooling.md +++ b/src/current/_includes/v23.1/misc/tooling.md @@ -93,6 +93,7 @@ For a list of tools supported by the CockroachDB community, see [Third-Party Too | [Qlik Replicate](https://www.qlik.com/us/products/qlik-replicate) | November 2022 | Full | [Migrate and Replicate Data with Qlik Replicate]({% link {{ page.version.version }}/qlik.md %}) | [Striim](https://www.striim.com) | 4.1.2 | Full | [Migrate and Replicate Data with Striim]({% link {{ page.version.version }}/striim.md %}) | [Oracle GoldenGate](https://www.oracle.com/integration/goldengate/) | 21.3 | Partial | [Migrate and Replicate Data with Oracle GoldenGate]({% link {{ page.version.version }}/goldengate.md %}) +| [Debezium](https://debezium.io/) | 2.4 | Full | [Migrate Data with Debezium]({% link {{ page.version.version }}/debezium.md %}) ## Provisioning tools | Tool | Latest tested version | Support level | Documentation | diff --git a/src/current/_includes/v23.1/sidebar-data/migrate.json b/src/current/_includes/v23.1/sidebar-data/migrate.json index f64c8fd989d..949b2754d81 100644 --- a/src/current/_includes/v23.1/sidebar-data/migrate.json +++ b/src/current/_includes/v23.1/sidebar-data/migrate.json @@ -52,6 +52,12 @@ "urls": [ "/${VERSION}/goldengate.html" ] + }, + { + "title": "Debezium", + "urls": [ + "/${VERSION}/debezium.html" + ] } ] }, diff --git a/src/current/_includes/v23.2/misc/tooling.md b/src/current/_includes/v23.2/misc/tooling.md index 7a3dab08975..4d6eb55492a 100644 --- a/src/current/_includes/v23.2/misc/tooling.md +++ b/src/current/_includes/v23.2/misc/tooling.md @@ -93,6 +93,7 @@ For a list of tools supported by the CockroachDB community, see [Third-Party Too | [Qlik Replicate](https://www.qlik.com/us/products/qlik-replicate) | November 2022 | Full | [Migrate and Replicate Data with Qlik Replicate]({% link {{ page.version.version }}/qlik.md %}) | [Striim](https://www.striim.com) | 4.1.2 | Full | [Migrate and Replicate Data with Striim]({% link {{ page.version.version }}/striim.md %}) | [Oracle GoldenGate](https://www.oracle.com/integration/goldengate/) | 21.3 | Partial | [Migrate and Replicate Data with Oracle GoldenGate]({% link {{ page.version.version }}/goldengate.md %}) +| [Debezium](https://debezium.io/) | 2.4 | Full | [Migrate Data with Debezium]({% link {{ page.version.version }}/debezium.md %}) ## Provisioning tools | Tool | Latest tested version | Support level | Documentation | diff --git a/src/current/_includes/v23.2/sidebar-data/migrate.json b/src/current/_includes/v23.2/sidebar-data/migrate.json index f64c8fd989d..949b2754d81 100644 --- a/src/current/_includes/v23.2/sidebar-data/migrate.json +++ b/src/current/_includes/v23.2/sidebar-data/migrate.json @@ -52,6 +52,12 @@ "urls": [ "/${VERSION}/goldengate.html" ] + }, + { + "title": "Debezium", + "urls": [ + "/${VERSION}/debezium.html" + ] } ] }, diff --git a/src/current/v23.1/debezium.md b/src/current/v23.1/debezium.md new file mode 100644 index 00000000000..a8649d31bdc --- /dev/null +++ b/src/current/v23.1/debezium.md @@ -0,0 +1,94 @@ +--- +title: Migrate Data with Debezium +summary: Use Debezium to migrate data to a CockroachDB cluster. +toc: true +docs_area: migrate +--- + +[Debezium](https://debezium.io/) is a self-hosted distributed platform that can read data from a variety of sources and import it into Kafka. You can use Debezium to [migrate data to CockroachDB](#migrate-data-to-cockroachdb) from another database that is accessible over the public internet. + +As of this writing, Debezium supports the following database [sources](https://debezium.io/documentation/reference/stable/connectors/index.html): + +- MongoDB +- MySQL +- PostgreSQL +- SQL Server +- Oracle +- Db2 +- Cassandra +- Vitess (incubating) +- Spanner (incubating) +- JDBC (incubating) + +{{site.data.alerts.callout_info}} +Migrating with Debezium requires familiarity with Kafka. Refer to the [Debezium documentation](https://debezium.io/documentation/reference/stable/architecture.html) for information on how Debezium is deployed with Kafka Connect. +{{site.data.alerts.end}} + +## Before you begin + +Complete the following items before using Debezium: + +- Configure a secure [publicly-accessible]({% link cockroachcloud/network-authorization.md %}) CockroachDB cluster running the latest **{{ page.version.version }}** [production release](https://www.cockroachlabs.com/docs/releases/{{ page.version.version }}) with at least one [SQL user]({% link {{ page.version.version }}/security-reference/authorization.md %}#sql-users), make a note of the credentials for the SQL user. +- Install and configure [Debezium](https://debezium.io/), [Kafka Connect](https://docs.confluent.io/platform/current/connect/index.html), and [Kafka](https://kafka.apache.org/). This documentation assumes you have already added data from your [source database](https://debezium.io/documentation/reference/stable/connectors/index.html) to a Kafka topic. + +## Migrate data to CockroachDB + +Once all of the [prerequisite steps](#before-you-begin) are completed, you can use Debezium to migrate data to CockroachDB. + +1. To write data from Kafka to CockroachDB, use the Confluent JDBC Sink Connector. First use the following `dockerfile` to create a custom image with the [JDBC driver](https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc): + + {% include_cached copy-clipboard.html %} + ~~~ + FROM quay.io/debezium/connect:latest + ENV KAFKA_CONNECT_JDBC_DIR=$KAFKA_CONNECT_PLUGINS_DIR/kafka-connect-jdbc \ + + + ARG POSTGRES_VERSION=latest + ARG KAFKA_JDBC_VERSION=latest + + + # Deploy PostgreSQL JDBC Driver + RUN cd /kafka/libs && curl -sO https://jdbc.postgresql.org/download/postgresql-$POSTGRES_VERSION.jar + + + # Deploy Kafka Connect JDBC + RUN mkdir $KAFKA_CONNECT_JDBC_DIR && cd $KAFKA_CONNECT_JDBC_DIR &&\ + curl -sO https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/$KAFKA_JDBC_VERSION/kafka-connect-jdbc-$KAFKA_JDBC_VERSION.jar + ~~~ + +1. Create the JSON configuration file that you will use to create the sink: + + {% include_cached copy-clipboard.html %} + ~~~ shell + { + "name": "pg-sink", + "config": { + "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", + "tasks.max": "10", + "topics" : "{topic.example.table}", + "connection.url": "jdbc:postgresql://{host}:{port}/{user}?sslmode=require", + "connection.user": "{username}", + "connection.password": "{password}", + "insert.mode": "upsert", + "pk.mode": "record_value", + "pk.fields":"id", + "database.time_zone": "UTC", + "auto.create":true, + "auto.evolve": false, + "transforms": "unwrap", + "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState" + } + } + ~~~ + + Specify the **Connection URL** in [JDBC format]({% link {{ page.version.version }}/connect-to-the-database.md %}?filters=java&#step-5-connect-to-the-cluster). For information about where to find the CockroachDB connection parameters, see [Connect to a CockroachDB Cluster]({% link {{ page.version.version }}/connect-to-the-database.md %}). + +1. To create the sink, `POST` the JSON configuration file to the Kafka Connect `/connectors` endpoint. Refer to the [Kafka Connect API documentation](https://kafka.apache.org/documentation/#connect_rest) for more information. + +## See also + +- [Migration Overview]({% link {{ page.version.version }}/migration-overview.md %}) +- [Schema Conversion Tool](https://www.cockroachlabs.com/docs/cockroachcloud/migrations-page) +- [Change Data Capture Overview]({% link {{ page.version.version }}/change-data-capture-overview.md %}) +- [Third-Party Tools Supported by Cockroach Labs]({% link {{ page.version.version }}/third-party-database-tools.md %}) +- [Stream a Changefeed to a Confluent Cloud Kafka Cluster]({% link {{ page.version.version }}/stream-a-changefeed-to-a-confluent-cloud-kafka-cluster.md %}) diff --git a/src/current/v23.2/debezium.md b/src/current/v23.2/debezium.md new file mode 100644 index 00000000000..914ce3c7e6a --- /dev/null +++ b/src/current/v23.2/debezium.md @@ -0,0 +1,94 @@ +--- +title: Migrate Data with Debezium +summary: Use Debezium to migrate data to a CockroachDB cluster. +toc: true +docs_area: migrate +--- + +[Debezium](https://debezium.io/) is a self-hosted distributed platform that can read data from a variety of sources and import it into Kafka. You can use Debezium to [migrate data to CockroachDB](#migrate-data-to-cockroachdb) from another database that is accessible over the public internet. + +As of this writing, Debezium supports the following database [sources](https://debezium.io/documentation/reference/stable/connectors/index.html): + +- MongoDB +- MySQL +- PostgreSQL +- SQL Server +- Oracle +- Db2 +- Cassandra +- Vitess (incubating) +- Spanner (incubating) +- JDBC (incubating) + +{{site.data.alerts.callout_info}} +Migrating with Debezium requires familiarity with Kafka. Refer to the [Debezium documentation](https://debezium.io/documentation/reference/stable/architecture.html) for information on how Debezium is deployed with Kafka Connect. +{{site.data.alerts.end}} + +## Before you begin + +Complete the following items before using Debezium: + +- Configure a secure [publicly-accessible]({% link cockroachcloud/network-authorization.md %}) CockroachDB cluster running the latest **{{ page.version.version }}** [production release](https://www.cockroachlabs.com/docs/releases/{{ page.version.version }}) with at least one [SQL user]({% link {{ page.version.version }}/security-reference/authorization.md %}#sql-users), make a note of the credentials for the SQL user. +- Install and configure [Debezium](https://debezium.io/), [Kafka Connect](https://docs.confluent.io/platform/current/connect/index.html), and [Kafka](https://kafka.apache.org/). This documentation assumes you have already added data from your [source database](https://debezium.io/documentation/reference/stable/connectors/index.html) to a Kafka topic. + +## Migrate data to CockroachDB + +Once all of the [prerequisite steps](#before-you-begin) are completed, you can use Debezium to migrate data to CockroachDB. + +1. To write data from Kafka to CockroachDB, use the Confluent JDBC Sink Connector. First use the following `dockerfile` to create a custom image with the [JDBC driver](https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc): + + {% include_cached copy-clipboard.html %} + ~~~ + FROM quay.io/debezium/connect:latest + ENV KAFKA_CONNECT_JDBC_DIR=$KAFKA_CONNECT_PLUGINS_DIR/kafka-connect-jdbc \ + + + ARG POSTGRES_VERSION=latest + ARG KAFKA_JDBC_VERSION=latest + + + # Deploy PostgreSQL JDBC Driver + RUN cd /kafka/libs && curl -sO https://jdbc.postgresql.org/download/postgresql-$POSTGRES_VERSION.jar + + + # Deploy Kafka Connect JDBC + RUN mkdir $KAFKA_CONNECT_JDBC_DIR && cd $KAFKA_CONNECT_JDBC_DIR &&\ + curl -sO https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/$KAFKA_JDBC_VERSION/kafka-connect-jdbc-$KAFKA_JDBC_VERSION.jar + ~~~ + +1. Create the JSON configuration file that you will use to create the sink: + + {% include_cached copy-clipboard.html %} + ~~~ shell + { + "name": "pg-sink", + "config": { + "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", + "tasks.max": "10", + "topics" : "{topic.example.table}", + "connection.url": "jdbc:postgresql://{host}:{port}/{user}?sslmode=require", + "connection.user": "{username}", + "connection.password": "{password}", + "insert.mode": "upsert", + "pk.mode": "record_value", + "pk.fields":"id", + "database.time_zone": "UTC", + "auto.create":true, + "auto.evolve": false, + "transforms": "unwrap", + "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState" + } + } + ~~~ + + Specify the **Connection URL** in [JDBC format]({% link {{ page.version.version }}/connect-to-the-database.md %}?filters=java&#step-5-connect-to-the-cluster). For information about where to find the CockroachDB connection parameters, see [Connect to a CockroachDB Cluster]({% link {{ page.version.version }}/connect-to-the-database.md %}). + +1. To create the sink, `POST` the JSON configuration file to the Kafka Connect `/connectors` endpoint. Refer to the [Kafka Connect API documentation](https://kafka.apache.org/documentation/#connect_rest) for more information. + +## See also + +- [Migration Overview]({% link {{ page.version.version }}/migration-overview.md %}) +- [Schema Conversion Tool](https://www.cockroachlabs.com/docs/cockroachcloud/migrations-page) +- [Change Data Capture Overview]({% link {{ page.version.version }}/change-data-capture-overview.md %}) +- [Third-Party Tools Supported by Cockroach Labs]({% link {{ page.version.version }}/third-party-database-tools.md %}) +- [Stream a Changefeed to a Confluent Cloud Kafka Cluster]({% link {{ page.version.version }}/stream-a-changefeed-to-a-confluent-cloud-kafka-cluster.md %})