Skip to content

Commit

Permalink
Debezium migration (#17945)
Browse files Browse the repository at this point in the history
* placeholders

* drafting

* dustin's revision

* shell and clipboard

* review

* jdbc link
  • Loading branch information
gemma-shay authored Oct 10, 2023
1 parent efa06de commit 8fc4b42
Show file tree
Hide file tree
Showing 6 changed files with 202 additions and 0 deletions.
1 change: 1 addition & 0 deletions src/current/_includes/v23.1/misc/tooling.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ For a list of tools supported by the CockroachDB community, see [Third-Party Too
| [Qlik Replicate](https://www.qlik.com/us/products/qlik-replicate) | November 2022 | Full | [Migrate and Replicate Data with Qlik Replicate]({% link {{ page.version.version }}/qlik.md %})
| [Striim](https://www.striim.com) | 4.1.2 | Full | [Migrate and Replicate Data with Striim]({% link {{ page.version.version }}/striim.md %})
| [Oracle GoldenGate](https://www.oracle.com/integration/goldengate/) | 21.3 | Partial | [Migrate and Replicate Data with Oracle GoldenGate]({% link {{ page.version.version }}/goldengate.md %})
| [Debezium](https://debezium.io/) | 2.4 | Full | [Migrate Data with Debezium]({% link {{ page.version.version }}/debezium.md %})

## Provisioning tools
| Tool | Latest tested version | Support level | Documentation |
Expand Down
6 changes: 6 additions & 0 deletions src/current/_includes/v23.1/sidebar-data/migrate.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@
"urls": [
"/${VERSION}/goldengate.html"
]
},
{
"title": "Debezium",
"urls": [
"/${VERSION}/debezium.html"
]
}
]
},
Expand Down
1 change: 1 addition & 0 deletions src/current/_includes/v23.2/misc/tooling.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ For a list of tools supported by the CockroachDB community, see [Third-Party Too
| [Qlik Replicate](https://www.qlik.com/us/products/qlik-replicate) | November 2022 | Full | [Migrate and Replicate Data with Qlik Replicate]({% link {{ page.version.version }}/qlik.md %})
| [Striim](https://www.striim.com) | 4.1.2 | Full | [Migrate and Replicate Data with Striim]({% link {{ page.version.version }}/striim.md %})
| [Oracle GoldenGate](https://www.oracle.com/integration/goldengate/) | 21.3 | Partial | [Migrate and Replicate Data with Oracle GoldenGate]({% link {{ page.version.version }}/goldengate.md %})
| [Debezium](https://debezium.io/) | 2.4 | Full | [Migrate Data with Debezium]({% link {{ page.version.version }}/debezium.md %})

## Provisioning tools
| Tool | Latest tested version | Support level | Documentation |
Expand Down
6 changes: 6 additions & 0 deletions src/current/_includes/v23.2/sidebar-data/migrate.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@
"urls": [
"/${VERSION}/goldengate.html"
]
},
{
"title": "Debezium",
"urls": [
"/${VERSION}/debezium.html"
]
}
]
},
Expand Down
94 changes: 94 additions & 0 deletions src/current/v23.1/debezium.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
title: Migrate Data with Debezium
summary: Use Debezium to migrate data to a CockroachDB cluster.
toc: true
docs_area: migrate
---

[Debezium](https://debezium.io/) is a self-hosted distributed platform that can read data from a variety of sources and import it into Kafka. You can use Debezium to [migrate data to CockroachDB](#migrate-data-to-cockroachdb) from another database that is accessible over the public internet.

As of this writing, Debezium supports the following database [sources](https://debezium.io/documentation/reference/stable/connectors/index.html):

- MongoDB
- MySQL
- PostgreSQL
- SQL Server
- Oracle
- Db2
- Cassandra
- Vitess (incubating)
- Spanner (incubating)
- JDBC (incubating)

{{site.data.alerts.callout_info}}
Migrating with Debezium requires familiarity with Kafka. Refer to the [Debezium documentation](https://debezium.io/documentation/reference/stable/architecture.html) for information on how Debezium is deployed with Kafka Connect.
{{site.data.alerts.end}}

## Before you begin

Complete the following items before using Debezium:

- Configure a secure [publicly-accessible]({% link cockroachcloud/network-authorization.md %}) CockroachDB cluster running the latest **{{ page.version.version }}** [production release](https://www.cockroachlabs.com/docs/releases/{{ page.version.version }}) with at least one [SQL user]({% link {{ page.version.version }}/security-reference/authorization.md %}#sql-users), make a note of the credentials for the SQL user.
- Install and configure [Debezium](https://debezium.io/), [Kafka Connect](https://docs.confluent.io/platform/current/connect/index.html), and [Kafka](https://kafka.apache.org/). This documentation assumes you have already added data from your [source database](https://debezium.io/documentation/reference/stable/connectors/index.html) to a Kafka topic.

## Migrate data to CockroachDB

Once all of the [prerequisite steps](#before-you-begin) are completed, you can use Debezium to migrate data to CockroachDB.

1. To write data from Kafka to CockroachDB, use the Confluent JDBC Sink Connector. First use the following `dockerfile` to create a custom image with the [JDBC driver](https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc):

{% include_cached copy-clipboard.html %}
~~~
FROM quay.io/debezium/connect:latest
ENV KAFKA_CONNECT_JDBC_DIR=$KAFKA_CONNECT_PLUGINS_DIR/kafka-connect-jdbc \
ARG POSTGRES_VERSION=latest
ARG KAFKA_JDBC_VERSION=latest
# Deploy PostgreSQL JDBC Driver
RUN cd /kafka/libs && curl -sO https://jdbc.postgresql.org/download/postgresql-$POSTGRES_VERSION.jar
# Deploy Kafka Connect JDBC
RUN mkdir $KAFKA_CONNECT_JDBC_DIR && cd $KAFKA_CONNECT_JDBC_DIR &&\
curl -sO https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/$KAFKA_JDBC_VERSION/kafka-connect-jdbc-$KAFKA_JDBC_VERSION.jar
~~~
1. Create the JSON configuration file that you will use to create the sink:
{% include_cached copy-clipboard.html %}
~~~ shell
{
"name": "pg-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "10",
"topics" : "{topic.example.table}",
"connection.url": "jdbc:postgresql://{host}:{port}/{user}?sslmode=require",
"connection.user": "{username}",
"connection.password": "{password}",
"insert.mode": "upsert",
"pk.mode": "record_value",
"pk.fields":"id",
"database.time_zone": "UTC",
"auto.create":true,
"auto.evolve": false,
"transforms": "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState"
}
}
~~~
Specify the **Connection URL** in [JDBC format]({% link {{ page.version.version }}/connect-to-the-database.md %}?filters=java&#step-5-connect-to-the-cluster). For information about where to find the CockroachDB connection parameters, see [Connect to a CockroachDB Cluster]({% link {{ page.version.version }}/connect-to-the-database.md %}).
1. To create the sink, `POST` the JSON configuration file to the Kafka Connect `/connectors` endpoint. Refer to the [Kafka Connect API documentation](https://kafka.apache.org/documentation/#connect_rest) for more information.
## See also
- [Migration Overview]({% link {{ page.version.version }}/migration-overview.md %})
- [Schema Conversion Tool](https://www.cockroachlabs.com/docs/cockroachcloud/migrations-page)
- [Change Data Capture Overview]({% link {{ page.version.version }}/change-data-capture-overview.md %})
- [Third-Party Tools Supported by Cockroach Labs]({% link {{ page.version.version }}/third-party-database-tools.md %})
- [Stream a Changefeed to a Confluent Cloud Kafka Cluster]({% link {{ page.version.version }}/stream-a-changefeed-to-a-confluent-cloud-kafka-cluster.md %})
94 changes: 94 additions & 0 deletions src/current/v23.2/debezium.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
title: Migrate Data with Debezium
summary: Use Debezium to migrate data to a CockroachDB cluster.
toc: true
docs_area: migrate
---

[Debezium](https://debezium.io/) is a self-hosted distributed platform that can read data from a variety of sources and import it into Kafka. You can use Debezium to [migrate data to CockroachDB](#migrate-data-to-cockroachdb) from another database that is accessible over the public internet.

As of this writing, Debezium supports the following database [sources](https://debezium.io/documentation/reference/stable/connectors/index.html):

- MongoDB
- MySQL
- PostgreSQL
- SQL Server
- Oracle
- Db2
- Cassandra
- Vitess (incubating)
- Spanner (incubating)
- JDBC (incubating)

{{site.data.alerts.callout_info}}
Migrating with Debezium requires familiarity with Kafka. Refer to the [Debezium documentation](https://debezium.io/documentation/reference/stable/architecture.html) for information on how Debezium is deployed with Kafka Connect.
{{site.data.alerts.end}}

## Before you begin

Complete the following items before using Debezium:

- Configure a secure [publicly-accessible]({% link cockroachcloud/network-authorization.md %}) CockroachDB cluster running the latest **{{ page.version.version }}** [production release](https://www.cockroachlabs.com/docs/releases/{{ page.version.version }}) with at least one [SQL user]({% link {{ page.version.version }}/security-reference/authorization.md %}#sql-users), make a note of the credentials for the SQL user.
- Install and configure [Debezium](https://debezium.io/), [Kafka Connect](https://docs.confluent.io/platform/current/connect/index.html), and [Kafka](https://kafka.apache.org/). This documentation assumes you have already added data from your [source database](https://debezium.io/documentation/reference/stable/connectors/index.html) to a Kafka topic.

## Migrate data to CockroachDB

Once all of the [prerequisite steps](#before-you-begin) are completed, you can use Debezium to migrate data to CockroachDB.

1. To write data from Kafka to CockroachDB, use the Confluent JDBC Sink Connector. First use the following `dockerfile` to create a custom image with the [JDBC driver](https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc):

{% include_cached copy-clipboard.html %}
~~~
FROM quay.io/debezium/connect:latest
ENV KAFKA_CONNECT_JDBC_DIR=$KAFKA_CONNECT_PLUGINS_DIR/kafka-connect-jdbc \
ARG POSTGRES_VERSION=latest
ARG KAFKA_JDBC_VERSION=latest
# Deploy PostgreSQL JDBC Driver
RUN cd /kafka/libs && curl -sO https://jdbc.postgresql.org/download/postgresql-$POSTGRES_VERSION.jar
# Deploy Kafka Connect JDBC
RUN mkdir $KAFKA_CONNECT_JDBC_DIR && cd $KAFKA_CONNECT_JDBC_DIR &&\
curl -sO https://packages.confluent.io/maven/io/confluent/kafka-connect-jdbc/$KAFKA_JDBC_VERSION/kafka-connect-jdbc-$KAFKA_JDBC_VERSION.jar
~~~
1. Create the JSON configuration file that you will use to create the sink:
{% include_cached copy-clipboard.html %}
~~~ shell
{
"name": "pg-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "10",
"topics" : "{topic.example.table}",
"connection.url": "jdbc:postgresql://{host}:{port}/{user}?sslmode=require",
"connection.user": "{username}",
"connection.password": "{password}",
"insert.mode": "upsert",
"pk.mode": "record_value",
"pk.fields":"id",
"database.time_zone": "UTC",
"auto.create":true,
"auto.evolve": false,
"transforms": "unwrap",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState"
}
}
~~~
Specify the **Connection URL** in [JDBC format]({% link {{ page.version.version }}/connect-to-the-database.md %}?filters=java&#step-5-connect-to-the-cluster). For information about where to find the CockroachDB connection parameters, see [Connect to a CockroachDB Cluster]({% link {{ page.version.version }}/connect-to-the-database.md %}).
1. To create the sink, `POST` the JSON configuration file to the Kafka Connect `/connectors` endpoint. Refer to the [Kafka Connect API documentation](https://kafka.apache.org/documentation/#connect_rest) for more information.
## See also
- [Migration Overview]({% link {{ page.version.version }}/migration-overview.md %})
- [Schema Conversion Tool](https://www.cockroachlabs.com/docs/cockroachcloud/migrations-page)
- [Change Data Capture Overview]({% link {{ page.version.version }}/change-data-capture-overview.md %})
- [Third-Party Tools Supported by Cockroach Labs]({% link {{ page.version.version }}/third-party-database-tools.md %})
- [Stream a Changefeed to a Confluent Cloud Kafka Cluster]({% link {{ page.version.version }}/stream-a-changefeed-to-a-confluent-cloud-kafka-cluster.md %})

0 comments on commit 8fc4b42

Please sign in to comment.