Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Add clickhouse dekaf docs #1775

Merged
merged 1 commit into from
Nov 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions site/docs/guides/dekaf_reading_collections_from_kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ walk you through the steps to connect to Estuary Flow using Dekaf and its schema

To connect to Estuary Flow via Dekaf, you need the following connection details:

- **Broker Address**: `dekaf.estuary.dev`
- **Schema Registry Address**: `https://dekaf.estuary.dev`
- **Broker Address**: `dekaf.estuary-data.com`
- **Schema Registry Address**: `https://dekaf.estuary-data.com`
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
- **SASL Username**: `{}`
Expand All @@ -57,7 +57,7 @@ from kafka import KafkaConsumer

# Configuration details
conf = {
'bootstrap_servers': 'dekaf.estuary.dev:9092',
'bootstrap_servers': 'dekaf.estuary-data.com:9092',
'security_protocol': 'SASL_SSL',
'sasl_mechanism': 'PLAIN',
'sasl_plain_username': '{}',
Expand Down Expand Up @@ -100,10 +100,10 @@ kcat -C \
-X sasl.mechanism=PLAIN \
-X sasl.username="{}" \
-X sasl.password="Your_Estuary_Refresh_Token" \
-b dekaf.estuary.dev:9092 \
-b dekaf.estuary-data.com:9092 \
-t "full/nameof/estuarycolletion" \
-p 0 \
-o beginning \
-s avro \
-r https://{}:{Your_Estuary_Refresh_Token}@dekaf.estuary.dev
-r https://{}:{Your_Estuary_Refresh_Token}@dekaf.estuary-data.com
```
3 changes: 2 additions & 1 deletion site/docs/reference/Connectors/dekaf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ functionality enables integrations with the Kafka ecosystem.
- [StarTree](/reference/Connectors/dekaf/dekaf-startree)
- [SingleStore](/reference/Connectors/dekaf/dekaf-singlestore)
- [Imply](/reference/Connectors/dekaf/dekaf-imply)
- [Bytewax](/reference/Connectors/dekaf/dekaf-bytewax)
- [Bytewax](/reference/Connectors/dekaf/dekaf-bytewax)
- [Clickhouse](/reference/Connectors/dekaf/dekaf-clickhouse)
2 changes: 1 addition & 1 deletion site/docs/reference/Connectors/dekaf/dekaf-bytewax.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ high-throughput, low-latency data processing tasks.
from bytewax.window import TumblingWindowConfig, SystemClockConfig

# Estuary Flow Dekaf configuration
KAFKA_BOOTSTRAP_SERVERS = "dekaf.estuary.dev:9092"
KAFKA_BOOTSTRAP_SERVERS = "dekaf.estuary-data.com:9092"
KAFKA_TOPIC = "/full/nameof/your/collection"

# Parse incoming messages
Expand Down
55 changes: 55 additions & 0 deletions site/docs/reference/Connectors/dekaf/dekaf-clickhouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Integrating ClickHouse Cloud with Estuary Flow via Dekaf

## Overview

This guide covers how to integrate ClickHouse Cloud with Estuary Flow using Dekaf, Estuary’s Kafka API compatibility
layer, and ClickPipes for real-time analytics. This integration allows ClickHouse Cloud users to stream data from a vast
array of sources supported by Estuary Flow directly into ClickHouse, using Dekaf for Kafka compatibility.

## Prerequisites

- **[ClickHouse Cloud](https://clickhouse.com/) account** with permissions to configure ClickPipes for data ingestion.
- **[Estuary Flow account](https://dashboard.estuary.dev/register)** with access to Dekaf and necessary connectors (
e.g., Salesforce, databases).
- **Estuary Flow Refresh Token** to authenticate with Dekaf.

---

## Step 1: Configure Data Source in Estuary Flow

1. **Generate a [Refresh Token](Estuary Refresh Token ([Generate a refresh token](/guides/how_to_generate_refresh_token))**:
- To access the Kafka-compatible topics, create a refresh token in the Estuary Flow dashboard. This token will act
as the password for both the broker and schema registry.

2. **Connect to Dekaf**:
- Estuary Flow will automatically expose your collections as Kafka-compatible topics through Dekaf. No additional
configuration is required.
- Dekaf provides the following connection details:

```
Broker Address: dekaf.estuary-data.com:9092
Schema Registry Address: https://dekaf.estuary-data.com
Security Protocol: SASL_SSL
SASL Mechanism: PLAIN
SASL Username: {}
SASL Password: <Estuary Refresh Token>
Schema Registry Username: {}
Schema Registry Password: <Estuary Refresh Token>
```

---

## Step 2: Configure ClickPipes in ClickHouse Cloud

1. **Set Up ClickPipes**:
- In ClickHouse Cloud, go to **Integrations** and select **Apache Kafka** as the data source.

2. **Enter Connection Details**:
- Use the connection parameters from the previous step to configure access to Estuary Flow.

3. **Map Data Fields**:
- Ensure that ClickHouse can parse the incoming data properly. Use ClickHouse’s mapping interface to align fields
between Estuary Flow collections and ClickHouse tables.

4. **Provision the ClickPipe**:
- Kick off the integration and allow ClickPipes to set up the pipeline (should complete within a few seconds).
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-imply.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Druid, designed for real-time analytics on streaming and batch data.

5. In the Kafka configuration section, enter the following details:

- **Bootstrap Servers**: `dekaf.estuary.dev:9092`
- **Bootstrap Servers**: `dekaf.estuary-data.com:9092`
- **Topic**: Your Estuary Flow collection name (e.g., `/my-organization/my-collection`)
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
Expand All @@ -28,7 +28,7 @@ Druid, designed for real-time analytics on streaming and batch data.
6. For the "Input Format", select "avro".

7. Configure the Schema Registry settings:
- **Schema Registry URL**: `https://dekaf.estuary.dev`
- **Schema Registry URL**: `https://dekaf.estuary-data.com`
- **Schema Registry Username**: `{}` (same as SASL Username)
- **Schema Registry Password**: `The same Estuary Access Token as above`

Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-materialize.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ for defining transformations and queries.

CREATE
CONNECTION estuary_connection TO KAFKA (
BROKER 'dekaf.estuary.dev',
BROKER 'dekaf.estuary-data.com',
SECURITY PROTOCOL = 'SASL_SSL',
SASL MECHANISMS = 'PLAIN',
SASL USERNAME = '{}',
Expand All @@ -29,7 +29,7 @@ for defining transformations and queries.

CREATE
CONNECTION csr_estuary_connection TO CONFLUENT SCHEMA REGISTRY (
URL 'https://dekaf.estuary.dev',
URL 'https://dekaf.estuary-data.com',
USERNAME = '{}',
PASSWORD = SECRET estuary_refresh_token
);
Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-singlestore.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ offering high performance for both transactional and analytical workloads.
CREATE TABLE test_table (id NUMERIC, server_name VARCHAR(255), title VARCHAR(255));

CREATE PIPELINE test AS
LOAD DATA KAFKA "dekaf.estuary.dev:9092/demo/wikipedia/recentchange-sampled"
LOAD DATA KAFKA "dekaf.estuary-data.com:9092/demo/wikipedia/recentchange-sampled"
CONFIG '{
"security.protocol":"SASL_SSL",
"sasl.mechanism":"PLAIN",
Expand All @@ -34,7 +34,7 @@ offering high performance for both transactional and analytical workloads.
"schema.registry.password": "ESTUARY_ACCESS_TOKEN"
}'
INTO table test_table
FORMAT AVRO SCHEMA REGISTRY 'https://dekaf.estuary.dev'
FORMAT AVRO SCHEMA REGISTRY 'https://dekaf.estuary-data.com'
( id <- id, server_name <- server_name, title <- title );
```
4. Your pipeline should now start ingesting data from Estuary Flow into SingleStore.
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-startree.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ low-latency analytics on large-scale data.

![Create StarTree Connection](https://storage.googleapis.com/estuary-marketing-strapi-uploads/uploads//startree_create_connection_548379d134/startree_create_connection_548379d134.png)

- **Bootstrap Servers**: `dekaf.estuary.dev`
- **Bootstrap Servers**: `dekaf.estuary-data.com`
- **Security Protocol**: `SASL_SSL`
- **SASL Mechanism**: `PLAIN`
- **SASL Username**: `{}`
- **SASL Password**: `Your generated Estuary Refresh Token`

5. **Configure Schema Registry**: To decode Avro messages, enable schema registry settings:

- **Schema Registry URL**: `https://dekaf.estuary.dev`
- **Schema Registry URL**: `https://dekaf.estuary-data.com`
- **Schema Registry Username**: `{}` (same as SASL Username)
- **Schema Registry Password**: `The same Estuary Refresh Token as above`

Expand Down
4 changes: 2 additions & 2 deletions site/docs/reference/Connectors/dekaf/dekaf-tinybird.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ In this guide, you'll learn how to use Estuary Flow to push data streams to Tiny

To configure the connection details, use the following settings.

Bootstrap servers: `dekaf.estuary.dev`
Bootstrap servers: `dekaf.estuary-data.com`
SASL Mechanism: `PLAIN`
SASL Username: `{}`
SASL Password: `Estuary Refresh Token` (Generate your token in the Estuary Admin Dashboard)

Tick the Decode Avro messages with Schema Register box, and use the following settings:

- URL: `https://dekaf.estuary.dev`
- URL: `https://dekaf.estuary-data.com`
- Username: `{}`
- Password: `The same Estuary Refresh Token as above`

Expand Down
Loading