Skip to content

Latest commit

 

History

History
67 lines (45 loc) · 2.74 KB

README.md

File metadata and controls

67 lines (45 loc) · 2.74 KB

image

Overview

This demo showcases a AWS Kinesis -> Confluent Cloud -> cloud storage pipeline. You have freedom of choice in selecting the cloud storage provider that is right for your business needs. Benefits of Confluent Cloud:

  • Build business applications on a full event streaming platform
  • Span multiple cloud providers (AWS, GCP, Azure) and on-prem datacenters
  • Use Kafka to aggregate data in single source of truth
  • Harness the power of KSQL

image

End-to-end Streaming ETL

This demo showcases an entire end-to-end streaming ETL deployment, built for 100% cloud services. It is built on the Confluent Platform, including:

  • Kinesis source connector: reads from a Kinesis stream and writes the data to a Kafka topic
  • KSQL: streaming SQL engine that enables real-time data processing against Kafka
  • GCS or S3 sink connector: pushes data from Kafka topics to cloud storage
Component Consumes From Produces To
Kinesis source connector Kinesis stream s1 locations
KSQL locations KSQL streams and tables (ksql.commands)
GCS (or S3) sink connector KSQL tables COUNT_PER_CITY, SUM_PER_CITY GCS (or S3)

Prerequisites

Local

As with the other demos in this repo, you may run the entire demo end-to-end with ./start.sh, and it runs on your local Confluent Platform install. This requires the following:

Run the demo

  1. Configure the cloud storage provider and other demo parameters in the config/demo.cfg file. In particular, be sure to configure the DESTINATION_STORAGE parameter appropriately for Google GCS or AWS S3, and set the appropriate region.

  2. Run the demo:

$ ./start.sh
  1. View all the Kinesis, Kafka, and cloud storage data after running the demo:
$ ./read-data.sh
  1. Stop the demo and clean up:
$ ./stop.sh