Skip to content

Latest commit

 

History

History
75 lines (46 loc) · 1.5 KB

File metadata and controls

75 lines (46 loc) · 1.5 KB

Near Realtime Analytics of DynamoDB Data with Redshift Streaming Ingestion

This demo shows how you can use Redshift Streaming Ingestion (Preview) to sync DynamoDB data with Redshift in near realtime for ETL, Analytics, and Reporting all using SQL

Motivation

There are tons of great tools for streaming ETL, but if you already know SQL why complicate things when you can use the tools you are already familiar with to load data for in near realtime for analytics.

Architecture Diagram

img

Data Flow Diagram

img

Deployment

Requirements

  • aws cli
  • NodeJS
  • npm
  • jq

Deploy Infrastructure

Install dependencies

npm install

Deploy DynamoDB table, data generator lambda, Kinesis Data Stream, VPC, Redshift Cluster and Redshift IAM Role

npm run deploy

Setup Redshift

Note: this will read the outputs.json file generated by the deploy step above

bash scripts/setup_redshift.sh

Export DynamoDB Table and Initial Redshift Data Load

bash scripts/export_dynamodb_backup.sh
bash scripts/initial_load_from_export.sh -a <export_arn>

Test Incremental Sync of New Member Records

bash scripts/test_sync_time.sh

Login to the Redshift query editor v2 to explore

Go to https://us-east-1.console.aws.amazon.com/sqlworkbench/home?region=us-east-1#/client and login to AWS Account

To connect to database select temporary credentials and admin for the user

Clean up

npm run destroy