This project is an overview of an Sales Data Projection data pipeline that involves near real tine data ingestion and transformation with Change Data Capture functionality. We will design a system using AWS services such as S3, Lambda, Glue, DynamoDB, Kinesis Stream, Kinesis Firehose and Event Bridge to ingest, transform, with change data capture functionality to load data in S3 and accessing using Athena for analytical purposes.
- Note: Give the attached IAM role the permission to access DynamoDB & Kinesis
- AmazonDynamoDBFullAccess
- AmazonKinesisFullAccess
-
Create Kinesis Firehose to fetch the data from Kinesis Stream and transform it with the help of Lambda and load as batches into S3
-
Kinesis Firehose
-
Lambda Function for Transformation
-
S3 Bucket "kinesis-firehose-destination-yb" for data load destination
- The data will be generated by mock data generator script and data will flow from DynamoDB to Kinesis Stream. from the Kinesis stream the daat will flow in Kinesis Firehose and Tranform by Lambda function and stored in the destination S3 bucket
-
Schema fetched by Crawler