These pipelines demonstrate how to bulk ingest data from Oracle 19c and process Change Data Capture (CDC) into Databricks Delta Lake.
- StreamSets Data Collector 3.16.0 or higher. You can deploy Data Collector on your choice of cloud provider of choice, or you can download it for local use.
- Access to Databricks cluster with Databricks Runtime 6.3 or higher
- Ensure the prerequisites for Databricks Delta Lake are satisfied
- Access to Oracle 19c database
- Download the pipelines and import them into your Data Collector or Control Hub
- After importing the pipelines into your environment and before running the pipelines, update pipeline parameters with your Oracle 19c JDBC URL, Databricks cluster JDBC URL, staging information on Databricks Delta Lake destination >> Staging tab and Table/Key columns information on Databricks Delta Lake destination >> Data tab.
- Start your Databricks cluster
For techincal info and detailed explanation, refer to this blog.