Skip to content

Shannon-Data/heatwave-tpch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HeatWave TPC-H

MySQL HeatWave is a fully managed and highly scalable in-memory database service which provides a cost-efficient solution for OLTP, OLAP and Machine Learning. It is available on both Oracle Cloud Infrastructure (OCI) and Amazon Web Service (AWS).

HeatWave is tightly integrated with MySQL database and is optimized for underlying infrastructure. You can run analytics on your MySQL data without requiring ETL and without any change to Your applications. Your applications connect to the HeatWave cluster through standard MySQL protocols. The MySQL Database is built on the MySQL Enterprise Edition Server, which allows developers to quickly create and deploy secure cloud native applications using the world's most popular open source database.

MySQL HeatWave Lakehouse extends MySQL HeatWave to enable you to run analytics workloads on data stored in object storage at 100s of TB data scale. HeatWave Lakehouse provides the same high query performance as MySQL HeatWave, enables you to run queries across data from both MySQL and object storage, providing a great solution for your data warehouse and data lake use cases.

This repository contains SQL scripts derived from TPC Benchmark™H (TPC-H). The SQL scripts contain TPC-H schema generation statements and queries derived from TPC-H benchmark, specific for MySQL HeatWave and MySQL HeatWave Lakehouse.

Software prerequisites:

  1. TPC-H data generation tool to generate TPC-H dataset for workload sizes of your choice
  2. MySQL Shell to import generated TPC-H dataset to MySQL Database System

Required services:

  1. Oracle Cloud Infrastructure
  2. MySQL HeatWave for OCI or MySQL HeatWave on AWS

Repository

  • TPCH - a collection of scripts for TPC-H schema and queries specific to MySQL Database System
  • HeatWave - a collection of scripts to configure HeatWave to run TPC-H queries

Getting started

To run TPC-H queries in HeatWave

  1. Generate TPC-H data using TPC-H data generation tool
  2. Provision a MySQL Database System
  3. Run create_tables.sql to create TPC-H schema on MySQL Database System
  4. Import TPC-H data generated to MySQL Database System. See MySQL Shell Parallel Table Import Utility documentation
  5. Add a HeatWave cluster to MySQL Database System. See HeatWave documentation
  6. Run secondary_load.sql to configure and load data to HeatWave cluster
  7. You are now ready to run the queries derived from TPC-H

To run TPC-H queries in MySQL HeatWave Lakehouse

  1. Generate TPC-H data using TPC-H data generation tool
  2. Keep the generated data in an Object Store bucket in OCI (in the same region where the MySQL Database System will be provisioned). Note down the namespace and bucket information.
  3. Provision a MySQL Database System. See Getting Started with MySQL Database Service
  4. Add a HeatWave cluster to MySQL Database System. See HeatWave documentation
  5. Run create_tables_lakehouse.sql to create TPC-H schema for MySQL HeatWave Lakehouse on MySQL Database System. Make sure to fill in the appropriate <region>, <namespace>, <bucket> and <name> information in the script.
  6. For larger scale TPC-H datasets, you might need to modify your table definitions in create_tables_lakehouse.sql to account for larger data values (BIGINTS instead of INTEGER) in certain columns.
  7. Run secondary_load_lakehouse.sql to configure and load data to HeatWave cluster
  8. You are now ready to run the queries derived from TPC-H

TPC Benchmark™, TPC-H, QppH, QthH, and QphH are trademarks of the Transaction Processing Performance Council.

All parties are granted permission to copy and distribute to any party without fee all or part of this material provided that: 1) copying and distribution is done for the primary purpose of disseminating TPC material; 2) the TPC copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the Transaction Processing Performance Council.

Benchmark queries are derived from the TPC-H benchmark, but results are not comparable to published TPC-H benchmark results since they do not comply with the TPC-H specification.

Contributing

This project is not accepting external contributions at this time. For bugs or enhancement requests, please file a GitHub issue unless it’s security related. When filing a bug remember that the better written the bug is, the more likely it is to be fixed. If you think you’ve found a security vulnerability, do not raise a GitHub issue and follow the instructions in our security policy.

Security

Please consult the security guide for our responsible security vulnerability disclosure process

License

Copyright (c) 2020, 2023 Oracle and/or its affiliates.

Released under the Universal Permissive License v1.0 as shown at https://oss.oracle.com/licenses/upl/.

About

SQL scripts for HeatWave benchmarking

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published