Skip to content

arora-manish/redis-sql-trino

 
 

Repository files navigation

Redis SQL Trino

Redis SQL Trino is a SQL interface for Redis Stack, Redis Cloud, and Redis Enterprise.


Build Status Coverage

Redis SQL Trino lets lets you easily integrate with visualization frameworks — like Tableau and SuperSet — and platforms that support JDBC-compatible databases (e.g., Mulesoft). Query support includes SELECT statements across secondary indexes on both Redis hashes & JSON, aggregations (e.g., count, min, max, avg), ordering, and more.

Trino is a distributed SQL engine designed to query large data sets across one or more heterogeneous data sources. Though Trino does support a Redis OSS connector, this connector is limited to SCAN and subsequent HGET operations, which do not scale well in high-throughput scenarios. However, that is where Redis SQL Trino shines since it can push the entire query to the data atomically. This eliminates the waste of many network hops and subsequent operations.

Background

Redis is an in-memory data store designed to serve data with the fastest possible response times. For this reason, Redis is frequently used for caching OLTP-style application queries and as a serving layer in data pipeline architectures (e.g., lambda architectures, online feature stores, etc.). Redis Stack is an extension to Redis that, among other things, lets you index your data on secondary attributes and then efficiently query it using a custom query language.

We built the Redis SQL Trino connector so that you can query Redis using SQL. This is useful for any application compatible with JDBC. For example, Redis SQL Trino lets you query and visualize your Redis data from Tableau.

Requirements

Redis SQL Trino requires a Redis deployment that includes RediSearch. RediSearch is a Redis module that adds querying and secondary indexing to Redis.

Redis deployments that bundle RediSearch include:

  • Redis Cloud: Fully-managed, enterprise-grade Redis deployed on AWS, Azure, or GCP.

  • Redis Enterprise: Enterprise-grade Redis for on-premises and private cloud deployment.

  • Redis Stack: Redis distribution that includes RediSearch, RedisJSON, RedisGraph, RedisTimeSeries, and RedisBloom.

Quick start

To understand how Redis SQL Trino works, it’s best to try it for yourself. View the screen recording or follow the steps below:

asciicast

First, clone this git repository:

git clone https://github.com/redis-field-engineering/redis-sql-trino.git
cd redis-sql-trino

Next, use Docker Compose to launch containers for Trino and Redis Stack:

docker-compose up

This example uses a small data set describing a collection of beers. To load the data set, you’ll need to have riot-file installed locally (see the riot-file installation instructions).

Next, use riot-file to import the sample data set into Redis:

riot-file -h localhost import https://storage.googleapis.com/jrx/beers.json \
  hset --keyspace beer --keys id

Each beer is represented as a Redis hash. Start the Redis CLI to examine this data. For example, here’s how you can view the "Beer Town Brown" record:

docker exec -it redis /opt/redis-stack/bin/redis-cli
127.0.0.1:6379> hgetall beer:190

Next, create an index on the beer data. While still in the Redis CLI, you can create the required index by running the following FT.CREATE command:

127.0.0.1:6379> FT.CREATE beers ON HASH PREFIX 1 beer: SCHEMA id TAG SORTABLE brewery_id TAG SORTABLE name TEXT SORTABLE abv NUMERIC SORTABLE descript TEXT style_name TAG SORTABLE cat_name TAG SORTABLE

Now that you’ve indexed the data set, you can query it using SQL statements through Trino. Start the Trino CLI:

docker exec -it trino trino --catalog redisearch --schema default

View "Beer Town Brown" using SQL:

trino:default> select * from beers where id = '190';

Show all beers with an ABV greater than 3.2%:

trino:default> select * from beers where abv > 3.2 order by abv desc;

Installation

To run Redis SQL Trino in production, you’ll need:

Trino

First, you’ll need a working Trino installation.

See the Trino installation and deplyoment guide for details. Trino recommends a container-based deployment using your orchestration platform of choice. If you run Kubernetes, see the Trino Helm chart.

Redis SQL Trino Connector

Next, you’ll need to install the Redis SQL Trino plugin and configure it. See our documentation for plugin installation and plugin configuration.

Redis installation

For a self-managed deployment, or for testing locally, install Redis Stack or spin up a free Redis Cloud instance. If you need a fully-managed, cloud-based deployment of Redis on AWS, GCP, or Azure, see all of the Redis Cloud offerings. For deployment in your own private cloud or data center, consider Redis Enterprise.

Documentation

Redis SQL Trino documentation is available at https://redis-field-engineering.github.io/redis-sql-trino

Usage

The example above uses the Trino CLI to access your data.

Most real world applications will use the Trino JDBC driver to issue queries. See the Redis SQL Trino documentation for details.

Support

Redis SQL Trino is supported by Redis, Inc. on a good faith effort basis. To report bugs, request features, or receive assistance, please file an issue.

License

Redis SQL Trino is licensed under the MIT License. Copyright © 2023 Redis, Inc.

About

Real-time Indexed SQL Queries for Redis

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 98.1%
  • Shell 1.5%
  • Dockerfile 0.4%