Skip to content

Commit

Permalink
Apache Superset: Verify connectivity to CrateDB
Browse files Browse the repository at this point in the history
Add a few basic integration tests having conversations with the Apache
Superset UI and HTTP API.
  • Loading branch information
amotl committed Jan 7, 2024
1 parent f3016da commit 8aa6650
Show file tree
Hide file tree
Showing 11 changed files with 455 additions and 1 deletion.
75 changes: 75 additions & 0 deletions .github/workflows/apache-superset.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
name: Apache Superset

on:
pull_request:
branches: ~
paths:
- '.github/workflows/apache-superset.yml'
- 'framework/apache-superset/**'
- 'requirements.txt'
push:
branches: [ main ]
paths:
- '.github/workflows/apache-superset.yml'
- 'framework/apache-superset/**'
- 'requirements.txt'

# Allow job to be triggered manually.
workflow_dispatch:

# Run job each night after CrateDB nightly has been published.
schedule:
- cron: '0 3 * * *'

# Cancel in-progress jobs when pushing to the same branch.
concurrency:
cancel-in-progress: true
group: ${{ github.workflow }}-${{ github.ref }}

jobs:

tests:
runs-on: ${{ matrix.os }}

strategy:
fail-fast: false
matrix:
os: [ ubuntu-22.04 ]
superset-spec: [ "<3" ]
python-version: [ "3.11" ]

services:
cratedb:
image: crate/crate:nightly
ports:
- 4200:4200
- 5432:5432

name: Python ${{ matrix.python-version }} on OS ${{ matrix.os }}
steps:

- name: Acquire sources
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
architecture: x64
cache: "pip"
cache-dependency-path: |
pyproject.toml
requirements.txt
requirements-test.txt
- name: Install utilities
run: |
pip install -r requirements.txt
- name: Install Apache Superset ${{ matrix.superset-spec }}
run: |
pip install 'apache-superset${{ matrix.superset-spec }}'
- name: Validate framework/apache-superset
run: |
ngr test --accept-no-venv framework/apache-superset
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
.venv*
__pycache__
.coverage
.DS_Store
coverage.xml
mlruns/
archive/
logs.log
logs.log
58 changes: 58 additions & 0 deletions framework/apache-superset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Verify Apache Superset with CrateDB

## About

This folder includes software integration tests for verifying
that Apache Superset works well together with CrateDB.

## Setup

You can also exercise the configuration and setup steps manually.

Start CrateDB.
```bash
docker run --rm -it --name=cratedb \
--publish=4200:4200 --publish=5432:5432 \
--env=CRATE_HEAP_SIZE=4g crate:latest -Cdiscovery.type=single-node
```

Setup sandbox and install packages.
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

Configure and initialize Apache Superset.
```bash
export FLASK_APP=superset
export SUPERSET_CONFIG_PATH=superset_config.py
superset db upgrade
superset fab create-admin --username=admin --password=admin --firstname=admin --lastname=admin [email protected]
superset init
```

Run Superset server.
```bash
superset run -p 8088 --with-threads
open http://127.0.0.1:8088/
```

## API Usage

```bash
# Authenticate and acquire a JWT token.
AUTH_TOKEN=$(http http://localhost:8088/api/v1/security/login username=admin password=admin provider=db | jq -r .access_token)

# Create a data source item / database connection.
http http://localhost:8088/api/v1/database/ database_name="CrateDB Testdrive" engine=crate sqlalchemy_uri=crate://crate@localhost:4200 Authorization:"Bearer ${AUTH_TOKEN}"
```

```bash
# Create datasets and probe them.
crash < data.sql
http http://127.0.0.1:8088/api/v1/dataset/ Authorization:"Bearer ${AUTH_TOKEN}" database=1 schema=doc table_name=devices_info
http http://127.0.0.1:8088/api/v1/dataset/ Authorization:"Bearer ${AUTH_TOKEN}" database=1 schema=doc table_name=devices_readings
cat probe-1.json | http http://127.0.0.1:8088/api/v1/chart/data Authorization:"Bearer ${AUTH_TOKEN}"
cat probe-2.json | http http://127.0.0.1:8088/api/v1/chart/data Authorization:"Bearer ${AUTH_TOKEN}"
```
131 changes: 131 additions & 0 deletions framework/apache-superset/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
import os
import shlex
import shutil
import subprocess
import time

import pytest
import requests

from util import get_auth_headers


superset_env = {
"FLASK_APP": "superset",
"SUPERSET_CONFIG_PATH": "superset_config.py",
}
superset_bin = shutil.which("superset")


uri_database = "http://localhost:8088/api/v1/database/"


# Utility functions.

def invoke_superset(command: str):
"""
Invoke `superset` command.
"""
command = f"{superset_bin} {command}"
subprocess.check_call(shlex.split(command), env=superset_env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)


# Test suite fixtures.

@pytest.fixture(scope="session")
def fix_greenlet():
"""
Install more recent greenlet, because Playwright installs version 3.0.1, which breaks Superset.
"""
os.system("pip install --upgrade greenlet")


@pytest.fixture(scope="session")
def playwright_install_firefox():
"""
Playwright needs a browser.
"""
os.system("playwright install firefox")


@pytest.fixture(scope="session")
def initialize_superset():
"""
Run the Apache Superset setup procedure.
"""
invoke_superset("db upgrade")
invoke_superset("fab create-admin --username=admin --password=admin --firstname=admin --lastname=admin [email protected]")
invoke_superset("init")


@pytest.fixture(scope="session")
def reset_superset():
"""
Reset database connections and datasets.
"""
resources_to_delete = [
"http://localhost:8088/api/v1/dataset/1",
"http://localhost:8088/api/v1/dataset/2",
"http://localhost:8088/api/v1/database/1",
"http://localhost:8088/api/v1/database/2",
]
for resource_to_delete in resources_to_delete:
response = requests.delete(resource_to_delete, headers=get_auth_headers())
assert response.status_code in [200, 404], response.json()


@pytest.fixture(scope="session")
def start_superset():
"""
Start the Apache Superset server.
"""
command = f"{superset_bin} run -p 8088 --with-threads"
daemon = subprocess.Popen(shlex.split(command), env=superset_env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
# Give the server time to start
time.sleep(4)
# Check it started successfully
assert not daemon.poll(), daemon.stdout.read().decode("utf-8")
yield daemon
# Shut it down at the end of the pytest session
daemon.terminate()


@pytest.fixture(scope="session")
def provision_superset(start_superset):
"""
Provision Superset by creating a database connection object for CrateDB.
"""

# Create a data source item / database connection.
response = requests.post(
uri_database,
headers=get_auth_headers(),
json={"database_name": "CrateDB Testdrive", "engine": "crate", "sqlalchemy_uri": "crate://crate@localhost:4200"},
)

assert response.status_code == 201
assert response.json() == {
"id": 1,
"result": {
"configuration_method": "sqlalchemy_form",
"database_name": "CrateDB Testdrive",
"driver": "crate-python",
"expose_in_sqllab": True,
"sqlalchemy_uri": "crate://crate@localhost:4200",
},
}


@pytest.fixture(scope="session", autouse=True)
def do_setup(
fix_greenlet,
playwright_install_firefox,
initialize_superset,
start_superset,
reset_superset,
provision_superset,
):
"""
Provide a fully configured and provisioned Apache Superset instance to the test suite.
"""
pass
36 changes: 36 additions & 0 deletions framework/apache-superset/data.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
-- https://github.com/crate/cratedb-datasets

CREATE TABLE IF NOT EXISTS devices_readings (
"ts" TIMESTAMP WITH TIME ZONE,
"device_id" TEXT,
"battery" OBJECT(DYNAMIC) AS (
"level" BIGINT,
"status" TEXT,
"temperature" DOUBLE PRECISION
),
"cpu" OBJECT(DYNAMIC) AS (
"avg_1min" DOUBLE PRECISION,
"avg_5min" DOUBLE PRECISION,
"avg_15min" DOUBLE PRECISION
),
"memory" OBJECT(DYNAMIC) AS (
"free" BIGINT,
"used" BIGINT
)
);

CREATE TABLE IF NOT EXISTS devices_info (
"device_id" TEXT,
"api_version" TEXT,
"manufacturer" TEXT,
"model" TEXT,
"os_name" TEXT
);

COPY "devices_readings"
FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/devices_readings.json.gz'
WITH (compression = 'gzip');

COPY "devices_info"
FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/devices_info.json.gz'
WITH (compression = 'gzip');
11 changes: 11 additions & 0 deletions framework/apache-superset/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[tool.pytest.ini_options]
minversion = "2.0"
addopts = """
-rfEXs -p pytester --strict-markers --verbosity=3
"""
log_level = "DEBUG"
log_cli_level = "DEBUG"
testpaths = ["*.py"]
xfail_strict = true
markers = [
]
3 changes: 3 additions & 0 deletions framework/apache-superset/requirements-test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
playwright<2
pytest<8
requests<3
3 changes: 3 additions & 0 deletions framework/apache-superset/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
apache-superset<3
crate[sqlalchemy]==0.34.0
marshmallow_enum<2 # Seems to be missing from `apache-superset`?
32 changes: 32 additions & 0 deletions framework/apache-superset/superset_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Superset specific config
ROW_LIMIT = 5000

# Flask App Builder configuration
# Your App secret key will be used for securely signing the session cookie
# and encrypting sensitive information on the database
# Make sure you are changing this key for your deployment with a strong key.
# Alternatively you can set it with `SUPERSET_SECRET_KEY` environment variable.
# You MUST set this for production environments or the server will not refuse
# to start and you will see an error in the logs accordingly.
SECRET_KEY = 'VcKzHS4g2h+dP33tCbqOghtKaU37wvFECMhVqrfccaoI/17qh/j3+VDV'

# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
# The check_same_thread=false property ensures the sqlite client does not attempt
# to enforce single-threaded access, which may be problematic in some edge cases
# When not configured, the default location is `~/.superset/superset.db`.
# See also https://superset.apache.org/docs/installation/configuring-superset/.
# SQLALCHEMY_DATABASE_URI = 'sqlite:////path/to/superset.db?check_same_thread=false'

# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = True
# Add endpoints that need to be exempt from CSRF protection
WTF_CSRF_EXEMPT_LIST = []
# A CSRF token that expires in 1 year
WTF_CSRF_TIME_LIMIT = 60 * 60 * 24 * 365

# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''
Loading

0 comments on commit 8aa6650

Please sign in to comment.