Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache Superset: Verify connectivity to CrateDB with basic integration tests #217

Merged
merged 2 commits into from
Jan 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions .github/workflows/apache-superset.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
name: Apache Superset

on:
pull_request:
branches: ~
paths:
- '.github/workflows/apache-superset.yml'
- 'framework/apache-superset/**'
- 'requirements.txt'
push:
branches: [ main ]
paths:
- '.github/workflows/apache-superset.yml'
- 'framework/apache-superset/**'
- 'requirements.txt'

# Allow job to be triggered manually.
workflow_dispatch:

# Run job each night after CrateDB nightly has been published.
schedule:
- cron: '0 3 * * *'

# Cancel in-progress jobs when pushing to the same branch.
concurrency:
cancel-in-progress: true
group: ${{ github.workflow }}-${{ github.ref }}

jobs:

tests:
runs-on: ${{ matrix.os }}

strategy:
fail-fast: false
matrix:
os: [ ubuntu-22.04 ]
superset-version: [ "2.*", "3.*" ]
python-version: [ "3.11" ]

services:
cratedb:
image: crate/crate:nightly
ports:
- 4200:4200
- 5432:5432

name: Superset ${{ matrix.superset-version }}, Python ${{ matrix.python-version }}
steps:

- name: Acquire sources
uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
architecture: x64
cache: "pip"
cache-dependency-path: |
pyproject.toml
requirements.txt
requirements-test.txt

- name: Install utilities
run: |
pip install -r requirements.txt

- name: Install Apache Superset ${{ matrix.superset-version }}
run: |
pip install 'apache-superset==${{ matrix.superset-version }}'

- name: Validate framework/apache-superset
run: |
ngr test --accept-no-venv framework/apache-superset
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
.venv*
__pycache__
.coverage
.DS_Store
coverage.xml
mlruns/
archive/
logs.log
logs.log
58 changes: 58 additions & 0 deletions framework/apache-superset/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Verify Apache Superset with CrateDB

## About

This folder includes software integration tests for verifying
that Apache Superset works well together with CrateDB.

## Setup

You can also exercise the configuration and setup steps manually.

Start CrateDB.
```bash
docker run --rm -it --name=cratedb \
--publish=4200:4200 --publish=5432:5432 \
--env=CRATE_HEAP_SIZE=4g crate:latest -Cdiscovery.type=single-node
```

Setup sandbox and install packages.
```bash
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

Configure and initialize Apache Superset.
```bash
export FLASK_APP=superset
export SUPERSET_CONFIG_PATH=superset_config.py
superset db upgrade
superset fab create-admin --username=admin --password=admin --firstname=admin --lastname=admin [email protected]
superset init
```

Run Superset server.
```bash
superset run -p 8088 --with-threads
open http://127.0.0.1:8088/
```

## API Usage

```bash
# Authenticate and acquire a JWT token.
AUTH_TOKEN=$(http http://localhost:8088/api/v1/security/login username=admin password=admin provider=db | jq -r .access_token)

# Create a data source item / database connection.
http http://localhost:8088/api/v1/database/ database_name="CrateDB Testdrive" engine=crate sqlalchemy_uri=crate://crate@localhost:4200 Authorization:"Bearer ${AUTH_TOKEN}"
```

```bash
# Create datasets and probe them.
crash < data.sql
http http://127.0.0.1:8088/api/v1/dataset/ Authorization:"Bearer ${AUTH_TOKEN}" database=1 schema=doc table_name=devices_info
http http://127.0.0.1:8088/api/v1/dataset/ Authorization:"Bearer ${AUTH_TOKEN}" database=1 schema=doc table_name=devices_readings
cat probe-1.json | http http://127.0.0.1:8088/api/v1/chart/data Authorization:"Bearer ${AUTH_TOKEN}"
cat probe-2.json | http http://127.0.0.1:8088/api/v1/chart/data Authorization:"Bearer ${AUTH_TOKEN}"
```
138 changes: 138 additions & 0 deletions framework/apache-superset/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
import os
import shlex
import shutil
import subprocess
import time

import pytest
import requests

from util import get_auth_headers


superset_env = {
"FLASK_APP": "superset",
"SUPERSET_CONFIG_PATH": "superset_config.py",
}
superset_bin = shutil.which("superset")


uri_database = "http://localhost:8088/api/v1/database/"


# Utility functions.

def invoke_superset(command: str):
"""
Invoke `superset` command.
"""
command = f"{superset_bin} {command}"
subprocess.check_call(shlex.split(command), env=superset_env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)


# Test suite fixtures.

@pytest.fixture(scope="session")
def fix_greenlet():
"""
Install more recent greenlet, because Playwright installs version 3.0.1, which breaks Superset.
"""
os.system("pip install --upgrade greenlet")
Comment on lines +35 to +40
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well. Irgendwas ist immer. May need a report or patch on/for Playwright.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@pytest.fixture(scope="session")
def playwright_install_firefox():
"""
Playwright needs a browser.
"""
os.system("playwright install firefox")


@pytest.fixture(scope="session")
def initialize_superset():
"""
Run the Apache Superset setup procedure.
"""
invoke_superset("db upgrade")
invoke_superset("fab create-admin --username=admin --password=admin --firstname=admin --lastname=admin [email protected]")
invoke_superset("init")


@pytest.fixture(scope="session")
def reset_superset():
"""
Reset database connections and datasets.
"""
resources_to_delete = [
"http://localhost:8088/api/v1/dataset/1",
"http://localhost:8088/api/v1/dataset/2",
"http://localhost:8088/api/v1/database/1",
"http://localhost:8088/api/v1/database/2",
]
for resource_to_delete in resources_to_delete:
response = requests.delete(resource_to_delete, headers=get_auth_headers())
assert response.status_code in [200, 404], response.json()


@pytest.fixture(scope="session")
def start_superset():
"""
Start the Apache Superset server.
"""
command = f"{superset_bin} run -p 8088 --with-threads"
daemon = subprocess.Popen(shlex.split(command), env=superset_env, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
# Give the server time to start
time.sleep(4)
# Check it started successfully
assert not daemon.poll(), daemon.stdout.read().decode("utf-8")
yield daemon
# Shut it down at the end of the pytest session
daemon.terminate()


@pytest.fixture(scope="session")
def provision_superset(start_superset):
"""
Provision Superset by creating a database connection object for CrateDB.
"""

# Create a data source item / database connection.
response = requests.post(
uri_database,
headers=get_auth_headers(),
json={"database_name": "CrateDB Testdrive", "engine": "crate", "sqlalchemy_uri": "crate://crate@localhost:4200"},
)

assert response.status_code == 201
payload = response.json()

# Superset 3 uses UUIDs to identify resources.
# Remove them for comparison purposes.
if "uuid" in payload["result"]:
del payload["result"]["uuid"]

assert payload == {
"id": 1,
"result": {
"configuration_method": "sqlalchemy_form",
"database_name": "CrateDB Testdrive",
"driver": "crate-python",
"expose_in_sqllab": True,
"sqlalchemy_uri": "crate://crate@localhost:4200",
},
}


@pytest.fixture(scope="session", autouse=True)
def do_setup(
fix_greenlet,
playwright_install_firefox,
initialize_superset,
start_superset,
reset_superset,
provision_superset,
):
"""
Provide a fully configured and provisioned Apache Superset instance to the test suite.
"""
pass
36 changes: 36 additions & 0 deletions framework/apache-superset/data.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
-- https://github.com/crate/cratedb-datasets

CREATE TABLE IF NOT EXISTS devices_readings (
"ts" TIMESTAMP WITH TIME ZONE,
"device_id" TEXT,
"battery" OBJECT(DYNAMIC) AS (
"level" BIGINT,
"status" TEXT,
"temperature" DOUBLE PRECISION
),
"cpu" OBJECT(DYNAMIC) AS (
"avg_1min" DOUBLE PRECISION,
"avg_5min" DOUBLE PRECISION,
"avg_15min" DOUBLE PRECISION
),
"memory" OBJECT(DYNAMIC) AS (
"free" BIGINT,
"used" BIGINT
)
);

CREATE TABLE IF NOT EXISTS devices_info (
"device_id" TEXT,
"api_version" TEXT,
"manufacturer" TEXT,
"model" TEXT,
"os_name" TEXT
);

COPY "devices_readings"
FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/devices_readings.json.gz'
WITH (compression = 'gzip');

COPY "devices_info"
FROM 'https://github.com/crate/cratedb-datasets/raw/main/cloud-tutorials/devices_info.json.gz'
WITH (compression = 'gzip');
11 changes: 11 additions & 0 deletions framework/apache-superset/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[tool.pytest.ini_options]
minversion = "2.0"
addopts = """
-rfEXs -p pytester --strict-markers --verbosity=3
"""
log_level = "DEBUG"
log_cli_level = "DEBUG"
testpaths = ["*.py"]
xfail_strict = true
markers = [
]
3 changes: 3 additions & 0 deletions framework/apache-superset/requirements-test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
playwright<2
pytest<8
requests<3
3 changes: 3 additions & 0 deletions framework/apache-superset/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
apache-superset
crate[sqlalchemy]==0.34.0
marshmallow_enum<2 # Seems to be missing from `apache-superset`?
32 changes: 32 additions & 0 deletions framework/apache-superset/superset_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Superset specific config
ROW_LIMIT = 5000

# Flask App Builder configuration
# Your App secret key will be used for securely signing the session cookie
# and encrypting sensitive information on the database
# Make sure you are changing this key for your deployment with a strong key.
# Alternatively you can set it with `SUPERSET_SECRET_KEY` environment variable.
# You MUST set this for production environments or the server will not refuse
# to start and you will see an error in the logs accordingly.
SECRET_KEY = 'VcKzHS4g2h+dP33tCbqOghtKaU37wvFECMhVqrfccaoI/17qh/j3+VDV'

# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
# The check_same_thread=false property ensures the sqlite client does not attempt
# to enforce single-threaded access, which may be problematic in some edge cases
# When not configured, the default location is `~/.superset/superset.db`.
# See also https://superset.apache.org/docs/installation/configuring-superset/.
# SQLALCHEMY_DATABASE_URI = 'sqlite:////path/to/superset.db?check_same_thread=false'

# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = False
# Add endpoints that need to be exempt from CSRF protection
WTF_CSRF_EXEMPT_LIST = []
# A CSRF token that expires in 1 year
WTF_CSRF_TIME_LIMIT = 60 * 60 * 24 * 365

# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''
Loading