The sparql-loadtest is a command-line tool to measure SPARQL query performance on the Manetu platform. The basic premise is that you may specify an arbitrary query with optional initial-bindings, and the tool will run a fixed number of iterations with the specified concurrency. After the test, the tool will compute and report various metrics such as the min/ave/max latency and the throughput rate.
This utility helps measure your cluster's overall query performance and helps optimize your SPARQL query.
- JDK (tested with JDK22)
- make
$ make
$ java -jar ./target/uberjar/app.jar -h
Usage: manetu-sparql-loadtest [options]
Measures the performance metrics of concurrent SPARQL queries to the Manetu platform
Options:
-h, --help
-v, --version Print the version and exit
-u, --url URL The connection URL
--insecure Disable TLS host checking
--[no-]progress true Enable/disable progress output (default: enabled)
-l, --log-level LEVEL :info Select the logging verbosity level from: [trace, debug, info, error]
-c, --concurrency NUM 64 The number of parallel jobs to run
-d, --driver DRIVER :gql Select the driver from: [null, gql]
-q, --query PATH The path to a file containing a SPARQL query to use in test
-b, --bindings FILE (Optional) The path to a CSV file to cycle through as input bindings to each SPARQL query
-n, --nr COUNT 10000 The number of queries to execute
In addition to --url and optionally --insecure, you must set the environment variable MANETU_TOKEN to a personal access token issued from your Manetu cluster.
The parameters of the test include the level of concurrency (--concurrency), the number of iterations (--nr), the query (--query) specified as a path to a file containing a SPARQL expression, and optionally a set of initial bindings (--bindings) defined as a path to a CSV file.
The SPARQL expression must be legal SPARQL grammar. For example:
PREFIX manetu: <http://manetu.com/manetu/>
PREFIX mmeta: <http://manetu.io/rdf/metadata/0.1/>
SELECT ?label
WHERE {
?s manetu:email ?email ;
mmeta:vaultLabel ?label .
}
The expression may optionally contain input bindings that will be satisfied by the CSV input file specified by --bindings. The utility will submit the column header of the CSV file as a binding verbatim, with the requisite "?" prefix. For example, for the following CSV:
id,name,email
1,alice,[email protected]
2,bob,[email protected]
This would generate bindings such as:
{"?id": "1", "?name": "alice", "?email": "[email protected]"}
The utility will cycle through the rows for cases where the number of iterations (--nr) exceeds the rows in the --bindings. For example, if the initial bindings contains 4 rows (numbered 1-4) and the test is executed with --nr 10, the utility will generate queries with the rows cycled e.g. [1 2 3 4 1 2 3 4 1 2].
For this example, we will use a query to search the graph by email to return a vault label. We will use several files from this repository for our query and bindings. Running this example assumes you have onboarded some mock-data using the Manetu data-loader utility.
$ java -jar ./target/uberjar/app.jar -u https://manetu.example.com --insecure -q ./examples/by-email/label-by-email.sparql --bindings ./examples/by-email/bindings.csv -n10000 -c64
If successful, the test should display something similar to the following:
2024-10-18T19:54:24.398Z INFO processing 10000 requests with concurrency 64
2024-10-18T19:54:24.403Z INFO Loading bindings from: ./examples/by-email/bindings.csv
10000/10000 100% [==================================================] ETA: 00:00
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
| Description | Count | Min | Mean | Stddev | P50 | P90 | P99 | Max | Rate |
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
| Errors | 0 (0.0%) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Not found | 0 (0.0%) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Successes | 10000 (100.0%) | 29.05 | 110.79 | 35.83 | 111.48 | 147.47 | 194.54 | 419.91 | 490.11 |
| Total | 10000 (100.0%) | 29.05 | 110.79 | 35.83 | 111.48 | 147.47 | 194.54 | 419.91 | 490.11 |
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
Total Duration: 20403.423232msecs
The following instructions allow you to deploy this tool into a Kubernetes instance. You would typically use the Kubernetes option to run the tool in close proximity to a Manetu instance running in the same cluster, though this is not strictly required.
Manetu hosts a Docker-based version of this tool on Dockerhub:
https://hub.docker.com/repository/docker/manetuops/sparql-loadtest/general
We will use this to deploy into Kubernetes.
You must inject a Personal Access Token to your Manetu instance as a Kubernetes Secret into your cluster for the tool to use, like so:
kubectl create secret generic manetu-sparql-loadtest --from-literal=MANETU_TOKEN=<your token>
You must also deploy the query/bindings you wish to use as a Kubernetes ConfigMap. The ConfigMap should have two bindings named 'bindings.csv' and 'query.sparql', which we will use to inject the files into our deployment in the next step.
kubectl create configmap manetu-sparql-loadtest --from-file=bindings.csv=examples/by-uuid/bindings.csv --from-file=query.sparql=examples/by-uuid/query.sparql
Next, we can define a Kubernetes Job that leverages our secret/configmap like so:
apiVersion: batch/v1
kind: Job
metadata:
name: manetu-sparql-loadtest
spec:
template:
spec:
restartPolicy: Never
containers:
- name: sparql-loadtest
image: manetuops/sparql-loadtest:latest
imagePullPolicy: Always
env:
- name: MANETU_URL
value: https://ingress.manetu-platform
- name: LOG_LEVEL
value: info
- name: LOADTEST_CONCURRENCY
value: "64"
- name: LOADTEST_NR
value: "10000"
- name: LOADTEST_QUERY
value: "/etc/manetu/loadtest/query.sparql"
- name: LOADTEST_BINDINGS
value: "/etc/manetu/loadtest/bindings.csv"
envFrom:
- secretRef:
name: manetu-sparql-loadtest
volumeMounts:
- name: data
mountPath: "/etc/manetu/loadtest"
readOnly: true
volumes:
- name: data
configMap:
name: manetu-sparql-loadtest
For convenience, this file is available in this repository as kubernetes/job.yaml. You may apply this like so:
kubectl apply -f kubernetes/job.yaml
Once the job is completed, you may use 'kubectl logs' to obtain the test results. First, obtain the name of the pod, like so:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
manetu-sparql-loadtest-4bmkh 0/1 Completed 0 28m
Then, query for the job logs, like so:
$ kubectl logs manetu-sparql-loadtest-4bmkh
+ exec java -jar /usr/local/app.jar -u https://ingress.manetu-platform -l info --no-progress --insecure --concurrency 64 --nr 10000 --query /etc/manetu/loadtest/examples/label-by-email.sparql --bindings /etc/manetu/loadtest/examples/bindings.csv
2024-10-18T23:51:00.921Z INFO processing 10000 requests with concurrency 64
2024-10-18T23:51:00.943Z INFO Loading bindings from: /etc/manetu/loadtest/examples/bindings.csv
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
| Description | Count | Min | Mean | Stddev | P50 | P90 | P99 | Max | Rate |
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
| Errors | 0 (0.0%) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Not found | 0 (0.0%) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| Successes | 10000 (100.0%) | 26.37 | 109.17 | 42.2 | 109.07 | 144.32 | 186.83 | 586.19 | 489.78 |
| Total | 10000 (100.0%) | 26.37 | 109.17 | 42.2 | 109.07 | 144.32 | 186.83 | 586.19 | 489.78 |
|-------------+----------------+-------+--------+--------+--------+--------+--------+--------+--------|
Total Duration: 20417.142784msecs
Tip: If the job is experiencing errors, you may set LOG_LEVEL to 'trace' to diagnose the problem.