Skip to content

Commit

Permalink
feat(interactive): Introduce a new benchmark tool for GIE (#4245)
Browse files Browse the repository at this point in the history
<!--
Thanks for your contribution! please review
https://github.com/alibaba/GraphScope/blob/main/CONTRIBUTING.md before
opening an issue.
-->

## What do these changes do?

<!-- Please give a short brief about these changes. -->

As titled.  Main features of the new benchmark tool includes:
* **Support for Multiple Query Languages**. The tool accommodates
various graph query languages, including Gremlin and Cypher, allowing
systems to configure according to their specific language support.
* **Support for Different Graph Systems**. It supports comparison among
multiple graph systems, such as GraphScope GIE and KuzuDB. More systems
will be integrated in the future.
* **Support for Versatile Workload**. The tool supports various
workloads, including LDBC IC/BI, LSQB, and JOB.
* **Results Evaluation**. It enables correctness validation and
performance benchmarking for detailed comparisons.

The results of the output comparison are illustrated as follows:


![image](https://github.com/user-attachments/assets/94e42d11-26a7-47e2-9410-3585cb67d029)


## Related issue number

<!-- Are there any issues opened that will be resolved by merging this
change? -->

Fixes #3862 , #4014

---------

Co-authored-by: Longbin Lai <[email protected]>
  • Loading branch information
BingqingLyu and longbinlai authored Sep 24, 2024
1 parent 15c6a6c commit abca708
Show file tree
Hide file tree
Showing 179 changed files with 4,036 additions and 404 deletions.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ and the vineyard store that offers efficient in-memory data transfers.
interactive_engine/tinkerpop_eco
interactive_engine/neo4j_eco
interactive_engine/gopt
interactive_engine/benchmark_tool
.. interactive_engine/guide_and_examples
interactive_engine/design_of_gie
.. interactive_engine/supported_gremlin_steps
Expand Down
161 changes: 161 additions & 0 deletions docs/interactive_engine/benchmark_tool.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# A Generic Benchmark Tool

We provide a benchmarking tool to evaluate the performance of Interactive Engine. This tool acts as multiple clients that send queries (Gremlin, or Cypher) to the server through the corresponding endpoint exposed by the engine. It reports performance metrics such as latency, throughput, and query results.

Notably, the tool has recently been enhanced to support comprehensive comparisons of different systems and a variety of benchmark workloads, enabling thorough assessments and comparison of query correctness and performance.

## Benchmark Tool Overview

Here are some key features of the benchmark tool:

* **Multiple Query Languages**. The tool accommodates various graph query languages, including Gremlin and Cypher, allowing systems to configure according to their specific language support.
* **Different Graph Systems**. It supports comparison among multiple graph systems, such as GraphScope GIE and KuzuDB. More systems will be integrated in the future.
* **Versatile Workload**. The tool supports various workloads, including [LDBC IC](https://ldbcouncil.org/benchmarks/snb-interactive/) and [BI](https://ldbcouncil.org/benchmarks/snb-bi/), [LSQB](https://github.com/ldbc/lsqb), and [JOB](https://github.com/gregrahn/join-order-benchmark).
* **Results Evaluation**. It enables correctness validation and performance benchmarking for detailed comparisons.

## Benchmark Tool Usage

The benchark tool is provided in [here](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark).
The benchmark program sends mixed queries to the server by reading query templates from [queries](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark/queries) with filling the parameters in the query templates using [substitution_parameters](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark/data/substitution_parameters).
The program uses a round-robin strategy to iterate all the **enabled** queries with corresponding parameters.

### Repository contents

```
- bin
- bench.sh // script for running benchmark for queries
- collect.sh // script for collecting benchmark results
- config
- interactive-benchmark.properties // configurations for running benchmark
- data
- substitution_parameters // query parameter files using to fill the query templates
- expected_results // expected query results for the running queries
- queries // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs // Other graph systems for comparison. Currently, KuzuDB is supported.
- example // an example to compare GraphScope GIE and Kuzu
- src // source code of benchmark program
```

_Note:_ the queries here with the prefix _ldbc_query_ are implementations of LDBC official interactive complex reads,
the queries with the prefix _bi_query_ are implementations of LDBC official business intelligence,
the queries with the prefix _lsqb_query_ are implementations of LDBC's labelled subgraph query benchmark,
and the queries with the prefix _job_ are the implementation of JOB Benchmark.
The gremlin queries should be with suffix _.gremlin_, and cypher queries should be with suffix _.cypher_.
The corresponding parameters (factor 1) for LDBC queries are generated by [LDBC official tools](http://github.com/ldbc/ldbc_snb_datagen).

### Building

Build benchmark program using Maven:

```bash
mvn clean package
```

All the binary and queries would be packed into _target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz_,
and you can use deploy the package to anywhere could connect to the gremlin endpoint (which should be provided in interactive-benchmark.properties).

### Running the benchmark

```bash
./bin/bench.sh # run the benchmark program with the provided properties
```

With the example configuration file ``example/job_benchmark.properties``, which compares GraphScope-GIE and KuzuDB while executing the JOB Benchmark, the example of results are as follows:

```
Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx
Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx
```

### Collecting the results

```bash
./bin/collect.sh # run the result collection program to collect the results and generate a performance comparison table
```

Based on the benchmark results, the collected data and the final performance comparison table are as follows:


| QueryName | GIE Avg | GIE P50 | GIE P90 | GIE P95 | GIE P99 | GIE Count | KuzuDb Avg | KuzuDb P50 | KuzuDb P90 | KuzuDb P95 | KuzuDb P99 | KuzuDb Count |
| --------- | ------- | ------- | ------- | ------- | ------- | --------- | ---------- | ---------- | ---------- | ---------- | ---------- | ------------ |
| 3a | 613.00 | 613 | 613 | 613 | 613 | 1 | 4676.00 | 4676 | 4676 | 4676 | 4676 | 1 |
| 5c | 8603.00 | 8603 | 8603 | 8603 | 8603 | 1 | 13623.00 | 13623 | 13623 | 13623 | 13623 | 1 |
| 9a | 3669.00 | 3669 | 3669 | 3669 | 3669 | 1 | 5122.00 | 5122 | 5122 | 5122 | 5122 | 1 |
| 13a | 3638.00 | 3638 | 3638 | 3638 | 3638 | 1 | 7068.00 | 7068 | 7068 | 7068 | 7068 | 1 |
| 32a | 266.00 | 266 | 266 | 266 | 266 | 1 | 253.00 | 253 | 253 | 253 | 253 | 1 |

A more detailed end-to-end example is provided in [here](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark/example).

## Configurations

All detailed configurations can be found in ``config/interactive-benchmark.properties``.

Below we highlight some key settings.

### Configure Compared Systems

We facilitate comparisons between various graph systems. For instance, to compare the GIE and Kuzu systems, the interactive-benchmark.properties file can be configured as follows. The Benchmark Tool will subsequently send queries to both GIE and Kuzu, gathering and analyzing their results.

```
# The configuration for the compared systems.
# Currently, the supported systems includes GIE and KuzuDb.
# For each system, starting from system.1 to system.n, the following configurations are needed:
# name: the name of the system, e.g., GIE, KuzuDb.
# client: the client of the system, e.g., for GIE, it can be cypher, gremlin; for KuzuDB, it should be kuzu.
# endpoint(optional): the endpoint of the system if the sytem provides a service endpoint, e.g., for GIE gremlin, it is 127.0.0.1:8182 by default.
# path(optional): the path of the database of the system if the system is a local database and need to access the database by the path, e.g., for KuzuDb, it can be /path_to_db/example_db.
# Either of endpoint or path need to be provided, depending on the access method of the system.
system.1.name = GIE
system.1.client = cypher
system.1.endpoint = 127.0.0.1:7687
system.1.path =
system.2.name = KuzuDb
system.2.client = kuzu
system.2.endpoint =
system.2.path = ./job_db
```

### Configure Workloads

Currently, we have provided commonly used benchmark workloads including ic, bi, lsqb, and job. Users can also add their own benchmarking queries to [queries](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark/queries) as well as adding substitution parameters of queries to [substitution_parameters](https://github.com/alibaba/GraphScope/tree/main/interactive_engine/benchmark/data/substitution_parameters). Note that the file name of user-defined query templates should follow the prefix _custom_query_ or _custom_constant_query_. The difference between custom_query and custom_constant_query is that the latter has no corresponding parameters.

Taking JOB benchmark as an example, the related configuration is as follows:

```
# The configuration for the benchmarking workloads.
# the directory of query templates
query.dir = ./queries/cypher_queries/job
# the directory of query parameters. If the queries do not have parameters, leave it empty.
query.parameters.dir =
# query file suffix, e.g., cypher (ldbc_query.cypher), gremlin (ldbc_query.gremlin), txt (ldbc_query.txt), etc.
query.file.suffix=cypher
# specify which kind of queries are sent.
# if query.all.enable is true, the benchmark will send all the queries in the query.dir.
query.all.enable=true
```

### Configure Results Collection

By default, benchmark results will be output to the `interactive-benchmark.log` and `interactive-benchmark-report.md` files, as exemplified in the sections "Running the benchmark" and "Collecting the results" above. Specifically, if you want to further compare query correctness under the current workloads, you can provide the corresponding configuration:

```
# the directory of query results which is optional. if provided, the benchmarking results will be compared with the expected results.
query.expected.path = ./data/expected_results/job_expected.json
```

The benchmark tool will automatically execute the queries and compare the results for correctness.
35 changes: 35 additions & 0 deletions interactive_engine/benchmark/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
OPT?=poc

CUR_DIR:=$(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))

ifeq ($(JAVA_HOME),)
java:=java
else
java:=$(JAVA_HOME)/bin/java
endif

UNAME_S := $(shell uname -s)
UNAME_M := $(shell uname -m)

config.path:=config/interactive-benchmark.properties
QUIET_OPT := --quiet

build:
cd $(CUR_DIR) && mvn clean package ${QUIET_OPT} && \
cd target && \
tar zxvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz > /dev/null

clean:
cd $(CUR_DIR) && mvn clean

run:
cd $(CUR_DIR) && $(java) \
-cp "$(CUR_DIR)/target/gaia-benchmark-0.0.1-SNAPSHOT/lib/*" \
com.alibaba.graphscope.gaia.benchmark.InteractiveBenchmark ${config.path}

collect:
cd $(CUR_DIR) && $(java) \
-cp "$(CUR_DIR)/target/gaia-benchmark-0.0.1-SNAPSHOT/lib/*" \
com.alibaba.graphscope.gaia.benchmark.CollectResult ${config.path}

.PHONY: build run
82 changes: 50 additions & 32 deletions interactive_engine/benchmark/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,30 @@
## Benchmark Tool Usage

In this directory is a tool that can be used to benchmark GAIA. It serves as multiple clients to send
queries to gremlin server through the gremlin endpoint exposed by the engine, and report the performance numbers
(e.g., latency, throughput, query results).
The benchmark program sends mixed queries to the server by reading query templates from [queries](queries) with filling the parameters in the query templates
using [substitution_parameters](data/substitution_parameters).
This directory contains a benchmarking tool for GraphScope GIE and other specified systems. It functions as multiple clients, sending queries through the engine's exposed endpoint or directly to the database, depending on the querying method for each system. The tool reports performance metrics such as latency, throughput, and query results.
The benchmark program sends mixed queries to the server by reading query templates from [queries](queries) with filling the parameters in the query templates using [substitution_parameters](data/substitution_parameters).
The program uses a round-robin strategy to iterate all the **enabled** queries with corresponding parameters.

### Repository contents
```
- bin
- bench.sh // script for running benchmark for queries
- collect.sh // script for collecting benchmark results
- config
- interactive-benchmark.properties // configurations for running benchmark
- data
- substitution_parameters // query parameter files using to fill the query templates
- queries // query templates including LDBC queries, K-hop queries and user-defined queries
- scripts
- benchmark.sh // script for running benchmark
- cal.py // script for calculating benchmark results
- expected_results // expected query results for the running queries
- queries // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs // Other graph systems for comparison. Currently, KuzuDB is supported.
- example // an example to compare GraphScope GIE and Kuzu
- src // source code of benchmark program
```
_Note:_ the queries here with the prefix _ldbc_query_ are implementations of LDBC official interactive complex reads,
and the corresponding parameters (factor 1) are generated by [LDBC official tools](http://github.com/ldbc/ldbc_snb_datagen).
the queries with the prefix _bi_query_ are implementations of LDBC official business intelligence,
the queries with the prefix _lsqb_query_ are implementations of LDBC's labelled subgraph query benchmark,
and the queries with the prefix _job_ are the implementation of JOB Benchmark.
The gremlin queries should be with suffix _.gremlin_, and cypher queries should be with suffix _.cypher_.
The corresponding parameters (factor 1) for LDBC queries are generated by [LDBC official tools](http://github.com/ldbc/ldbc_snb_datagen).

### Building

Expand All @@ -29,36 +33,50 @@ Build benchmark program using Maven:
mvn clean package
```
All the binary and queries would be packed into _target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz_,
and you can use deploy the package to anywhere could connect to the gremlin endpoint.
and you can use deploy the package to anywhere could connect to the endpoint (which should be provided in interactive-benchmark.properties).

### Running the benchmark

```bash
cd target
tar -xvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz
cd gaia-benchmark-0.0.1-SNAPSHOT
vim config/interactive-benchmark.properties # specify the gremlin endpoint of your server and modify running configurations
chmod +x ./scripts/benchmark.sh
./scripts/benchmark.sh # run the benchmark program
./bin/bench.sh # run the benchmark program with the provided properties
```
With the example configuration file ``example/job_benchmark.properties``, which compares GraphScope-GIE and KuzuDB while executing the JOB Benchmark, the results are as follows:
```
Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx
Benchmark reports numbers as following:
Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx
```
QueryName[LDBC_QUERY_1], Parameter[{firstName=John, personId=17592186223433}], ResultCount[87], ExecuteTimeMS[ 1266 ].
QueryName[LDBC_QUERY_12], Parameter[{tagClassName=Judge, personId=19791209469071}], ResultCount[0], ExecuteTimeMS[ 259 ].
QueryName[LDBC_QUERY_11], Parameter[{workFromYear=2001, personId=32985348901156, countryName=Bolivia}], ResultCount[0], ExecuteTimeMS[ 60 ].
QueryName[LDBC_QUERY_9], Parameter[{personId=10995116420051, maxDate=20121128080000000}], ResultCount[20], ExecuteTimeMS[ 55755 ].
QueryName[LDBC_QUERY_8], Parameter[{personId=67523}], ResultCount[20], ExecuteTimeMS[ 148 ].
QueryName[LDBC_QUERY_7], Parameter[{personId=26388279199350}], ResultCount[0], ExecuteTimeMS[ 10 ].
QueryName[LDBC_QUERY_6], Parameter[{personId=26388279148519, tagName=Vallabhbhai_Patel}], ResultCount[0], ExecuteTimeMS[ 12837 ].
QueryName[LDBC_QUERY_5], Parameter[{minDate=20120814080000000, personId=2199023436754}], ResultCount[0], ExecuteTimeMS[ 11268 ].
QueryName[LDBC_QUERY_3], Parameter[{durationDays=30, endDate=20110701080000000, countryXName=Mongolia, countryYName=Namibia, personId=8796093204429, startDate=20110601080000000}], ResultCount[20]
, ExecuteTimeMS[ 21474 ].
QueryName[LDBC_QUERY_2], Parameter[{personId=28587302394490, maxDate=20121128080000000}], ResultCount[20], ExecuteTimeMS[ 331 ].
query count: 10; execute time(ms): ...; qps: ...

### Collecting the results

```bash
./bin/collect.sh # run the result collection program to collect the results and generate a performance comparison table
```
Based on the benchmark results, the collected data and the final performance comparison table are as follows:

And the comparison result after collection is as follows:
| QueryName | GIE Avg | GIE P50 | GIE P90 | GIE P95 | GIE P99 | GIE Count | KuzuDb Avg | KuzuDb P50 | KuzuDb P90 | KuzuDb P95 | KuzuDb P99 | KuzuDb Count |
| --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- | --------- |
| 3a | 613.00 | 613 | 613 | 613 | 613 | 1 | 4676.00 | 4676 | 4676 | 4676 | 4676 | 1 |
| 5c | 8603.00 | 8603 | 8603 | 8603 | 8603 | 1 | 13623.00 | 13623 | 13623 | 13623 | 13623 | 1 |
| 9a | 3669.00 | 3669 | 3669 | 3669 | 3669 | 1 | 5122.00 | 5122 | 5122 | 5122 | 5122 | 1 |
| 13a | 3638.00 | 3638 | 3638 | 3638 | 3638 | 1 | 7068.00 | 7068 | 7068 | 7068 | 7068 | 1 |
| 32a | 266.00 | 266 | 266 | 266 | 266 | 1 | 253.00 | 253 | 253 | 253 | 253 | 1 |

### User-defined Benchmarking Queries
Users can add their own benchmarking queries to [queries](queries) as well as adding substitution parameters of queries to [substitution_parameters](data/substitution_parameters).
Note that the file name of user-defined query templates should follow the prefix _custom_query_ or _custom_constant_query_. The difference between custom_query and
custom_constant_query is that the latter has no corresponding parameters.
Note that the file name of user-defined query templates should follow the prefix _custom_query_ or _custom_constant_query_. The difference between custom_query and custom_constant_query is that the latter has no corresponding parameters.
Loading

0 comments on commit abca708

Please sign in to comment.