Façade-based Data Access Benchmark

This folder provides a benchmark derived from GTFS-Madrid-Bench for evaluating Façade-based Data Access (FBDA) engines, such as SPARQL Anything.

The extension consists of:

a set of query templates that translate the GTFS-Madrid-Bench's queries and RML mappings into FBDA queries;
a query executor which fires the queries and measures the performance of the FBDA engines under four experimental regimes:
- In-memory execution over a complete materialised view (in-memory+complete);
- In-memory execution optimised by a triple-filtering approach (in-memory+triple-filtering);
- In-memory execution over a sliced materialised view and optimised by triple-filtering (sliced+triple-filtering);
- On-disk execution optimised by triple-filtering (on-disk+triple-filtering).

More details can be found in this article.

Requirements for the use

To have locally installed Java 11 (or later versions).

Using FBDA Benchmark

Generate data using GTFS-Madrid-Bench and move the result folder generated by GTFS within experiments folder. At the moment only csv, json and xml formats are allowed.
Generate FBDA queries for the scales passed to GTFS-Madrid-Bench (e.g. 1, 10, 100)

./generate_queries.sh "1 10 100" "TMP_FOLDER" "xml csv json"

where:

TMP_FOLDER is the path to a temporary folder that will be used during the experiments
"xml csv json" are the formats passed to GTFS-Madrid-Bench

Download the executable jar file of the FBDA engine to evaluate (e.g. SPARQL Anything v0.9.0)
Run the the queries

./execute_queries.sh /path/to/fbda_engine.jar "1 10 100" "xml csv json" "/path/to/results" "TMP_FOLDER"

where:

"1 10 100" are the scales passed to GTFS-Madrid-Bench
"xml csv json" are the formats passed to GTFS-Madrid-Bench
"/path/to/results" is the path to a folder where the results of the execution of the queries (i.e. measures) will be stored
TMP_FOLDER is the path to a temporary folder that will be used during the experiments

Analysing the results

The execution of the queries generates two TSV files for each query executed on a given format, namely time_q<query_id>_<format>.tsv and mem_q<query_id>_<format>.tsv. These files trace the execution of the queries in terms of computational resources used by the engine (i.e. memory footprint, CPU and time).

The files are stored in the directory /path/to/results passed as argument of execute_queries.sh.

The time_q<query_id>_<format>.tsv file keeps track of the execution time of the queries on a experimenting format. The table has the following structure:

Query	InputSize	Strategy	Slice	Ondisk	MemoryLimit	Run	Time	Unit	Status	STDErr

The mem_q<query_id>_<format>.tsv file keeps track of the usage by the engine of the CPU and memory during the evaluation of the queries. The table has the following structure:

Query	InputSize	Strategy	Slice	Ondisk	MemoryLimit	Run	PID	%cpu	%mem	vsz	rss

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Façade-based Data Access Benchmark

Requirements for the use

Using FBDA Benchmark

Analysing the results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Façade-based Data Access Benchmark

Requirements for the use

Using FBDA Benchmark

Analysing the results