Skip to content

DQP Distributed Query Process

Erwan Demairy edited this page Mar 8, 2016 · 6 revisions

Introduction

The purpose of this extension if to allow to split a SPARQL request and execute its subparts on various SPARQL endpoints.

Comparison between a local request and a distributed request

Local Request

import fr.inria.edelweiss.kgram.core.Mappings;
import fr.inria.edelweiss.kgraph.core.Graph;
import fr.inria.edelweiss.kgraph.query.QueryProcess;
import fr.inria.edelweiss.kgtool.load.Load;

Collection<String> queries = ...;
Graph graph = Graph.create();
QueryProcess exec = QueryProcessDQP.create(graph);
Load ld = Load.create(graph);
ld.load( data_uri );

for (String q : queries) {
  Mappings results = exec.query(q);
}

Distributed Request

The main difference is on the exec object, which is of QueryProcessDQP type instead of QueryProcess. Then the user adds remote endpoints that should be used with the method addRemote.

import fr.inria.acacia.corese.exceptions.EngineException;
import fr.inria.edelweiss.kgdqp.core.Messages;
import fr.inria.edelweiss.kgdqp.core.ProviderImplCostMonitoring;
import fr.inria.edelweiss.kgdqp.core.QueryProcessDQP;
import fr.inria.edelweiss.kgdqp.core.Util;
import fr.inria.edelweiss.kgdqp.core.WSImplem;
import fr.inria.edelweiss.kgram.core.Mappings;
import fr.inria.edelweiss.kgraph.core.Graph;
import fr.inria.edelweiss.kgraph.query.QueryProcess;
import fr.inria.edelweiss.kgtool.load.Load;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.logging.Level;
import org.apache.commons.lang.time.StopWatch;
import org.apache.log4j.Logger;
	
Graph graph = Graph.create(false);
QueryProcessDQP execDQP = QueryProcessDQP.create(graph, sProv, true);
execDQP.setGroupingEnabled(true); // @ToBeDocumented
execDQP.addRemote(new URL(...), WSImplem.REST);
for (String query : queries) {
  Mappings map = execDQP.query(query);
}

REST API of Corese Server

/dqp/configureDatasources (POST)

Add new SPARQL endpoints url to the federation engine. The parameter "endpointUrl" contains the endpoint to add.

/dqp/sparqlprov (GET)

Web service for federated query processing with provenance documented results. It has the same parameters as /dqp/sparql.

/dqp/reset (POST)

Resets the local KGRAM graph.

/dqp/sparql (GET)

Ask for the processing of a request. The parameters are :

  • "query" (String): the SPARQL query;
  • "slicing"(Integer): threshold for the size of intermediate blocks. When the mappings exceed this threshold, they are sent to the Corese Server.
  • "tpgrouping"(Boolean): a boolean for enabling/disabling tripple patterns grouping into SERVICE clauses, false by default

/dqp/getCost (GET)

Get the current monitored distributed query processing cost.

/dqp/testDatasources (POST)

Check wether a SPARQL endpoint is available. The parameter "endpointUrl" indicates the endpoint to test.