-
Notifications
You must be signed in to change notification settings - Fork 0
Add a new Mockpath Dataset
It will contain a list of
itemA, itemB, nodes_on_path, edges_on_path, metapathinstances
between two sets.
We need to define the two input sets A and B.
For this we require two lists of node ids, which correspond to nodes in the freebase graph.
They can be easily obtained by querying manually (example for programming lanuages:
MATCH (n) WHERE n.name = "PYTHON" RETURN ID(n);
)
Once these ids are collected, replace those ids in the lists in a cypher query like the following and replace
/computer/programming_language/
with your domain. Note that the 3
here corresponds to the longest allowed meta path, you might want to change this.
MATCH p = (a)-[*1..3]-(b)
WHERE
id(a) IN [33260702,23580293,70267249]
and id(b) IN [16521328,31640738,60021106]
and all(x in relationships(p) WHERE type(x)=~'/computer/programming_language/.*')
and all(x in nodes(p)[1..(size(nodes(p))-1)] WHERE (id(x) <> id(a) and id(x) <> id(b)))
RETURN a.name as a_set, b.name as b_set, extract(n IN nodes(p)| labels(n)) AS nodes_types, extract(r IN relationships(p)| type(r)) AS relationship_types, count(*) as path_count;
save this file on watson on /home/bp/dataset-extraction
to your own folder as query.cypher
You can execute it (best in a screen session) with
cat query.cypher | /usr/bin/cypher-shell -u neo4j -a bolt://localhost:PORT_NUMBER --format plain > output.csv
, change PORT_NUMBER to the bolt port-number of your database.
Once the query executed, put the output-file in the python framework to make it accessible in
32de-python/tests/data
You still need to tell Python how to find it, so in util/meta_path_loader_dispatcher.py
add your dataset in the class MetaPathLoaderDispatcher
to the variables available_datasets
(name and description) and dataset_to_loader
(relative path to output file you just created)
That's it!