Concerning the statistics 6.6.2.6 #121

yayamamo · 2015-03-10T16:23:43Z

Hi,
The spec of 6.6.2.6 defines the unique numbers of subjects and objects w.r.t a predicate.
This shows one aspect of the triples connecting two classes, but another cannot be obtained.
It is the unique number of triples connecting the two classes.
More properly, it specifies the number of unique triples that connects typed subjects and objects, which belong to certain classes, respectively.

One extreme example is that 100 different subjects have an identical property.
The former says that 100 distinctSubjects and 1 distinctObject(s) while the latter says 100 triples.
Another example is that each of 10 different subjects has an identical set of 10 properties.
The former says that 10 distinctSubjects and 10 distinctObjects while the latter says 100 triples.

I think the latter statistics is also useful to know the characteristics of the target dataset, and I feel this was on the document before, wasn't it?

micheldumontier · 2015-03-10T18:52:37Z

The intent of 6.6.2.6 is to capture the total number of triples between subjects and objects of a specified type e.g. 100 distinct subjects may be connected to 10 distinct objects via 100 triples.
One way of dealing with the total number of triples between subjects and objects of a certain type would be to simply declare a property partition on "rdfs:property".

yayamamo · 2015-03-11T15:29:41Z

I cannot fully understand what the meaning of to declare a property partition on "rdfs:property".
My previous comment may be vague, and a statistic what I'd like to know is the number of a certain predicate that connects specific classes (i.e., :c1 and :c2 in the example below). If the predicate connects these classes only, the number is identical to that of the predicate.

SELECT ?p (COUNT(?p) AS ?rc)
    WHERE {
      GRAPH :graph {
        ?s ?p ?o .
        ?s a :c1 .
        ?o a :c2 .
      }}
  GROUP BY ?p

micheldumontier · 2015-03-11T16:30:15Z

6.6.2.6 does just this, does it not?

http://htmlpreview.github.io/?https://github.com/indiedotkim/HCLSDatasetDescriptions/blob/master/Overview.html#s6_6

yayamamo · 2015-03-11T16:41:04Z

I don't think so.
The difference is what I wrote at the top of this comment.
Former is 6.6.2.6, and the latter is the query I wrote just above.

count(distinct ?s) = 100, count(distinct ?o) = 1, count(?p) = 100

One extreme example is that 100 different subjects have an identical property.
The former says that 100 distinctSubjects and 1 distinctObject(s) while the latter says 100 triples.

count(distinct ?s) = 10, count(distinct ?o) = 10, count(?p) = 100

Another example is that each of 10 different subjects has an identical set of 10 properties.
The former says that 10 distinctSubjects and 10 distinctObjects while the latter says 100 triples.

micheldumontier · 2015-03-11T17:23:02Z

so 6.6.2.2 talks about properties and number of triples. This query is not, however, limited to the subject and object being of some arbitrary type - we imagine that this is necessarily true.

SELECT ?p (COUNT(?p) AS ?triples)
{ ?s ?p ?o }
GROUP BY ?p

yayamamo · 2015-03-12T16:30:26Z

That is to say, would 6.6.2.6 be as follows?

:rdfdataset
    void:propertyPartition [
        void:property <property-uri> ;
        void:triples "###"^^xsd:integer ;
        void:classPartition [
            void:class <subject-class-uri> ;
            void:distinctSubjects "###"^^xsd:integer ;
        ];
        void-ext:objectClassPartition [
            void:class <object-class-uri> ;
            void:distinctObjects "###"^^xsd:integer ;
        ];
    ] .

SELECT (COUNT(DISTINCT ?s) AS ?scount) ?stype ?p (COUNT(?p) AS ?pcount) ?otype  (COUNT(DISTINCT ?o) AS ?ocount)  
{ 
 ?s ?p ?o . 
 ?s a ?stype .
 ?o a ?otype .
} GROUP BY ?p ?stype ?otype

micheldumontier · 2015-03-12T16:41:30Z

yes that's right

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concerning the statistics 6.6.2.6 #121

Concerning the statistics 6.6.2.6 #121

yayamamo commented Mar 10, 2015

micheldumontier commented Mar 10, 2015

yayamamo commented Mar 11, 2015

micheldumontier commented Mar 11, 2015

yayamamo commented Mar 11, 2015

micheldumontier commented Mar 11, 2015

yayamamo commented Mar 12, 2015

micheldumontier commented Mar 12, 2015

Concerning the statistics 6.6.2.6 #121

Concerning the statistics 6.6.2.6 #121

Comments

yayamamo commented Mar 10, 2015

micheldumontier commented Mar 10, 2015

yayamamo commented Mar 11, 2015

micheldumontier commented Mar 11, 2015

yayamamo commented Mar 11, 2015

micheldumontier commented Mar 11, 2015

yayamamo commented Mar 12, 2015

micheldumontier commented Mar 12, 2015