Skip to content
ktym edited this page May 19, 2014 · 11 revisions

RDF model

  • Improved INSDC to RDF converter using SIO and nicer URIs.

Note for turtle: http://www.w3.org/TR/turtle/

  • D.4 Changes from January 2008 Team Submission to First Public Working Draft
  • Adopted SPARQL's syntax for prefixed names (see editor's draft):
  • '.'s in names in all positions of a local name apart from the first or last, e.g. ex:first.name.
  • digits in the first character of the PN_LOCAL lexical token, e.g. ex:7tm.
  • 6.4 Escape Sequences
  • reserved character escape sequences consist of a '' followed by one of ~.-!$&'()*+,;=/?#@%_ and represent the character to the right of the ''.
[141]  	PNAME_LN	  ::=  	PNAME_NS PN_LOCAL
[169]  	PN_LOCAL	  ::=  	(PN_CHARS_U | ':' | [0-9] | PLX ) ((PN_CHARS | '.' | ':' | PLX)* (PN_CHARS | ':' | PLX) )?
[167]  	PN_CHARS	  ::=  	PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
[173]  	PN_LOCAL_ESC	  ::=  	'\' ( '_' | '~' | '.' | '-' | '!' | '$' | '&' | "'" | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%' )
[165]  	PN_CHARS_U	  ::=  	PN_CHARS_BASE | '_'
[170]  	PLX	  ::=  	PERCENT | PN_LOCAL_ESC
[140]  	PNAME_NS	  ::=  	PN_PREFIX? ':'
[168]  	PN_PREFIX	  ::=  	PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)?
[164]  	PN_CHARS_BASE	  ::=  	[A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

Ensembl

  • Core Ensembl human genes/transcrpts are converted to RDF. (Conversion just took 20min; 26Mil triples; 3GB)
  • Generated RDF is loaded into Virtuoso for SPARQL query testing. (Loading took 1min)
  • Discussed how to deal with variations.

UniProt

  • UniProt uses Ensemble identifiers for genes.

Identifiers.org

  • Content negotiation
  • Collection URIs

GFVO

  • Started to re-write BioInterchange to use GFVO, SIO.
  • Corrected GFVO ontology.
  • Created a script to collect statistics of RDF data.
Clone this wiki locally