-
Notifications
You must be signed in to change notification settings - Fork 8
20140518
skwsm edited this page May 19, 2014
·
11 revisions
- Improved INSDC to RDF converter using SIO and nicer URIs.
Note for turtle: http://www.w3.org/TR/turtle/
- D.4 Changes from January 2008 Team Submission to First Public Working Draft
- Adopted SPARQL's syntax for prefixed names (see editor's draft):
- '.'s in names in all positions of a local name apart from the first or last, e.g. ex:first.name.
- digits in the first character of the PN_LOCAL lexical token, e.g. ex:7tm.
- 6.4 Escape Sequences
- reserved character escape sequences consist of a '' followed by one of ~.-!$&'()*+,;=/?#@%_ and represent the character to the right of the ''.
[141] PNAME_LN ::= PNAME_NS PN_LOCAL
[169] PN_LOCAL ::= (PN_CHARS_U | ':' | [0-9] | PLX ) ((PN_CHARS | '.' | ':' | PLX)* (PN_CHARS | ':' | PLX) )?
[167] PN_CHARS ::= PN_CHARS_U | '-' | [0-9] | #x00B7 | [#x0300-#x036F] | [#x203F-#x2040]
[165] PN_CHARS_U ::= PN_CHARS_BASE | '_'
[170] PLX ::= PERCENT | PN_LOCAL_ESC
[170s] PERCENT ::= '%' HEX HEX
[171s] HEX ::= [0-9] | [A-F] | [a-f]
[173] PN_LOCAL_ESC ::= '\' ( '_' | '~' | '.' | '-' | '!' | '$' | '&' | "'" | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%' )
[140] PNAME_NS ::= PN_PREFIX? ':'
[168] PN_PREFIX ::= PN_CHARS_BASE ((PN_CHARS|'.')* PN_CHARS)?
[164] PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
-> basically only ':', '.' and '-' are allowed in local part.
- Core Ensembl human genes/transcrpts are converted to RDF. (Conversion just took 20min; 26Mil triples; 3GB)
- Generated RDF is loaded into Virtuoso for SPARQL query testing. (Loading took 1min)
- Discussed how to deal with variations.
- UniProt uses Ensembl identifiers for genes.
- Content negotiation
- Collection URIs
- Started to re-write BioInterchange to use GFVO, SIO.
- Corrected GFVO ontology.
- Created a script to collect statistics of RDF data.