Skip to content
Hirokazu Chiba edited this page Jun 16, 2023 · 1 revision

Alternative ontologies

  • tio:Ncbigene
  • obo:RO_0002162 (in taxon)
$ diff Homo_sapiens.gene_info.ttl Homo_sapiens.gene_info_2.ttl
2a3
> @prefix obo: <http://purl.obolibrary.org/obo/> .
3a5
> @prefix tio: <http://togoid.dbcls.jp/ontology#> .
13c15,16
< ncbigene:1 a nuc:Gene ;
---
> ncbigene:1 a tio:Ncbigene ;
>     obo:RO_0002162 taxid:9606 ; # in taxon
31d33
<     :taxid taxid:9606 ;
  • SIO or SO
  • sio:SIO_010035 (gene)
function hasClass(str) {
  if (str === 'protein-coding') {
    return 'sio:SIO_000985';
  } else if (str === 'pseudo') {
    return 'sio:SIO_000988';
  } else if (str === 'ncRNA') {
    return 'sio:SIO_000790';
  } else if (str === 'tRNA') {
    return 'sio:SIO_001230';
  } else if (str === 'rRNA') {
    return 'sio:SIO_001182';
  } else if (str === 'snoRNA') {
    return 'sio:SIO_001229';
  } else if (str === 'snRNA') {
    return 'sio:SIO_001228';
  } else if (str === 'scRNA') {
    return 'sio:SIO_001227';
  } else if (str === 'biological-region') {
    return 'obo:SO_0001411';
  }
}
  • No classes are found for "other", "unknown", "miscRNA".

Check for valid string?

function isValidString(str) {
  if (/^[\-\w @\.'/+:,();>?\[\]#*&~{}=\^]+$/.test(str)) {
    return true;
  } else {
    return false;
  }
}

Nim

./bin/linux/make_rdf.nim original_data/Homo_sapiens.gene_info  8.99s user 0.52s system 99% cpu 9.543 total
./bin/linux/make_rdf.nim original_data/Homo_sapiens.gene_info  8.98s user 0.52s system 99% cpu 9.522 total
./bin/linux/make_rdf.nim original_data/Homo_sapiens.gene_info  9.09s user 0.55s system 99% cpu 9.666 total
./bin/linux/make_rdf.nim original_data/Homo_sapiens.gene_info  8.90s user 0.52s system 99% cpu 9.444 total
./bin/linux/make_rdf.nim original_data/Homo_sapiens.gene_info  8.91s user 0.50s system 99% cpu 9.430 total
./bin/linux/make_rdf.nim.release original_data/Homo_sapiens.gene_info  8.87s user 0.59s system 99% cpu 9.486 total
./bin/linux/make_rdf.nim.release original_data/Homo_sapiens.gene_info  8.89s user 0.53s system 99% cpu 9.441 total
./bin/linux/make_rdf.nim.release original_data/Homo_sapiens.gene_info  8.98s user 0.50s system 99% cpu 9.510 total
./bin/linux/make_rdf.nim.release original_data/Homo_sapiens.gene_info  8.93s user 0.54s system 99% cpu 9.500 total
./bin/linux/make_rdf.nim.release original_data/Homo_sapiens.gene_info  8.90s user 0.53s system 99% cpu 9.460 total

Misc

$ wc gene_info_tax9606.2022-09-30.ttl
1196039
$ wc gene_info.2022-09-30.ttl
415710247
$ cat gene_info_tax9606.2022-09-30 | cut -f10 | fr
  20598 protein-coding
  17302 pseudo
  16227 biological-region
    803 rRNA
    658 tRNA
  22217 ncRNA
    174 snRNA
      4 scRNA
   1202 snoRNA
    847 other
   1386 unknown
$ cat gene_info.2022-09-30 | cut -f10 | fr
32582211 protein-coding
1940482 pseudo
  17130 biological-region
 431078 rRNA
1825591 tRNA
2506933 ncRNA
 286040 snRNA
     21 scRNA
 320544 snoRNA
   6025 miscRNA
  82910 other
  60671 unknown
Clone this wiki locally