Skip to content

Latest commit

 

History

History
50 lines (48 loc) · 1.6 KB

gorule-0000010.md

File metadata and controls

50 lines (48 loc) · 1.6 KB
layout id title contact status fail_mode implementations
rule
GORULE:0000010
PubMed reference formatting must be correct
Proposed
soft
code language source
SELECT gene_product.symbol, CONCAT(gpx.xref_dbname, ':', gpx.xref_key) AS gpxref, IF(association.is_not=1,"NOT","") AS 'not', term.acc, term.name, evidence.code, CONCAT(dbxref.xref_dbname, ':', dbxref.xref_key) AS evxref, db.name AS assigned_by FROM association INNER JOIN evidence ON association.id = evidence.association_id INNER JOIN gene_product ON association.gene_product_id = gene_product.id INNER JOIN term ON association.term_id = term.id INNER JOIN dbxref ON evidence.dbxref_id = dbxref.id INNER JOIN dbxref AS gpx ON gene_product.dbxref_id = gpx.id INNER JOIN db ON association.source_db_id=db.id WHERE dbxref.xref_dbname = 'PMID' AND dbxref.xref_key REGEXP '^[^0-9]'
SQL
code language source
/^(.*?\t){5}([^\t]\|)*PMID:(?!\d+)/
regex

References in the GAF (Column 6) should be of the format db_name:db_key|PMID:12345678, e.g. SGD_REF:S000047763|PMID:2676709. No other format is acceptable for PubMed references; the following examples are invalid:

  • PMID:PMID:14561399
  • PMID:unpublished
  • PMID:.
  • PMID:0

This is proposed as a HARD QC check: incorrectly formatted references will be removed.