Skip to content
This repository has been archived by the owner on Apr 12, 2023. It is now read-only.

Consider displaying or documenting more detail about association sources #1604

Open
mbrush opened this issue Jun 20, 2018 · 9 comments
Open

Comments

@mbrush
Copy link
Member

mbrush commented Jun 20, 2018

MGI was surprised (displeased?) to find themselves referenced as a source of certain phenotype annotations on genes that they did not assert. e.g. for Bmpr1a

bmpr1aassociation_002

This annotation is derived/inferred by Monarch a GO:heart development' annotation with an IMP evidence code. MGI curators make this GO annotation and report it on their site, but MGI does not make or report the assertion that Bmpr1a is associated with a heart development phenotype.
they raised this concern in a recent meeting - i.e. that it looked on our site like this assertion came from MGI, and that they many not want credit for this.

It may be worth considering how to describe provenance at a more granular level, to be clear about what source made the actual assertion, vs what sources provided information that was used in some way to support the assertion. In this example, it would be nice to know that this phenotype association was made by Monarch, as inferred from an IMP-based GO annotation made by MGI that Bmpr1a is involved in GO:heart development.

@mbrush mbrush added the AGR label Jun 20, 2018
@selewis
Copy link

selewis commented Jun 20, 2018 via email

@pnrobinson
Copy link
Member

pnrobinson commented Jun 21, 2018 via email

@mellybelly
Copy link

I have described this issue as far back as ~2007 perhaps ;-). I do think the provenance description as Matt suggests is a good idea. At ZFIN they have some general curation guidelines for when to make one annotation and not another, perhaps a survey of MODs to assess this information at this time would be helpful.

@cindyJax
Copy link

My concern goes deeper than provenance. MGI deliberately did not go down this path of asserting phenotype annotations from GO IMP annotations. It is often different in mouse from other organisms that cellular based phenotype assertions frequently do not translate accurately to whole mouse phenotype.

A couple of existing examples of this in your data that I found quickly include:

  1. Fam126a - A GO IMP annotation from PMID:26571211 asserts a role in myelination based on cellular studies, and a derived phenotype assertion has been made by you. The same reference, PMID: 26571211 asserts that mutant mice do NOT exhibit an obvious abnormal phenotype or specific defects in myelination as evaluated by light and electron microscopy analysis.

  2. Sp9 - A GO IMP annotation from PMID:15358670 asserts a role in embryonic limb morphogenesis. PMID:27452460 shows that mutant mice are born and survive with no limb defects described. They have neurological defects and die after 2 weeks of age.

I can get you more examples...

I think you will get better accuracy if there is a mouse mutant allele ID in the annotation assertion that the data is inferred from - this indicates that an actual mouse was the source of the GO annotation.

Matt's Bmpr1a example above was not necessarily egregious or wrong, but it duplicates existing phenotype data that has more detailed information (more specific phenotype term, specific mouse mutation, mouse strain background, etc...). I'm not sure what value the derived annotation from the GO is adding in this case.

Happy to discuss further.

@pnrobinson
Copy link
Member

Sorry, reading the entire thread now I may have misunderstood initially. It would be great to discuss this with Cindy as to how to best represent this information!

@kshefchek
Copy link
Contributor

@cindyJax this is very informative. My understanding is that the go-phenotype annotations are for normal phenotypes, which fall under the Biological process phenotype sub tree in the uber pheno ontology. Could you link to the sp9 and Fam126a? I'm not able to track them down my self.

The provenance for these won't be obvious from examining the graph alone, but I agree we need to document why this choice is made and make this available for these annotations.

@cindyJax
Copy link

@selewis I appreciate that you have concerns about the separate streams for GO and phenotype annotation. There are simpler ways to solve the disconnect problem than having MGI upend our entire curation workflows. In fact, we had a QC process that was set up to tag papers with GO IMP annotations for high-priority evaluation by the phenotypes group. I had a closer look at this and the process is clearly broken and has been for quite some time. We will take care of this.

@kshefchek Where did you want me to link to ?

@cmungall
Copy link
Member

@cindyJax thanks for your comments. I think the phenotypes derived from IMPs should be switched off (and more generally improve the provenance reporting, but in this case it's not an issue if we simply switch these off)

@kshefchek
Copy link
Contributor

@cindyJax I was referring to your statement:

A couple of existing examples of this in your data that I found quickly include (...Fam126a, Sp9)

and was hoping you could point me to these examples in our data (were they on our site, flat files?)

@cmungall my understanding is that this is required by our R24 - monarch-initiative/dipper#410 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants