-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Represent FMS expression submission model in LinkML #231
Conversation
…p; clean up notes
|
||
# DQM NOTE: Each FMS expression annotation is limited to one stage. However, | ||
# in the LinkML model, it's possible to specify a range. | ||
ExpressionPattern: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abecerra Have you considered how these referenced classes (ExpressionPattern, TemporalContext, AnatomicalSite) are going to be loaded and/or stored? Are they going to be shared objects that can be referenced by many annotations and therefore only editable in the context of their own tables (like ConditionRelation and ExperimentalCondition), or are they created and edited in the context of the expression annotation and not shared between annotations (like CrossReference and Note). That may determine whether we want uniqueId slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the second option: only created and edited in the context of the expression annotation, with a uniqueId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why with a uniqueId? A single expression annotation has a single ExpressionPattern attached, and that has a single TemporalContext and single AnatomicalSite. You're either creating the objects or updating the ones attached to the annotation - there's no lookup to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant, the uniqueId would be for the annotation. The ExpressionPattern, TemporalContext and AnatomicalSite would be loaded and stored
model/schema/expression.yaml
Outdated
@@ -233,58 +421,60 @@ slots: | |||
domain: TemporalContext | |||
range: StageTerm | |||
multivalued: false | |||
required: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is at odds with the old JSON schema, so we may need to remove this requirement while loading from the FMS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a closer look at the FMS requirements. I think I've updated to be mostly in line now - see notes
for the GeneExpressionAnnotation
. I should note, however, that the AnatomicalSite
postconditions are still not quite as restrictive as the FMS whereExpressed
constraints - see * NOTE: For the FMS whereExpressed component ...
for details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Less restrictive is not a problem, for loading from the FMS it's only an issue if the LinkML model requires a field that the FMS doesn't require
…ine with FMS data, rename figure-related slots
model/schema/expression.yaml
Outdated
# from_expression_experiment # For DTO only; for database, relate via non-DTO ExpressionExperiment.expression_annotations instead. | ||
- expression_annotation_subject | ||
- expression_pattern | ||
- expression_assay_used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markquintontulloch @abecerra @chris-grove
*** NOTE - new change**. I have just added expression_assay_used
to the ExpressionAnnotation
object. This slot is redundantly present in the ExpressionExperiment
object. My original rationale for keeping the slot only in the expt object was that I imagined DQMs would be submitting both types of objects. Now that the A-Team is simply mapping FMS data to LinkML, the old assay_used
slot was the only one not present in the primary ExpressionAnnotation
object, and it seemed unnecessarily complicated to have A-Team derive ExpressionExperiment
objects just so they had a place to put the FMS assay
data; any ExpressionExperiment
objects that the A-Team could derive would also be very scant compared to what a DQM could provide (e.g., IDs for these expts). Anyway, just a suggestion but we can change that back if preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abecerra I presume the intention is to create ExpressionExperiment objects from the FMS submission, so this duplication is not required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gildossantos OK makes sense to me. I'll let @abecerra address Mark's question. I think the question comes down to whether or not we have sufficient information in the FMS EXPRESSION data to infer all defining properties of an ExpressionExperiment, which I can't answer off hand. Ideally, we would create the ExpressionExperiment objects now so we don't have to create them later and then transfer the data from one table to the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the duplication is not required, Thanks @gildossantos! We would create the ExpressionExperiments when loading @markquintontulloch, thanks!
…lot must be filled in
model/schema/expression.yaml
Outdated
slots: | ||
- curie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curie, mod_entity_id, mod_internal_id, and data_provider are all inherited from SubmittedObject so don't need to be defined here
model/schema/expression.yaml
Outdated
rules: | ||
- postconditions: | ||
any_of: | ||
- slot_conditions: | ||
biological_entity_assayed: | ||
entity_assayed: | ||
required: true | ||
- slot_conditions: | ||
reagents_used: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reagents_used needs to be swapped with detection_reagents
description: >- | ||
An ExpressionExperiment for a Gene. | ||
slot_usage: | ||
entity_assayed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this field needs to be explicitly stated as required here. I see the post_conditions on the parent ExpressionExperiment class, but when I generate the documentation and the JSON model, the GeneExpressionExperiment does not indicate that entity_assayed is required:
https://github.com/alliance-genome/agr_curation_schema/blob/xprn1/generated/jsonschema/allianceModel.schema.json#L12481
including all qualifiers. | ||
slots: | ||
- when_expressed | ||
- where_expressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that where_expressed is considered a required slot but when_expressed is not. Is it true that an AnatomicalSite would always be required, but a TemporalContext would not? Maybe this is meant to have a post_conditions statement indicating that either one or the other is required?
No description provided.