Skip to content

Tree status in synthesis

Karen Cranston edited this page Sep 30, 2016 · 11 revisions

In the curation application, we want to make it easier for curators to:

  • add trees to synthesis
  • see which trees are in synthesis / queued for synthesis
  • explicitly prevent trees from going into synthesis.

Making these changes may also affect the NexSON structure and / or the synthesis pipeline.

Current status

(As of September 2016)

Preferred trees

In the UI, users can set the status of any individual tree as "Preferred". This setting adds the treeId to the list of trees in the study-level ot:candidateTreeForSynthesis property, this flag is not used during synthesis.

Do-not-include studies

Until recently, there was a 'This study should not contribute to synthesis' checkbox on the Metadata tab, which (if selected), set the study-level ot:notIntendedForSynthesis property to True. The only effect of this property was to reduce the stringency of study quality validation in the curator app. Since we were not using the property, we recently removed the option from the UI.

Synthesis collections

To include a tree into synthesis, a user must add the tree to one of the opentreeoflife collections (or one of the user-owned collections in synthesis). Adding a tree to a collection does not change the study NexSON. When running propinquity, we specify a list of collections, and all trees in those collections are included in synthesis.

Issues with current system:

Note that a first pass at an update may not address all of these issues.

  • we have NexSON properties that imply an effect on synthesis, but synthesis decisions are actually dependent on which trees are in synthesis collections
  • putting a tree into synthesis is too hard, i.e. adding a tree to a synthesis collection is non-intuitive and has too many steps
  • the 'Preferred' tree checkbox and the 'This study should not contribute to synthesis' study checkbox suggest an effect on the synthesis pipeline, but no effect exists
  • there is no way for a user to explicitly say "do not include this tree / study in synthesis"
  • we do not validate the curation status of trees when a user does an action that is (or appears to be) linked to synthesis, either 1. checking the preferred status; or 2. adding it to one of the synthesis collections
  • we are not clear about which non-opentreeoflife collections go into synthesis, or how to add a custom collection to synthesis

Proposed new workflow(s)

For some additional background, you may want to look at this opentree issue about synthesis status, which references the study homepage mockup, this issue about nexson properties, and this diagram about curator / collection interactions.

For each tree in a study, a curator can select one of four statuses, which trigger the following listed actions:

  • Include
    • check curation status (or alternately, grey this option out if insufficiently curated)
    • if passes validation, add to end of default synthesis collection
    • provide some help text to the user about how to up-rank a tree (or, alternately, provide a list of synthesis collections to choose from)
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:include
  • Do not include
    • check if the tree exists in synthesis. If so, remove from collection(s)
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:doNotInclude
  • Needs curation
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:needsCuration
  • Not reviewed (default)
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:notReviewed

Concerns with new workflow

  • the connection between the listed status of a tree, the presence of a tree in a collection, and the use of a tree for synthesis might be confusing
  • information about trees proposed for synthesis is stored twice: once in the NexSON with ot:candidateForSynthesis : ot:include and a second time by the presence of the tree in a synthesis collection
  • there could be drift over time between the NexSON property and the collection status
  • the curator needs some way of knowing what collections are being used for synthesis in order to remove trees flagged as ot:doNotInclude (or, potentially, to provide users a list of collections when changing status to ot:include)

Alternate approaches

  • we could remove all synthesis status info from the NexSONs and rely entirely on the collections
    • PRO: no redundancy or chance of drift
    • CON: can't capture ot:doNotInclude vs ot:notReviewed
  • we could update only the NexSON from the curator; then, at some point later in the pipeline, move ot:include trees to the synthesis collections
    • PRO: curator doesn't need to know about synthesis status of collections
    • CON: unclear when and where to put this step; collections UI? propinquity?
Clone this wiki locally