Skip to content

Tree status in synthesis

Karen Cranston edited this page Sep 30, 2016 · 11 revisions

In the curation application, we want to make it easier for curators to:

  • add trees to synthesis
  • see which trees are in synthesis / queued for synthesis
  • explicitly prevent trees from going into synthesis.

Making these changes may also affect the NexSON structure and / or the synthesis pipeline.

Current status

(As of September 2016)

Preferred trees

In the UI, users can set the status of any individual tree as "Preferred". This setting adds the treeId to the list of trees in the study-level ot:candidateTreeForSynthesis property, but this flag is not used during synthesis.

Do-not-include studies

Until recently, there was a 'This study should not contribute to synthesis' checkbox on the Metadata tab, which set the study-level ot:notIntendedForSynthesis property. The only effect of this property was to reduce the stringency of study quality validation in the curator app. Since we were not using the property, we recently removed the option from the UI.

Synthesis collections

To include a tree into synthesis, a user must add the tree to one of the opentreeoflife collections (or one of the user-owned collections in synthesis). Adding a tree to a collection does not change the study NexSON. When running propinquity, we specify a list of collections, and all trees in those collections are included in synthesis.

Issues with current system

Note that a first pass at an update may not address all of these issues.

  • putting a tree into synthesis is too hard, i.e. adding a tree to a synthesis collection is non-intuitive and has too many steps
  • the 'Preferred' tree checkbox and the 'This study should not contribute to synthesis' study checkbox in the UI suggest an effect on the synthesis pipeline, but no effect exists
  • we store properties in the NexSON that suggest an effect on synthesis, but synthesis decisions are entirely dependent on which trees are in collections
  • there is no way for a user to explicitly say "do not include this tree / study in synthesis"
  • we do not validate the curation status of trees when a user chooses an action that is (or appears to be) linked to synthesis, either 1. checking the preferred status; or 2. adding a tree to one of the synthesis collections
  • we are not clear about which collections go into synthesis, or how to add a non-opentreeoflife collection to synthesis

Proposed new workflow(s)

(For some additional background, you may want to look at this opentree issue about synthesis status, which references the study homepage mockup, this issue about nexson properties, and this diagram about curator / collection interactions.)

For each tree in a study, a curator can select one of four statuses, which trigger the following listed actions:

  • Include
    • check curation status (or alternately, grey this option out if insufficiently curated)
    • if passes validation, add to end of default synthesis collection
    • provide some help text to the user about how to up-rank a tree (or, alternately, provide a list of synthesis collections to choose from)
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:include
  • Do not include
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:doNotInclude
    • check if the tree exists in synthesis. If so, remove from collection(s)
  • Needs curation
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:needsCuration
  • Not reviewed (default)
    • change the tree property in the NexSON to ot:candidateForSynthesis : ot:notReviewed

Concerns with new workflow

  • the connection between the listed status of a tree, the presence of a tree in a collection, and the use of a tree for synthesis might be confusing
  • information about trees proposed for synthesis is stored twice: once in the NexSON with ot:candidateForSynthesis : ot:include and a second time by the presence of the tree in a synthesis collection
  • there could be drift over time between the NexSON property and the collection status
  • the curator needs some way of knowing what collections are being used for synthesis in order to remove trees flagged as ot:doNotInclude (or, potentially, to provide users a list of collections when changing status to ot:include)
  • what is the effect of the ot:needsCuration option?

Alternate approaches

  1. We could remove all synthesis status info from the NexSONs and rely entirely on the collections.
  • PRO: no redundancy or chance of drift
  • PRO: no modifications to current collections-based synthesis procedure
  • CON: can't capture ot:doNotInclude vs ot:notReviewed; only boolean In / Not In collection
  • We could update only the NexSON from the curator; then, at some point later in the pipeline, update the collections based on changes in the ot:candidateForSynthesis property (removing trees without ot:include and adding trees that have ot:include).
    • PRO: curator logic simpler when user changes status
    • CON: unclear when and where to put this step; collections UI? propinquity?
    • CON: chance of drift if collection modified without updating nexsons

Note that the curator needs to know something about synthesis collections in either scenario, either to display the correct status in Option 1 or to know whether to add / remove trees to / from collections in Option 2.

Clone this wiki locally