Skip to content
asijbers edited this page Sep 19, 2016 · 12 revisions

#Gut-commensals with a health benefit#

##Use case description# Goal: Identify novel relationships between gut-commensals and health benefits#

Workflow: the query will start with a list of bacteria to find direct and indirect relations with a list of health benefits. Intermediate concepts relevant to this search: other bacteria and/or metabolites/food components. Possible extensions: bacterial metabolic pathways, human physiological pathways and genes.

Search result output: clickable HTML pages; interactive graph.

Additional functionality: an automated search and reporting system generating state-of-the-art overviews on a regular basis supplemented with an alerting system of new findings.

##Pilot workflow# We have started with a list of # commensals and # health benefit terms. A pilot workflow (old API) revieled only two direct relations as just a few concepts matched the terms in the input lists. Possible reasons for the low recall:

  • automatic mapping of an input term to a concept in EKP has a result for # out of # terms.
  • UMLS is focussed on human disease rather than health benefits.
  • UMLS has concepts for bacteria at species level and only few? at strain level.
  • the current version of the Euretos Knowledge Platform is biased to data related to human disease and pharmacological treatment.

To identify indirect interactions, compounds () or human diseases () were added to the workflow as intermediate concepts. In addition, these concepts should have a label/attribute 'gut' OR 'intestine'. This search yielded more results (relations), however, most compounds identified were very general, like 'DNA', 'glucose' and 'growth factor' and none of the indirect relations analyzed made sense. As an example: E.coli was used as a vehicle to produce a certain growth factor. This relation was connected by the concept 'growth factor' to a relation of a different growth factor that may have a certain health benefit.

Notably, Semantic Medline, the data source of these relations, is a repository of semantic predications (subject-predicate-object triples) extracted from Medline. Semantic Medline does not include bacteria - health benefit relations, as extraction is restricted to specific concept-relation-concept patterns.

Based on the findings above, the term lists have been supplemented with manually assigned UMLS CUIs and semantic categories. Furthermore, additional manually curated data sources have been identified for ingestion in EKP to include data more relevant to the use case.