Skip to content
asijbers edited this page Apr 13, 2017 · 12 revisions

Gut-commensals with a health benefit

Background

The human body carries about 100 trillion microorganisms in its intestines, a number ten times greater than the total number of human cells in the body. These microorganisms perform several useful functions, such as fermenting unused energy substrates, training the immune system, preventing growth of harmful, pathogenic bacteria, regulating the development of the gut, producing vitamins for the host, such as vitamins B and K, and producing hormones to direct the host to store fats. In return, these microorganisms procure within the host a protected, nutrient-rich environment in which they can thrive.

An ever-growing number of studies have demonstrated that changes in the composition of our microbiomes correlate with numerous disease states, raising the possibility that manipulation of these communities could be used to treat disease. The gut microbiota has been implicated not only in diseases of the gastrointestinal tract but also in several extra-intestinal disorders, such as obesity, diabetes, and metabolic syndrome.

Key question to be addressed is which commensal bacteria and their metabolites may have a prebiotic or health improving function associated with gut health.
Gut health use case
Fig.1 Gut health use case

Use case description

Goal: Identify novel relationships between gut-commensals and health benefits.

Workflow: the query will start with a list of bacteria to find direct (Fig.2) and indirect (Fig.3) relations with a list of health benefits.
Intermediate concepts relevant to this search: other bacteria and/or metabolites/food compound (Fig.4).
Possible extensions: bacterial metabolic pathways, human physiological pathways and genes (Fig.5).

Search result output: clickable HTML pages; interactive graph.

Additional functionality: an automated search and reporting system generating state-of-the-art overviews on a regular basis supplemented with an alerting system of new findings.

Direct relations Fig.2 Direct relations scheme

Indirect relations Fig.3 Indirect relations scheme

Zooming in on Figure 3, Figure 4 shows an overview of the metabolites and associated metabolic pathways, foods, human genes and pathways (including the relevant data sources). Zooming in on metabolite Fig.4 Linking metabolite/food compound data

All possible Fig.5 The bigger picture

Pilot workflow

A pilot workflow (old EKP API) was generated to search for relations between 10 commensal and 6 health benefit terms. These input terms were automatically mapped to concepts in EKP and the workflow resulted in two direct relations. For comparison, co-occurrence text-mining revealed 16 relations between 10 commensals and 4 (out of the 6) health benefit terms in Medline abstracts. Possible explanations for the different result:

  • automatic mapping of an input term to a concept in EKP is matching 30-80%.
  • UMLS is focussed on human disease rather than health benefits.
  • UMLS has concepts for bacteria at species level and only few? at strain level.
  • Semantic Medline, the data source of these relations, is a repository of semantic predications (subject-predicate-object triples) extracted from Medline. Semantic Medline does not contain bacterium - health benefit relations, as extraction is restricted to specific concept-relation-concept patterns.
  • the current version of the Euretos Knowledge Platform is biased to data related to human disease and pharmacological treatment.

To identify indirect interactions, compounds or human diseases were added to the workflow as intermediate concepts. In addition, these concepts should have a label/attribute 'gut' OR 'intestines'. This search produced more relations, however, most compounds identified were very general, like 'DNA', 'glucose' and 'growth factor'. None of the indirect relations analyzed made sense. For example, E.coli was used as a vehicle to produce a certain growth factor in one abstract and linked by the concept 'growth factor' to a relation of a different growth factor with a certain health benefit in another abstract.

Based on the findings above, the term lists were supplemented with manually assigned UMLS CUIs and semantic types. Furthermore, additional manually curated data sources were identified for ingestion in EKP to include data more relevant to the use case.