Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External standard subsystem for each reaction to make it comparable from different GEMs #19

Open
hongzhonglu opened this issue Aug 24, 2020 · 10 comments
Labels
discussion Discussion for the optimal solutions

Comments

@hongzhonglu
Copy link

Here, I have an issue in model comparison. It is that the subsystem of one rxn from different models are different. I think for standard-GEM, the subsystem should be compared. Currently, the GEMs from modelSeed have a series of standard subsystem, which make it easy for the model life.

@haowang-bioinfo
Copy link
Member

@hongzhonglu It does make sense to standardize subsystems and make them comparable across GEMs.

The majority subsystems of available GEMs were adapted from the names of KEGG pathways. How similar between the modelSeed subsystems and KEGG pathways?

@edkerk
Copy link
Collaborator

edkerk commented Aug 24, 2020

While some standardization of this could indeed be useful, I would argue that this is outside of the scope of standard-GEM. People will have arguments to use different ways of naming subSystems, and most likely none of these naming systems will be satisfactory to all. Instead, one can annotate reactions to e.g. KEGG pathways (should be added to COBRAToolbox similar as opencobra/cobratoolbox#1591), are modelSeed subsystems also in identifiers.org?

We are also not instructing what reaction identifiers to use. Allowing flexibility in standard-GEM (unless there are clear standards, like identifiers.org annotations) is part of its appeal.

@mihai-sysbio
Copy link
Member

I agree with @edkerk - the focus of standard-GEM at the moment essentially stops at the file tree of a repository. There have been many approaches targeting standardization of specific file formats, and the content of the respective files. Maybe this is a place where standard-GEM could contribute at some point in the far future.

Subsystem comparison sounds like a nice tool for a website. @hongzhonglu I hope you don't mind that we borrow this idea for the roadmap of Metabolic Atlas. Such feature requests are very appreciated over at the Met Atlas repository.

@haowang-bioinfo
Copy link
Member

the focus of standard-GEM at the moment essentially stops at the file tree of a repository.

This is something missing before, and should be moved to somewhere more obvious (README or issue template) for contributors.

@mihai-sysbio
Copy link
Member

The issue template is "locked" because whatever issue template is defined on the main branch is used both for the creation of new issues, and also passed on to people who Use this template to set up their own repository. Similarly, README.md is set up for templating purposes.

Even though this specific issue is something we cannot focus on at the moment, I still think it's valuable to have an open discussion. Therefore I would propose to not "exclude" such issues, or close them, but to keep them in the Backlog.

@Hao-Chalmers how about creating a new issue that described the current focus of standard-GEM, and pin it at the top of the Issues page?

@haowang-bioinfo
Copy link
Member

how about creating a new issue that described the current focus of standard-GEM, and pin it at the top of the Issues page?

Great idea

@mihai-sysbio mihai-sysbio added the discussion Discussion for the optimal solutions label Sep 8, 2020
@haowang-bioinfo
Copy link
Member

@hongzhonglu I wonder if you have any ideas or plan for standardizing subsystems, are you suggesting to adopt the way modelSeed?

@hongzhonglu
Copy link
Author

Hi Hao, not sure whether modelSeed is used. It is better to also use KEGG, metacyc?

@draeger
Copy link
Collaborator

draeger commented Nov 29, 2020

To answer @hongzhonglu's original question, let me take one step back and describe how the subsystem information has been stored in SBML before proposing a more structured way of dealing with them.

Originally, subsystem information was written into the reaction's notes, resulting in the redundant mentioning of the same subsystem in every reaction belonging to that subsystem. The new release of the BiGG Models database in 2015 introduced a new approach to redefining the use of subsystems in SBML. BiGG then used the groups extension from SBML to declare a group of reactions for every subsystem. Those groups contain members, each of which points to a reaction within that subsystem. With this approach, each subsystem was only specified once, and at the same time, one reaction could now belong to multiple subsystems.

As an additional advantage, we can annotate every group that represents such a subsystem. This means that we can add references in the form of controlled vocabulary terms to the subsystems and refer to KEGG and diverse other pathway databases. With this, it is possible to identify canonical pathways or subsystems across multiple models.

Furthermore, the idea of separating annotations from models (see https://doi.org/10.1093/bib/bby087) allows us to even store subsystem annotations in an external glossary file. Maybe those techniques can be helpful to solve such problems and to make models more comparable. Let's suggest using SBML groups and annotate them, possibly in separate glossary files.

@haowang-bioinfo
Copy link
Member

@draeger sounds a great idea by separating annotations of models in separate glossary files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion for the optimal solutions
Projects
Status: No status
Development

No branches or pull requests

5 participants