Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete file import pipeline from EcoCyc to WCM #1067

Open
11 of 17 tasks
ggsun opened this issue May 18, 2021 · 0 comments
Open
11 of 17 tasks

Complete file import pipeline from EcoCyc to WCM #1067

ggsun opened this issue May 18, 2021 · 0 comments
Labels

Comments

@ggsun
Copy link
Contributor

ggsun commented May 18, 2021

This issue will keep track of the remaining issues that need to be resolved to complete the file import pipeline from EcoCyc to WCM that has been kicked off with PR #1065.

As of May 18th, 2021, the issues are:

  • Add back TFs that were removed.
  • Test all analysis scripts.
  • Emit warnings for molecules in reactions whose specifications are missing from the molecule files.
  • Add a way to check which rows are being dropped from data files due to missing components.
  • Add script to automatically pull latest versions of files from EcoCyc's API.
  • Add patch notes for a list of manual changes made while incorporating these files.
  • Clean up models/ecoli/analysis/causality_network/build_network.py for improved robustness.
  • Incorporate missing compartment IDs to compartments.tsv.
  • Add support for multiple proteins being translated from a single RNA through frameshifting (should be implemented in line with how operons are handled).
  • Dynamically calculate molecular weights of modified proteins in 2CS.
  • Add hardcoded values in reconstruction/ecoli/dataclasses/process/metabolism.py as raw flat files.
  • Automatically add complexed versions of enzymes as functional enzymes for each metabolic reaction.
  • Remove hardcoded compartment tags from file reconstruction/ecoli/flat/endoRNases.tsv.
  • Pull coordinates of oriC and terC directly from EcoCyc (to be added as a separate file that lists non-genomic locales on the chromosome).
  • Exclude unused metabolite-compartment pairs from BulkMolecules.
  • Remove redundant common_names flat files.
  • Check if the genome sequence needs to be updated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant