Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update of our Species table with ingestion of NCBI taxononmy data #2122

Open
14 tasks
only1chunts opened this issue Dec 5, 2024 · 0 comments
Open
14 tasks

Comments

@only1chunts
Copy link
Member

only1chunts commented Dec 5, 2024

User story

As a curator
I want to be able to bulk update the Species table in GigaDB from the NCBI taxonomy
So that we can maintain and upto date Species table

Acceptance criteria

Given the Species table in GigaDB is out of date
When I request an update
Then anything new in NCBI taxonomy is imported into GigaDB Species table
This should include the Taxon ID, Scientific name, and Genbank Name fields

Additional Info

The Species table was imported a long time ago and as far as I know has never been updated, we dont need to update it very often, but it would be useful to be able to update it periodically or on request.

FTP link: https://ftp.ncbi.nih.gov/pub/taxonomy/
API: https://www.ncbi.nlm.nih.gov/home/develop/api/

Note: currently, there is a species id that's not NCBI that we need to keep (-1, Non assigned) as it's for special internal purpose

Product Backlog Item Ready Checklist

  • Business value is clearly articulated
  • Item is understood enough by the IT team so it can make an informed decision as to whether it can complete this item
  • Dependencies are identified and no external dependencies would block this item from being completed
  • At the time of the scheduled sprint, the IT team has the appropriate composition to complete this item
  • This item is estimated and small enough to comfortably be completed in one sprint
  • Acceptance criteria are clear and testable
  • Performance criteria, if any, are defined and testable
  • The Scrum team understands how to demonstrate this item at the sprint review

Product Backlog Item Done Checklist

  • Item(s) in increment pass all Acceptance Criteria
  • Code is refactored to best practices and coding standards
  • Documentation is updated as needed
  • Data security has not been compromised (with particular reference to the personal information we hold in GigaDB)
  • No deviation from the team technology stack and software architecture has been introduced
  • The product is in a releasable state (i.e. the increment has not broken anything)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To Estimate
Development

No branches or pull requests

2 participants