Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a plan for using NewtDB (an extension of ZODB for PostgreSQL) #22

Open
tnigon opened this issue May 6, 2020 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@tnigon
Copy link
Contributor

tnigon commented May 6, 2020

NewtDB docs

As I understand it, NewtDB is great for storing Python objects into a database. After feature selection, tuning, and training, we will want to store the object, then retrieve it later to perform predictions with new data. NewtDB can store an object, then we have the choice whether we load that entire object or simply the data of that object. This flexibility seems to be advantageous depending on what we are trying to achieve.

To Do

  1. #XX - Demonstrate a minimal working example of NewtDB with Tuning API and a PostgreSQL database. Show how an object can be saved then loaded back, and also show how to retrieve just the data.
  2. #XX - Develop a plan for how Tuning objects should be stored in PostgreSQL. Things to consider include retuning/retraining with same data, with new data, easy access to parameters, test accuracy, etc., and relationship with how we bring new data stored in a DB into that object (same DB or different DB).
  3. #XX - Determine in what cases it is most appropriate to create an entirely new object, and in which cases it is is okay to grab the object and re-run one of it's functions (perhaps saving over the object in the DB or saving as a new object in the DB). In the last case above, maybe it's best to just create a new object with the new data and retrain, retune, etc. We may be able to save a lot of CPU and/or DB storage space by simply updating an object at the training step rather than starting from scratch and updating the features selection, tuning, and training.
  4. #XX - When implementing this, be sure to be sure objects are "persistent" and the appropriate coding is used to keep track of updates to the objects. See this ZODB link.
  5. #XX - Use Zope or similar to manager generations of the objects being stored in the DB. As I understand it, this can be used to ensure we use a particular library version when accessing an object from the DB so that there aren't inconsistencies with, e.g. attribute names that may have changed in a later library version.
@tnigon tnigon added the enhancement New feature or request label May 6, 2020
@tnigon tnigon changed the title Make a plan for using NewtDB (an extension of ZODB for PostgreSQL Make a plan for using NewtDB (an extension of ZODB for PostgreSQL) May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant