-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database batch update API #81
Conversation
reccmp/isledecomp/compare/db.py
Outdated
return self | ||
|
||
def __exit__(self, exc_type, exc_value, exc_traceback): | ||
self.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/reference/datamodel.html#object.exit
This will unconditionally commit on exit. Shouldn't it roll back, or cancel the operations when the exception argument is non-null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also remember that there is some helper function and syntax trick involving yield
to efficiently write these scopes. Can't remember from the top of my head, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also remember that there is some helper function and syntax trick involving
yield
to efficiently write these scopes. Can't remember from the top of my head, though
You're probably thinking about contextlib.contextmanager (or similar functions in that module)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you. I added some basic handling to __exit__
.
commit
might be a poor name because the batch operation itself is only loosely atomic. Only the bulk_match
function uses a SQL transaction because it requires two statements.
I just added a SQL transaction to |
I am rather busy at the moment. Not sure if I can review before Sunday. If everything works and @madebr hasn't found anything else, feel free to merge. |
Updating the sqlite db using
executemany
and generator functions is a necessary evil for performance. This change abstracts away the implementation details and provides a much friendlier interface:You can use the batch object with the context manager as shown, or manually call
batch.commit()
.The batch object provides five methods. When you commit, the changes are applied in this order (regardless of what order the methods were called):
insert_orig
andinsert_recomp
will set an entity in the database, but only if there is no existing data. (i.e.INSERT
.)set_orig
andset_recomp
will insert or update data for the entity. (i.e.INSERT or REPLACE
.)match
will combine an orig address with a recomp address if neither is already matched.The other big change is that matching will now combine two entities if both the orig and recomp addresses already exist but neither are matched. Until now, it was only possible to match by setting an unused orig address on an existing recomp entity. This will finally allow us to have inserting data and matching entities as two distinct operations. I have new modularized match functions mostly done in the stash. We can add these incrementally and now finally have some good tests on the matching behavior.
I only used the batch in a few places in
core.py
for now. These new matching functions will use it extensively and we can refactor the existing code that works with thunks, vtordisps, and strings.