-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added db.select_from_TABLE methods #1828
base: master
Are you sure you want to change the base?
Conversation
I was going to not add the
|
@Nick-Hall and other reviewers, I've been thinking for a very long time about how to add a select-style method in Gramps, while keeping with a Python interface. This PR represents the best that I can come up with. Note that the syntax for the The strings are parsed by Python into an Abstract Syntax Tree (ast) that is then used to generate the SQL syntax. I wrote the Evaluator with different DB engines in mind, in case they use different syntax for JSON extraction, etc. The code is fairly minimal and with low complexity to make it easy to maintain and extend. Let me know if you have concerns or ideas for improvement. |
@DavidMStraub this could serve as a replacement in gramps-web for both gramps-ql and object-ql as it is converted into SQL. (It doesn't yet allow everything that the others do though). |
One thing that I realize that this doesn't respect is filters and proxies. But I think that can be fixed. Some options:
Other ideas? |
@Nick-Hall, actually, I'm realizing that we have a bigger issue: if you have a proxy/filter in place, then you might not be able to access all of the items in the JSON data. That means that: person_data.family_list != person_object.family_list if a family does not appear in the filter/proxy. It could be that if we have a filter or proxy, we must force the DataDict to generate the object through methods like |
If I may ... I think it's really great that so much refactoring and improvement is happening, but I find it a bit strange that so many things are merged so quickly without (sorry - at least my impression) considering all the implications (I was triggered by the example with proxies and filters), while at the same time my simple PR which does nothing but enable static type checking has been open for half a year. Static type checking would make the refactoring less dangerous. |
@DavidMStraub, nothing has been merged yet that has any effect on the implications I have raised above. The implication is for the things being considered for merging. It would be great if we had more developers (like yourself) that would be able to comment on such implications. So, no, things aren't being merged "too quickly" and without thinking about consequences. Working on the what is next gives us insight into complex issues. So no need to get triggered by such a realization. Regarding type checking: yes, I would have merged that PR many months ago because I am very familiar with the benefits of typing, and realize there are no down sides. But also, the implication above is the realization that a "type" (eg, In any event, we need to refactor this PR, and the filter refactor PR. And probably adjust the |
One of your optimisations is to keep the data in a |
@dsblank This PR reminds me of the db.collcetion.find method in MongoDB. It may be worth a quick look if you are unfamiliar with it. You may get some ideas. I like how you have made the query pythonic. This is better than previous SQL-like designs and the JSON queries of MongoDB. @DavidMStraub We seem to have been discussing this on and off for about 7 or 8 years now, so I don't think that the progress is too fast. There have also been a couple of prototypes. The static type checking PR makes changes to 51 files. I tend to leave this type of change until fairly close to release in order to avoid potential conflicts when merging up fixes from the maintenance branch. Also the smaller changes tend to be easier to fit in when I have time available. Your PR is on my schedule though. @stevenyoungs Yes. Proxies are mainly used in the report and export code. I don't mind if these are not optimised to use the new code, but we must make sure that they don't run significantly slower than at present. Some people already have to wait a long time for certain reports to run. I don't regard this PR as essential for the next release, but it may be worth continuing to investigate our options. |
the |
#1839 will allow efficient get_raw_* functions in proxies. |
Asked for comments on the gramps-dev mailing list. |
The mailing list thread is called A DB select method that can be engine-optimized. I'll start the discussion if nobody replies, but I didn't want to influence people by posting my opinions first. |
Off topic, but I think this also applies to the Switch from pickled blobs to JSON data #1786 change https://github.com/gramps-project/gramps/pull/1786/files |
That would be great. |
This PR adds methods designed to be implemented in a low-level DB system, like SQL. The human-facing code is all Python, and gets parsed into SQL. All of the code that is converted into SQL is written as strings. This allows coders to write in the same syntax that is supported by the
DataDict
interface (minus the object-creation variation).For example, you could select all of the male people with:
(Person is defined in the environment evaluated in.)
By default, the methods returns a Gramps object per row. But you can optionally select one attribute ("person.handle") or a list of attributes (["person.handle", "person.gramps_id"]) using the
what
parameter.All arguments are optional.
Further Examples: