Silly question about ZODB #73

cristianocca · 2017-06-07T20:51:50Z

Reading the docs seems this library is made to work to ZODB, however, I'm trying to use this simply as an in memory BTree to store some ordered data and be able to efficiently query it after, since it's implemented in C (or the critical parts at least) it seems to yield slightly better results than using a simple python list with a bisect algorithm, not to mention a much cleaner syntax for range searches.

Am I using this wrong since it's optimized around ZODB? Should I consider some other implementation?

jamadden · 2017-06-07T20:55:01Z

BTrees work fine in-memory without being persisted to ZODB (but note that you typically can't pickle them). One place where they're especially useful is if you're storing integers or floating point keys or values, since those are unwrapped to the C data type and as such use much less memory than Python objects.

Your application may have different trade-offs than those that persist BTrees, so you might want to check the node sizes and possibly adust them.

cristianocca · 2017-06-07T21:06:00Z

I'm curious about the node sizes and if I should use different values if I know the tree will never be persisted. Does it mean that for an Object-Object tree, the max amount of values to be set on the tree is 30*250? Seems quite low.

Basically I'm loading some possibly large sorted list of objects indexed by a datetime value that needs to be queried thousands of times by a date range (reason it can't hit the database every time) and this BTree implementation seems to fit the job perfectly.

jamadden · 2017-06-07T21:11:56Z

Does it mean that for an Object-Object tree, the max amount of values to be set on the tree is 30*250?

IIRC the node sizes given are the maximum before the node splits into another level, thus impacting the height of the tree. Nodes split automatically when needed and AFAIK there is no artificial maximum number of keys one tree can hold.

possibly large sorted list of objects indexed by a datetime value

My app has a similar index. We turn the datetimes into their floating point timestamps so we can use an optimized tree (actually, we turn them into an integer representation of their floating point timestamp so we can normalize and use an even more optimized tree; this is a common approach, although it may not directly work for you).

cristianocca · 2017-06-07T21:20:11Z

I see, I guess I will need to test various values although I think it shouldn't really matter.

Using float or ints rather than datetimes is a good idea, the overhead of converting from datetime might be too much and negate any improvements since it's more about speed than actual memory usage, but I will give it a try, thanks for the help!

jamadden · 2017-06-07T21:25:49Z

I think it shouldn't really matter.

Probably not much in your case, it's more about persistence, but there is a performance component to it (e.g., larger buckets mean less pointer chasing means better cache locality, but maybe more searches)?

Using float or ints rather than datetimes is a good idea, the overhead of converting from datetime might be too much and negate any improvements since it's more about speed than actual memory usage, but I will give it a try

Right, as in all things optimization: profile first! (Actually, profile third!)

I'm going to go ahead and close this now.

jamadden closed this as completed Jun 7, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silly question about ZODB #73

Silly question about ZODB #73

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017 •

edited

Loading

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017

Silly question about ZODB #73

Silly question about ZODB #73

Comments

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017 • edited Loading

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017

cristianocca commented Jun 7, 2017

jamadden commented Jun 7, 2017

jamadden commented Jun 7, 2017 •

edited

Loading