You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using Python 3.7's shelve with the default dbm I run into the same size limitation noted here http://jamesls.com/semidbm-a-pure-python-dbm.html (notably HASH: Out of overflow pages. Increase page size) using a Mac. Having installed gdbm it won't appear with my Conda Pythons.
semidbm came to the rescue using the following code snippet. The class and function are lifted directly from Python's shelve.py. I see no speed difference but I do see a an ability to scale to more objects that dbm lacked. gdbmshould have provided a similar solution but on my Anaconda distribution I can't get it to work ( for reference import dbm.gnu generates ModuleNotFoundError: No module named '_gdbm').
Thank you for this package! I hope that the snippet below helps other who use shelve on a large dataset.
from shelve import Shelf
class DbfilenameShelfSemidbm(Shelf):
"""Shelf implementation using the "dbm" generic dbm interface.
This is initialized with the filename for the dbm database.
See the module's __doc__ string for an overview of the interface.
"""
def __init__(self, filename, flag='c', protocol=None, writeback=False):
import dbm
Shelf.__init__(self, semidbm.open(filename, flag), protocol, writeback)
def open_semidbm(filename, flag='c', protocol=None, writeback=False):
"""Open a persistent dictionary for reading and writing.
The filename parameter is the base filename for the underlying
database. As a side-effect, an extension may be added to the
filename and more than one file may be created. The optional flag
parameter has the same interpretation as the flag parameter of
dbm.open(). The optional protocol parameter specifies the
version of the pickle protocol.
See the module's __doc__ string for an overview of the interface.
"""
return DbfilenameShelfSemidbm(filename, flag, protocol, writeback)
Timing on a smaller client task (prior to the HASH error above):
dbm inside a default shelve 1m20
semidbm inside the derived shelve 1m20 (the same as dbm)
The text was updated successfully, but these errors were encountered:
Using Python 3.7's
shelve
with the defaultdbm
I run into the same size limitation noted here http://jamesls.com/semidbm-a-pure-python-dbm.html (notablyHASH: Out of overflow pages. Increase page size
) using a Mac. Having installedgdbm
it won't appear with my Conda Pythons.semidbm
came to the rescue using the following code snippet. The class and function are lifted directly from Python'sshelve.py
. I see no speed difference but I do see a an ability to scale to more objects thatdbm
lacked.gdbm
should have provided a similar solution but on my Anaconda distribution I can't get it to work ( for referenceimport dbm.gnu
generatesModuleNotFoundError: No module named '_gdbm'
).Thank you for this package! I hope that the snippet below helps other who use
shelve
on a large dataset.Timing on a smaller client task (prior to the HASH error above):
dbm
inside a defaultshelve
1m20semidbm
inside the derivedshelve
1m20 (the same asdbm
)The text was updated successfully, but these errors were encountered: