Support large files #31

jreadey · 2015-08-27T16:35:19Z

h5tojson.py and jsontoh5.py can't convert files whose size is comparable to the amount of physical memory on the machine the convertor is running on.

jreadey · 2015-08-27T16:39:12Z

I'm tagging this as an "enhancement" rather than a bug since it was a known limitation of the design.

It may be worthwhile investigating using an alternative json parser such as: https://pypi.python.org/pypi/ijson/.

Would it make more sense to tackle this using a native-C implementation of the conversion tools?

ccoulombe · 2019-11-27T20:00:51Z

Any work towards this?

jreadey · 2019-11-27T21:13:46Z

Sort of... In HSDS we use what is basically the hdf5-json schema for metadata, but chunk data is stored as blobs. See: https://github.com/HDFGroup/hsds/blob/master/docs/design/obj_store_schema/obj_store_schema_v2.md for a description. This works pretty well - we've used it for "files" as large as 50 TB. "files' is in quotes since what you get at the end is a large collection of files in a tree structure.

This was done to support the HDF service, but the same approach could be used outside the server.

What type of problem are you looking to solve?

jreadey added the enhancement label Aug 27, 2015

jreadey mentioned this issue Oct 19, 2016

hdf/netcdf extractor for clowder terraref/computing-pipeline#145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support large files #31

Support large files #31

jreadey commented Aug 27, 2015

jreadey commented Aug 27, 2015

ccoulombe commented Nov 27, 2019

jreadey commented Nov 27, 2019

Support large files #31

Support large files #31

Comments

jreadey commented Aug 27, 2015

jreadey commented Aug 27, 2015

ccoulombe commented Nov 27, 2019

jreadey commented Nov 27, 2019