You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This feature adds support for nvCOMP batch/low-level API which allows to process multiple chunks in parallel.
The proposed implementation provides an easy way to use the API via well-known numcodecs Codec API. Using numcodecs also enables seamless integration with libraries such as zarr that use numcodecs internally.
Additionally, using nvCOMP batch API enables interoperability between existing codecs and nvCOMP batch codec. For example, the data can be compressed on CPU using default LZ4 codec and then decompressed on GPU using proposed nvCOMP batch codec.
To support batch mode, Codec interface was extended with functions, encode_batch and decode_batch, which implement batch mode.
Note that the current version of zarr does not support chunk-parallel functionality, but there is a proposal for this feature.
Currently the following compression/decompression algorithms are supported:
LZ4
Gdeflate
zstd
Snappy
nvCOMP also supports other algorithms which can be relatively easily added to kvikio.
Example of usage:
Simple use of Codec batch API:
importnumcodecsimportnumpyasnp# Get the codec from the numcodecs registry.codec=numcodecs.registry.get_codec(dict(id="nvcomp_batch", algorithm="lz4"))
# Creater 2 chunks. The chunks do not have to be the same size.shape= (4, 8)
chunk1, chunk2=np.random.randn(2, *shape).astype(np.float32)
# Compress data.data_comp=codec.encode_batch([chunk1, chunk2])
# Decompress.data_decomp=codec.decode_batch(data_comp)
# Verify.np.testing.assert_equal(data_decomp[0].view(np.float32).reshape(shape), chunk1)
np.testing.assert_equal(data_decomp[1].view(np.float32).reshape(shape), chunk2)
Using with zarr (no parallel chunking yet - see the note above).
importnumcodecsimportnumpyasnpimportzarr# Get the codec from the numcodecs registry.codec=numcodecs.registry.get_codec(dict(id="nvcomp_batch", algorithm="lz4"))
shape= (16, 16)
chunks= (8, 8)
# Create data and compress.data=np.random.randn(*shape).astype(np.float32)
z1=zarr.array(data, chunks=chunks, compressor=codec)
# Store in compressed format.zarr_store=zarr.MemoryStore()
zarr.save_array(zarr_store, z1, compressor=codec)
# Read back/decompress.z2=zarr.open_array(zarr_store)
np.testing.assert_equal(z1[:], z2[:])
If desired, the API can also be used directly, without having to use numcodecs API.
The text was updated successfully, but these errors were encountered:
This feature adds support for nvCOMP batch/low-level API which allows to process multiple chunks in parallel.
The proposed implementation provides an easy way to use the API via well-known numcodecs Codec API. Using
numcodecs
also enables seamless integration with libraries such aszarr
that usenumcodecs
internally.Additionally, using nvCOMP batch API enables interoperability between existing codecs and nvCOMP batch codec. For example, the data can be compressed on CPU using default
LZ4
codec and then decompressed on GPU using proposed nvCOMP batch codec.To support batch mode, Codec interface was extended with functions,
encode_batch
anddecode_batch
, which implement batch mode.Note that the current version of
zarr
does not support chunk-parallel functionality, but there is a proposal for this feature.Currently the following compression/decompression algorithms are supported:
nvCOMP also supports other algorithms which can be relatively easily added to kvikio.
Example of usage:
zarr
(no parallel chunking yet - see the note above).If desired, the API can also be used directly, without having to use numcodecs API.
The text was updated successfully, but these errors were encountered: