-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/nixpath/chunker: add #81
Conversation
chunkerOpts.NormalSize = 64 * 2024 | ||
chunkerOpts.MinSize = chunkerOpts.NormalSize / 4 | ||
chunkerOpts.MaxSize = chunkerOpts.NormalSize * 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used the values here I used in nix-casync initially. This probably should still be refined once we can ingest a bit of data.
@@ -6,6 +6,7 @@ require ( | |||
github.com/alecthomas/kong v0.5.0 | |||
github.com/dgraph-io/badger/v3 v3.2103.2 | |||
github.com/google/go-cmp v0.5.5 | |||
github.com/poolpOrg/go-fastcdc v0.0.0-20211130135149-aa8a1e8a10db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will you be able to re-use that and be casync-compatible or is that a new system altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a new chunking mechanism, with a much smaller interface. You pass in a reader to some data and can use some iterator interface to get chunks, while reading through the data.
nix-casync used desync
, which provided a lot of functionality that we didn't use (.caidx
). Also, the way it was designed required us to first write the data to be chunked to a (temporary) file.
The chunking method used to chunk up data shouldn't matter when it comes to substitution. However, using the same chunking method with similar parameters should yield more block reuse.
This provides two different implementations to chunk data.
I marked this to a draft. I'm not entirely sure it belongs in Also, right now, |
Closing in favor of #86. |
This provides two different implementations to chunk data.
I'm not entirely sure if this should go into
pkg/nixpath/chunker
, or in another place.