Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support decompressing fixed huffman blocks #124
base: master
Are you sure you want to change the base?
support decompressing fixed huffman blocks #124
Changes from all commits
6943db7
6983463
f5faa04
58f3d2e
c4416b7
b29aaf3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 38 in huffman/src/decode.hpp
Codecov / codecov/patch
huffman/src/decode.hpp#L27-L38
Check warning on line 74 in huffman/src/decode.hpp
Codecov / codecov/patch
huffman/src/decode.hpp#L74
Check warning on line 126 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L125-L126
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you can just return instead of construction + assignment of
len
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check warning on line 153 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L152-L153
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if
src_bits
doesn't have enough bits to decode a symbol - that would be "success" and this function should exit?If
decode_one
is changed to returnexpected<>
, maybe something like this?If
src_bits
doesn't have enough bits, you do have have a partially constructed code and iterator in the code table. You could potentially keep those in order to skip some computation when more bits arrive - although it's certainly simpler to just restart.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking of a chunked / streaming input?
If not, I don't understand why not having enough bits would be success?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm thinking of the chunked/streaming case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth adding a TODO to handle later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks better to me.
I think this could still benefit from breaking out smaller functions - although if you think that's too much work, feel free to ignore.
where
convert_error_status: ParseLitOrLenStatus -> optional<DecompressStatus>
return {in_place, e == EndOfBlock ? Success : InvalidLitOrLen};
and
decode_lit_or_len_into
uses the poor version of pattern matching available in C++:and
although that's a bit of a simplification - you'll also need to capture
dst_written
.The different cases are broken out into different functions, although there's some logic inversion with
variant::visit
.Check warning on line 161 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L161
Check warning on line 167 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L166-L167
Check warning on line 177 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L176-L177
Check warning on line 181 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L180-L181
Check warning on line 188 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L187-L188
Check warning on line 191 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L190-L191
Check warning on line 196 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L196
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this comment is irrelevant unless we drpo the prefix of dst every time as suggested below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
views::drop
has a specialization forspan
which can be used to trim off the fronthttps://en.cppreference.com/w/cpp/ranges/drop_view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we advance dst, my guess would be it's UB to do
dst.begin() - distance
. I need to go backwards. If you have a better suggestion LMK.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be UB if the pointer goes off the front of the original buffer.
Maybe it would make sense to create a
destination_buffer
type that handles this logic and catches potential UB.As for how to define the destination buffer, that could depend on how you want to handle growth.
Would it act like a
std::vector
where it allocates a new buffer and copies elements over? In this case, maybedestination_buffer
stores a pointer and 2 integers.Or would it maintain a list of chunks so growth doesn't require copying elements from the old buffer to a new buffer? I'm not sure which approach is better, but it would probably help to abstract this part so the rest of the decompression implementation is independent.
Check warning on line 258 in src/decompress.cpp
Codecov / codecov/patch
src/decompress.cpp#L257-L258