-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
construct huffman table from symbol spans #97
Conversation
c3f3a71
to
b8a3e0d
Compare
Codecov Report
@@ Coverage Diff @@
## master #97 +/- ##
==========================================
+ Coverage 58.98% 61.12% +2.14%
==========================================
Files 12 14 +2
Lines 885 939 +54
==========================================
+ Hits 522 574 +52
- Misses 363 365 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
457b6d0
to
9440b84
Compare
3cb36c5
to
8bac466
Compare
69122c0
to
c70f795
Compare
3a66b30
to
fb781a1
Compare
6a09bb7
to
ec2ce26
Compare
fb781a1
to
3079c1d
Compare
ec2ce26
to
efcadda
Compare
I made a couple of very minor changes and pushed. We can discuss the other stuff next time we meet if you want. |
/// An inclusive span of symbols | ||
/// | ||
template <symbol S> | ||
class symbol_span : public std::ranges::view_interface<symbol_span<S>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between this and ranges::views::iota
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It pretty much is the same as ranges::iota_view
(ranges::views::iota
is a function that creates an iota_view
).
It just has a slightly different constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the only difference being end is inclusive rather than exclusive?
If so seems like a lot of code for that, but OK.
|
||
namespace starflate::huffman::detail { | ||
|
||
template <std::ranges::view V> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead add a little complexity to the table constructor and avoid this class?
Could we have a new tag for a constructor to differentiate the constructor that takes symbol-bitsize pairs and the one that takes symbol-range-bitsize pairs?
I'm honestly a bit lost in what's going on here, which doesn't mean it's wrong, but I want to explore if we can make it a bit easier for a noob like me to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure let's look into that. I'll give it some thought tomorrow or this weekend.
At a high level, we get a range of ranges:
[
(a, b, c), // subrange 1
(d, e), // subrange 2
(f, g, h), // subrange 3
(i), // subrange 4
]
and we want to flatten to:
[a, b, c, d, e, f, g, h, i]
This view provides a lazy flatten so we don't need to actually construct the flattened range of all the subranges.
Note that the subranges (symbol_span
s) don't actually store all the elements but are instead are generators.
We're essentially doing this: https://docs.python.org/3/library/itertools.html#itertools.chain
for a collection of symbol_span
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some thoughts about this and I'll try to write it out tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, table
constructors take a range and potentially a tag, and pass that range to the table_storage's constructor. The table_storage's constructor is responsible for taking that range and filling up the internal container with "valid enough" codes and symbols. Once the table_storage's constructor is complete, no elements can be added or removed, as members such as resize and emplace_back are not exposed upwards. After table_storage's constructor is complete, we get to the function body of table's constructor which is responsible for building the Huffman tree, setting skip fields, and then canonicalizing the codes. table_storage is responsible for adding elements, table is responsible for moving them around.
I was thinking about passing the symbol-span+count range down to the table_storage constructor and doing nested for loops. If we're able to reuse existing constructors, then we can focus on reviewing the correctness of the range adapter code, which admittedly requires understanding the range abstractions. But I think this will be pretty useful given that this library is focusing on handling streams of data and we'll likely run into other cases where a range abstraction is useful, even after switching to execution on the GPU. I am very much in the no raw loops camp.
I've added more documentation to the flattened_symbol_bitsize_view
class template. I've also removed the to_code_symbol
function and instead do the equivalent in flattened_symbol_bitsize_view::iterator::operator*()
. Hopefully things are a bit clearer but let me know if there's anything I can explain or we can also try another approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing that threw me off was the use of concepts to figure out whether to use this flattened_symbol_bitsize_view, and I was suggesting using a constructor tag intead. I was not objecting to the existence of this class.
But I guess from a caller's point of view this is nicer, and you've already figured it out so I guess we can keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I misunderstood. But yeah, I'll generally try to keep the interface simpler for the caller at the expense of more complexity in the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is your understanding of the view class? I don't mind going over it if there are things you don't understand.
f596820
to
5360f66
Compare
efcadda
to
811d680
Compare
811d680
to
57ef4a6
Compare
Define class template `symbol_span` that represents an inclusive range from first to last. The `symbol_bitsize` constructor of `huffman::table` now takes a range of pair-likes of `symbol_span` and bitsize, allowin gthe same bitsize to be associated with a contiguous range of symbols. For example, the static Huffman tree corresponding to the literal/length alphabet in section 3.2.6 of the DEFLATE RFC can be defined as: constexpr auto table = huffman::table<std::uint16_t, 288>{ huffman::symbol_bitsize, {{{ 0, 143}, 8}, {{144, 255}, 9}, {{256, 279}, 7}, {{280, 287}, 8}}}; resolves #91 Change-Id: I50f7ea63561de0bb785c85467cdb943829b65edf Co-Authored-By: Gary Miguel <[email protected]>
57ef4a6
to
ac578ee
Compare
Define class template
symbol_span
that represents an inclusive rangefrom first to last. The
symbol_bitsize
constructor ofhuffman::table
now takes a range of pair-likes of
symbol_span
and bitsize, allowingthe same bitsize to be associated with a contiguous range of symbols.
For example, the static Huffman tree corresponding to the literal/length
alphabet in section 3.2.6 of the DEFLATE RFC can be defined as:
constexpr auto table =
huffman::table<std::uint16_t, 288>{
huffman::symbol_bitsize,
{{{ 0, 143}, 8},
{{144, 255}, 9},
{{256, 279}, 7},
{{280, 287}, 8}}};
resolves #91
Change-Id: I50f7ea63561de0bb785c85467cdb943829b65edf
Co-Authored-By: Gary Miguel [email protected]