-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from encoding
to encoding_rs
.
#9
Conversation
Actually, I could use the |
This proved to be much more complicated than I anticipated but it should work. |
289a091
to
d46eb0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff, this looks good to me. I just had some minor sugs but nothing worth holding up a merge over.
// If the output is full, we must reallocate. | ||
(DecoderResult::OutputFull, bytes_read) => { | ||
total_bytes_read += bytes_read; | ||
output.reserve(input.len() / 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lean towards removing the / 10
here since we're just reserving. It should lead to less reallocs overall. Another common strategy for really large buffers is to double the reserve() size each time we hit OutputFull.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output is already reserved to the size of the input. Would it be common to have instances of inputs which, when converted to utf8, are more than 1.1x their size?
This definitely needs a max
though. If the input size is less than 10 bytes that's a reserve of 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's true. Most of the time we'd probably expect something encoded in utf32 to become smaller when converted to utf8, so the / 10
makes sense, especially with a short comment to explain the magic number. / 4
would give a safe buffer for some pathological cases, but in light of this is probably overkill. I wonder whether this code path is ever hit in the wild.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment, thanks!
I wonder whether this code path is ever hit in the wild.
I know nearly nothing about encodings, but I'm curious 🤔
src/yaml.rs
Outdated
"Invalid character sequence at {byte_idx}: {malformed_sequence:?}", | ||
)))); | ||
} | ||
YAMLDecodingTrap::Call(f) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sug: rename f
to fun
or func
.
See rustsec/advisory-db#1605.
With these changes, we seem to retain the
encoding
functionalities we had before, aside from theCall
variant.I am not well versed in that side of parsing. Would this require more thorough testing?
cc @mkmik
Fixes #8