Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API overhaul #26

Merged
merged 9 commits into from
Dec 22, 2021
Merged

API overhaul #26

merged 9 commits into from
Dec 22, 2021

Conversation

scottlamb
Copy link
Collaborator

@scottlamb scottlamb commented Jun 11, 2021

for #4

I think this is a lot more straightforward to use. It's also faster. Looking at the benchmarks, throughputs in GiB/s:

                                  |---commit---|
                                  orig api  rbsp

parse_annexb/onepush_null         8.8  8.5  9.2
parse_annexb/chunksize1440_null   8.0  7.8  8.2
parse_annexb/chunksize184_null    6.2  6.0  6.3

parse_annexb/onepush_rbsp         4.6  4.8  8.3
parse_annexb/chunksize1440_rbsp   4.4  4.4  4.2
parse_annexb/chunksize184_rbsp    4.0  4.0  6.6

parse_annexb/onepush_parse        3.6  3.6  6.2
parse_annexb/chunksize1440_parse  5.3  5.3  5.9
parse_annexb/chunksize184_parse   4.1  4.1  4.6

commits:
orig: 744582e
api:  6db563f
rbsp: cdc1da0

BitRead::read_unary1 is faster than count_zeros.
Add test for overflow.
* make the bit reader type take a BufRead rather than a slice so we
  don't have to keep a buffered copy of the RBSP.
* reduce "stuttering" by taking the module name out of the struct name.
* use a trait so there's less type bounds to deal with in callers.
* take a name in all BitReader operations. This will improve error
  messages and trace logs/println debugging.
My goal is to establish a good baseline for performance impact of
upcoming API changes.

* include a complete usage of NalSwitch, RbspDecoder, and NAL parsers.
  The slice header parse will stop decoding RBSP after it's gotten
  a full slice header.
* parse in one push, 184-byte pushes (like MPEG-TS), and 1440-byte
  pushes (~typical for RTP). RTP doesn't even use Annex B, but it
  needs RTSP decoding and NAL parsing.
I haven't removed RbspDecoder or adjusted decode_nal yet, and the code
is a little ugly as a result. Seems to work though.
It now more closely matches NalAccumulator's interface. Getting ready
to plug that in.
No more need to deal with a user context, Box<RefCell<>>, or separate
traits for the various handlers. In the simplest case, just a closure
will do.

This is essentially performance-neutral by itself. It allows RBSP
parsing to be lazy though, and after the next commit that will pretty
significantly speed up the case where slice NALs are processed in a
single push.
* remove RbspDecoder in favor of ByteReader

* return ErrorKind::InvalidData on illegal byte sequences.
  This interface is straightforward now with the std::io interface.
  I held off until removing the RbspDecoder interface because
  doing it there was ugly.

* re-implement decode_nal on top of ByteReader.
  behavior change: it now strips the NAL header byte, which I think
  makes it easier to use. Also replace the unit test with a doctest
  to better explain what it does.

* don't look as far ahead for the next zero byte. This speeds things
  up in general but particularly the case where a push has a full slice
  NAL and we're only interested in the header.
* return error rather than panic for unimplemented B frame
* set limits for things passed to Vec::with_capacity
@scottlamb scottlamb changed the title DRAFT: push parser helper to accumulate NALs API overhaul Jun 23, 2021
@scottlamb scottlamb marked this pull request as ready for review June 23, 2021 22:38
@dholroyd dholroyd merged commit 6778a7f into dholroyd:master Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants