-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avcc param sets with emulation prevention bytes don't parse #3
Comments
Seems I've also left some My preference is to have It's a shame that the |
This is an ugly fix to get things working. Look into RbspDecoder refactor to make these cases simpler in future. As reported by Scott Lamb in #3
Commit 7ef5c74 is a bit unsatisfactory, but gets this case working; I'll probably not have time in the short term to work out how to refactor RbspDecoder to support this better. Sorry, it's a bit hairy! I'd be happy to re-do these interfaces -- any suggestions would be welcome! |
Thank you!
Did you mean to fix that in 7ef5c74? It's still there.
fwiw, the "simple cases" are all my software sees right now. It currently gets all its H.264 data from ffmpeg, which always gives it a complete "extra data" (sps + pps) or frame. So my own, not-terribly-robust NAL splitting code is super easy to use—give it some data, it calls a closure once per NALU, no dealing with state between calls. So I haven't thought through the other cases nearly as much as you. I want to replace the usage of ffmpeg here with a (to-be-determined/written) Rust rtsp crate, which may behave differently. So I may end up caring about the complicated cases as well. But I haven't given it much thought yet. Here are some ideas off the top of my head: I don't think it's a problem RAM-wise to buffer a whole NALU at once. I assume the concern is CPU? How much does the naive implementation that always copies to accumulate a whole NALU in a (reused) If that's too expensive, how often does a single NALU get split into two push calls in practice? Maybe the mux/"outer" parsers (RTP / .mp4 / MPEG-TS / Annex-B) could have a single method that pushes some bytes with an end flag. If the whole thing is in one call, it can pass the buffer through as-is; otherwise it can copy to an internal buffer. The RBSP layer could always expect a complete NAL. Perhaps (if the outer parser's buffer stores only one NALU), it could take a mutable buffer and shift over all the non-"emulation prevention" bytes, sort of like Vec::retain does. I also wonder if when doing this with network protocols and tokio and such, it might be better to use the bytes crate rather than plain Speaking of the bytes crate, one of its selling points is to "facilitate zero-copy network programming by allowing multiple Bytes objects to point to the same underlying memory." So maybe my "if the outer parser's buffer stores only one NALU" caveat could be removed by using it as a convenient way of starting with some subset of the buffer, then shrinking it a bit as the emulation prevention bytes are removed. |
Now removed the moonfire-nvr looks really cool, I will be taking a look! My aim with the design (maybe not 100% realised yet!) was to avoid another portion of memory write+read workload at more-or-less the source video bitrate. It's true that, depending on the size of the 'lump' that need to be accumulated, the writes+reads might not get all the way through the cache hierarchy down to RAM, but wanting to avoid cpu 'back-end stalls' (I think that's the right term?) was the motivator. As you already know, those costs aren't a deal-breaker for lots of workloads. I confess I've not really measured the result yet for this crate, but this is mirroring the approach in mpeg2ts-reader, which avoids accumulating the full PES-packets for the caller, and instead hands on the little fragments of PES packet from each TS-packet payload. Benchmark results vs ffmpeg appear good (but you should never trust benchmarks!). It all pales into insignificance if the video actually has to be decoded of course! 😱 I sometimes use this code with higher-bitrate streams (40-50Mbps is common; 100Mbps is sometimes suggested). My hope is that when the app is just proxying / extracting metadata, the network will be the limiting factor, rather than CPU / cache / RAM. So in the end, the complexity arises from my desire to parse the header from It would be nice for this crate to support both a simple interface and a complex 'partial-input' one, with a common implementation. Not sure how yet. I have looked briefly at |
I've raised #4 to track the desire to support better APIs for some of these things. |
Existing SeqParameterSetNalHandler / PicParameterSetNalHandler types, when used together with RbspDecoder, already have the functionality of the decode() function, so we can kill that function and just use the existing types. Revision to the fix for #3
FWIW, I revised the fix for this problem in 2ce960b, since there was actually already a way mechanism to perform the required decode, but I'd forgotten about it! |
In the new AVCC module (thank you! super helpful), an SPS with an emulation prevention doesn't parse. e.g., this panics:
I'm not sure how you'd like this to work. Is
ParamSetIter
supposed to yield an encoded NAL unit or RBSP? If the former,Avcc::create_context
should pass it through the RBSP decoder before callingSeqParameterSet::from_bytes
. If the latter,ParamSetIter
should pass it through the RBSP (and I suppose needs to return aVec
orCow<[u8]>
or some such).The text was updated successfully, but these errors were encountered: