Skip to content
This repository has been archived by the owner on Apr 28, 2021. It is now read-only.

appendBuffer is not standard #8

Open
AxelDelmas opened this issue Feb 23, 2016 · 8 comments
Open

appendBuffer is not standard #8

AxelDelmas opened this issue Feb 23, 2016 · 8 comments
Labels

Comments

@AxelDelmas
Copy link

In appendBuffer call, as of now, we still need to pass the segment's start and end timestamp, and wether it's an initialization segment. See here
This is the main thing that still require some modifications in the JS player to work with this polyfill.

We need to parse these infos from the MP4 segments themselves in order to get rid of this issue

@AxelDelmas AxelDelmas added the bug label Feb 23, 2016
@kasrinat
Copy link

Reliably parsing the timestamp info from the right MP4 boxes is needed. Based on referring to the ISO-IEC specs and guidance from Axel D. and team, I have found the following ways to do this:

  • Detecting the initialization segment:

https://w3c.github.io/media-source/isobmff-byte-stream-format.html
The initialization segment shall have an 'ftyp' box followed by an 'moov' box. No media data is present, meaning there is no 'mdat'.

  • Getting the start presentation timestamp and the end timestamps:

Here there seem to be multiple ways:
The segment index box ('sidx'), if present (not mandatory), will contain the timescale, starting timestamp, as well as sub-segment durations which will add up to give the end timestamp of the segment we are attempting to append to the MSE buffer.

The track fragment box ('tfdt') if present (not mandatory), gives the sum of the decode times of all earlier samples encountered. It can be thought of as being the base decode timestamp of the present sample. It does not seem to be an indicator of the presentation time though.

The movie header box ('mvhd'), a mandatory box, gives the segment timescale and duration info. However, presentation start time is not available here. If the mp4 is of fragmented type, we would have access to the movie fragment header box ('mfhd') and the movie fragment random access box ('mfra'). The 'mfra' box seems to be a reliable way of getting the presentation time of the sync sample for fragmented mp4s.

What is the preferred way of getting this information out? Are there other ways to get these params not listed here? Getting clarity on this will help remove the need to pass in these parameters in the appendBuffer call. Thanks for any pointers in advance.

@AxelDelmas
Copy link
Author

Got an update from @cconcolato:

  • tfdt is optional in theory, but is always present in DASH, CMF and Smooth. It gives the decode time, not the presentation time so it can cause surprises.
  • trun gives the offset between decode time and presentation time for all frames. It can be used to derive the presentation time from the decode time present in tfdt, by taking the minimum offset.
  • sidx, if present, is a shortcut to get the presentation time directly

@kasrinat
Copy link

kasrinat commented Apr 1, 2016

Thanks a lot. I am proceeding in this direction by attempting parsing accordingly. However, the composition offset time in the trun box seems to be optional. The presence of the sample‐composition‐time‐offsets‐present flag in the tr_flags seems to indicate it's presence.

I am looking at a segment where the tr_flags (0x000601) and the CTS offset present flag (0x000800) when combined indicate that the CTS value is not present in the trun box.

When this happens, should we still rely on the decode time in the tfdt box as fallback though not safe? What would be ideal in this situation? Please suggest.

@cconcolato
Copy link

CTS is indeed optional, if not there CTS is equal to DTS.

@kasrinat
Copy link

kasrinat commented Apr 7, 2016

Very clear on the CTS offsets inside trun now. Thanks @cconcolato.

Another question is on the timescale used to compute the timestamps. This is typically always present inside the sidx or themvhd boxes. For fragmented MP4s which do have an mfhd instead, getting the timescale for the segment is seemingly an issue.

mdhd is another option to extract the timescale and seems to be listed as a mandatory box. But I have a segment here with no mdhd either. Is this segment valid? Can we assume at the least that there is a place to always retrieve the timescale?

Thanks for your thoughts.

@cconcolato
Copy link

mdhd must be there otherwise the file is not conformant.

@kasrinat
Copy link

Thank you for all the clarification.

@kasrinat
Copy link

Another scenario is where byte ranges are accepted. Here there is no presence of unique files per segment. Instead, media segments are part of the one long byte stream, but with byte range offset as seen in the response header.

We do not see an ftyp or an styp prefix in these segment headers. Instead, following a random 4-byte start, the moof atom is encountered. Here, I am assuming that such a segment is a media segment and proceeding to find either the track boxes or the mdhd box for attempting the fetching of timestamps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants