Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional data after WebVTT file signature isn't respected #110

Open
nakkamarra opened this issue Aug 12, 2024 · 2 comments
Open

Optional data after WebVTT file signature isn't respected #110

nakkamarra opened this issue Aug 12, 2024 · 2 comments

Comments

@nakkamarra
Copy link
Contributor

When I read a webvtt file using the ReadFromWebVTT function, do some work, and attempt to write the captions back out to WriteToWebVTT, the optional data after the WebVTT file signature is dropped.

Example:

    captions, err := astisub.ReadFromWebVTT(reader) // reader here represents file
    if err != nil {
        panic(err)
    }
    // ... do some work here to captions
    captions.WriteToWebVTT(w) // writer represents output

file.vtt:

WEBVTT - Some optional comment here

1
00:00:00.500 --> 00:00:02.000
The Web is always changing

2
00:00:02.500 --> 00:00:04.300
and the way we access it is changing

output:

WEBVTT

1
00:00:00.500 --> 00:00:02.000
The Web is always changing

2
00:00:02.500 --> 00:00:04.300
and the way we access it is changing

This isn't a huge deal, it doesn't seem to cause issues with parsing. But I would expect it to work, as it's technically valid according to the spec:

A WebVTT file body consists of the following components, in the following order:

  1. An optional U+FEFF BYTE ORDER MARK (BOM) character.
  2. The string "WEBVTT".
    3. Optionally, either a U+0020 SPACE character or a U+0009 CHARACTER TABULATION (tab) character followed by any number of characters that are not U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN (CR) characters.
  3. Exactly one WebVTT line terminators to terminate the line with the file magic and separate it from the rest of the body.
  4. Zero or more WebVTT metadata headers.
  5. One or more WebVTT line terminators to terminate the header block and separate the cues from the file header.
  6. Zero or more WebVTT cues and WebVTT comments separated from each other by one or more WebVTT line terminators.
  7. Zero or more WebVTT line terminators.
@asticode
Copy link
Owner

You're right, for this to work we would need to add a Comments []string attribute to the Subtitles struct, set it properly when reading the webvtt and write it properly when writing the webvtt.

I'm not planning on adding this anytime soon, but I'm welcoming PRs (for which I can obviously offer guidance) 👍

@nakkamarra
Copy link
Contributor Author

You're right, for this to work we would need to add a Comments []string attribute to the Subtitles struct, set it properly when reading the webvtt and write it properly when writing the webvtt.

I'm not planning on adding this anytime soon, but I'm welcoming PRs (for which I can obviously offer guidance) 👍

Hey thanks for the response @asticode, sure I'll give it a shot and put up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants