Delay speech-synthesis functions #127

noamr · 2022-01-30T09:16:14Z

See https://wicg.github.io/speech-api/#speechrecognition and https://wicg.github.io/speech-api/#tts-section.
Note that that API doesn't currently handle anything to do with document focus, which should be fixed separately.

domenic

Do we think this delay-everything model is the right one? I could see a few possibilities:

Get no-op behavior for free based on document focus or user activation. (The speech synthesis spec doesn't seem to require these right now, but maybe implementations do?)
Start in the paused state, and auto-resume upon activation.
Be a bit smarter than just delaying. E.g., Allow cancel() while prerendering; allow pausing and resuming; allow using speak() to enqueue things; just avoid actually speaking.

For SpeechRecognition it seems more likely that delay-everything is correct, or maybe it should just auto-fail based on user activation/document focus.

Any idea what our implementation does?

prerendering.bs

noamr · 2022-01-31T16:52:03Z

Do we think this delay-everything model is the right one? I could see a few possibilities:

Get no-op behavior for free based on document focus or user activation. (The speech synthesis spec doesn't seem to require these right now, but maybe implementations do?)

Start in the paused state, and auto-resume upon activation.

Be a bit smarter than just delaying. E.g., Allow cancel() while prerendering; allow pausing and resuming; allow using speak() to enqueue things; just avoid actually speaking.

Yea I see your point... The problem with all these suggestions and the reason I went with something a lot more basic is that the speech synthesis spec doesn't mention anything to do with multiple clients, I feel it's a conversation that should start at that spec's GitHub regardless of prerender and I wasn't sure whether to create a dependency, but that's maybe the right thing to do.

For SpeechRecognition it seems more likely that delay-everything is correct, or maybe it should just auto-fail based on user activation/document focus.

Any idea what our implementation does?

domenic · 2022-01-31T17:01:08Z

Yeah I agree we don't want to take on too large of a dependency here; we're not responsible for solving all the spec tech debt in everything we touch. IMO the right tradeoff here is:

Investigate if we have any easy-outs, e.g. user activation or focus requirements that just aren't specced currently. If so, adding those to the speech spec seems like a reasonable amount of tech debt for us to fix while we're here.
If we don't have any easy-outs, then just make sure what we spec here either matches the Chromium implementation, or is reasonably easy to implement and we have some agreement to do so. We shouldn't knowingly spec something we don't plan to implement.

noamr · 2022-02-01T07:37:56Z

Yeah I agree we don't want to take on too large of a dependency here; we're not responsible for solving all the spec tech debt in everything we touch. IMO the right tradeoff here is:

Investigate if we have any easy-outs, e.g. user activation or focus requirements that just aren't specced currently. If so, adding those to the speech spec seems like a reasonable amount of tech debt for us to fix while we're here.

The activation/focus gate is currently [in discussion[(https://github.com/WebAudio/web-speech-api/issues/35), and doesn't work the same across browsers. Firefox requires focus, Chrome uses the autoplay rules, which should throw not-allowed when trying to speak before the page is activated, and WebKit requires user-gesture but only on iOS. When opening a new unfocused tab with speech-synthesis on desktop, Firefox delays speech but Chrome/Safari doesn't. I believe the Firefox implementation is the closest to how I would expect this to behave when prerendering (the page appears normal, but some of the stuff only happens when you activate it).

If we don't have any easy-outs, then just make sure what we spec here either matches the Chromium implementation, or is reasonably easy to implement and we have some agreement to do so. We shouldn't knowingly spec something we don't plan to implement.

Currently the implementation would throw a not-allowed, I believe. @nyaxt, can you make sure? This is problematic as prerendering would cause pages with speech synthesis to reach an error branch where they wouldn't with regular rendering.

I'm becoming more convinced that the most straightforward solution is a simple DelayWhilePrerendering. I think if we do that for most less-common web features, and forego some possible subtleties at least at the first phase, there's a higher chance developers would understand the prerendering mechanism and how to go about it, while if every feature behaves slightly differently it would be confusing, especially for a new feature.

domenic · 2022-02-01T22:53:33Z

I think I agree with that reasoning. I am slightly concerned [DelayWhilePrerendering] will be harder to implement than it is to spec, but we're already all-in on using it everywhere else, so I hope @nyaxt can agree to it as a general strategy :).

Given the underspecification of all the methods you've decorated, it's hard to review for sure and make sure that delaying them will work as expected. (I.e., it seems like it'd require some sort of queue of actions, which is implicit in the speech spec but not explicit.) But this is probably good enough.

prerendering.bs

noamr added 8 commits January 30, 2022 11:12

Delay for speech-synthesis

851b0e8

Delay for speech-synthesis

62b0f8a

refs

9612552

refs

b434b0a

refs

97dc320

refs

609a419

refs

f35f1a6

refs

ae247f8

noamr mentioned this pull request Jan 30, 2022

Tracking issue for speccing disallowed prerendering BC features #42

Closed

60 tasks

domenic reviewed Jan 31, 2022

View reviewed changes

prerendering.bs Outdated Show resolved Hide resolved

prerendering.bs Outdated Show resolved Hide resolved

domenic added 2 commits February 1, 2022 17:53

Merge methods

9f5c4d9

Merge methods

c53d57d

domenic approved these changes Feb 1, 2022

View reviewed changes

prerendering.bs Outdated Show resolved Hide resolved

Add space

4e816e4

domenic merged commit 3bc55b3 into WICG:main Feb 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay speech-synthesis functions #127

Delay speech-synthesis functions #127

noamr commented Jan 30, 2022

domenic left a comment •

edited

Loading

noamr commented Jan 31, 2022

domenic commented Jan 31, 2022

noamr commented Feb 1, 2022

domenic commented Feb 1, 2022

Delay speech-synthesis functions #127

Delay speech-synthesis functions #127

Conversation

noamr commented Jan 30, 2022

domenic left a comment • edited Loading

Choose a reason for hiding this comment

noamr commented Jan 31, 2022

domenic commented Jan 31, 2022

noamr commented Feb 1, 2022

domenic commented Feb 1, 2022

domenic left a comment •

edited

Loading