Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playback positions and audio device timing information. #279

Closed
ishitatsuyuki opened this issue Jun 7, 2019 · 16 comments
Closed

Playback positions and audio device timing information. #279

ishitatsuyuki opened this issue Jun 7, 2019 · 16 comments

Comments

@ishitatsuyuki
Copy link
Collaborator

We should expose the timing information because it’s useful for A/V sync; or simply because most API provide them.

(It’s also useful because audio clocks are not precisely same as CPU clocks, and also underruns could alter the timings which applications should detect.)

@Wojtechnology
Copy link

Is there any way to count samples being played on an output stream, which I guess would be a direct mapping to the timing information? I'd be interested in working on this feature, since I need it for a music visualizer I am building (need A/V sync for visualizations). What kind of API did you have in mind?

@ishitatsuyuki
Copy link
Collaborator Author

I don’t have a concrete idea yet, but I think GStreamer’s timer APIs can be used as a good reference.

@Wojtechnology
Copy link

I'll take a look at it and try implementing something similar here. I'm pretty new to rust so a review once I'm done would be greatly appreciated.

@ishitatsuyuki
Copy link
Collaborator Author

Just a heads up that #301 is a large change to interface; you may need to do a big rebase later.

If you have any questions about the code, don't hesitate to ask here!

@HadrienG2
Copy link

In general, it would be very useful to have a way to correlate samples with time, and to be able to tell (or at least estimate) when input samples were recorded by the ADCs, when the audio callback was called, and when output samples will be played by the DACs, against a common clock.

This allows syncing input streams (and output streams) with each other, and playing latency compensation tricks in interactive applications such as MIDI-controlled virtual instruments.

@Wojtechnology
Copy link

After taking a look at #301, I think I will wait on that work to be complete before I start working on this. I'm not sure if this would work on all platforms, but it seems to me that when each particular API produces/requires some additional samples, we increment a new Clock object that we'll create. We can return this clock from the cpal API and it can be used to check the amount of samples that have been produced/requested.

One question I had is how accurate this clock would be. My assumption is that the best we can do is increment the clock every time the underlying API produces/requests samples, which depends on the number of samples it produces/requests (e.g. at 44100 Hz, if the API provides/requests every 64 samples/buffer size 64, We will update the clock roughly around every 1.5 ms). Depending on the use-case, this could be fast enough. I have 3 questions regarding this:

  1. What is the rough number of samples in between which most APIs will provide/request samples (16/64/256)?
  2. Do they provide some other method of getting more granular signals of when a samples are produced/played?
  3. Does cpal produce/request samples at the same rate as the underlying API or does it wait for a longer amount of time (i.e. waiting until the underlying API has requested/provided more samples)?

@HadrienG2 Do you have an idea of how we would make this work. It seems like input/output streams (in the new non-blocking API) are quite separate. The GStreamer API does this through the help of pipelines, but I'm not sure if cpal in its current form will support this. Perhaps you can build something on top of whatever interface we provide for this ticket?

@ishitatsuyuki In general, does my approach make sense?

@ishitatsuyuki
Copy link
Collaborator Author

First I want to note that sample buffering and playback positions should be handled separately - callbacks suffer from scheduler drifts for example, and obviously not all APIs provide very small buffer size.

The strategy would basically be, directly asking the underlying API for timestamps (snd_pcm_delay in ALSA), or calculate based on system clock and sync with audio clock periodically if it doesn't. Additional measures might be needed to reduce synchronization overhead.

@HadrienG2
Copy link

HadrienG2 commented Sep 9, 2019

@Wojtechnology One API which I think does audio timing pretty well, and can be used as inspiration, is JACK. It provides two facilities that are both very useful, clock and transport.

Regarding clock:

  • JACK provides a monotonic clock, shared by all audio streams on the audio interface(s) that it's currently managing. It uses a combination of system clock, audio device clock and sample-based clock, to provide both frames- and microsecond-based timings of the audio stream.
  • JACK tries its best to get a complete view of latency by 1/querying the latency metadata of ALSA and a 2/asking client applications which "pass through" audio to tell it how much latency they add to their input audio.
  • During the processing callback, audio threads can query what the monotonic clock was when the beginning and end of an input buffer was recorded (or what it will be when an output buffer will start and stop playing back), along with an estimate of the current monotonic clock (computed from the above based on known delays and how much system time has elapsed since the start of audio processing).

Regarding transport:

  • JACK provides a mechanism to transport and synchronize A/V playback information across applications, so that e.g. applications can start/stop playing back media at the same time, handle rewind/fast-forward together, and stay in sync with each other.
  • To do this, it propagates both playback controls and current playback position.
  • Playback position can be communicated in various ways. By default, JACK uses its usual microseconds + audio frames approach. For MIDI applications, it can also propagate beats-based playback position, and for A/V applications, it can also propagate SMPTE-ish frame-based video timecodes.

As @ishitatsuyuki pointed out, the two concepts shouldn't necessarily be associated with the same API/callback. In JACK's case, they aren't.


When trying to map these concepts into CPAL, two issues may appear:

  • Not every supported audio API may be as obsessed about getting audio timings right and provide as much timing information as JACK.
  • JACK's design (one device, common audio callback for all input and outputs) can be difficult to reconcile with CPAL's design (multiple devices, no guarantee about when the audio callback will be invoked).

Personally, I think a minimal integration could be:

  • The audio processing callback allows querying JACK-like buf-start/buf-stop/current clock information for each audio stream (or this information is directly provided as a parameter to the callback). If the audio API does not provide this information, CPAL can either feed estimates (based on e.g. number of received frames) or return an empty optional.
  • There is a new callback in the event loop, which is in charge of propagating transport controls and playback position information. It is optionally registered during device setup, and the operation only only succeeds if the audio backend supports it.

@mitchmindtree mitchmindtree changed the title Playback positions Playback positions and audio device timing information. Sep 11, 2019
@mitchmindtree
Copy link
Member

I'm going to close this now that #397 has landed and #408 is opened.

There was some discussion related to transport APIs in this issue, but I think discussion on the scope of such an API might be best done in another issue (my intuition is that this could be built downstream on top of the new time stamp API anyway).

@TonalidadeHidrica
Copy link

I'm new to this library but very interested in its wonderful features. Especially I really need the feature of retrieving playback position. I have not read the entire issues concerned yet, but let me ask a question: are the library users already provided with getting the playback position info, or is it yet to be implemented?

@TonalidadeHidrica
Copy link

Okay, I've read all the related issue and understood. The playback time is provided in the second argument of the callback functions as Input/OutputCallbackInfo::timestamp(). The "coordinate system" of provided timestamp differs depending on platform and backend. To make an accurate sync to other system it's essential to obtain the current time on the "coordinate system" but the function is not yet provided.
Sorry for disturbing you all.

@Hasenn
Copy link

Hasenn commented Mar 15, 2021

I'm trying to use cpal to make a simple synth, and getting the time associated with sample n in the buffer my output callback gets is very difficult, the only way i could find is to use an unsafe mutable static, write the samplerate before anything happens, and then read it in the callback, which doesn't seem nice.
It would be great if OutputCallbackInfo had a method to get that, as with this and the timestamp i can get pretty good estimates of the time for each sample

@TonalidadeHidrica
Copy link

@Hasenn Actually it is possible to obtain and send the playback position, based on how many samples has been obtained from Decoder (sample iterator), without unsafe Rust, but instead with Mutex. See my code how I handled that. If you have further questions, feel free to ask me.

@Hasenn
Copy link

Hasenn commented Mar 15, 2021

@TonalidadeHidrica i resorted to just putting it in a const, but even then the timestamps i get are StreamInstant and anything i try to do with them is private. can't create one, can't get as nanoseconds, can't do anything apart from adding and subtracting

fn run<T: Sample>(data: &mut [T], info: &cpal::OutputCallbackInfo) {
    let buf_start = info.timestamp().playback;
    for (i, sample) in data.iter_mut().enumerate() {
        let t = buf_start.add(Duration::new(0, (i as u32) / SAMPLE_RATE * NANOSECONDS_IN_SECOND)).unwrap(); // this gets me i think the time in a StreamInstant... but thats completely unusable without doing some black magic to cast it into a duration

    }
}

granted i could keep state and use the first one i get to use duration_since but i don't really care about the exact time origin as i'm just writing a sine wave

@Ralith
Copy link
Contributor

Ralith commented Mar 15, 2021

use an unsafe mutable static, write the samplerate before anything happens, and then read it in the callback

Why wouldn't you just close over the sample rate when constructing the callback?

@Hasenn
Copy link

Hasenn commented Mar 19, 2021

Thanks for your answers, i ended up closing over it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants