Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext/draft/scratch-memory: Specify num_concurrent_buffers #436

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

p--b
Copy link

@p--b p--b commented Dec 3, 2024

This commit adds a new parameter to the reserve() method of the draft scratch-memory extension which allows a plugin to indicate to the host whether its thread-pool tasks need to access scratch memory buffers or not, and if so, how many of them the host must accommodate.

This commit adds a new parameter to the `reserve()` method of the
draft `scratch-memory` extension which allows a plugin to indicate to
the host whether its `thread-pool` tasks need to access scratch memory
buffers or not, and if so, how many of them the host must accommodate.
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Peter Bridgman seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@p--b p--b mentioned this pull request Dec 3, 2024
@p--b
Copy link
Author

p--b commented Dec 3, 2024

I have signed the CLA but the bot doesn't seem to be happy - not sure if there's a step I'm missing?

@abique
Copy link
Contributor

abique commented Dec 4, 2024

Thank you for the PR.
I'm traveling at the moment. I'll try to review soon.
Regarding the CLA, I've no idea what's happening.

@abique
Copy link
Contributor

abique commented Dec 4, 2024

Maybe you can try to visit this link again?
https://cla-assistant.io/free-audio/clap?pullRequest=436

@p--b
Copy link
Author

p--b commented Dec 4, 2024

image

@abique
Copy link
Contributor

abique commented Dec 5, 2024

I find that this additional parameter makes everything more complicated and harder to understand.
I'm worried that it creates variable behavior from one host to another and makes the feature less reliable for the plugin developer.

If a plugin wants concurrent buffers, but doesn't know what its maximum concurrency is, what should it pass for max_concurrent_buffers? INT_MAX?

Here's a proposal to simplify the change:

  • rename this last argument to concurrency_hint
  • if concurrency_hint == 0 then no hint is provided
    • note: even if the plugin doesn't use the thread pool, using the scratch from the process call counts as one concurrent usage
  • concurrency_hint is an optional info that the plugin can provide to the host, in order to let the host optimize its memory allocation.
  • the host can ignore concurrency_hint
  • if the plugin crosses the provided hint, then the behavior is undefined (the host can ignore the hint, so we can't force it to honor a special behavior)

Actually that list above isn't short either... 🤔

@abique
Copy link
Contributor

abique commented Dec 5, 2024

To me forcing every hosts to modify their behavior depending on the concurrency hint is a no.
But then we have undefined behavior when the hint is crossed, so a given plugin will work fine with a host that ignores the hint, and not work with a host that uses it. Which I'm not very happy about.

@abique
Copy link
Contributor

abique commented Dec 5, 2024

@jatinchowdhury18 what's your thought?

@p--b
Copy link
Author

p--b commented Dec 5, 2024

I think it's important to say that as currently specified, this change does not force any host to modify their behaviour. The host is already free to ignore the parameter: the host could still return a thread-local buffer for every access() call on every worker thread and be compliant with this interface. All that the documentation says is that some hosts may return NULL in certain circumstances, not that all hosts must.

I would be happy to change the name of the parameter to concurrency_hint.

If a plugin wants concurrent buffers, but doesn't know what its maximum concurrency is, what should it pass for max_concurrent_buffers? INT_MAX?

I'm not sure that this case makes much sense to me: it's basically saying, "what if a plugin wants to allocate memory, but doesn't know how much memory it wants to allocate?" In the same way as you have to pass a number of bytes to malloc, I don't think it's unreasonable to ask a plugin to specify how much memory the host needs to allocate for it.

This design is as was agreed in #423 and is only intended to make this extension possible to implement in environments where resources are constrained - if we can't know how many scratch buffers we have to allocate ahead of time to satisfy the thread pool requirements, we simply can't implement this extension.

@jatinchowdhury18
Copy link
Contributor

I like concurrency_hint over num_concurrent_buffer, but I think I prefer max_concurrency_hint even more.

Disclaimer: I've never worked on a host that implements thread-pool, so there may be some things here that I don't fully understand. Imagining the host supports thread pool with host_thread_pool_count number of threads, then I can see three possibilities:

  1. max_concurrency_hint == 0: Host allocates one scratch buffer for each thread-pool thread.
  2. max_concurrency_hint >= host_thread_pool_count: Host allocates one scratch buffer for each thread-pool thread.
  3. max_concurrency_hint < host_thread_pool_count: Host allocates scratch buffers for some of the thread-pool threads.

So the concurrency hint would allow the host to save memory, but only in case (3). I guess the ultimate question is: does that potential memory saving justify the added complexity for the plugin and host devs implementing/using this extension?

I'm trying to imagine a scenario where this memory saving would have the most impact:

  • The plugin and host both implement thread-pool
  • The host is memory constrained, but also has a high thread pool count
  • The plugin asks for a large amount of scratch but with a low concurrency hint

To me that scenario seems unlikely. In particular, I can't think of a scenario where the host would be memory constrained but also able to support a high thread count. But I would be very curious to hear if such systems do exist (or could exist in the future)!


On the plugin side, I don't think the added complexity is too bad. The thought process as a plugin dev would be:

  • If I'm not using thread-pool then max_concurrency_hint should always be zero.
  • If I am using thread-pool I should know at activation time the maximum number of thread pool tasks I can request (I understand that thread-pool doesn't require the plugin to know this until processing time, but I can't think of a situation in which it would not be knowable at activation time).
  • If I know the maximum number of thread-pool tasks I will request, then I can set that as the max_concurrency_hint, and it's my responsibility to make sure that I'm doing that correctly (it can't be the host's responsibility, since the host is allowed to ignore the hint).

The only potentially confusing thing is making sure the plugin knows whether the process() thread counts as one of the concurrent tasks. For example, if I'm going to call request_exec(host, 16), and I also need a scratch buffer in the process callback, should max_concurreny_hint be 16 or 17? Based on the current documentation I would guess 17, but we should make sure that's obvious in the documentation.

@abique
Copy link
Contributor

abique commented Dec 10, 2024

Hi,

So here's my aggregated thoughts:

  • OK for max_concurrency_hint.
  • max_concurrency_hint == 0 means no hint.
  • max_concurrency_hint is the maximum number of concurrent threads using the scratch, so the process()'s thread must be counted in, valid values starts at 1, meaning that only one thread at a time will be using the scratch memory.
  • I think that max_concurrency_hint is more theoretical than practical, though I may be wrong so let's try it. @p--b please implement it in your host during the draft period and tell us if it was worth it. If we don't get any positive feedback for this added argument during the draft period we may consider removing it.

I'll take the PR and do the adjustments so I'm happy with the doc/spec.

Does that work for both of you @p--b, @jatinchowdhury18 ?

@jatinchowdhury18
Copy link
Contributor

That sounds good to me! I'm happy to help out with testing/debugging from the plugin side if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants