Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread manager crate to agave #3890

Merged
merged 20 commits into from
Jan 22, 2025

Conversation

alexpyattaev
Copy link

@alexpyattaev alexpyattaev commented Dec 3, 2024

Problem

There exist a variety of perf issues related to unorganized thread pools that spawn far more threads than are useful on a given machine, this was confirmed by measurements (see measurement examples in thread-manager/examples directory) and is relatively easy to fix in agave.

Idea is to eventually address the needs of

Summary of Changes

Added new agave-thread-manager crate, which is to be gradually hooked into the agave itself. It will be used to centralize thread pool creation such that their core allocations can be controlled, and total thread count can gradually be reduced to match core count as closely as possible. Crate comes with its own benchmarks and some tests. The crate has full functionality only on Linux as this is the only platform where it is relevant.

@alexpyattaev alexpyattaev force-pushed the thread_manager branch 3 times, most recently from da038b3 to 309350b Compare December 8, 2024 07:50
@alexpyattaev alexpyattaev marked this pull request as ready for review December 8, 2024 22:05
@@ -48,11 +48,12 @@ fn main() -> anyhow::Result<()> {
println!("Running {exp}");
let mut conffile = PathBuf::from(env!("CARGO_MANIFEST_DIR"));
conffile.push(exp);
let conffile = std::fs::File::open(conffile)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically, we don't specify configurations in files. Doing it in code instead for the example/test cases, or with command line flags for other cases. Sometimes use yml (for accounts information for example).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got you, you want to read thread-manager/examples/core_contention_contending_set.toml with it.
Yeah, could you maybe have a method that reads this file and used in both example and production environment in the future?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file will never be read in prod, its just for examples which are also benchmarks/tests. I'm not even sure we will use toml in prod for this yet, though it is likely.

@KirillLykov
Copy link

I think you need to explain broader audience proposed changes in the context. Maybe by setting up a meeting with involved parties.

@alexpyattaev
Copy link
Author

Yes, a meeting would be necessary eventually, but I also need the code to be reasonably good before a million people look at it=)

@alexpyattaev alexpyattaev force-pushed the thread_manager branch 2 times, most recently from 2c6a9dd to 4e1d728 Compare December 17, 2024 21:56
@alexpyattaev alexpyattaev requested a review from bw-solana January 8, 2025 22:04
@alexpyattaev alexpyattaev force-pushed the thread_manager branch 2 times, most recently from 895fd15 to e00029a Compare January 8, 2025 22:47
Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the high level idea. Left a few nits and some questions.
I need to better understand how you envision the string-->string-->runtime mapping working before I can weigh in on it

## Scheduling policy and priority
If you want you can set thread scheduling policy and priority. Keep in mind that this will likely require
```bash
sudo setcap cap_sys_nice+ep

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the past we've had tuning scripts that could handle recommended configurations like this. If it makes sense, we could add this wherever relevant

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about priority stuff at this point, it is super platform specific and mostly a placeholder at the moment, as getting this past CI will be unfun (due to extra priveleges necessary)

#[derive(Default, Debug)]
pub struct ThreadManagerInner {
pub tokio_runtimes: HashMap<ConstString, TokioRuntime>,
pub tokio_runtime_mapping: HashMap<ConstString, ConstString>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the idea here that we essentially have a mapping like so:
service --> tokio name --> tokio runtime
And service --> tokio name could many to 1 but the tokio name --> tokio runtime is 1 to 1?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, exactly! I believe we will eventually have maybe 1-2 tokio runtimes + 2-3 rayons. and mappings to those.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! I think it makes sense. Maybe we can get rid of the intermediate mapping and go just from service to runtime (I believe you mentioned this elsewhere)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, do you want the intermediate mapping removed now or can we keep it in?

@bw-solana bw-solana requested a review from alessandrod January 9, 2025 17:52
@bw-solana
Copy link

Adding @alessandrod as I'm sure he'll have opinions on this 😄

@alexpyattaev
Copy link
Author

I like the high level idea. Left a few nits and some questions. I need to better understand how you envision the string-->string-->runtime mapping working before I can weigh in on it
runtime mapping is supposed to work as follows:

  1. in code, we ask for a named pool, identified by its purpose (e.g. solGossipConsume).
  2. In config, we define that solGossipConsume should be handled by RayonSigVerify pool
  3. Further in config, we also define that RayonSigVerify shoudl have 16 threads pinned to cores 32-48 and should have certain priority w.r.t. other threads etc etc.

From this idea we get to solGossipConsume -> RayonSigVerify -> ThreadPool sort of mapping. At least that was the idea. We can dispense with the intermediate layer but then each pool would necessarily run on its own set of threads which may not be ideal. We can, of course, axe this functionality and bring it back later as appropriate.

Copy link

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@KirillLykov KirillLykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I expect that when this crate will be adapted there will be some other changes/improvements. And also I would like to see better code documentation (in follow ups):

  • on each module level describe what is this module for, etc
  • on each trait/struct the same
  • each method

Finally check that the generated documentation looks decent.
I know that most of the crates don't follow this practice but I think for the new crates we can make it.

@alexpyattaev alexpyattaev merged commit 73b8cb7 into anza-xyz:master Jan 22, 2025
58 checks passed
@alexpyattaev alexpyattaev deleted the thread_manager branch January 22, 2025 18:30
@t-nelson
Copy link

this was confirmed by measurements

what measurements? why do we have a userspace scheduler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants