Allow restricting the number of parallel linker invocations #9157

luser · 2021-02-09T20:04:16Z

In CI at my work, we ran into a situation where rustc would get OOM-killed while linking example binaries:

error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" <…>
  = note: collect2: fatal error: ld terminated with signal 9 [Killed]
          compilation terminated.

We were able to mitigate this by using a builder with more available memory, but it's unfortunate. We could dial down the parallelism of the whole build by explicitly passing -jN, but that would make the non-linking parts of the build slower by leaving CPU cores idle.

It would be ideal if we could explicitly ask cargo to lower the number of parallel linker invocations it will spawn. Compile steps are generally CPU-intensive, but linking is usually much more memory-intensive. In the extreme case, for large projects like Firefox and Chromium where the vast majority of code gets linked into a single binary, that link step far outweighs any other part of the build in terms of memory usage.

In terms of prior art, ninja has a concept of "pools" that allow expressing this sort of restriction in a more generic way:

Pools allow you to allocate one or more rules or edges a finite number of concurrent jobs which is more tightly restricted than the default parallelism.
This can be useful, for example, to restrict a particular expensive rule (like link steps for huge executables), or to restrict particular build statements which you know perform poorly when run concurrently.

The Ninja feature was originally motivated by Chromium builds switching to Ninja and wanting to support distributed builds, in which there might be capacity to spawn way more compile jobs in parallel since they can be run on distributed build nodes, but link jobs, needing to be run on the local machine, would want a lower limit.

If this were implemented, one could imagine a further step whereby cargo could estimate how heavy individual linker invocations are by the number of crates they link, and attempt to set a reasonable default value based on that and the amount of available system memory.

The text was updated successfully, but these errors were encountered:

luser · 2021-02-09T20:13:22Z

I believe this would also be useful for people using sccache in distributed compilation mode, as they could have an exaggerated version of this problem similar to what's described in that message about Chromium, with more build capacity for compiling than linking.

Be-ing · 2021-02-27T03:11:33Z

I have no idea if this would be practical, but could cargo automatically monitor memory usage to adjust how many concurrent threads to use?

levkk · 2021-10-26T06:31:52Z

I've managed to work around this by enabling swap. Linking time did not suffer visibly. On Ubuntu, I followed this guide.

sagudev · 2023-03-16T09:00:30Z

Proposed Solution

Adding --link-jobs option to specify number of jobs for linking. The option would default to number of parallel jobs.

Here what would help look like (-j option is displayed for comparison):

-j, --jobs <N>                Number of parallel jobs, defaults to # of CPUs
--link-jobs <N>               Number of parallel jobs for linking, defaults to # of parallel jobs

weihanglo · 2023-10-31T15:08:41Z

@rustbot claim

epage · 2023-11-02T20:39:58Z

@weihanglo see also #7480

weihanglo · 2023-11-02T21:21:51Z

FWIW, Cabel community had a discussion a while back: haskell/cabal#1529

weihanglo · 2023-11-03T05:33:43Z

Potentially the unstable rustc flag -Zno-link can separate linking phase from others (see #9019), and then Cargo can control the parallelism of linker invocations. Somebody needs to take a look at the status of -Zno-link/-Zlink-only in rustc (and that is very likely me).

epage · 2023-11-03T13:37:57Z

As this is focusing on the problem of OOM, I'm going to close in favor of #12912 so we keep the conversations in one place.

weihanglo · 2023-11-03T21:11:54Z

FWIW, setting split-debuginfo = "packed" or "unpacked" on profile should reduce the memory usage of linker. My experiment results in half of the memory usage per invocation.

Something we might want to keep an eye on in rustc: rust-lang/rust#48762

luser · 2023-11-07T14:29:04Z

As this is focusing on the problem of OOM, I'm going to close in favor of #12912 so we keep the conversations in one place.

I suppose, although this is a very specific problem and I'm doubtful that the generic mechanisms being discussed in that issue will really help address it.

weihanglo · 2023-11-07T19:46:38Z

Thanks. Reopened as it might need both #9019 and #12912, and maybe other upstream works from rust to make it happen.

weihanglo · 2023-11-09T19:52:02Z

FWIW, there is a --no-keep-memory flag for GNU linker. Haven't tried it but might help before we make some progress on this.

https://linux.die.net/man/1/ld

weihanglo · 2023-12-13T02:46:34Z

rust-lang/rust#117962 has made into nightly. It could alleviate the pain of linker OOM to some extent.

luser · 2023-12-13T13:33:12Z

FWIW, there is a --no-keep-memory flag for GNU linker. Haven't tried it but might help before we make some progress on this.

I suspect this will make performance much worse in the average case, unfortunately.

soloturn · 2024-10-19T08:07:48Z

i think this can be closed, as it is the linkers business to not run out of memory. e.g. mold does this, see rui314/mold#1319 via its MOLD_JOBS environment variable. to avoid cargo does everything and nothing well ...

epage · 2024-10-21T02:33:08Z

For something like that, jobserver support in a linker would be a big help so we can coordinate across rustc and the linker for how many threads are available to use.

That also only focuses on number of parallel threads and not actual memory consumption. Like with threads, likely any solution coordination will be needed between links and rustc / cargo

what I am seeing with the move to Linode VMs is essentially this: rust-lang/cargo#9157 rust-lang/cargo#12912 B/c these new VMs have more CPU cores, 16 (new) vs 4(old), compilation is faster however this causes cargo to be overzealous and spawn too many linker processes which consume all of the available memory (on a 64 GB VM) and causes an OOM error forcing the kernel to kill the linker process and causing cargo to fail! Another alternative, which works, is using `--jobs 8`, however that is less optimal b/c it leaves unused CPU capacity and also affects the number of parallel threads when executing the test suite! WARNING: using `--release` is not an option because it breaks tests. The polkadot-sdk code uses the macro defensive! which is designed to panic when running in debug mode and multiple test scenarios rely on this behavior via #[should_panic]! WARNING: we still need the 64 GB memory!

luser added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Feb 9, 2021

ehuss added A-jobserver Area: jobserver, concurrency, parallelism A-linkage Area: linker issues, dylib, cdylib, shared libraries, so labels Feb 10, 2021

ehuss mentioned this issue Aug 4, 2021

Cargo build runs out of memory and crashes when linking multiple workspace members #9735

Closed

skyzh mentioned this issue Jun 6, 2022

chore(test): prevent test compile OOM risingwavelabs/risingwave#2989

Merged

3 tasks

benesch mentioned this issue Jun 30, 2023

build: Build all cargo executables at once MaterializeInc/materialize#20221

Merged

5 tasks

weihanglo added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label Oct 31, 2023

rustbot assigned weihanglo Oct 31, 2023

weihanglo mentioned this issue Nov 3, 2023

Limiting the parallelism automatically #12912

Open

epage closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2023

weihanglo reopened this Nov 7, 2023

weihanglo mentioned this issue Nov 16, 2023

Stop emitting less useful debug sections: .debug_pubnames and .debug_pubtypes rust-lang/compiler-team#688

Closed

3 tasks

weihanglo mentioned this issue Dec 30, 2023

Introduce 'nice' value under cargo.toml -> [build] #9250

Open

weihanglo mentioned this issue Oct 7, 2024

Job server strongly conflicts with multithreaded linkers #14651

Closed

weihanglo removed their assignment Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow restricting the number of parallel linker invocations #9157

Allow restricting the number of parallel linker invocations #9157

luser commented Feb 9, 2021 •

edited by rustbot

Loading

luser commented Feb 9, 2021

Be-ing commented Feb 27, 2021

levkk commented Oct 26, 2021

sagudev commented Mar 16, 2023

weihanglo commented Oct 31, 2023

epage commented Nov 2, 2023

weihanglo commented Nov 2, 2023

weihanglo commented Nov 3, 2023 •

edited

Loading

epage commented Nov 3, 2023

weihanglo commented Nov 3, 2023

luser commented Nov 7, 2023

weihanglo commented Nov 7, 2023

weihanglo commented Nov 9, 2023

weihanglo commented Dec 13, 2023

luser commented Dec 13, 2023

soloturn commented Oct 19, 2024

epage commented Oct 21, 2024

Allow restricting the number of parallel linker invocations #9157

Allow restricting the number of parallel linker invocations #9157

Comments

luser commented Feb 9, 2021 • edited by rustbot Loading

luser commented Feb 9, 2021

Be-ing commented Feb 27, 2021

levkk commented Oct 26, 2021

sagudev commented Mar 16, 2023

Proposed Solution

weihanglo commented Oct 31, 2023

epage commented Nov 2, 2023

weihanglo commented Nov 2, 2023

weihanglo commented Nov 3, 2023 • edited Loading

epage commented Nov 3, 2023

weihanglo commented Nov 3, 2023

luser commented Nov 7, 2023

weihanglo commented Nov 7, 2023

weihanglo commented Nov 9, 2023

weihanglo commented Dec 13, 2023

luser commented Dec 13, 2023

soloturn commented Oct 19, 2024

epage commented Oct 21, 2024

luser commented Feb 9, 2021 •

edited by rustbot

Loading

weihanglo commented Nov 3, 2023 •

edited

Loading