Slow linking with lld #38

sewerynplazuk · 2023-03-27T08:44:32Z

Two weeks ago we adapted the lld linker to our project, unfortunately after that we observed much longer build times on CI. After investigating, it turned out that the problem is caused by the new linker.

Running the linking of UI tests target locally, I observe that a process is created that takes 100% of the CPU time. The whole thing even takes more than 300s(!), where the default linker, finishes its work in ~3s.

We're using mix of Swift and ObjC code, as well as notable number of precompiled libraries. Project is heavily modularised, most of the modules are linked statically.

I attach edited contents of the response.txt file generated by the --reproduce flag. Hope it will be helpful. I tried to reproduce the error in dummy project but unfortunately without success.

The text was updated successfully, but these errors were encountered:

keith · 2023-04-03T23:03:24Z

Interesting, I think the best way to investigate further would be to run with instruments attached and see where the time is going. The response file isn't really enough to know where the hot path is

sewerynplazuk · 2023-04-07T08:49:37Z

I ran the time profiler on my repro case and that's the heaviest stack trace recorded:

4.04 min  100.0%	0 s	 	ld64.lld (47739)
4.04 min   99.9%	0 s	 	 start
4.04 min   99.9%	0 s	 	  main
4.04 min   99.9%	0 s	 	   lld_main(int, char**, llvm::ToolContext const&)
4.04 min   99.9%	0 s	 	    0x104819360
4.04 min   99.9%	0 s	 	     lld::macho::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool)
4.01 min   99.3%	0 s	 	      0x104ba4ce4
3.56 min   87.9%	0 s	 	       lld::macho::ArchiveFile::fetch(llvm::object::Archive::Child const&, llvm::StringRef)
3.56 min   87.9%	0 s	 	        0x104bc66c0
3.56 min   87.9%	0 s	 	         lld::macho::ObjFile::ObjFile(llvm::MemoryBufferRef, unsigned int, llvm::StringRef, bool, bool)
3.56 min   87.9%	2.00 ms	 	          void lld::macho::ObjFile::parse<lld::macho::LP64>()
3.51 min   86.8%	6.00 ms	 	           lld::macho::parseLCLinkerOption(lld::macho::InputFile*, unsigned int, llvm::StringRef)
3.42 min   84.5%	6.00 ms	 	            0x104b99ef0
3.30 min   81.5%	33.00 ms	 	             lld::macho::resolveDylibPath(llvm::StringRef)
3.24 min   80.1%	41.00 ms	 	              llvm::sys::fs::access(llvm::Twine const&, llvm::sys::fs::AccessMode)
3.06 min   75.6%	3.05 min	 	               access (libsystem_kernel.dylib)

See the full trace.

For the measurement I've used linker that I compiled myself from 1167d676100ffebcafaf130b3af9250d81c4f5c7 commit with your instructions.

EDIT:
So, I've added some logging to lld::macho::resolveDylibPath(llvm::StringRef) method and noticed that it looks for every statically linked module, under every path passed to the linker as -F flag.

The -F flags seems to be added by rules_ios.

Anyway, that's sums up to ~32250 checks of dylibs presence in our case.

keith · 2023-04-10T19:00:58Z

Those 32250 checks are mostly duplicates right? If so I think we can safely cache the results of that function and return that

keith · 2023-04-10T19:23:46Z

FWIW if they're static frameworks it would probably be ideal if they weren't discovered this way and instead were directly passed to the linker (which rules_apple and bazel are generally doing now with some recent changes). But also clearly lld should handle this.

Can you try off the same sha with this patch https://reviews.llvm.org/D147960

sewerynplazuk · 2023-04-11T08:44:44Z

Can you try off the same sha with this patch https://reviews.llvm.org/D147960

Here's the trace after the patch:

Technically there are not so many I/O actions anymore, still the linking time increased by a minute 🤔 I'm not sure how to debug it further.

FWIW if they're static frameworks it would probably be ideal if they weren't discovered this way and instead were directly passed to the linker (which rules_apple and bazel are generally doing now with some recent changes). But also clearly lld should handle this.

I guess that'd be ideal. We're planning to switch to rules_apple but that won't happen anytime soon I'm afraid.

Those 32250 checks are mostly duplicates right? If so I think we can safely cache the results of that function and return that

Quite a lot of them is something like that:

/some/path/to/framework.xyz
/some/other/path/to/framework.xyz
/some/different/path/to/framework.xyz
...

So perhaps the patch could be even more improved by looking into file names, if possible.

keith · 2023-04-11T13:18:34Z

Can you attach that trace? Oof with different paths I bet the cache isn't really even helping (you could add logs for hits / misses). In that trace what's under that 4 minute unsymbolicated thing? I don't think it's technically safe to cache by file name given it's technically possible to have frameworks of the same name at different paths that are actually different

keith · 2023-04-11T13:22:21Z

You might be able to ignore auto load commands all together, but that would require at least a bit of manually passing some linker flags for swift. But I think that would sidestep this entirely. If that did work that could be added in rules_iOS to disable auto linking for its frameworks and then you wouldn't have to do anything manually

sewerynplazuk · 2023-04-11T15:26:08Z

Here's the trace. It's a different one, though still with the patch applied - ld64.lld-repro-patched.trace.zip

You might be able to ignore auto load commands all together, but that would require at least a bit of manually passing some linker flags for swift.

Which flags are you referring to? 🤔

Oof with different paths I bet the cache isn't really even helping (you could add logs for hits / misses)

I will try that when I have a minute and get back to you. I think it actually helps a bit. If you look on the trace, you can see that fs::access takes up to 30s now.

sewerynplazuk · 2023-04-12T08:14:26Z

So, I added the logs for hits/misses and I can see misses mostly. Only system frameworks distributed with Xcode seems to be cached. Quite often the code falls into if (entry->second.empty()) condition.

keith · 2023-04-13T16:05:01Z

So, I added the logs for hits/misses and I can see misses mostly. Only system frameworks distributed with Xcode seems to be cached. Quite often the code falls into if (entry->second.empty()) condition.

Ah yes there is a bug in the logic because of how it's called, I updated my diff can you try again?

Which flags are you referring to? 🤔

pass -ignore_auto_link and then when you see failures because specific symbols can't be found you'll have to manually add the flags to correctly link those libraries (especially for swift) like -lswiftCore etc

keith · 2023-04-13T16:06:28Z

also if you could share an entire reproducer with me that would be super helpful for debugging. to do that you can pass --reproduce and it will produce a tar file with all the inputs used (so you might not want to share it). If you are open to that you can send it to me privately on the bazel slack

sewerynplazuk · 2023-04-14T13:42:20Z

Ah yes there is a bug in the logic because of how it's called, I updated my diff can you try again?

I did, nothing new. Mostly misses are logged and no visible improvement in the time profiler.

pass -ignore_auto_link

Will try that, thanks.

also if you could share an entire reproducer with me that would be super helpful for debugging

Sorry, can't share that. We can sync on the Slack about further debug steps I could take, let me know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow linking with lld #38

Slow linking with lld #38

sewerynplazuk commented Mar 27, 2023 •

edited

Loading

keith commented Apr 3, 2023

sewerynplazuk commented Apr 7, 2023 •

edited

Loading

keith commented Apr 10, 2023

keith commented Apr 10, 2023

sewerynplazuk commented Apr 11, 2023 •

edited

Loading

keith commented Apr 11, 2023

keith commented Apr 11, 2023

sewerynplazuk commented Apr 11, 2023 •

edited

Loading

sewerynplazuk commented Apr 12, 2023

keith commented Apr 13, 2023

keith commented Apr 13, 2023

sewerynplazuk commented Apr 14, 2023 •

edited

Loading

Slow linking with lld #38

Slow linking with lld #38

Comments

sewerynplazuk commented Mar 27, 2023 • edited Loading

keith commented Apr 3, 2023

sewerynplazuk commented Apr 7, 2023 • edited Loading

keith commented Apr 10, 2023

keith commented Apr 10, 2023

sewerynplazuk commented Apr 11, 2023 • edited Loading

keith commented Apr 11, 2023

keith commented Apr 11, 2023

sewerynplazuk commented Apr 11, 2023 • edited Loading

sewerynplazuk commented Apr 12, 2023

keith commented Apr 13, 2023

keith commented Apr 13, 2023

sewerynplazuk commented Apr 14, 2023 • edited Loading

sewerynplazuk commented Mar 27, 2023 •

edited

Loading

sewerynplazuk commented Apr 7, 2023 •

edited

Loading

sewerynplazuk commented Apr 11, 2023 •

edited

Loading

sewerynplazuk commented Apr 11, 2023 •

edited

Loading

sewerynplazuk commented Apr 14, 2023 •

edited

Loading