Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow linking with lld #38

Open
sewerynplazuk opened this issue Mar 27, 2023 · 12 comments
Open

Slow linking with lld #38

sewerynplazuk opened this issue Mar 27, 2023 · 12 comments

Comments

@sewerynplazuk
Copy link

sewerynplazuk commented Mar 27, 2023

Two weeks ago we adapted the lld linker to our project, unfortunately after that we observed much longer build times on CI. After investigating, it turned out that the problem is caused by the new linker.

Running the linking of UI tests target locally, I observe that a process is created that takes 100% of the CPU time. The whole thing even takes more than 300s(!), where the default linker, finishes its work in ~3s.

We're using mix of Swift and ObjC code, as well as notable number of precompiled libraries. Project is heavily modularised, most of the modules are linked statically.

I attach edited contents of the response.txt file generated by the --reproduce flag. Hope it will be helpful. I tried to reproduce the error in dummy project but unfortunately without success.

@keith
Copy link
Owner

keith commented Apr 3, 2023

Interesting, I think the best way to investigate further would be to run with instruments attached and see where the time is going. The response file isn't really enough to know where the hot path is

@sewerynplazuk
Copy link
Author

sewerynplazuk commented Apr 7, 2023

I ran the time profiler on my repro case and that's the heaviest stack trace recorded:

4.04 min  100.0%	0 s	 	ld64.lld (47739)
4.04 min   99.9%	0 s	 	 start
4.04 min   99.9%	0 s	 	  main
4.04 min   99.9%	0 s	 	   lld_main(int, char**, llvm::ToolContext const&)
4.04 min   99.9%	0 s	 	    0x104819360
4.04 min   99.9%	0 s	 	     lld::macho::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool)
4.01 min   99.3%	0 s	 	      0x104ba4ce4
3.56 min   87.9%	0 s	 	       lld::macho::ArchiveFile::fetch(llvm::object::Archive::Child const&, llvm::StringRef)
3.56 min   87.9%	0 s	 	        0x104bc66c0
3.56 min   87.9%	0 s	 	         lld::macho::ObjFile::ObjFile(llvm::MemoryBufferRef, unsigned int, llvm::StringRef, bool, bool)
3.56 min   87.9%	2.00 ms	 	          void lld::macho::ObjFile::parse<lld::macho::LP64>()
3.51 min   86.8%	6.00 ms	 	           lld::macho::parseLCLinkerOption(lld::macho::InputFile*, unsigned int, llvm::StringRef)
3.42 min   84.5%	6.00 ms	 	            0x104b99ef0
3.30 min   81.5%	33.00 ms	 	             lld::macho::resolveDylibPath(llvm::StringRef)
3.24 min   80.1%	41.00 ms	 	              llvm::sys::fs::access(llvm::Twine const&, llvm::sys::fs::AccessMode)
3.06 min   75.6%	3.05 min	 	               access (libsystem_kernel.dylib)

See the full trace.

For the measurement I've used linker that I compiled myself from 1167d676100ffebcafaf130b3af9250d81c4f5c7 commit with your instructions.

EDIT:
So, I've added some logging to lld::macho::resolveDylibPath(llvm::StringRef) method and noticed that it looks for every statically linked module, under every path passed to the linker as -F flag.

The -F flags seems to be added by rules_ios.

Anyway, that's sums up to ~32250 checks of dylibs presence in our case.

@keith
Copy link
Owner

keith commented Apr 10, 2023

Those 32250 checks are mostly duplicates right? If so I think we can safely cache the results of that function and return that

@keith
Copy link
Owner

keith commented Apr 10, 2023

FWIW if they're static frameworks it would probably be ideal if they weren't discovered this way and instead were directly passed to the linker (which rules_apple and bazel are generally doing now with some recent changes). But also clearly lld should handle this.

Can you try off the same sha with this patch https://reviews.llvm.org/D147960

@sewerynplazuk
Copy link
Author

sewerynplazuk commented Apr 11, 2023

Can you try off the same sha with this patch https://reviews.llvm.org/D147960

Here's the trace after the patch:
Screenshot 2023-04-11 at 10 32 52

Technically there are not so many I/O actions anymore, still the linking time increased by a minute 🤔 I'm not sure how to debug it further.

FWIW if they're static frameworks it would probably be ideal if they weren't discovered this way and instead were directly passed to the linker (which rules_apple and bazel are generally doing now with some recent changes). But also clearly lld should handle this.

I guess that'd be ideal. We're planning to switch to rules_apple but that won't happen anytime soon I'm afraid.

Those 32250 checks are mostly duplicates right? If so I think we can safely cache the results of that function and return that

Quite a lot of them is something like that:

/some/path/to/framework.xyz
/some/other/path/to/framework.xyz
/some/different/path/to/framework.xyz
...

So perhaps the patch could be even more improved by looking into file names, if possible.

@keith
Copy link
Owner

keith commented Apr 11, 2023

Can you attach that trace? Oof with different paths I bet the cache isn't really even helping (you could add logs for hits / misses). In that trace what's under that 4 minute unsymbolicated thing? I don't think it's technically safe to cache by file name given it's technically possible to have frameworks of the same name at different paths that are actually different

@keith
Copy link
Owner

keith commented Apr 11, 2023

You might be able to ignore auto load commands all together, but that would require at least a bit of manually passing some linker flags for swift. But I think that would sidestep this entirely. If that did work that could be added in rules_iOS to disable auto linking for its frameworks and then you wouldn't have to do anything manually

@sewerynplazuk
Copy link
Author

sewerynplazuk commented Apr 11, 2023

Here's the trace. It's a different one, though still with the patch applied - ld64.lld-repro-patched.trace.zip

You might be able to ignore auto load commands all together, but that would require at least a bit of manually passing some linker flags for swift.

Which flags are you referring to? 🤔

Oof with different paths I bet the cache isn't really even helping (you could add logs for hits / misses)

I will try that when I have a minute and get back to you. I think it actually helps a bit. If you look on the trace, you can see that fs::access takes up to 30s now.

@sewerynplazuk
Copy link
Author

So, I added the logs for hits/misses and I can see misses mostly. Only system frameworks distributed with Xcode seems to be cached. Quite often the code falls into if (entry->second.empty()) condition.

@keith
Copy link
Owner

keith commented Apr 13, 2023

So, I added the logs for hits/misses and I can see misses mostly. Only system frameworks distributed with Xcode seems to be cached. Quite often the code falls into if (entry->second.empty()) condition.

Ah yes there is a bug in the logic because of how it's called, I updated my diff can you try again?

Which flags are you referring to? 🤔

pass -ignore_auto_link and then when you see failures because specific symbols can't be found you'll have to manually add the flags to correctly link those libraries (especially for swift) like -lswiftCore etc

@keith
Copy link
Owner

keith commented Apr 13, 2023

also if you could share an entire reproducer with me that would be super helpful for debugging. to do that you can pass --reproduce and it will produce a tar file with all the inputs used (so you might not want to share it). If you are open to that you can send it to me privately on the bazel slack

@sewerynplazuk
Copy link
Author

sewerynplazuk commented Apr 14, 2023

Ah yes there is a bug in the logic because of how it's called, I updated my diff can you try again?

I did, nothing new. Mostly misses are logged and no visible improvement in the time profiler.

pass -ignore_auto_link

Will try that, thanks.

also if you could share an entire reproducer with me that would be super helpful for debugging

Sorry, can't share that. We can sync on the Slack about further debug steps I could take, let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants