Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The __attribute__((constructor)) trick not working for the Rust esp-idf-* crates (IDFGH-12987) #13938

Closed
3 tasks done
ivmarkov opened this issue Jun 9, 2024 · 5 comments
Closed
3 tasks done
Assignees
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally

Comments

@ivmarkov
Copy link
Contributor

ivmarkov commented Jun 9, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

CC:
@igrr
@MabezDev
@Vollbrecht

@igrr - sorry for the ping! Please CC whoever you deem necessary. Pinging you because the issue we are having below is quite crucial and leads to the Rust wrappers being next to unusable ATM as these are now being prone to random CONFLICT! crashes.

Overview

The TL;DR is:
What I'm observing in the .map file (pasted below) of this very simple test project is that the check_rmt_legacy_driver_conflict constructor of the RMT legacy driver remains and is active for a Rust project, even if the rust project only calls the NEW RMT driver APIs (and thus we are having the dreaded CONFLICT! program abort.)

Now, what is very important to realize, is that regular, simplistic whole-.o files elimination by the linker does not really work for Rust, because - unlike in C where the definition of a codegen unit (= .o file) is exactly one .c file - in Rust it is up to the compiler, optimization settings, number of cores on the build machine and what-not as to what ends up in one codegen unit (.o file).

So it might very well be, that two completely separate Rust modules still end up in a single .o file. Like the wrapper for the legacy RMT driver, as well as the wrapper for the new RMT driver for example.
You get the idea what happens next (I do realize I'm not talking about --gc-sections, but we'll cover that below, read on):

  • Since both the legacy and the new RMT driver are referenced from a single (Rust) .o codegen unit, both are pulled in, and we get the CONFLICT! thing (that is, assuming that Rust codegen unit is itself pulled in).

--gc-sections

Now obviously the above would've been a disaster w.r.t. Rust code size, hence by default and unconditionally Rust code is always compiled with the ffunction-sections and fdata-sections, thus all functions and so on ending up in separate sections. Rust also emits to the linker --gc-sections (Not that the link args from ESP IDF that we grab don't have it either).

So in the above scernario where we can see the code referencing the legacy and new RMT driver ending up in a single Rust .o file, I STILL see all symbols from the legacy RMT driver eliminated (check the "Discarded input sections" section in the .map file)

... except for the check_rmt_legacy_driver_conflict legacy RMT driver constructor which is still happily in there and obviously called at startup!

So my question is: is this behavior expected? I'm browsing the internet since a couple of hours, but I can't find an answer to the following question:

  • Could it be, that constructors remain even if all other symbols in their .o files are eliminated by the linker --gc-sections garbage collector?
  • It almost looks as if whether the constructors should go or stay is decided during some sort of a linker "first pass" logic, where the linker applies the more primitive "whole .o file elimination" approach,
    ... and then on the second pass, when --gc-sections do run, even if all functions from the .o file are garbage collected, constructors still stay there?

By the way, doing random changes to the source code layout of the sample Rust crate referenced above ^^^ (as in splitting the code referencing the legacy RMT driver in a separate file from the code referencing the new RMT driver) leads to this problem being solved. But then again, in the absence of any guarantee from Rust that codegen units do obey source file boundaries, this is a pure luck and completely unreliable.

Next steps

  • I would be very happy if we at least confirm that this behavior (constructors stay even if everything else from the .o was garbage collected) is an expected and defined behavior.

  • The problem remains though? What to do on the Rust side? Is it an option to somehow put a CONF option or at least something so that these constructors can be disabled? This is not as dangerous in Rust as it is in C, because (assuming folks use the Rust wrappers), we can of course shift/re-implement the check done in the constructor on the Rust level.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Jun 9, 2024
@github-actions github-actions bot changed the title The __attribute((constructor))__ trick now working for the Rust esp-idf-* crates The __attribute((constructor))__ trick now working for the Rust esp-idf-* crates (IDFGH-12987) Jun 9, 2024
@ivmarkov
Copy link
Contributor Author

ivmarkov commented Jun 9, 2024

The .map file is available here.

@ivmarkov ivmarkov changed the title The __attribute((constructor))__ trick now working for the Rust esp-idf-* crates (IDFGH-12987) The __attribute((constructor))__ trick not working for the Rust esp-idf-* crates (IDFGH-12987) Jun 9, 2024
@ivmarkov
Copy link
Contributor Author

  • Could it be, that constructors remain even if all other symbols in their .o files are eliminated by the linker --gc-sections garbage collector?

Re-reading some of the internet material from yesterday, this is the closest to our problem I was able to dig up.

It is - however - discussing how --as-needed (i.e. which dynamic libs to remain or not as refs in the final elf - we don't use dynamic libs of course) and --gc-sections interact, NOT how __attribute__((constructor)) and --gc-sections do.

Still, it seems to imply that the linker does work in two passes:

  • On the first pass, --as-needed will only leave those dynamic libraries, which have at least one symbol from them referenced directly or indirectly from the root
  • On the second pass, --gcc-sections runs. After that, it might be that dynamic libraries which were marked with "needed" might actually no longer be needed, as all symbols which were referenced from them were garbage collected

... yet, the dynamic link library will remain as referenced in the final executable. And... if it has __attribute__((constructor)) constructors, those would remain and ran when the dyn lib is loaded...

@ivmarkov ivmarkov changed the title The __attribute((constructor))__ trick not working for the Rust esp-idf-* crates (IDFGH-12987) The __attribute__((constructor)) trick not working for the Rust esp-idf-* crates (IDFGH-12987) Jun 10, 2024
@igrr
Copy link
Member

igrr commented Jun 12, 2024

Hi @ivmarkov,

Could it be, that constructors remain even if all other symbols in their .o files are eliminated by the linker --gc-sections garbage collector?

Yes, that would be the expected behavior with GNU linker.

The linking process is well described here.

In your case, we go down the path described under "When the linker encounters a new library,..." > "If any of the symbols it exports are on the undefined list...".

Since the Rust code does reference legacy RMT driver symbols, the object file rmt_legacy.c.obj is included into the output file. Later, unused code is removed due to gc-sections, but the constructors remain.

When we added these warnings based on constructors, we did not consider it a valid situation that the project would reference both the old and the new driver.

I think we can add Kconfig option to disable the check, and you can re-implement the check on the Rust side.

(cc @suda-morris)

@ivmarkov
Copy link
Contributor Author

When we added these warnings based on constructors, we did not consider it a valid situation that the project would reference both the old and the new driver.

This is an unfortunate consequence of how Rust defines "codegen units" (.o files). I was surprised myself how little control we have over that, in Rust.

I think we can add Kconfig option to disable the check, and you can re-implement the check on the Rust side.

That would work for us!
I assume the CONF_ option will be back-ported to the current 5.X branches? At least to the upcoming 5.3, if not the earlier ones?

@espressif-bot espressif-bot added Status: In Progress Work is in progress Status: Done Issue is done internally Resolution: NA Issue resolution is unavailable and removed Status: Opened Issue is new Status: In Progress Work is in progress labels Dec 6, 2024
@Alvin1Zhang
Copy link
Collaborator

Thanks for reporting and sorry for slow turnaround, fix on release/5.3 is available at bfac3bf, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally
Projects
None yet
Development

No branches or pull requests

6 participants