Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make aro-based translate-c lazily built from source #19122

Merged
merged 2 commits into from
Feb 29, 2024
Merged

make aro-based translate-c lazily built from source #19122

merged 2 commits into from
Feb 29, 2024

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Feb 28, 2024

Depends on #19114.

Part of #19063.

The Big Change

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig, making
Aro source only compiled in the case that C translation features are required.

Aro's .def files and multiple Zig module requirements make things cumbersome.
I looked at the .def files and made these observations:

  • The canonical source is llvm .def files.
  • Therefore there is an update process to sync with llvm that involves
    regenerating the .def files in Aro.
  • Therefore you might as well just regenerate the .zig files directly
    and check those into Aro.
  • Also with a small amount of tinkering, the file size on disk of these
    generated .zig files can be made many times smaller, without
    compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me. Opening this PR to ask @Vexu for feedback.

Results

I'm measuring 9s for ZIG_DEBUG_CMD=1 stage3/bin/zig translate-c -fno-clang simple.c after making changes to aro_translate_c.zig.

This also lets you directly do stage3/bin/zig run --dep aro -Mroot=../lib/compiler/aro_translate_c.zig -Maro=../lib/compiler/aro/aro.zig -- simple.c which means you can choose to additionally pass -fno-llvm -fno-lld which I'm clocking at 2.4s.

Follow-Up Work

Improve the test harness

The aro-enabled test-run-translated-c cases look broken. I don't think any of these are being run:

../test/cases/run_translated_c/explicit_cast_bool_from_float.c:// c_frontends=aro,clang
../test/cases/run_translated_c/compound_assignments_with_implicit_casts.c:// c_frontends=aro,clang

Move caching to the build system

Currently the build system unconditionally executes zig translate-c ... and this subcommand in zig does caching. Instead, I think the zig CLI subcommand should always run the translation, and std.Build.Step.TranslateC should be doing the caching. At least for the aro-based one.

Current status is that even after making changes to the translate-c sources, the translate-c binary gets rebuilt, but then it gets a cache hit on the input C code and produces the same result as before even though the translate-c binary might have different behavior.

To work around this, contributors can use the zig run ... trick I mentioned above.

@ehaas
Copy link
Contributor

ehaas commented Feb 28, 2024

One thing to note is that Builtins.def is generated from LLVM def files, but the other two def files are not (attribute names and diagnostic messages). So those probably shouldn't be deleted.

@andrewrk
Copy link
Member Author

Thanks, that detail escaped me. However... what if it weren't so? The build step requirement is onerous. I'm not asking Aro to change its ways - I would just like zig's aro update process to involve committing those generated files into the source repo so that zig can skip the build step, just as I have done so here.

Part of #19063.

Primarily, this moves Aro from deps/ to lib/compiler/ so that it can be
lazily compiled from source. src/aro_translate_c.zig is moved to
lib/compiler/aro_translate_c.zig and some of Zig CLI logic moved to a
main() function there.

aro_translate_c.zig becomes the "common" import for clang-based
translate-c.

Not all of the compiler was able to be detangled from Aro, however, so
it still, for now, remains being compiled with the main compiler
sources due to the clang-based translate-c depending on it. Once
aro-based translate-c achieves feature parity with the clang-based
translate-c implementation, the clang-based one can be removed from Zig.

Aro made it unnecessarily difficult to depend on with these .def files
and all these Zig module requirements. I looked at the .def files and
made these observations:

- The canonical source is llvm .def files.
- Therefore there is an update process to sync with llvm that involves
  regenerating the .def files in Aro.
- Therefore you might as well just regenerate the .zig files directly
  and check those into Aro.
- Also with a small amount of tinkering, the file size on disk of these
  generated .zig files can be made many times smaller, without
  compromising type safety in the usage of the data.

This would make things much easier on Zig as downstream project,
particularly we could remove those pesky stubs when bootstrapping.

I have gone ahead with these changes since they unblock me and I will
have a chat with Vexu to see what he thinks.
@andrewrk andrewrk merged commit f5aad47 into master Feb 29, 2024
10 checks passed
@andrewrk andrewrk deleted the lazy-aro branch February 29, 2024 02:21
@andrewrk
Copy link
Member Author

Noting that an x86_64 -OReleaseSmall build of zig went from 11M to 10M with this change.

@Vexu
Copy link
Member

Vexu commented Mar 1, 2024

The changes seem fine, committing the generated files has the downside that small changes to the source result in half of the generated file changing but that downside can be minimized by not updating them too frequently.

  • The canonical source is llvm .def files.

Only for Builtin.def and that is not the end goal Vexu/arocc#526

The aro-enabled test-run-translated-c cases look broken. I don't think any of these are being run:

They are being run but translating statements hasn't been implemented yet #18578 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants