-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868
Conversation
This test shows the bug where LR is used as a general-purpose register on a code path where it is not spilled to the stack.
PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based on all of the return instructions in the function, not just one. However, there is also code in ARMLoadStoreOptimizer which changes return instructions, but it set IsRestored based on the one instruction it changed, not the whole function. The fix is to factor out the code added in #75527, and also call it from ARMLoadStoreOptimizer if it made a change to return instructions. Fixes #80287.
Part of removing debug-intrinsics from LLVM requires using iterators whenever we insert an instruction into a block. That means we need all instruction constructors and factory functions to have an iterator taking option, which this patch adds. The whole of this patch should be NFC: it's adding new flavours of existing constructors, and plumbing those through to the Instruction constructor that takes iterators. It's almost entirely boilerplate copy-and-paste too.
Additional test for llvm/llvm-project#82922.
…(#78431) When initializing MachineSSAUpdater save all attributes of current virtual register and create new virtual registers with same attributes. Now new virtual registers have same both register class or bank and LLT. Previously new virtual registers had same register class but LLT was not set (LLT was set to default/empty LLT). Required by GlobalISel for AMDGPU, new 'lane mask' virtual registers created by MachineSSAUpdater need to have both register class and LLT. patch 4 from: llvm/llvm-project#73337
Implementation looks similar to the one in the current interpreter. Except for three static assertions, test/Sema/atomic-ops.c works.
…992) Speed up disassembly by only calling tryDecodeInst for DecoderTables that make sense for the current subtarget. This gives a 1.3x speed-up on check-llvm-mc-disassembler-amdgpu in my Release+Asserts build.
…pe." (#82856) Reverts llvm/llvm-project#82348, which caused crashes when analyzing empty InitListExprs for unions, e.g. ```cc union U { double double_value; int int_value; }; void target() { U value; value = {}; } ``` Co-authored-by: Samira Bazuzi <[email protected]>
The name hasGDS better reflects what it is used for.
This reverts commit 1069823. This has caused second stage timeouts when building Flang on AArch64: https://lab.llvm.org/buildbot/#/builders/179/builds/9442
… (#82878) To align colons inside TableGen !cond operators.
This upstreams more of the Clang API Notes functionality that is currently implemented in the Apple fork: https://github.com/apple/llvm-project/tree/next/clang/lib/APINotes This was extracted from a larger PR: llvm/llvm-project#73017
This actually passes on Windows but I don't know how to convey that with an xfail without clashing with the xfail for all platforms. At least this avoids a UPASS.
For whatever reason on Windows, it is not at this point. The copy of unit test we used to use would ignore failures during teardown but Python's does not.
Like with 'break'/'continue', returning out of a compute construct is ill-formed, so this implements the diagnostic. However, unlike the OpenMP implementation of this same diagnostic, OpenACC doesn't have a concept of 'capture region', so this is implemented as just checking the 'scope'.
This test was broken on MacOS, see the discussion in llvm/llvm-project@a35599b
…(#83007) For renamed instructions, there is no need to mention the new name twice on every line defining a Real.
…ter partial ordering when determining primary template (#82417) Consider the following: ``` struct A { static constexpr bool x = true; }; template<typename T, typename U> void f(T, U) noexcept(T::y); // #1, error: no member named 'y' in 'A' template<typename T, typename U> void f(T, U*) noexcept(T::x); // #2 template<> void f(A, int*) noexcept; // explicit specialization of #2 ``` We currently instantiate the exception specification of all candidate function template specializations when deducting template arguments for an explicit specialization, which results in a error despite `#1` not being selected by partial ordering as the most specialized template. According to [except.spec] p13: > An exception specification is considered to be needed when: > - [...] > - the exception specification is compared to that of another declaration (e.g., an explicit specialization or an overriding virtual function); Assuming that "comparing declarations" means "determining whether the declarations correspond and declare the same entity" (per [basic.scope.scope] p4 and [basic.link] p11.1, respectively), the exception specification does _not_ need to be instantiated until _after_ partial ordering, at which point we determine whether the implicitly instantiated specialization and the explicit specialization declare the same entity (the determination of whether two functions/function templates correspond does not consider the exception specifications). This patch defers the instantiation of the exception specification until a single function template specialization is selected via partial ordering, matching the behavior of GCC, EDG, and MSVC: see https://godbolt.org/z/Ebb6GTcWE.
…003) With no debug intrinsics, correctly identifying the start of a block with iterators becomes important. We need to use the iterator-returning methods here in loop-unroll-and-jam where we're shifting PHIs around. Otherwise they can be inserted after debug-info records, leading to debug-info attached to PHIs, which is ill formed. Fixes #83000
We need to leave a pointer on the stack for them, even if their type is primitive.
…670) The `unittest2` package is unused since 5b38615. The `progress` package was only used internally by `unittest2`, so it can be deleted as well.
Instead of having retInt/retLong/retSizeT/etc., just add retInteger, which takes an APSInt and returns it in form of the given QualType. This makes the code a little neater, but is also necessary since some builtins have a different return type with -fms-extensions.
… and local/private `nullptr` value for AMDGPU. (#78759) - Address space cast of nullptr in local_space into a generic_space for the CUDA backend. The reason for this cast was having invalid local memory base address for the associated variable. - In the context of AMD GPU, assigns a NULL value as ~0 for the address spaces of sycl_local and sycl_private to match the ones for opencl_local and opencl_private.
e2e failures are common to others, also seen in https://github.com/intel/llvm/actions/runs/8104206945/job/22150442787. |
This is ready for review.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The revert is lgtm
@LU-JOHN Would you please help to add comments or approve explicitly so that this can be merged. Thanks. |
Reversion is good. Community commit has a bug when unbundling an archive of objects. Community issue tracked in: |
@bader @intel/llvm-gatekeepers Can we get this merged. Thanks. |
/merge |
Tue 05 Mar 2024 07:00:21 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Tue 05 Mar 2024 07:07:47 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
LLVM: llvm/llvm-project@9f99eda
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@2cd8b78