LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868

sys-ce-bb · 2024-02-29T13:30:47Z

LLVM: llvm/llvm-project@9f99eda
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@2cd8b78

This test shows the bug where LR is used as a general-purpose register on a code path where it is not spilled to the stack.

PR #75527 fixed ARMFrameLowering to set the IsRestored flag for LR based on all of the return instructions in the function, not just one. However, there is also code in ARMLoadStoreOptimizer which changes return instructions, but it set IsRestored based on the one instruction it changed, not the whole function. The fix is to factor out the code added in #75527, and also call it from ARMLoadStoreOptimizer if it made a change to return instructions. Fixes #80287.

Part of removing debug-intrinsics from LLVM requires using iterators whenever we insert an instruction into a block. That means we need all instruction constructors and factory functions to have an iterator taking option, which this patch adds. The whole of this patch should be NFC: it's adding new flavours of existing constructors, and plumbing those through to the Instruction constructor that takes iterators. It's almost entirely boilerplate copy-and-paste too.

Additional test for llvm/llvm-project#82922.

…(#78431) When initializing MachineSSAUpdater save all attributes of current virtual register and create new virtual registers with same attributes. Now new virtual registers have same both register class or bank and LLT. Previously new virtual registers had same register class but LLT was not set (LLT was set to default/empty LLT). Required by GlobalISel for AMDGPU, new 'lane mask' virtual registers created by MachineSSAUpdater need to have both register class and LLT. patch 4 from: llvm/llvm-project#73337

Implementation looks similar to the one in the current interpreter. Except for three static assertions, test/Sema/atomic-ops.c works.

…992) Speed up disassembly by only calling tryDecodeInst for DecoderTables that make sense for the current subtarget. This gives a 1.3x speed-up on check-llvm-mc-disassembler-amdgpu in my Release+Asserts build.

…pe." (#82856) Reverts llvm/llvm-project#82348, which caused crashes when analyzing empty InitListExprs for unions, e.g. ```cc union U { double double_value; int int_value; }; void target() { U value; value = {}; } ``` Co-authored-by: Samira Bazuzi <[email protected]>

The name hasGDS better reflects what it is used for.

This reverts commit 1069823. This has caused second stage timeouts when building Flang on AArch64: https://lab.llvm.org/buildbot/#/builders/179/builds/9442

… (#82878) To align colons inside TableGen !cond operators.

This upstreams more of the Clang API Notes functionality that is currently implemented in the Apple fork: https://github.com/apple/llvm-project/tree/next/clang/lib/APINotes This was extracted from a larger PR: llvm/llvm-project#73017

This actually passes on Windows but I don't know how to convey that with an xfail without clashing with the xfail for all platforms. At least this avoids a UPASS.

For whatever reason on Windows, it is not at this point. The copy of unit test we used to use would ignore failures during teardown but Python's does not.

Like with 'break'/'continue', returning out of a compute construct is ill-formed, so this implements the diagnostic. However, unlike the OpenMP implementation of this same diagnostic, OpenACC doesn't have a concept of 'capture region', so this is implemented as just checking the 'scope'.

This test was broken on MacOS, see the discussion in llvm/llvm-project@a35599b

Indentified in llvm/llvm-project#77741 (review)

…(#83007) For renamed instructions, there is no need to mention the new name twice on every line defining a Real.

…ter partial ordering when determining primary template (#82417) Consider the following: ``` struct A { static constexpr bool x = true; }; template<typename T, typename U> void f(T, U) noexcept(T::y); // #1, error: no member named 'y' in 'A' template<typename T, typename U> void f(T, U*) noexcept(T::x); // #2 template<> void f(A, int*) noexcept; // explicit specialization of #2 ``` We currently instantiate the exception specification of all candidate function template specializations when deducting template arguments for an explicit specialization, which results in a error despite `#1` not being selected by partial ordering as the most specialized template. According to [except.spec] p13: > An exception specification is considered to be needed when: > - [...] > - the exception specification is compared to that of another declaration (e.g., an explicit specialization or an overriding virtual function); Assuming that "comparing declarations" means "determining whether the declarations correspond and declare the same entity" (per [basic.scope.scope] p4 and [basic.link] p11.1, respectively), the exception specification does _not_ need to be instantiated until _after_ partial ordering, at which point we determine whether the implicitly instantiated specialization and the explicit specialization declare the same entity (the determination of whether two functions/function templates correspond does not consider the exception specifications). This patch defers the instantiation of the exception specification until a single function template specialization is selected via partial ordering, matching the behavior of GCC, EDG, and MSVC: see https://godbolt.org/z/Ebb6GTcWE.

…003) With no debug intrinsics, correctly identifying the start of a block with iterators becomes important. We need to use the iterator-returning methods here in loop-unroll-and-jam where we're shifting PHIs around. Otherwise they can be inserted after debug-info records, leading to debug-info attached to PHIs, which is ill formed. Fixes #83000

We need to leave a pointer on the stack for them, even if their type is primitive.

…670) The `unittest2` package is unused since 5b38615. The `progress` package was only used internally by `unittest2`, so it can be deleted as well.

Instead of having retInt/retLong/retSizeT/etc., just add retInteger, which takes an APSInt and returns it in form of the given QualType. This makes the code a little neater, but is also necessary since some builtins have a different return type with -fms-extensions.

… and local/private `nullptr` value for AMDGPU. (#78759) - Address space cast of nullptr in local_space into a generic_space for the CUDA backend. The reason for this cast was having invalid local memory base address for the associated variable. - In the context of AMD GPU, assigns a NULL value as ~0 for the address spaces of sycl_local and sycl_private to match the ones for opencl_local and opencl_private.

This reverts commit 33a6ce1. There is bug in the implementation, John Lu suggested to revert it first.

This reverts commit 1d53bf4.

jsji · 2024-03-01T04:33:58Z

e2e failures are common to others, also seen in https://github.com/intel/llvm/actions/runs/8104206945/job/22150442787.

jsji · 2024-03-01T04:37:03Z

This is ready for review.

Revert "[HIP] Allow partial linking for -fgpu-rdc (#81700)" Conflict with bc support in sycl, @LU-JOHN is working a fix, reverted first according to his suggestion.
Revert "Fix OpGroupNonUniformBroadcast version requirement (https://github.com/intel/llvm/pull/2378[)"](https://github.com/intel/llvm/pull/12868/commits/bbf3800adf1209d062ddb49555bb4f36120575ad) @MrSidims helped to have a look. Unfortunately id passed to OpGroupNonUniformBroadcast is always runtime value for SYCL headers. Hence we need to revert the patch. Also recorded in Track all customizations that are made in llvm-spirv in intel/llvm #7592.

jsji · 2024-03-01T04:39:34Z

@LU-JOHN @MrSidims Would you please help to add comments or approve explicitly so that this can be merged. Thanks.

MrSidims

The revert is lgtm

jsji · 2024-03-05T03:00:10Z

@LU-JOHN Would you please help to add comments or approve explicitly so that this can be merged. Thanks.

LU-JOHN · 2024-03-05T17:33:22Z

Reversion is good. Community commit has a bug when unbundling an archive of objects. Community issue tracked in:

llvm/llvm-project#83509

jsji · 2024-03-05T17:36:02Z

@bader @intel/llvm-gatekeepers Can we get this merged. Thanks.

bader · 2024-03-05T18:59:58Z

/merge

bb-sycl · 2024-03-05T19:00:21Z

Tue 05 Mar 2024 07:00:21 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

bb-sycl · 2024-03-05T19:07:48Z

Tue 05 Mar 2024 07:07:47 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

ostannard and others added 30 commits February 26, 2024 12:21

Pre-commit test showing bug #80287

8779cf6

This test shows the bug where LR is used as a general-purpose register on a code path where it is not spilled to the stack.

[TBAA] Add additional bitfield tests.

f290c00

Additional test for llvm/llvm-project#82922.

[clang][Interp][NFC] Fix comment typo

58aa995

[clang][Interp] Implement a few _is_lock_free builtins

a35599b

Implementation looks similar to the one in the current interpreter. Except for three static assertions, test/Sema/atomic-ops.c works.

[AMDGPU] Only try DecoderTables for the current subtarget. NFCI. (#82…

60e7ae3

…992) Speed up disassembly by only calling tryDecodeInst for DecoderTables that make sense for the current subtarget. This gives a 1.3x speed-up on check-llvm-mc-disassembler-amdgpu in my Release+Asserts build.

[clang][NFC] Prefer usings over typedefs (#82920)

96e536e

[AMDGPU] Rename a DS class template argument. NFC.

d41615e

The name hasGDS better reflects what it is used for.

Revert "Enable JumpTableToSwitch pass by default (#82546)"

9c5ca6b

This reverts commit 1069823. This has caused second stage timeouts when building Flang on AArch64: https://lab.llvm.org/buildbot/#/builders/179/builds/9442

[clang-format] Add AlignConsecutiveTableGenCondOperatorColons option.…

046682e

… (#82878) To align colons inside TableGen !cond operators.

[lldb][test][Windows] Skip thread state test on Windows

285bff3

This actually passes on Windows but I don't know how to convey that with an xfail without clashing with the xfail for all platforms. At least this avoids a UPASS.

[lldb][test][Windows] Don't assert that module cache is empty

8ce81e5

For whatever reason on Windows, it is not at this point. The copy of unit test we used to use would ignore failures during teardown but Python's does not.

[clang][Interp] Try to atomic.c on Mac

7c52d0c

This test was broken on MacOS, see the discussion in llvm/llvm-project@a35599b

[gn build] Port 2823340

ce78dfa

[gn build] Port 440b174

62e88bc

[gn build] Port 8c5e9cf

f887fad

[libc][NFC] Delete unused file (#82980)

ac86a76

Indentified in llvm/llvm-project#77741 (review)

[AMDGPU] Reduce duplication in DS Real instruction definitions. NFC. …

83feb84

…(#83007) For renamed instructions, there is no need to mention the new name twice on every line defining a Real.

[llvm][CodeGen] Add ValueType v3i1. [NFCI] (#82338)

969d7ec

[clang][Interp] Fix lvalue CompoundLiteralExprs

b504870

We need to leave a pointer on the stack for them, even if their type is primitive.

[lldb][test] Remove vendored packages unittest2 and progress (#82…

252f1cd

…670) The `unittest2` package is unused since 5b38615. The `progress` package was only used internally by `unittest2`, so it can be deleted as well.

sys-ce-bb added the disable-lint Skip linter check step and proceed with build jobs label Feb 29, 2024

sys-ce-bb had a problem deploying to WindowsCILock February 29, 2024 14:13 — with GitHub Actions Failure

sys-ce-bb had a problem deploying to WindowsCILock February 29, 2024 15:52 — with GitHub Actions Failure

jsji had a problem deploying to WindowsCILock February 29, 2024 20:46 — with GitHub Actions Failure

jsji had a problem deploying to WindowsCILock February 29, 2024 21:23 — with GitHub Actions Failure

jsji and others added 2 commits February 29, 2024 16:23

Revert "[HIP] Allow partial linking for -fgpu-rdc (#81700)"

5c14376

This reverts commit 33a6ce1. There is bug in the implementation, John Lu suggested to revert it first.

Revert "Fix OpGroupNonUniformBroadcast version requirement (#2378)"

bbf3800

This reverts commit 1d53bf4.

jsji force-pushed the llvmspirv_pulldown branch from 6e5cb8b to bbf3800 Compare March 1, 2024 00:42

jsji temporarily deployed to WindowsCILock March 1, 2024 00:43 — with GitHub Actions Inactive

jsji mentioned this pull request Mar 1, 2024

Track all customizations that are made in llvm-spirv in intel/llvm #7592

Open

9 tasks

jsji had a problem deploying to WindowsCILock March 1, 2024 02:23 — with GitHub Actions Failure

jsji marked this pull request as ready for review March 1, 2024 04:33

jsji requested review from a team and bader as code owners March 1, 2024 04:33

MrSidims approved these changes Mar 5, 2024

View reviewed changes

bb-sycl approved these changes Mar 5, 2024

View reviewed changes

bb-sycl merged commit 70cca70 into sycl Mar 5, 2024
11 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868

LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868

sys-ce-bb commented Feb 29, 2024

jsji commented Mar 1, 2024

jsji commented Mar 1, 2024 •

edited

Loading

jsji commented Mar 1, 2024

MrSidims left a comment

jsji commented Mar 5, 2024

LU-JOHN commented Mar 5, 2024

jsji commented Mar 5, 2024

bader commented Mar 5, 2024

bb-sycl commented Mar 5, 2024

bb-sycl commented Mar 5, 2024

LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868

LLVM and SPIRV-LLVM-Translator pulldown (WW09 2024) #12868

Conversation

sys-ce-bb commented Feb 29, 2024

jsji commented Mar 1, 2024

jsji commented Mar 1, 2024 • edited Loading

jsji commented Mar 1, 2024

MrSidims left a comment

Choose a reason for hiding this comment

jsji commented Mar 5, 2024

LU-JOHN commented Mar 5, 2024

jsji commented Mar 5, 2024

bader commented Mar 5, 2024

bb-sycl commented Mar 5, 2024

bb-sycl commented Mar 5, 2024

jsji commented Mar 1, 2024 •

edited

Loading