-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proclaim pair and tuple trivially relocatable #2010
Conversation
/ok to test |
Just merged main in to kick CI. There was a GHA outage last night, it looks like this run may have been a casualty. |
I thought we'd already taken care of this here: #1249 Or does this PR do more than that? |
This is |
@jrhemstad the C++ standard gives us the trait "trivially_copyable" to detect whether copying a type can be done by copying its byte-wise representation. We are not allowed to manually mark types as such by specializating Both traits should be recursive, so a Also, a |
@@ -39,6 +39,8 @@ | |||
# pragma system_header | |||
#endif // no system header | |||
|
|||
#include <thrust/type_traits/is_trivially_relocatable.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am of the opinion that we should be moving this trait into libcu++ and derive it from is_trivially_copyable
as a default and then users to specialize it.
But that may be something for another day. Note that the definition from the paper is much more complex and would require compiler support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I also check for this trait inside CUB. I don't mind how it's called and whether it is really about relocatability
. It should just be is_trivially_copyable
with opt-in.
static_assert(thrust::is_trivially_relocatable<int2>::value, ""); | ||
static_assert(thrust::is_trivially_relocatable<int3>::value, ""); | ||
static_assert(thrust::is_trivially_relocatable<int4>::value, ""); | ||
static_assert(thrust::is_trivially_relocatable<__int128>::value, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be guarded by availability of __int128_t
We have slightly different definitions for those in cub and libcu++. Choose wisely ;)
Those are _LIBCUDACXX_HAS_NO_INT128
and CUB_IS_INT128_ENABLED
If you want to go above and beyond we might want a _CCCL_HAS_INT128
that is globally available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take libcu++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would I want to have a _CCCL_HAS_INT128
, if everything depends on libcu++ anyway? I would just accumulate all configuration there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasone is
a) libcu++ has the double negation from libc++ which is bad
b) we want to move global macro definitions into cccl because they have nothing to do with libcu++. compilers and standard modes are example of such macros. Availability of int128 definitely is one too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to move global macro definitions into cccl because they have nothing to do with libcu++.
This is news to me, I thought libcu++ was at the root of food chain.
Anyhow, I don't intend to fix this within this PR. Here is an issue: #2067
So, now GCC rightfully complains:
This is true from the standard's point of view, but we want to |
in #2127 I am making tuple The same is already true for pair, so do we actually need this at all? Or does thrust not default to |
I think we still need it to support the case of Edit: I had another look and I believe we still need this PR for exactly the above reason. |
c026a84
to
f8badf2
Compare
I added tests for the propagation of trivial relocatablity through |
🟨 CI finished in 2h 40m: Pass: 98%/250 | Total: 5d 02h | Avg: 29m 25s | Max: 57m 03s | Hits: 51%/246264
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
🟨 CI finished in 3h 51m: Pass: 99%/250 | Total: 5d 03h | Avg: 29m 36s | Max: 57m 03s | Hits: 51%/247998
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
🟨 CI finished in 6h 11m: Pass: 99%/250 | Total: 5d 03h | Avg: 29m 43s | Max: 57m 03s | Hits: 52%/248865
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
GCC 6 rightfully complains about:
|
That warning should already be disabled, wondering why it is popping up now |
I also see |
🟨 CI finished in 9h 15m: Pass: 99%/250 | Total: 1d 10h | Avg: 8m 21s | Max: 1h 07m | Hits: 99%/247998
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
🟨 CI finished in 1d 21h: Pass: 99%/250 | Total: 6d 00h | Avg: 34m 43s | Max: 1h 13m | Hits: 51%/14811
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
It turns out the literal operator suffix warning can only be suppressed in GCC >= 7, but was added in earlier GCC versions :D I disabled the test now for GCC 6. |
🟩 CI finished in 1d 13h: Pass: 100%/417 | Total: 6d 23h | Avg: 24m 01s | Max: 1h 06m | Hits: 92%/34050
|
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
CUB | |
+/- | Thrust |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
+/- | pycuda |
🏃 Runner counts (total jobs: 417)
# | Runner |
---|---|
305 | linux-amd64-cpu16 |
61 | linux-amd64-gpu-v100-latest-1 |
28 | linux-arm64-cpu16 |
23 | windows-amd64-cpu16 |
We check whether a type is trivially relocatable in several places in Thrust to decide whether we can copy a type by means of
memcpy
orcudaMemcpy
. This PR addspair
andtuple
to those types, when their respective elements are trivially relocatable.New unit test also covers
complex
.