-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not require merge to have identical value types for both inputs #2054
Conversation
Our beloved `tuple_of_iterator_references` does nto like ternary operators because that does not understand the underlying value txypes of the proxies.
static_assert(::cuda::std::is_same<cub::detail::value_t<KeyIt2>, key_t>::value, ""); | ||
static_assert(::cuda::std::is_same<cub::detail::value_t<ValueIt2>, value_t>::value, ""); | ||
static_assert(::cuda::std::__invokable<CompareOp, key_t, key_t>::value, | ||
static_assert(::cuda::std::__invokable<CompareOp, cub::detail::value_t<KeyIt1>, cub::detail::value_t<KeyIt1>>::value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bernhardmgruber as discussed in #1817 (comment), this is a breaking change and should go into device code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seem to have added the static assert in the device code but haven't removed it from the dispatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this static assertion about the return type:
::cuda::std::is_convertible<typename ::cuda::std::__invoke_of<CompareOp, key_t, key_t>::type, bool>::value
which I have removed. The offending static assert here was not commented at in PR #1817.
I thought querying whether a comparison operator is invokable with a given set of arguments should be fine in host code. Only asking for the return type is problematic. My reasoning was that it's much more user friendly to have a compile error here if the user passed a wrong operator, than many stacks deeper in the kernel code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only asking for the return type is problematic.
Any form of introspection of the signature is problematic, including:
Introspecting the parameter type of operator() is only supported in device code.
Let's move the check to device code.
static_assert(::cuda::std::is_same<cub::detail::value_t<KeyIt2>, key_t>::value, ""); | ||
static_assert(::cuda::std::is_same<cub::detail::value_t<ValueIt2>, value_t>::value, ""); | ||
static_assert(::cuda::std::__invokable<CompareOp, key_t, key_t>::value, | ||
static_assert(::cuda::std::__invokable<CompareOp, cub::detail::value_t<KeyIt1>, cub::detail::value_t<KeyIt1>>::value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
critical: see #1817 (comment)
static_assert(::cuda::std::__invokable<CompareOp, cub::detail::value_t<KeyIt1>, cub::detail::value_t<KeyIt1>>::value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, you just marked the second static_assert
on the review, but your textual comment applied to both. I understand now.
// Cannot check output iterators, since they could be discard iterators, which do not have the right value_type | ||
static_assert(::cuda::std::is_same<cub::detail::value_t<KeyIt2>, key_t>::value, ""); | ||
static_assert(::cuda::std::is_same<cub::detail::value_t<ValueIt2>, value_t>::value, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: @miscco we require this in our docs:
InputIterator1 and InputIterator2 have the same value_type
If we don't want to enforce that, we should relax requirements in the docs (both Thrust and CUB) and add tests. I think it's fine to merge this PR without doing that, but we should at least file an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So people are already doing it as we have seen with cuDF.
I believe we should think about something in the direction of is_assignable<value_t<OutIt>&, value_t<InIt1>>
with the obvious caveat that it needs to work for discard iterators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue it should be is_assignable<reference_t<OutIt>, value_t<InIt1>>
. That would handle proxy references.
I will add a test. |
🟨 CI finished in 10h 54m: Pass: 99%/250 | Total: 5d 08h | Avg: 30m 44s | Max: 1h 40m | Hits: 81%/248859
|
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
Thrust | |
CUDA Experimental | |
pycuda |
Modifications in project or dependencies?
Project | |
---|---|
CCCL Infrastructure | |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental | |
+/- | pycuda |
🏃 Runner counts (total jobs: 250)
# | Runner |
---|---|
178 | linux-amd64-cpu16 |
41 | linux-amd64-gpu-v100-latest-1 |
16 | linux-arm64-cpu16 |
15 | windows-amd64-cpu16 |
Ok, this opened a huge can of worms. I tried with key1_type = key_type keys_loc[items_per_thread];
gmem_to_reg<threads_per_block, IsFullTile>(
keys_loc, keys1_in + keys1_beg, keys2_in + keys2_beg, num_keys1, num_keys2);
reg_to_shared<threads_per_block>(&storage.keys_shared[0], keys_loc); Here, This all started to sound a bit too scary, and I think we may only allow a subset of these combinations. For completness, it seems |
I'm late to the party but I totally share that concern and started working on a reproducer yesterday: I think ultimately, the right thing would be to have a specialization for the case where the For now, I would like to avoid opening up more than we used to before the refactor, to avoid users establishing the wrong expectations. But I'm afraid we were already quite permissive with regards to different types, right? |
I have opened #2075 which mostly supersedes this PR. I tried to fix as little as possible to make the cuDF use case work. |
closing in favor of #2075 |
Fixes some minor issues that break cuDF
I am happy to discuss what to do with the static asserts.
I did not catch it during review but I believe the asserts are too strict.
Most likely we actually want something the likes of
is_assignable<>