Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow (somewhat) different input value types for merge #2075

Merged
merged 3 commits into from
Jul 26, 2024

Conversation

bernhardmgruber
Copy link
Contributor

This PR is a fix for some issues in cuDF that were introduced in #1817. The problem is cub::DeviceMerge does not support input sequences with different key types, as documented. However, the old Thrust API allowed this (although maybe not correctly in all cases). This PR restores the old behavior of Thrust to fix cuDF.

CUB should eventually allow properly mixing input types, which is tracked by #2074.

Supercedes: #2054

@bernhardmgruber bernhardmgruber added bug Something isn't working right. cub For all items related to CUB labels Jul 25, 2024
@bernhardmgruber bernhardmgruber marked this pull request as ready for review July 25, 2024 13:12
@bernhardmgruber bernhardmgruber requested review from a team as code owners July 25, 2024 13:12
As long as they are convertible to the value type of the first iterator. This weakens the publicly documented guarantees of equal value types to restore the old behavior of the thrust implementation replaced in NVIDIA#1817.
Comment on lines -60 to -62
using key_t = typename ::cuda::std::iterator_traits<KeyIt1>::value_type;
static_assert(::cuda::std::is_same<key_t, typename ::cuda::std::iterator_traits<KeyIt2>::value_type>::value, "");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion was not present in the old Thrust implementation.

Comment on lines -67 to -71
static_assert(
::cuda::std::is_convertible<typename ::cuda::std::__invoke_of<CompareOp, value_t<KeyIt1>, value_t<KeyIt1>>::type,
bool>::value,
"Comparison operator must be convertible to bool");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped this check entirely, since a bad comparison operator would be diagnosed one level deeper inside cub::MergePath.

@bernhardmgruber bernhardmgruber enabled auto-merge (squash) July 25, 2024 19:19
Copy link
Contributor

🟨 CI finished in 10h 50m: Pass: 96%/250 | Total: 6d 02h | Avg: 35m 15s | Max: 1h 39m | Hits: 77%/240435
  • 🟨 cub: Pass: 94%/131 | Total: 3d 23h | Avg: 43m 43s | Max: 1h 39m | Hits: 86%/105055

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  94%/123 | Total:  3d 17h | Avg: 43m 30s | Max:  1h 39m | Hits:  86%/98119 
      🟩 arm64              Pass: 100%/8   | Total:  6h 15m | Avg: 46m 53s | Max: 48m 15s | Hits:  84%/6936  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  9h 04m | Avg: 36m 17s | Max: 44m 49s | Hits:  87%/11792 
      🟩 11.8               Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🔍 12.5               Pass:  93%/113 | Total:  3d 11h | Avg: 44m 26s | Max:  1h 39m | Hits:  86%/90662 
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 29m 16s | Avg: 14m 38s | Max: 14m 44s | Hits:  87%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  9h 04m | Avg: 36m 17s | Max: 44m 49s | Hits:  87%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🔍 nvcc12.5           Pass:  93%/111 | Total:  3d 11h | Avg: 44m 59s | Max:  1h 39m | Hits:  86%/89226 
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 29m 16s | Avg: 14m 38s | Max: 14m 44s | Hits:  87%/1436  
      🔍 nvcc               Pass:  94%/129 | Total:  3d 22h | Avg: 44m 10s | Max:  1h 39m | Hits:  86%/103619
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  3h 53m | Avg: 38m 55s | Max: 45m 02s | Hits:  86%/4980  
      🟩 Clang10            Pass: 100%/3   | Total:  2h 04m | Avg: 41m 32s | Max: 42m 13s | Hits:  85%/2607  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 43m | Avg: 40m 46s | Max: 42m 04s | Hits:  85%/3476  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 46m | Avg: 41m 37s | Max: 46m 26s | Hits:  85%/3476  
      🟩 Clang13            Pass: 100%/4   | Total:  2h 46m | Avg: 41m 44s | Max: 44m 49s | Hits:  85%/3476  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 42m | Avg: 40m 31s | Max: 41m 04s | Hits:  85%/3476  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 43m | Avg: 40m 52s | Max: 44m 33s | Hits:  85%/3468  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 42m | Avg: 40m 33s | Max: 41m 19s | Hits:  85%/3468  
      🟨 Clang17            Pass:  80%/26  | Total: 12h 04m | Avg: 27m 52s | Max: 48m 15s | Hits:  93%/17909 
      🟩 GCC6               Pass: 100%/2   | Total:  1h 10m | Avg: 35m 24s | Max: 35m 48s | Hits:  87%/1582  
      🟩 GCC7               Pass: 100%/6   | Total:  3h 50m | Avg: 38m 24s | Max: 42m 38s | Hits:  85%/4983  
      🟩 GCC8               Pass: 100%/6   | Total:  3h 49m | Avg: 38m 12s | Max: 40m 46s | Hits:  85%/4983  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 48m | Avg: 38m 07s | Max: 41m 07s | Hits:  85%/4983  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 53m | Avg: 43m 28s | Max: 45m 50s | Hits:  84%/3476  
      🟩 GCC11              Pass: 100%/7   | Total:  5h 34m | Avg: 47m 49s | Max: 54m 49s | Hits:  84%/6069  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 52m | Avg: 43m 11s | Max: 46m 20s | Hits:  84%/3468  
      🟨 GCC13              Pass:  92%/28  | Total:  1d 05h | Avg:  1h 03m | Max:  1h 39m | Hits:  82%/22542 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 08m | Avg: 42m 47s | Max: 43m 16s | Hits:  87%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 44m 49s | Avg: 44m 49s | Max: 44m 49s | Hits:  86%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 09s | Max: 55m 45s | Hits:  86%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 39m | Avg: 53m 17s | Max: 57m 15s | Hits:  86%/2127  
    🟨 cxx_family
      🟨 Clang              Pass:  91%/59  | Total:  1d 10h | Avg: 35m 02s | Max: 48m 15s | Hits:  88%/46336 
      🟨 GCC                Pass:  96%/63  | Total:  2d 05h | Avg: 51m 03s | Max:  1h 39m | Hits:  84%/52086 
      🟩 Intel              Pass: 100%/3   | Total:  2h 08m | Avg: 42m 47s | Max: 43m 16s | Hits:  87%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  5h 15m | Avg: 52m 30s | Max: 57m 15s | Hits:  86%/4254  
    🟨 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 19h | Avg: 40m 47s | Max: 57m 15s | Hits:  85%/83380 
      🟨 DeviceLaunch       Pass:  87%/8   | Total:  7h 00m | Avg: 52m 34s | Max:  1h 29m | Hits:  86%/6069  
      🟨 GraphCapture       Pass:  75%/8   | Total:  6h 25m | Avg: 48m 13s | Max:  1h 39m | Hits:  92%/5202  
      🟨 HostLaunch         Pass:  62%/8   | Total:  6h 19m | Avg: 47m 28s | Max:  1h 38m | Hits:  87%/4335  
      🟨 TestGPU            Pass:  87%/8   | Total:  8h 23m | Avg:  1h 02m | Max:  1h 37m | Hits:  91%/6069  
    🟨 gpu
      🟨 v100               Pass:  94%/131 | Total:  3d 23h | Avg: 43m 43s | Max:  1h 39m | Hits:  86%/105055
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🟩 90a                Pass: 100%/4   | Total:  1h 13m | Avg: 18m 17s | Max: 19m 00s | Hits:  84%/3468  
    🟨 std
      🟨 11                 Pass:  91%/34  | Total:  1d 00h | Avg: 43m 10s | Max:  1h 39m | Hits:  84%/26446 
      🟨 14                 Pass:  97%/37  | Total:  1d 03h | Avg: 44m 27s | Max:  1h 37m | Hits:  87%/30307 
      🟨 17                 Pass:  97%/36  | Total:  1d 02h | Avg: 43m 43s | Max:  1h 34m | Hits:  85%/29525 
      🟨 20                 Pass:  91%/24  | Total: 17h 20m | Avg: 43m 20s | Max:  1h 28m | Hits:  88%/18777 
    
  • 🟨 thrust: Pass: 97%/118 | Total: 2d 03h | Avg: 26m 03s | Max: 1h 01m | Hits: 70%/135380

    🔍 cpu: amd64 🔍
      🔍 amd64              Pass:  97%/110 | Total:  1d 23h | Avg: 26m 04s | Max:  1h 01m | Hits:  70%/125960
      🟩 arm64              Pass: 100%/8   | Total:  3h 27m | Avg: 25m 55s | Max: 31m 24s | Hits:  65%/9420  
    🔍 ctk: 12.5 🔍
      🟩 11.1               Pass: 100%/15  | Total:  6h 23m | Avg: 25m 35s | Max: 54m 47s | Hits:  66%/17660 
      🟩 11.8               Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🔍 12.5               Pass:  97%/100 | Total:  1d 19h | Avg: 25m 53s | Max:  1h 01m | Hits:  71%/114186
    🔍 cudacxx: nvcc12.5 🔍
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 34s | Avg: 23m 47s | Max: 24m 01s | Hits:  71%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 23m | Avg: 25m 35s | Max: 54m 47s | Hits:  66%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🔍 nvcc12.5           Pass:  96%/98  | Total:  1d 18h | Avg: 25m 55s | Max:  1h 01m | Hits:  71%/111832
    🔍 cudacxx_family: nvcc 🔍
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 34s | Avg: 23m 47s | Max: 24m 01s | Hits:  71%/2354  
      🔍 nvcc               Pass:  97%/116 | Total:  2d 02h | Avg: 26m 05s | Max:  1h 01m | Hits:  70%/133026
    🔍 jobs: TestGPU 🔍
      🟩 Build              Pass: 100%/99  | Total:  1d 22h | Avg: 27m 59s | Max:  1h 01m | Hits:  67%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 46m | Avg:  9m 43s | Max: 19m 14s | Hits:  99%/12939 
      🔍 TestGPU            Pass:  62%/8   | Total:  3h 16m | Avg: 24m 37s | Max: 49m 07s | Hits:  67%/5888  
    🟨 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 32m | Avg: 25m 22s | Max: 30m 22s | Hits:  72%/7062  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 16m | Avg: 25m 32s | Max: 27m 30s | Hits:  72%/3531  
      🟩 Clang11            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 47s | Max: 28m 36s | Hits:  71%/4708  
      🟩 Clang12            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 46s | Max: 29m 06s | Hits:  71%/4708  
      🟩 Clang13            Pass: 100%/4   | Total:  1h 38m | Avg: 24m 43s | Max: 27m 03s | Hits:  71%/4708  
      🟩 Clang14            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 54s | Max: 27m 51s | Hits:  71%/4708  
      🟩 Clang15            Pass: 100%/4   | Total:  1h 45m | Avg: 26m 17s | Max: 29m 24s | Hits:  71%/4708  
      🟩 Clang16            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 05s | Max: 29m 01s | Hits:  71%/4708  
      🟨 Clang17            Pass:  88%/18  | Total:  5h 34m | Avg: 18m 34s | Max: 30m 21s | Hits:  82%/18832 
      🟩 GCC6               Pass: 100%/2   | Total: 42m 15s | Avg: 21m 07s | Max: 23m 02s | Hits:  72%/2354  
      🟩 GCC7               Pass: 100%/6   | Total:  2h 29m | Avg: 24m 58s | Max: 29m 16s | Hits:  72%/7068  
      🟩 GCC8               Pass: 100%/6   | Total:  2h 27m | Avg: 24m 33s | Max: 27m 59s | Hits:  72%/7068  
      🟩 GCC9               Pass: 100%/6   | Total:  2h 36m | Avg: 26m 04s | Max: 30m 22s | Hits:  65%/7068  
      🟩 GCC10              Pass: 100%/4   | Total:  1h 48m | Avg: 27m 07s | Max: 31m 56s | Hits:  71%/4712  
      🟩 GCC11              Pass: 100%/7   | Total:  3h 31m | Avg: 30m 09s | Max: 38m 55s | Hits:  71%/8246  
      🟩 GCC12              Pass: 100%/4   | Total:  1h 48m | Avg: 27m 11s | Max: 29m 44s | Hits:  71%/4712  
      🟨 GCC13              Pass:  95%/20  | Total:  7h 21m | Avg: 22m 05s | Max: 49m 07s | Hits:  70%/22382 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 00m | Avg: 40m 13s | Max: 45m 28s | Hits:  28%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 54m 47s | Avg: 54m 47s | Max: 54m 47s | Hits:  35%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 30s | Max: 57m 39s | Hits:  35%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 55m | Avg: 39m 11s | Max:  1h 01m | Hits:  67%/7038  
    🟨 cxx_family
      🟨 Clang              Pass:  96%/51  | Total: 19h 45m | Avg: 23m 14s | Max: 30m 22s | Hits:  75%/57673 
      🟨 GCC                Pass:  98%/55  | Total: 22h 45m | Avg: 24m 50s | Max: 49m 07s | Hits:  70%/63610 
      🟩 Intel              Pass: 100%/3   | Total:  2h 00m | Avg: 40m 13s | Max: 45m 28s | Hits:  28%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  6h 42m | Avg: 44m 46s | Max:  1h 01m | Hits:  56%/10557 
    🟨 std
      🟨 11                 Pass:  96%/30  | Total: 10h 09m | Avg: 20m 19s | Max: 35m 26s | Hits:  75%/34151 
      🟨 14                 Pass:  97%/34  | Total: 15h 51m | Avg: 27m 58s | Max: 57m 39s | Hits:  66%/38843 
      🟩 17                 Pass: 100%/33  | Total: 15h 58m | Avg: 29m 03s | Max:  1h 00m | Hits:  68%/38847 
      🟨 20                 Pass:  95%/21  | Total:  9h 15m | Avg: 26m 26s | Max:  1h 01m | Hits:  75%/23539 
    🟨 gpu
      🟨 v100               Pass:  97%/118 | Total:  2d 03h | Avg: 26m 03s | Max:  1h 01m | Hits:  70%/135380
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🟩 90a                Pass: 100%/4   | Total: 59m 49s | Avg: 14m 57s | Max: 15m 55s | Hits:  71%/4712  
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

Copy link
Contributor

🟩 CI finished in 23h 47m: Pass: 100%/250 | Total: 6d 02h | Avg: 35m 02s | Max: 1h 39m | Hits: 78%/250036
  • 🟩 cub: Pass: 100%/131 | Total: 3d 22h | Avg: 43m 22s | Max: 1h 39m | Hits: 87%/111124

    🟩 cpu
      🟩 amd64              Pass: 100%/123 | Total:  3d 16h | Avg: 43m 08s | Max:  1h 39m | Hits:  87%/104188
      🟩 arm64              Pass: 100%/8   | Total:  6h 15m | Avg: 46m 53s | Max: 48m 15s | Hits:  84%/6936  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  9h 04m | Avg: 36m 17s | Max: 44m 49s | Hits:  87%/11792 
      🟩 11.8               Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🟩 12.5               Pass: 100%/113 | Total:  3d 10h | Avg: 44m 02s | Max:  1h 39m | Hits:  87%/96731 
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 29m 16s | Avg: 14m 38s | Max: 14m 44s | Hits:  87%/1436  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  9h 04m | Avg: 36m 17s | Max: 44m 49s | Hits:  87%/11792 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🟩 nvcc12.5           Pass: 100%/111 | Total:  3d 10h | Avg: 44m 34s | Max:  1h 39m | Hits:  87%/95295 
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 29m 16s | Avg: 14m 38s | Max: 14m 44s | Hits:  87%/1436  
      🟩 nvcc               Pass: 100%/129 | Total:  3d 22h | Avg: 43m 48s | Max:  1h 39m | Hits:  87%/109688
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  3h 53m | Avg: 38m 55s | Max: 45m 02s | Hits:  86%/4980  
      🟩 Clang10            Pass: 100%/3   | Total:  2h 04m | Avg: 41m 32s | Max: 42m 13s | Hits:  85%/2607  
      🟩 Clang11            Pass: 100%/4   | Total:  2h 43m | Avg: 40m 46s | Max: 42m 04s | Hits:  85%/3476  
      🟩 Clang12            Pass: 100%/4   | Total:  2h 46m | Avg: 41m 37s | Max: 46m 26s | Hits:  85%/3476  
      🟩 Clang13            Pass: 100%/4   | Total:  2h 46m | Avg: 41m 44s | Max: 44m 49s | Hits:  85%/3476  
      🟩 Clang14            Pass: 100%/4   | Total:  2h 42m | Avg: 40m 31s | Max: 41m 04s | Hits:  85%/3476  
      🟩 Clang15            Pass: 100%/4   | Total:  2h 43m | Avg: 40m 52s | Max: 44m 33s | Hits:  85%/3468  
      🟩 Clang16            Pass: 100%/4   | Total:  2h 42m | Avg: 40m 33s | Max: 41m 19s | Hits:  85%/3468  
      🟩 Clang17            Pass: 100%/26  | Total: 12h 43m | Avg: 29m 21s | Max: 48m 15s | Hits:  94%/22244 
      🟩 GCC6               Pass: 100%/2   | Total:  1h 10m | Avg: 35m 24s | Max: 35m 48s | Hits:  87%/1582  
      🟩 GCC7               Pass: 100%/6   | Total:  3h 50m | Avg: 38m 24s | Max: 42m 38s | Hits:  85%/4983  
      🟩 GCC8               Pass: 100%/6   | Total:  3h 49m | Avg: 38m 12s | Max: 40m 46s | Hits:  85%/4983  
      🟩 GCC9               Pass: 100%/6   | Total:  3h 48m | Avg: 38m 07s | Max: 41m 07s | Hits:  85%/4983  
      🟩 GCC10              Pass: 100%/4   | Total:  2h 53m | Avg: 43m 28s | Max: 45m 50s | Hits:  84%/3476  
      🟩 GCC11              Pass: 100%/7   | Total:  5h 34m | Avg: 47m 49s | Max: 54m 49s | Hits:  84%/6069  
      🟩 GCC12              Pass: 100%/4   | Total:  2h 52m | Avg: 43m 11s | Max: 46m 20s | Hits:  84%/3468  
      🟩 GCC13              Pass: 100%/28  | Total:  1d 04h | Avg:  1h 00m | Max:  1h 39m | Hits:  84%/24276 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 08m | Avg: 42m 47s | Max: 43m 16s | Hits:  87%/2379  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 44m 49s | Avg: 44m 49s | Max: 44m 49s | Hits:  86%/709   
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 50m | Avg: 55m 09s | Max: 55m 45s | Hits:  86%/1418  
      🟩 MSVC14.39          Pass: 100%/3   | Total:  2h 39m | Avg: 53m 17s | Max: 57m 15s | Hits:  86%/2127  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/59  | Total:  1d 11h | Avg: 35m 41s | Max: 48m 15s | Hits:  89%/50671 
      🟩 GCC                Pass: 100%/63  | Total:  2d 04h | Avg: 49m 43s | Max:  1h 39m | Hits:  84%/53820 
      🟩 Intel              Pass: 100%/3   | Total:  2h 08m | Avg: 42m 47s | Max: 43m 16s | Hits:  87%/2379  
      🟩 MSVC               Pass: 100%/6   | Total:  5h 15m | Avg: 52m 30s | Max: 57m 15s | Hits:  86%/4254  
    🟩 gpu
      🟩 v100               Pass: 100%/131 | Total:  3d 22h | Avg: 43m 22s | Max:  1h 39m | Hits:  87%/111124
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  2d 19h | Avg: 40m 47s | Max: 57m 15s | Hits:  85%/83380 
      🟩 DeviceLaunch       Pass: 100%/8   | Total:  7h 14m | Avg: 54m 16s | Max:  1h 29m | Hits:  88%/6936  
      🟩 GraphCapture       Pass: 100%/8   | Total:  5h 43m | Avg: 42m 56s | Max:  1h 39m | Hits:  94%/6936  
      🟩 HostLaunch         Pass: 100%/8   | Total:  6h 06m | Avg: 45m 48s | Max:  1h 38m | Hits:  92%/6936  
      🟩 TestGPU            Pass: 100%/8   | Total:  8h 19m | Avg:  1h 02m | Max:  1h 37m | Hits:  92%/6936  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  2h 40m | Avg: 53m 29s | Max: 54m 49s | Hits:  84%/2601  
      🟩 90a                Pass: 100%/4   | Total:  1h 13m | Avg: 18m 17s | Max: 19m 00s | Hits:  84%/3468  
    🟩 std
      🟩 11                 Pass: 100%/34  | Total:  1d 00h | Avg: 43m 36s | Max:  1h 39m | Hits:  86%/29047 
      🟩 14                 Pass: 100%/37  | Total:  1d 02h | Avg: 43m 10s | Max:  1h 37m | Hits:  87%/31174 
      🟩 17                 Pass: 100%/36  | Total:  1d 02h | Avg: 44m 06s | Max:  1h 34m | Hits:  85%/30392 
      🟩 20                 Pass: 100%/24  | Total: 16h 53m | Avg: 42m 12s | Max:  1h 28m | Hits:  89%/20511 
    
  • 🟩 thrust: Pass: 100%/118 | Total: 2d 03h | Avg: 25m 59s | Max: 1h 01m | Hits: 71%/138912

    🟩 cpu
      🟩 amd64              Pass: 100%/110 | Total:  1d 23h | Avg: 26m 00s | Max:  1h 01m | Hits:  71%/129492
      🟩 arm64              Pass: 100%/8   | Total:  3h 27m | Avg: 25m 55s | Max: 31m 24s | Hits:  65%/9420  
    🟩 ctk
      🟩 11.1               Pass: 100%/15  | Total:  6h 23m | Avg: 25m 35s | Max: 54m 47s | Hits:  66%/17660 
      🟩 11.8               Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🟩 12.5               Pass: 100%/100 | Total:  1d 19h | Avg: 25m 48s | Max:  1h 01m | Hits:  71%/117718
    🟩 cudacxx
      🟩 ClangCUDA17        Pass: 100%/2   | Total: 47m 34s | Avg: 23m 47s | Max: 24m 01s | Hits:  71%/2354  
      🟩 nvcc11.1           Pass: 100%/15  | Total:  6h 23m | Avg: 25m 35s | Max: 54m 47s | Hits:  66%/17660 
      🟩 nvcc11.8           Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🟩 nvcc12.5           Pass: 100%/98  | Total:  1d 18h | Avg: 25m 51s | Max:  1h 01m | Hits:  71%/115364
    🟩 cudacxx_family
      🟩 ClangCUDA          Pass: 100%/2   | Total: 47m 34s | Avg: 23m 47s | Max: 24m 01s | Hits:  71%/2354  
      🟩 nvcc               Pass: 100%/116 | Total:  2d 02h | Avg: 26m 02s | Max:  1h 01m | Hits:  71%/136558
    🟩 cxx
      🟩 Clang9             Pass: 100%/6   | Total:  2h 32m | Avg: 25m 22s | Max: 30m 22s | Hits:  72%/7062  
      🟩 Clang10            Pass: 100%/3   | Total:  1h 16m | Avg: 25m 32s | Max: 27m 30s | Hits:  72%/3531  
      🟩 Clang11            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 47s | Max: 28m 36s | Hits:  71%/4708  
      🟩 Clang12            Pass: 100%/4   | Total:  1h 47m | Avg: 26m 46s | Max: 29m 06s | Hits:  71%/4708  
      🟩 Clang13            Pass: 100%/4   | Total:  1h 38m | Avg: 24m 43s | Max: 27m 03s | Hits:  71%/4708  
      🟩 Clang14            Pass: 100%/4   | Total:  1h 43m | Avg: 25m 54s | Max: 27m 51s | Hits:  71%/4708  
      🟩 Clang15            Pass: 100%/4   | Total:  1h 45m | Avg: 26m 17s | Max: 29m 24s | Hits:  71%/4708  
      🟩 Clang16            Pass: 100%/4   | Total:  1h 44m | Avg: 26m 05s | Max: 29m 01s | Hits:  71%/4708  
      🟩 Clang17            Pass: 100%/18  | Total:  5h 49m | Avg: 19m 23s | Max: 30m 21s | Hits:  84%/21186 
      🟩 GCC6               Pass: 100%/2   | Total: 42m 15s | Avg: 21m 07s | Max: 23m 02s | Hits:  72%/2354  
      🟩 GCC7               Pass: 100%/6   | Total:  2h 29m | Avg: 24m 58s | Max: 29m 16s | Hits:  72%/7068  
      🟩 GCC8               Pass: 100%/6   | Total:  2h 27m | Avg: 24m 33s | Max: 27m 59s | Hits:  72%/7068  
      🟩 GCC9               Pass: 100%/6   | Total:  2h 36m | Avg: 26m 04s | Max: 30m 22s | Hits:  65%/7068  
      🟩 GCC10              Pass: 100%/4   | Total:  1h 48m | Avg: 27m 07s | Max: 31m 56s | Hits:  71%/4712  
      🟩 GCC11              Pass: 100%/7   | Total:  3h 31m | Avg: 30m 09s | Max: 38m 55s | Hits:  71%/8246  
      🟩 GCC12              Pass: 100%/4   | Total:  1h 48m | Avg: 27m 11s | Max: 29m 44s | Hits:  71%/4712  
      🟩 GCC13              Pass: 100%/20  | Total:  6h 59m | Avg: 20m 58s | Max: 49m 07s | Hits:  72%/23560 
      🟩 Intel2023.2.0      Pass: 100%/3   | Total:  2h 00m | Avg: 40m 13s | Max: 45m 28s | Hits:  28%/3540  
      🟩 MSVC14.16          Pass: 100%/1   | Total: 54m 47s | Avg: 54m 47s | Max: 54m 47s | Hits:  35%/1173  
      🟩 MSVC14.29          Pass: 100%/2   | Total:  1h 53m | Avg: 56m 30s | Max: 57m 39s | Hits:  35%/2346  
      🟩 MSVC14.39          Pass: 100%/6   | Total:  3h 55m | Avg: 39m 11s | Max:  1h 01m | Hits:  67%/7038  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/51  | Total: 20h 00m | Avg: 23m 32s | Max: 30m 22s | Hits:  76%/60027 
      🟩 GCC                Pass: 100%/55  | Total: 22h 23m | Avg: 24m 26s | Max: 49m 07s | Hits:  71%/64788 
      🟩 Intel              Pass: 100%/3   | Total:  2h 00m | Avg: 40m 13s | Max: 45m 28s | Hits:  28%/3540  
      🟩 MSVC               Pass: 100%/9   | Total:  6h 42m | Avg: 44m 46s | Max:  1h 01m | Hits:  56%/10557 
    🟩 gpu
      🟩 v100               Pass: 100%/118 | Total:  2d 03h | Avg: 25m 59s | Max:  1h 01m | Hits:  71%/138912
    🟩 jobs
      🟩 Build              Pass: 100%/99  | Total:  1d 22h | Avg: 27m 59s | Max:  1h 01m | Hits:  67%/116553
      🟩 TestCPU            Pass: 100%/11  | Total:  1h 46m | Avg:  9m 43s | Max: 19m 14s | Hits:  99%/12939 
      🟩 TestGPU            Pass: 100%/8   | Total:  3h 09m | Avg: 23m 42s | Max: 49m 07s | Hits:  79%/9420  
    🟩 sm
      🟩 60;70;80;90        Pass: 100%/3   | Total:  1h 42m | Avg: 34m 06s | Max: 38m 55s | Hits:  72%/3534  
      🟩 90a                Pass: 100%/4   | Total: 59m 49s | Avg: 14m 57s | Max: 15m 55s | Hits:  71%/4712  
    🟩 std
      🟩 11                 Pass: 100%/30  | Total: 10h 17m | Avg: 20m 35s | Max: 35m 26s | Hits:  75%/35328 
      🟩 14                 Pass: 100%/34  | Total: 15h 58m | Avg: 28m 11s | Max: 57m 39s | Hits:  67%/40020 
      🟩 17                 Pass: 100%/33  | Total: 15h 58m | Avg: 29m 03s | Max:  1h 00m | Hits:  68%/38847 
      🟩 20                 Pass: 100%/21  | Total:  8h 52m | Avg: 25m 22s | Max:  1h 01m | Hits:  76%/24717 
    
  • 🟩 pycuda: Pass: 100%/1 | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s

    🟩 cpu
      🟩 amd64              Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 ctk
      🟩 12.5               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cudacxx
      🟩 nvcc12.5           Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cxx
      🟩 GCC13              Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 cxx_family
      🟩 GCC                Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 gpu
      🟩 v100               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    🟩 jobs
      🟩 Test               Pass: 100%/1   | Total: 11m 57s | Avg: 11m 57s | Max: 11m 57s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
pycuda

Modifications in project or dependencies?

Project
CCCL Infrastructure
libcu++
+/- CUB
+/- Thrust
CUDA Experimental
+/- pycuda

🏃‍ Runner counts (total jobs: 250)

# Runner
178 linux-amd64-cpu16
41 linux-amd64-gpu-v100-latest-1
16 linux-arm64-cpu16
15 windows-amd64-cpu16

@bernhardmgruber bernhardmgruber merged commit 4b9de3b into NVIDIA:main Jul 26, 2024
264 of 265 checks passed
@bernhardmgruber bernhardmgruber deleted the fix_merge branch July 26, 2024 14:57
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
* Add a cuDF inspired test for merge_by_key
* Allow CUB MergePath to support iterators with different value types
* Allow different input value types for merge, as long as they are convertible to the value type of the first iterator. This weakens the publicly documented guarantees of equal value types to restore the old behavior of the thrust implementation replaced in NVIDIA#1817.
pciolkosz pushed a commit to pciolkosz/cccl that referenced this pull request Aug 4, 2024
* Add a cuDF inspired test for merge_by_key
* Allow CUB MergePath to support iterators with different value types
* Allow different input value types for merge, as long as they are convertible to the value type of the first iterator. This weakens the publicly documented guarantees of equal value types to restore the old behavior of the thrust implementation replaced in NVIDIA#1817.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right. cub For all items related to CUB
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants