Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cudastf #1

Closed
wants to merge 462 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
462 commits
Select commit Hold shift + click to select a range
cc316db
Fix BWUtil report on early exit (#1994)
gonidelis Jul 17, 2024
72f77c1
Use libcu++ void_t everywhere (#1977)
bernhardmgruber Jul 17, 2024
be91914
Drop zipped_binary_op (#1988)
bernhardmgruber Jul 17, 2024
64e7a06
Clarify PtxVersion and SmVersion (#2004)
bernhardmgruber Jul 18, 2024
87d0849
Refactor CUB util_device (#1948)
bernhardmgruber Jul 18, 2024
92b4b0b
fix some typos in `<cuda/stream_ref>` (#2003)
ericniebler Jul 18, 2024
56d99db
Add CI slack notifications. (#1961)
alliepiper Jul 18, 2024
fc457b4
Allow nightly workflow to be manually invoked. (#2007)
alliepiper Jul 18, 2024
eb62dc6
Need to use a different approach to reuse secrets in reusable workflo…
alliepiper Jul 18, 2024
97e699f
Enable RAPIDS builds for manually dispatched workflows. (#2009)
alliepiper Jul 19, 2024
2ff83a2
Clean up complex.inl (#1655)
ZelboK Jul 19, 2024
8a5e56a
Add github token to nightly workflow-results action. (#2012)
alliepiper Jul 19, 2024
e5fcebe
Remove obsolete build system glue from the Thrust/CUB submodule struc…
alliepiper Jul 20, 2024
496d88d
Benchmark thrust::copy with non-trivially relocatable type (#1989)
bernhardmgruber Jul 22, 2024
e61bafe
Make bool_constant available in C++11 (#1997)
bernhardmgruber Jul 22, 2024
b8116c3
Spell value initialization where used by thrust vectors (#1990)
bernhardmgruber Jul 22, 2024
1b16af7
Do no redefine `__ELF__` macro (#2018)
miscco Jul 22, 2024
8635429
Port `thrust::merge[_by_key]` to CUB (#1817)
bernhardmgruber Jul 23, 2024
53fe08f
Simplify some pointer traits (#2020)
bernhardmgruber Jul 23, 2024
18cd90f
Simplify test data setup (#2023)
bernhardmgruber Jul 24, 2024
f6d3d0b
Add tests to ensure that we properly propagate common_type for comple…
miscco Jul 24, 2024
a69c8ac
Update Thrust CMake README to use CCCL repo. (#2026)
alliepiper Jul 24, 2024
82a3ed0
Include container toolkit in manual prereqs (#2064)
bryevdv Jul 24, 2024
46759c5
Avoid ADL issues with `thrust::distance` (#2053)
miscco Jul 25, 2024
e25344c
Simplify thrust::detail::wrapped_function (#2019)
bernhardmgruber Jul 25, 2024
5ba23b6
Add a test for Thrust scan with non-commutative op (#2024)
bernhardmgruber Jul 25, 2024
30eaa9c
Update memory_resource docs (#1883)
miscco Jul 25, 2024
04db77a
Temporarily switch nightly H100 CI to build-only. (#2060)
alliepiper Jul 25, 2024
1797742
Do not rely on conversions between float and extended floating point …
miscco Jul 25, 2024
a4cd52e
experimental wrapper types for `cudaEvent_t` that provide a modern C+…
ericniebler Jul 26, 2024
c60d687
[CUDAX] Add a dummy device struct for now (#2066)
pciolkosz Jul 26, 2024
4b9de3b
Allow (somewhat) different input value types for merge (#2075)
bernhardmgruber Jul 26, 2024
b761538
Avoid `::result_type` for partial sums in TBB reduce_by_key (#1998)
bernhardmgruber Jul 26, 2024
a8db0a9
Fix formatting (#2090)
bernhardmgruber Jul 27, 2024
94c86b6
Rename and refactor transform_iterator_base (#1987)
bernhardmgruber Jul 27, 2024
ad57b1e
Bench analysis: Print all top rows when asked for (#2089)
bernhardmgruber Jul 27, 2024
59d7a4b
makes examples use explicit exec space specifiers in user-provided fu…
elstehle Jul 27, 2024
8a185fe
Separate `cuda/experimental` when sorting includes (#2094)
bernhardmgruber Jul 29, 2024
731c84c
add support to `cudax::device` for querying a device's attributes (#2…
ericniebler Jul 30, 2024
15e2ce0
[CUDAX] Add experimental owning abstraction for cudaStream_t (#2093)
pciolkosz Jul 30, 2024
1e67aa7
Do not query NVRTC for cuda runtime header (#2102)
miscco Jul 30, 2024
6dfc8dd
Cleanup CUB block/thread load and exchange (#1946)
bernhardmgruber Jul 30, 2024
4188fb0
Improve binary function objects and replace thrust implementation (#1…
srinivasyadav18 Jul 30, 2024
d92ef23
Replace `_LIBCUDACXX_CPO_ACCESSIBILITY` with `_CCCL_GLOBAL_CONSTANT `…
miscco Jul 30, 2024
d4f928e
Add script to update RAPIDS version. (#2082)
bdice Jul 30, 2024
ce95739
Update bad links (#2080)
bryevdv Jul 30, 2024
c0cfbd0
Fix line break issues that break doxygen code examples (#2103)
miscco Jul 30, 2024
7a3dae7
Add internal wrapper for cuda driver APIs (#2070)
pciolkosz Jul 31, 2024
694e963
Use `common_type` for complex `pow` (#1800)
miscco Jul 31, 2024
a2a3824
rename `device` to `device_ref`, add immovable `device` as a place to…
ericniebler Jul 31, 2024
bddcd20
Use the float flavors of the cmath functions in the extended floating…
miscco Jul 31, 2024
27253d7
[PoC]: Implement `cuda::experimental::uninitialized_buffer` (#1831)
miscco Jul 31, 2024
2600135
Ensure that we avoid ABI Version conflics (#2137)
miscco Jul 31, 2024
39b926a
Ensure that `cuda_memory_resource` allocates memory on the proper dev…
miscco Aug 1, 2024
ce4b904
Clarify compatibility wrt. template specializations (#2138)
bernhardmgruber Aug 1, 2024
fadb135
Implement a `cudax::get_stream` CPO (#2135)
miscco Aug 1, 2024
4634d81
Make `cuda::std::tuple` trivially copyable (#2127)
miscco Aug 1, 2024
cc0b3d1
Fix missing copy of docs artifacts (#2162)
miscco Aug 1, 2024
cbe01b0
Update CODEOWNERS
jrhemstad Aug 1, 2024
02378eb
Fix g++-14 warning on uninitialized copying (#2157)
bernhardmgruber Aug 1, 2024
cba0345
Fix flakey heterogeneous tests (#2085)
wmaxey Aug 2, 2024
24ed47d
Fix multiple definition of InclusiveScanKernel (#2169)
bernhardmgruber Aug 2, 2024
a8ca75c
[CUDAX] Add a global constexpr `cudax::devices` range for all devices…
ericniebler Aug 3, 2024
d0254e4
fix use of `cudaStream_t` as if it were a stream wrapper (#2190)
ericniebler Aug 3, 2024
a903dc6
Fix uninitialized_buffer self assignment (#2170)
miscco Aug 5, 2024
9459e4a
Fix trivial_copy_device_to_device execution space (#2164)
gevtushenko Aug 5, 2024
c65a965
Clarify libcu++ use by non-CUDA compilers (#1969)
bernhardmgruber Aug 5, 2024
e519f25
Warn when using C++14 in CUB and Thrust (#2166)
bernhardmgruber Aug 5, 2024
fe27d99
Fix the `clang-format` path in the devcotnainers (#2194)
miscco Aug 5, 2024
d1e7c1c
Mount a build directory for CCCL projects if WSL is detected (#2035)
wmaxey Aug 5, 2024
75929cb
2118 [CUDAX] Change the RAII device swapper to use driver API and add…
pciolkosz Aug 6, 2024
1b6dbd4
Fix singular vs plural typo in thread scope documentation. (#2198)
brycelelbach Aug 6, 2024
2db4fa7
[CUDAX] fixing some minor issues with device attribute queries (#2183)
ericniebler Aug 6, 2024
b0e09d0
Integrate Python docs (#2196)
bryevdv Aug 7, 2024
62336ad
[FEA] Atomics codegen refactor (#1993)
wmaxey Aug 7, 2024
47b8f5c
[CUDAX] add `__launch_transform` to transform arguments to `cudax::la…
ericniebler Aug 7, 2024
39fd05e
Cleanup common testing headers and correct asserts in launch testing …
pciolkosz Aug 8, 2024
c9a7b6a
[CUDAX] Add an API to get device_ref from stream and add comparison o…
pciolkosz Aug 8, 2024
3ebf8cc
Update devcontainer docs for WSL (#2200)
jrhemstad Aug 8, 2024
f95f211
add `cudax::distribute<threadsPrBlock>(numElements)` as a way to even…
ericniebler Aug 9, 2024
8e20c9a
Rework mdspan concept emulation (#2213)
miscco Aug 9, 2024
7473934
Un-doc functions taking debug_synchronous (#2209)
bryevdv Aug 9, 2024
a3a5f9c
CUDA `vector_add` sample project (#2160)
ericniebler Aug 9, 2024
6ee3415
avoid constraint recursion in the `resource` concept (#2215)
ericniebler Aug 12, 2024
aaf1340
fix `cuda_memory_resource` test for properly aligned memory (#2227)
ericniebler Aug 13, 2024
098fb29
Fix including `<complex>` when bad CUDA bfloat/half macros are used. …
wmaxey Aug 13, 2024
d7c83fe
add license & fix long_description (#2211)
leofang Aug 13, 2024
64d28d1
Extract reduction kernels into NVRTC-compilable header (#2231)
gevtushenko Aug 14, 2024
6213a5e
Implement `<cuda/std/bitset>` (#1496)
griwes Aug 14, 2024
2e44b2c
Refactor placeholder operators (#2233)
bernhardmgruber Aug 14, 2024
352638b
Add missing annotations for deprecated debug_sync APIs (#2212)
bernhardmgruber Aug 14, 2024
dded5f1
Test thrust headers for disabled half/bf16 support (#2219)
bernhardmgruber Aug 14, 2024
1981c49
Make cuda::std::max constexpr in C++11 (#2107)
bernhardmgruber Aug 14, 2024
73df2b0
Fix ForEachCopyN for non-contiguous iterators (#2220)
bernhardmgruber Aug 14, 2024
cbce14b
Configure CUB/Thrust for C++17 by default (#2217)
bernhardmgruber Aug 14, 2024
e423412
Allow installing components when downstream (#2096)
stephenswat Aug 15, 2024
532ff47
Rename the memory resources to drop the superfluous prefix `cuda_` (#…
miscco Aug 15, 2024
16d4fd3
Fix and simplify <bit> (#2197)
wmaxey Aug 16, 2024
fed3ec1
Proclaim pair and tuple trivially relocatable (#2010)
bernhardmgruber Aug 16, 2024
4a5dcc4
Make `cuda::std::min` constexpr in C++11 (#2249)
miscco Aug 16, 2024
ba9e9bb
Add `CCCL_DISABLE_NVTX` macro (#2173)
bernhardmgruber Aug 16, 2024
51c1b22
Workaround GCC 13 issue with empty histogram decoder op (#2252)
bernhardmgruber Aug 19, 2024
da9b7dd
Refactor Thrust's logical meta functions (#2260)
bernhardmgruber Aug 20, 2024
f871aeb
Fix use of doxygen \file command (#2259)
bernhardmgruber Aug 20, 2024
38d5787
Add tests for transform_iterator's reference type (#2221)
bernhardmgruber Aug 20, 2024
c92e8d4
Small tuning script output improvements (#2262)
bernhardmgruber Aug 20, 2024
7bec0ce
Fix Thrust::vector ctor selection for int,int (#2261)
bernhardmgruber Aug 20, 2024
06e334f
Adds support for large number of items to `DeviceScan` (#2171)
elstehle Aug 21, 2024
1e1af8d
Use/Test radix sort for int128, half, bfloat16 in Thrust (#2168)
bernhardmgruber Aug 21, 2024
5a4881b
Implement C API for device reduction (#2256)
gevtushenko Aug 21, 2024
2c1080d
Move cooperative module (#2269)
gevtushenko Aug 21, 2024
529f910
Move compiler version macros into libcu++ (#2250)
bernhardmgruber Aug 21, 2024
d62e979
Introduce cuda.parallel module (#2276)
gevtushenko Aug 23, 2024
0d0d2d3
Adds `thrust::tabulate_output_iterator` (#2282)
elstehle Aug 25, 2024
a15adf3
Drop macos string that lit cannot parse properly (#2283)
miscco Aug 25, 2024
c1c1d96
Flatten forwarding headers (#2284)
miscco Aug 26, 2024
03247ab
2270 static compute capabilities queries (#2271)
pciolkosz Aug 26, 2024
9d4c3a8
Fix read of dangling reference (#2290)
bernhardmgruber Aug 26, 2024
f53e725
Implement `any_resource`, an owning wrapper around a memory resource …
ericniebler Aug 27, 2024
e8939e9
fixes formatting of tabulate iterator (#2298)
elstehle Aug 27, 2024
92e006b
use `NV_IF_TARGET` to conditionally compile CUDAX tests (#2297)
ericniebler Aug 27, 2024
f80972b
Make for_each compatible with NVRTC (#2288)
wmaxey Aug 27, 2024
a5b0a23
refactor cmake so more cudax samples can be easily added (#2296)
ericniebler Aug 27, 2024
dd90bed
Use the `in`, `out`, and `inout` parameter decorators from `cudax::la…
ericniebler Aug 27, 2024
0a1cddb
Implement `std::bit_cast` (#2258)
miscco Aug 27, 2024
490a20f
Cleanup the `<cuda/std/bit>` header (#2299)
miscco Aug 28, 2024
198208a
change `cudax::uninitialized_buffer` to own its memory resource with …
ericniebler Aug 28, 2024
ec5bd08
Documentation typos (#2302)
fbusato Aug 28, 2024
e311e89
Add thrust::inclusive_scan with init_value support (#1940)
gonidelis Aug 28, 2024
942f59f
Assure placeholder expressions are semi-regular (#2305)
bernhardmgruber Aug 28, 2024
7d4be26
Add documentation for `any_resource` (#2309)
miscco Aug 28, 2024
eb87e56
Implement P0843 `inplace_vector` (#1936)
miscco Aug 29, 2024
10b0d2b
Cleanup `__config` and unify most visibility macros (#2285)
miscco Aug 29, 2024
11fc50b
Add a fast, low memory "limited" mode to CUB testing. (#2317)
alliepiper Aug 29, 2024
d862315
[CUDAX] Add event_ref::is_done() and update event tests (#2304)
pciolkosz Aug 29, 2024
e42d7b7
Minor cleanup to memory resources (#2308)
miscco Aug 29, 2024
a7837d3
Drop ICC from the cudax support matrix (#2330)
miscco Aug 29, 2024
16096d4
Do not hardcode Thrust's host system to cpp. (#2332)
alliepiper Aug 30, 2024
a9fa9a1
[CUDAX] Add compute_capability device attribute and handle arch_trait…
pciolkosz Aug 30, 2024
95c6ba9
Disable exec checks on ranges CPOs (#2331)
miscco Aug 30, 2024
206e745
Enable exceptions by default (#2329)
miscco Aug 30, 2024
89702de
Make the thrust dispatch mechanisms configurable (#2310)
miscco Aug 30, 2024
a7996f0
[CUDAX] give all the cudax headers the `.cuh` extension (#2340)
ericniebler Aug 30, 2024
bb6c7b1
Compiler version improvements (#2316)
fbusato Aug 31, 2024
0a40182
Fix hardcoding __THRUST_HOST_SYSTEM_NAMESPACE to cpp (#2341)
bernhardmgruber Aug 31, 2024
709ddec
Improvements to the Cuda Core C library infrastructure (#2336)
miscco Sep 2, 2024
498251c
Fix bug remaining on thrust::inclusive_scan with init value with CDP …
gonidelis Sep 3, 2024
c6b777b
[CUDAX] make `uninitialized_buffer` usable with `launch` (#2342)
ericniebler Sep 3, 2024
5c6b6df
Try to reenable nightly tests (#1847)
miscco Sep 3, 2024
4297b07
Update Memory Model docs for HMM (#2272)
gonzalobg Sep 3, 2024
457e4d7
Update CONTRIBUTING.md
jrhemstad Sep 3, 2024
6b76188
Harden thrust algorithms against evil iterators that overload `operat…
miscco Sep 3, 2024
707ee73
Avoid circular concept definition with memory resources (#2351)
miscco Sep 3, 2024
a154e7b
add IWYU `export` pragma on config headers (#2352)
ericniebler Sep 3, 2024
1e9125e
Add cuda_parallel to CI. (#2338)
alliepiper Sep 4, 2024
0251ae4
[CUDAX] Branch out an experimental version of stream_ref (#2343)
pciolkosz Sep 4, 2024
dae826b
Improve visibility macros for libcu++ (#2337)
miscco Sep 4, 2024
dcb7d51
Add missing cuKernelGetFunction call to reduce (#2355)
pciolkosz Sep 4, 2024
046a761
Move `invalid_stream` to the proper file (#2360)
miscco Sep 4, 2024
3876dcc
fix the cudax `vector_add` sample (#2372)
ericniebler Sep 5, 2024
af695d0
Add -Wmissing-field-initializers to cudax (#2373)
pciolkosz Sep 5, 2024
05e019a
Update CCCL version to 2.7.0 (#2364)
wmaxey Sep 5, 2024
e0dad56
Adds benchmarks for `DeviceSelect::Unique` (#2359)
elstehle Sep 6, 2024
3adc92a
CUB - Enable DPX Reduction (#2286)
fbusato Sep 6, 2024
4a32b1c
[CUDAX] add a small c++17 implementation of `std::execution` (aka P23…
ericniebler Sep 6, 2024
5647255
Add thurst::transform_inclusive_scan with init value (#2326)
gonidelis Sep 6, 2024
fcf7c91
Widen histogram agent constructor to more types (#2380)
bernhardmgruber Sep 6, 2024
07fef97
Use a constant for the amount of static SMEM (#2374)
bernhardmgruber Sep 6, 2024
71b9f98
Add `cub::DeviceTransform` (#2086)
bernhardmgruber Sep 8, 2024
371a434
Update toolkit to CTK 12.6 (#2348)
miscco Sep 9, 2024
ee9b856
implement `make_integer_sequence` in terms of intrinsics whenever pos…
ericniebler Sep 9, 2024
d5492d5
Implement `cuda::mr::cuda_async_memory_resource` (#1637)
miscco Sep 10, 2024
e7ade77
Drop implementation of `thrust::pair` and `thrust::tuple` (#2395)
miscco Sep 11, 2024
1c422f2
Pull out `_LIBCUDACXX_UNREACHABLE` into its own file (#2399)
miscco Sep 11, 2024
1fe25ed
Share common compiler flags in new CCCL-level targets. (#2386)
alliepiper Sep 12, 2024
cf21a40
include `<crt/host_defines.h>` if possible from `execution_space.h` h…
ericniebler Sep 12, 2024
684cf8e
add some simple utilities for manipulating lists of types (#2370)
ericniebler Sep 16, 2024
4088134
Drop thrusts diagnostic suppression warnings (#2392)
miscco Sep 16, 2024
e3c2e2b
[PoC]: Implement `cuda::experimental::uninitialized_async_buffer` (#1…
miscco Sep 17, 2024
8ced877
Fix thrust package to work with newer FindOpenMP.cmake. (#2421)
alliepiper Sep 18, 2024
8f27fba
Introduce `cccl_configure_target` cmake function. (#2388)
alliepiper Sep 18, 2024
2496571
Fix sccache errors in RAPIDS builds (#2417)
trxcllnt Sep 18, 2024
52a967f
Replace `CUDA C++ Core Libraries` with `CUDA Core Compute Libraries` …
rwgk Sep 19, 2024
d191102
Move the cuda atomic.h file (#2418)
miscco Sep 19, 2024
445fd71
`uninitialized_buffer::get_resource` returns a ref to an `any_resourc…
ericniebler Sep 19, 2024
b07f036
Refactor `cuda::ceil_div` to take two different types (#2376)
miscco Sep 19, 2024
ee94bb9
Reduce PR testing matrix. (#2436)
alliepiper Sep 19, 2024
7bd04ad
Implement `cudax::shared_resource` (#2398)
miscco Sep 19, 2024
5e14128
Increase the libcu++ timeout (#2435)
miscco Sep 19, 2024
2fe09c8
Move c/include/cccl/*.h files to c/include/cccl/c/*.h (#2428)
rwgk Sep 19, 2024
8b2bf13
Make `any_resource` emplacable (#2425)
miscco Sep 19, 2024
28888eb
Fix issues with `__host__` and `__device__` definitions (#2413)
miscco Sep 19, 2024
31c3eb9
Make `bit_cast` play nice with extended floating point types (#2434)
miscco Sep 20, 2024
92bc4ac
Do not include our own string.h file (#2444)
miscco Sep 20, 2024
9641b7e
Move nightly time [skip ci] (#2437)
bdice Sep 20, 2024
aa1458d
Remove a ton of lines in thrust tests (#2356)
gonidelis Sep 24, 2024
6fd1e5c
[CUDAX] Add placeholder green context type and logical device that ca…
pciolkosz Sep 24, 2024
0f0fdc2
Fix typo in CCCLBuildCompilerTargets.cmake (#2453)
alliepiper Sep 24, 2024
17e0c83
This drops the duplicated definition of `_CCCL_NO_SYSTEM_HEADER` from…
miscco Sep 25, 2024
2cbf40b
Consolidate packages and install rules (#2456)
alliepiper Sep 25, 2024
bda69fd
Prune CUB's ChainedPolicy by __CUDA_ARCH_LIST__ (#2154)
bernhardmgruber Sep 26, 2024
cc01ce7
fixes merge conflict for policz pruning (#2466)
elstehle Sep 26, 2024
99fb4b4
Add CCCL_ENABLE_WERROR flag. (#2463)
alliepiper Sep 26, 2024
5d45850
Add CUB tests for segmented sort/radix sort with 64-bit num. items an…
fbusato Sep 26, 2024
0e09815
Propagate compiler flags down to libcu++ LIT tests (#2420)
Artem-B Sep 27, 2024
467a44d
Drop remaining uses of `_LIBCUDACXX_COMPILER_*` (#2467)
miscco Sep 28, 2024
7c668e8
Avoid C++17 extension in c++11 tests (#2469)
miscco Sep 28, 2024
e3800d7
Add span to example and templated block size (#2470)
Kh4ster Sep 28, 2024
94e4e75
Drop Objective C++ support (#2468)
miscco Sep 28, 2024
242bcce
removes superfluous template keyword that striggers warnings/errors w…
andrewcorrigan Sep 30, 2024
653e546
Improve build times in several heavyweight libcudacxx tests. (#2478)
wmaxey Sep 30, 2024
0521015
Drop `__availability` header (#2484)
miscco Sep 30, 2024
725954c
Replace a few more instances of `CUDA C++ Core Libraries` with CUDA C…
rwgk Sep 30, 2024
81d05bb
Fix `common_type` specialization for extended floating point types (#…
miscco Oct 1, 2024
808f9c2
Implement some CUDA API calls for `async_memory_pool` (#2455)
miscco Oct 1, 2024
57b9899
Move cudax example project to CCCL project examples. (#2462)
alliepiper Oct 1, 2024
190099c
Disable system header for narrowing conversion check (#2465)
miscco Oct 1, 2024
59ad103
Require resources to always provide at least one execution space prop…
miscco Oct 2, 2024
e4f48cf
Rework builtin handling (#2461)
miscco Oct 2, 2024
ee3bd53
Disable execution checks for `std::equal` (#2491)
miscco Oct 2, 2024
0589775
replace `_CCCL_ALWAYS_INLINE` with `_CCCL_FORCEINLINE` (#2439)
ericniebler Oct 2, 2024
25c57f8
Drop 2 relative includes that snuck in (#2492)
miscco Oct 2, 2024
10769b4
re-express the `__tupl::__apply` member to make nvc++ happy (#2493)
ericniebler Oct 2, 2024
5e139af
Drop badly named `_One_of` concept (#2490)
miscco Oct 2, 2024
3eee9b2
Unify assert handling in cccl (#2382)
miscco Oct 3, 2024
bb001b7
Reduce scope of Thrust linkage in cudax. (#2496)
alliepiper Oct 3, 2024
a0ec74c
Centralize CPM logic. (#2495)
alliepiper Oct 3, 2024
c15546a
Fix typo in presets. (#2497)
alliepiper Oct 3, 2024
1cfe171
Refactor away per-project TOPLEVEL flags. (#2498)
alliepiper Oct 3, 2024
e8d57c3
[FEA]: Validate cuda.parallel type matching in build and execution (#…
rwgk Oct 4, 2024
583567b
avoid gcc optimizer bug by not force inlining part of `thrust::transf…
ericniebler Oct 4, 2024
c86caca
Cleanup and modularize `<cuda/std/barrier>` (#2443)
miscco Oct 5, 2024
8aaeb29
Consolidate header testing infra. (#2460)
alliepiper Oct 7, 2024
ee5dd3e
Add ForEachN from CUB to cccl/c. (#2378)
wmaxey Oct 8, 2024
16f9a1a
Adds support for large number of items in `DeviceSelect` and `DeviceP…
elstehle Oct 8, 2024
951c822
Adds support for large number of items to `DeviceScan::*ByKey` family…
elstehle Oct 8, 2024
e149e86
Integrate c/parallel with CCCL build system and CI. (#2514)
alliepiper Oct 9, 2024
cbb0edd
Initial import and rename of STF headers.
alliepiper Oct 9, 2024
afa153d
Refactor include paths to match cudax conventions.
alliepiper Oct 9, 2024
4b2cf18
Apply CCCL clang-format to STF files.
alliepiper Oct 9, 2024
450136e
Split STF headers into a separate headertest unit.
alliepiper Oct 9, 2024
09213f6
Fix -Wreorder warnings.
alliepiper Oct 9, 2024
c587b36
Fix -Wsign-compare warnings.
alliepiper Oct 9, 2024
2030832
s/I/Idx/g (Identifier I conflicts with complex.h system headers).
alliepiper Oct 9, 2024
7a2a842
Add missing includes.
alliepiper Oct 9, 2024
71196f1
Add missing execution space annotations.
alliepiper Oct 9, 2024
ebc205a
Fix standalone compilation of logical_data.cuh.
alliepiper Oct 9, 2024
750db80
Limit `no_device_stack` pragma to NVHPC.
alliepiper Oct 9, 2024
5c55fef
Temporarily exclude some failing headers from header testing.
alliepiper Oct 9, 2024
03d0a33
Mark a variable as potentially unused (due to some constexpr condition)
caugonnet Oct 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
71 changes: 32 additions & 39 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -23,54 +23,35 @@ AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterReturnType: None
AlwaysBreakTemplateDeclarations: Yes
AttributeMacros: [
'_CCCL_ALIGNAS_TYPE',
'_CCCL_ALIGNAS',
'_CCCL_CONSTEXPR_CXX14',
'_CCCL_CONSTEXPR_CXX17',
'_CCCL_CONSTEXPR_CXX20',
'_CCCL_CONSTEXPR_CXX23',
'_CCCL_DEVICE',
'_CCCL_FALLTHROUGH',
'_CCCL_FORCEINLINE',
'_CCCL_HIDE_FROM_ABI',
'_CCCL_HOST_DEVICE',
'_CCCL_HOST',
'_CCCL_NO_UNIQUE_ADDRESS',
'_CCCL_NODISCARD_FRIEND',
'_CCCL_NODISCARD',
'_CCCL_NORETURN',
'_CCCL_TYPE_VISIBILITY_DEFAULT',
'_CCCL_VISIBILITY_HIDDEN',
'CUB_RUNTIME_FUNCTION',
'CUB_DETAIL_KERNEL_ATTRIBUTES',
'THRUST_RUNTIME_FUNCTION',
'THRUST_DETAIL_KERNEL_ATTRIBUTES',
'_ALIGNAS_TYPE',
'_ALIGNAS',
'_LIBCUDACXX_ALIGNOF',
'_LIBCUDACXX_ALWAYS_INLINE',
'_LIBCUDACXX_AVAILABILITY_THROW_BAD_VARIANT_ACCESS',
'_CCCL_CONSTEXPR_CXX14',
'_CCCL_CONSTEXPR_CXX17',
'_CCCL_CONSTEXPR_CXX20',
'_CCCL_CONSTEXPR_CXX23',
'_LIBCUDACXX_CONSTINIT',
'_LIBCUDACXX_DEPRECATED_IN_CXX11',
'_LIBCUDACXX_DEPRECATED_IN_CXX14',
'_LIBCUDACXX_DEPRECATED_IN_CXX17',
'_LIBCUDACXX_DEPRECATED_IN_CXX20',
'_LIBCUDACXX_DEPRECATED',
'_LIBCUDACXX_DISABLE_EXTENTSION_WARNING',
'_LIBCUDACXX_EXCLUDE_FROM_EXPLICIT_INSTANTIATION',
'_LIBCUDACXX_EXPORTED_FROM_ABI',
'_LIBCUDACXX_EXTERN_TEMPLATE_TYPE_VIS',
'_LIBCUDACXX_FALLTHROUGH',
'_LIBCUDACXX_HIDDEN',
'_LIBCUDACXX_HIDE_FROM_ABI_AFTER_V1',
'_LIBCUDACXX_HIDE_FROM_ABI',
'_LIBCUDACXX_INLINE_VISIBILITY',
'_LIBCUDACXX_INTERNAL_LINKAGE',
'_LIBCUDACXX_METHOD_TEMPLATE_IMPLICIT_INSTANTIATION_VIS',
'_LIBCUDACXX_NO_DESTROY',
'_LIBCUDACXX_NO_SANITIZE',
'_LIBCUDACXX_NO_UNIQUE_ADDRESS',
'_LIBCUDACXX_NOALIAS',
'_LIBCUDACXX_NODISCARD_EXT',
'_LIBCUDACXX_NODISCARD',
'_LIBCUDACXX_NORETURN',
'_LIBCUDACXX_OVERRIDABLE_FUNC_VIS',
'_LIBCUDACXX_STANDALONE_DEBUG',
'_LIBCUDACXX_TEMPLATE_DATA_VIS',
'_LIBCUDACXX_TEMPLATE_VIS',
'_LIBCUDACXX_THREAD_SAFETY_ANNOTATION',
'_LIBCUDACXX_USING_IF_EXISTS',
'_LIBCUDACXX_WEAK',
]
BinPackArguments: false
BinPackParameters: false
Expand Down Expand Up @@ -107,18 +88,30 @@ IfMacros: [
IndentWrappedFunctionNames: false
IncludeBlocks: Regroup
IncludeCategories:
- Regex: '^<cub'
Priority: 1
- Regex: '^<cuda/experimental/__async/prologue.cuh>'
Priority: 0x7FFFFFFF
SortPriority: 0x7FFFFFFF
- Regex: '^<(cuda/std/detail/__config|cub/config.cuh|thrust/detail/config.h|thrust/system/cuda/config.h)'
Priority: 0
SortPriority: 0
- Regex: '^<thrust'
- Regex: '^<cub/'
Priority: 2
SortPriority: 1
- Regex: '^<cuda'
- Regex: '^<thrust/'
Priority: 3
SortPriority: 2
- Regex: '^<[a-z_]*>$'
- Regex: '^<cuda/experimental'
Priority: 5
SortPriority: 4
- Regex: '^<cuda/'
Priority: 4
SortPriority: 3
- Regex: '^<[a-z_]*>$'
Priority: 6
SortPriority: 5
- Regex: '^<cuda'
Priority: 0
SortPriority: 0
InsertBraces: true
IndentCaseLabels: true
InsertNewlineAtEOF: true
Expand Down Expand Up @@ -157,7 +150,7 @@ SpaceBeforeParens: ControlStatements
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 1
SpacesInAngles: Leave
SpacesInAngles: Never
SpacesInCStyleCastParentheses: false
SpacesInParentheses: false
SpacesInSquareBrackets: false
Expand Down
210 changes: 122 additions & 88 deletions .devcontainer/README.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,32 @@
> **Note**
> The instructions in this README are specific to Linux development environments. Instructions for Windows are coming soon!
> The instructions in this README are specific to Linux development environments (including WSL on Windows). Instructions for native Windows development (e.g., `msvc`) are coming soon!

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/NVIDIA/cccl?quickstart=1&devcontainer_path=.devcontainer%2Fdevcontainer.json)

# CCCL Dev Containers

CCCL uses [Development Containers](https://containers.dev/) to provide consistent and convenient development environments for both local development and for CI. This guide covers setup in [Visual Studio Code](#quickstart-vscode-recommended) and [Docker](#quickstart-docker-manual-approach). The guide also provides additional instructions in case you want use WSL.
CCCL uses [Dev Containers](https://containers.dev/) to provide consistent and convenient development environments for both local development and for CI.

## Table of Contents
1. [Quickstart: VSCode (Recommended)](#vscode)
2. [Quickstart: Docker (Manual Approach)](#docker)
3. [Quickstart: Using WSL](#wsl)
VSCode offers the most convenient experience with Dev Containers due to its tight native integration, however, our containers are also fully usable without VSCode by leveraging Docker directly.

## Quickstart: VSCode (Recommended) <a name="vscode"></a>
## Table of Contents
1. [Quickstart: VSCode on Linux (Recommended)](#vscode)
2. [Quickstart: VSCode on WSL (Recommended for Windows)](#wsl)
3. [Quickstart: Docker on Linux (Manual Approach)](#docker)

## Quickstart: VSCode on Linux (Recommended) <a name="vscode"></a>
### Prerequisites
- [Visual Studio Code](https://code.visualstudio.com/)
- [Remote - Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
- [Docker](https://docs.docker.com/engine/install/) - This is only for completeness because it should already be implicitly installed by the Dev Containers extension

### Steps
#### GPU Prerequisites (only needed for executing tests that require a GPU)
- Supported NVIDIA GPU
- [NVIDIA Driver](https://www.nvidia.com/Download/index.aspx?lang=en-us)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)

### Steps <a name="vscode-devcontainer-steps"></a>

1. Clone the Repository
```bash
Expand All @@ -32,7 +38,7 @@ CCCL uses [Development Containers](https://containers.dev/) to provide consisten

![Shows "Reopen in Container" prompt when opening the cccl directory in VScode.](./img/reopen_in_container.png)

- Alternatively, use the Command Palette to start a Dev Container. Press `Ctrl+Shift+P` to open the Command Palette. Type "Remote-Containers: Reopen in Container" and select it.
- Alternatively, use `ctrl+shift+p` to open the Command Palette and type "Remote-Containers: Reopen in Container" and select it.

![Shows "Reopen in Container" in command pallete.](./img/open_in_container_manual.png)

Expand All @@ -42,11 +48,14 @@ CCCL uses [Development Containers](https://containers.dev/) to provide consisten

5. VSCode will initialize the selected Dev Container. This can take a few minutes the first time.

6. Once initialized, the local `cccl/` directory is mirrored into the container to ensure any changes are persistent.
6. (Optional) Authenticate with GitHub
- After container startup, you will be asked if you would like to authenticate with GitHub. This is for access to CCCL's distributed `sccache` storage. If you are not an NVIDIA employee, you can safely ignore this step. For more information, see the [`sccache`](#sccache) section below.

7. Done! See the [contributing guide](../CONTRIBUTING.md#building-and-testing) for instructions on how to build and run tests.
7. Once initialized, the local `cccl/` directory is mirrored into the container to ensure any changes are persistent.

### (Optional) Authenticate with GitHub for `sccache`
8. Done! See the [contributing guide](../CONTRIBUTING.md#building-and-testing) for instructions on how to build and run tests.

### (Optional) Authenticate with GitHub for `sccache` <a name="sccache"></a>

After starting the container, there will be a prompt to authenticate with GitHub. This grants access to a [`sccache`](https://github.com/mozilla/sccache) server shared with CI and greatly accelerates local build times. This is currently limited to NVIDIA employees belonging to the `NVIDIA` or `rapidsai` GitHub organizations.

Expand All @@ -60,10 +69,110 @@ To manually trigger this authentication, execute the `devcontainer-utils-vault-s

For more information about the sccache configuration and authentication, see the documentation at [`rapidsai/devcontainers`](https://github.com/rapidsai/devcontainers/blob/branch-23.10/USAGE.md#build-caching-with-sccache).

## Quickstart: VSCode on WSL (Recommended for Windows) <a name="wsl"></a>

Windows Subsystem for Linux (WSL) enables you to run a Linux environment directly in Windows.
This isn't for native Windows development (e.g., compiling with `msvc`), but effectively a more convenient option than setting up a dual-boot Linux/Windows machine.
Apart from the initial setup of WSL, the process for using CCCL's Dev Containers in WSL is effectively the same as the instructions for Linux, because WSL _is_ Linux.

### Prerequisites
- Windows OS that supports WSL 2 (Windows 11 or newer)
- [Windows Subsystem for Linux v2 (WSL 2)](https://learn.microsoft.com/en-us/windows/wsl/install)
- [Visual Studio Code](https://code.visualstudio.com/) (installed on Windows host)
- [VSCode Remote Development Extension Pack](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack) (installed on Windows host)
- Includes [Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) and [WSL](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-wsl) extensions
- [Docker](https://docs.docker.com/engine/install/) - (Will be installed automatically by the Remote Development extension)

#### GPU Prerequisites (only needed for executing tests that require a GPU)
- Supported NVIDIA GPU
- [NVIDIA Driver](https://www.nvidia.com/Download/index.aspx?lang=en-us) (installed on Windows host)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) (**installed inside WSL**)

For more details see the official NVIDIA [Getting Started with CUDA on WSL guide](https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl-2).

### Install WSL on your Windows host
Refer to [Microsoft's documentation](https://learn.microsoft.com/en-us/windows/wsl/install) for the full instructions to install WSL2.

<details>
<summary>Click here for the TL;DR version</summary>
1. Run `Powershell` as an administrator
![image](https://github.com/user-attachments/assets/2c985887-ca6c-46bc-9e1b-f235ccfd8513)

2. Install WSL 2 by running:
```bash
> wsl --install
```
3. Restart your computer
4. If this is your first time installing WSL, upon restarting, it will prompt you to create a username/password to use inside WSL.
5. Verify `wsl` was succesfully installed by opening Powershell again and run
```bash
> wsl -l -v
NAME STATE VERSION
* Ubuntu Running 2
```
5. Launch `wsl` and verify your Linux environment
```
# In Powershell, start WSL, which will drop you into a terminal session running in Linux
> wsl

# In the new terminal session, verify your Linux environment by changing to your home directory
# and displaying the current directory. This should show `/home/*YOUR USER NAME*`
> cd ~
> pwd
/home/jhemstad
```

Congratulations! You now have WSL installed and can use it as you would a normal Ubuntu/Linux installation.
This is sufficient for *building* CCCL's tests, if you have a GPU on your system and you would like to use it to run the tests, continue below:

6. (Optional) Install `nvidia-container-toolkit`
See [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-apt) for full instructions.

**Important:** `nvidia-container-toolkit` needs to be installed inside WSL (not on the Windows host). The following commands should be run within the Linux environment.

```bash
$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-container-toolkit
```

Then configure Docker to use the `nvidia-container-toolkit`:
```bash
$ sudo nvidia-ctk runtime configure --runtime=docker
$ sudo systemctl restart docker
```

7. (Optional) Verify your GPU is available inside WSL
Use `nvidia-smi` inside of WSL to verify that your GPU is correctly configured and available from inside the container.
If not, verify that the NVIDIA GPU driver is correctly installed on your Windows host and `nvidia-container-toolkit` was successfully installed inside of WSL.
```bash
$ nvidia-smi
```
</details>

### Connect VSCode to WSL
1. Launch VSCode on your Windows host

2. Connect VSCode to your WSL instance
- Enter `ctrl + shift + p` to open the command prompt and type "WSL" and click "WSL: Connect to WSL"
- If you don't see this option, you need to install the [WSL VSCode Extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-wsl) (comes with the [Remote Development pack ](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.vscode-remote-extensionpack))
![image](https://github.com/user-attachments/assets/3e0e6af7-4251-4ce9-9204-589ad7daa12a)
- To verify VSCode is connected to WSL, you should see the following in the bottom left corner: ![Shows the WSL: Ubuntu status for a successful connection to WSL.](https://github.com/user-attachments/assets/26dbba61-cc96-4ac3-8200-fdb26a8e4a4b)

3. VSCode is now attached to WSL and it is equivalent to running in a native Linux environment. You can now proceed as described in the [section above](#vscode-devcontainer-steps).

## Quickstart: Docker (Manual Approach) <a name="docker"></a>

### Prerequisites
- [Docker](https://docs.docker.com/desktop/install/linux-install/)
- [Docker](https://docs.docker.com/engine/install/)

#### GPU Prerequisites (only needed for executing tests that require a GPU)
- Supported NVIDIA GPU
- [NVIDIA Driver](https://www.nvidia.com/Download/index.aspx?lang=en-us)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)

### Steps
1. Clone the repository and use the [`launch.sh`](./launch.sh) script to launch the default container environment
Expand Down Expand Up @@ -121,78 +230,3 @@ Click the badge above or [click here](https://codespaces.new/NVIDIA/cccl?quickst
For more information, see the `.devcontainer/make_devcontainers.sh --help` message.

**Note**: When adding or updating supported environments, modify `matrix.yaml` and then rerun this script to synchronize the `devcontainer` configurations.

## Quickstart: Using WSL <a name="wsl"></a>

> [!NOTE]
> _Make sure you have the Nvidia driver installed on your Windows host before moving further_. Type in `nvidia-smi` for verification.

### Install WSL on your Windows host

> [!WARNING]
> Dsiclaimer: This guide was developed for WSL 2 on Windows 11.

1. Launch a Windows terminal (_e.g. Powershell_) as an administrator.

2. Install WSL 2 by running:
```bash
wsl --install
```
This should probably install Ubuntu distro as a default.

3. Restart your computer and run `wsl -l -v` on a Windows terminal to verify installation.

<h3 id="prereqs"> Install prerequisites and VS Code extensions</h3>

4. Launch your WSL/Ubuntu terminal by running `wsl` in Powershell.

5. Install the [WSL extension](ms-vscode-remote.remote-wsl) on VS Code.

- `Ctrl + Shift + P` and select `WSL: Connect to WSL` (it will prompt you to install the WSL extension).

- Make sure you are connected to WSL with VS Code by checking the bottom left corner of the VS Code window (should indicate "WSL: Ubuntu" in our case).

6. Install the [Dev Containers extension](ms-vscode-remote.remote-containers) on VS Code.

- In a vanilla system you should be prompted to install `Docker` at this point, accept it. If it hangs you might have to restart VS Code after that.

7. Install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html). **Make sure you install the WSL 2 version and not the native Linux one**. This builds on top of Docker so make sure you have Docker properly installed (run `docker --version`).

8. Open `/etc/docker/daemon.json` from within your WSL system (if the file does not exist, create it) and add the following:

```json
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
```

then run `sudo systemctl restart docker.service`.

---
### Build CCCL in WSL using Dev Containers

9. Still on your WSL terminal run `git clone https://github.com/NVIDIA/cccl.git`


10. Open the CCCL cloned repo in VS Code ( `Ctrl + Shift + P `, select `File: Open Folder...` and select the path where your CCCL clone is located).

11. If prompted, choose `Reopen in Container`.

- If you are not prompted just type `Ctrl + Shift + P` and `Dev Containers: Open Folder in Container ...`.

12. Verify that Dev Container was configured properly by running `nvidia-smi` in your Dev Container terminal. For a proper configuration it is important for the steps in [Install prerequisites and VS Code extensions](#prereqs) to be followed in a precise order.

From that point on, the guide aligns with our [existing Dev Containers native Linux guide](https://github.com/NVIDIA/cccl/blob/main/.devcontainer/README.md) with just one minor potential alteration:

13. If WSL was launched without the X-server enabled, when asked to "authenticate Git with your Github credentials", if you answer **Yes**, the browser might not open automatically, with the following error message.

> Failed opening a web browser at https://github.com/login/device
exec: "xdg-open,x-www-browser,www-browser,wslview": executable file not found in $PATH
Please try entering the URL in your browser manually

In that case type in the address manually in your web browser https://github.com/login/device and fill in the one-time code.
17 changes: 17 additions & 0 deletions .devcontainer/cccl-entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env bash

# shellcheck disable=SC1091

set -e;

devcontainer-utils-post-create-command;
devcontainer-utils-init-git;
devcontainer-utils-post-attach-command;

cd /home/coder/cccl/

if test $# -gt 0; then
exec "$@";
else
exec /bin/bash -li;
fi
Loading
Loading