-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELEASE] cudf v24.12 #17406
base: main
Are you sure you want to change the base?
[RELEASE] cudf v24.12 #17406
Commits on Oct 5, 2024
-
Add string.convert.convert_ipv4 APIs to pylibcudf (#16994)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16994
Configuration menu - View commit details
-
Copy full SHA for 33b8dfa - Browse repository at this point
Copy the full SHA 33b8dfaView commit details -
Fix write_json to handle empty string column (#16995)
Add empty string column condition for write_json bypass make_strings_children for empty column because when grid size is zero, it throws cuda error. Authors: - Karthikeyan (https://github.com/karthikeyann) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - David Wendt (https://github.com/davidwendt) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #16995
Configuration menu - View commit details
-
Copy full SHA for fcff2b6 - Browse repository at this point
Copy the full SHA fcff2b6View commit details
Commits on Oct 7, 2024
-
This PR removes an unused unused import in cudf which was causing errors in doc builds. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17005
Configuration menu - View commit details
-
Copy full SHA for bfd568b - Browse repository at this point
Copy the full SHA bfd568bView commit details -
Add release tracking to project automation scripts (#17001)
This PR adds two new jobs to the project automations. One to extract the version number from the branch name, and one to set the project `Release` field to the version found. Authors: - Ben Jarmak (https://github.com/jarmak-nv) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17001
Configuration menu - View commit details
-
Copy full SHA for f926a61 - Browse repository at this point
Copy the full SHA f926a61View commit details -
Address all remaining clang-tidy errors (#16956)
With this set of changes I get a clean run of clang-tidy (with one caveat that I'll explain in the follow-up PR to add clang-tidy to pre-commit/CI). Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - MithunR (https://github.com/mythrocks) - David Wendt (https://github.com/davidwendt) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #16956
Configuration menu - View commit details
-
Copy full SHA for 7e1e475 - Browse repository at this point
Copy the full SHA 7e1e475View commit details -
Implement
extract_datetime_component
inlibcudf
/pylibcudf
(#16776)Closes #16735 Authors: - https://github.com/brandon-b-miller - Lawrence Mitchell (https://github.com/wence-) Approvers: - Matthew Murray (https://github.com/Matt711) - Lawrence Mitchell (https://github.com/wence-) - Bradley Dice (https://github.com/bdice) URL: #16776
Configuration menu - View commit details
-
Copy full SHA for 2d02bdc - Browse repository at this point
Copy the full SHA 2d02bdcView commit details
Commits on Oct 8, 2024
-
Migrate nvtext generate_ngrams APIs to pylibcudf (#17006)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17006
Configuration menu - View commit details
-
Copy full SHA for 09ed210 - Browse repository at this point
Copy the full SHA 09ed210View commit details -
Everything in the expression evaluation now operates on columns without names. DataFrame construction takes either a mapping from string-valued names to columns, or a sequence of pairs of names and columns. This removes some duplicate code in the NamedColumn class (by removing it) where we had to fight the inheritance hierarchy. - Closes #16272 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #16962
Configuration menu - View commit details
-
Copy full SHA for 219ec0e - Browse repository at this point
Copy the full SHA 219ec0eView commit details -
Compute whole column variance using numerically stable approach (#16448)
We use the pairwise approach of Chan, Golub, and LeVeque (1983). - Closes #16444 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) - Robert (Bobby) Evans (https://github.com/revans2) URL: #16448
Configuration menu - View commit details
-
Copy full SHA for bcf9425 - Browse repository at this point
Copy the full SHA bcf9425View commit details -
Turn on
xfail_strict = true
for all python packages (#16977)The cudf tests already treat tests that are expected to fail but pass as errors, but at the time we introduced that change, we didn't do the same for the other packages. Do that now, it turns out there are only a few xpassing tests. While here, it turns out that having multiple different pytest configuration files does not work. `pytest.ini` takes precedence over other options, and it's "first file wins". Consequently, the merge of #16851 turned off `xfail_strict = true` (and other options) for many of the subpackages. To fix this, migrate all pytest configuration into the appropriate section of the `pyproject.toml` files, so that all tool configuration lives in the same place. We also add a section in the developer guide to document this choice. - Closes #12391 - Closes #16974 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - James Lamb (https://github.com/jameslamb) - Matthew Roeschke (https://github.com/mroeschke) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #16977
Configuration menu - View commit details
-
Copy full SHA for cc23474 - Browse repository at this point
Copy the full SHA cc23474View commit details -
Performance optimization of JSON validation (#16996)
As part of JSON validation, field, value and string tokens are validated. Right now the code has single transform_inclusive_scan. Since this transform functor is a heavy operation, it slows down the entire scan drastically. This PR splits transform and scan in validation. The runtime of validation went from 200ms to 20ms. Also, a few hardcoded string comparisons are moved to trie. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) - Robert (Bobby) Evans (https://github.com/revans2) URL: #16996
Configuration menu - View commit details
-
Copy full SHA for 553d8ec - Browse repository at this point
Copy the full SHA 553d8ecView commit details -
Migrate nvtext jaccard API to pylibcudf (#17007)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17007
Configuration menu - View commit details
-
Copy full SHA for 618a93f - Browse repository at this point
Copy the full SHA 618a93fView commit details -
make conda installs in CI stricter (#17013)
Contributes to rapidsai/build-planning#106 Proposes specifying the RAPIDS version in `conda install` calls in CI that install CI artifacts, to reduce the risk of CI jobs picking up artifacts from other releases. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17013
Configuration menu - View commit details
-
Copy full SHA for 349ba5d - Browse repository at this point
Copy the full SHA 349ba5dView commit details
Commits on Oct 9, 2024
-
Add string.convert.convert_urls APIs to pylibcudf (#17003)
Contributes to #15162 Also I believe the cpp docstrings were incorrect, but could use a second look. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - https://github.com/brandon-b-miller - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: #17003
Configuration menu - View commit details
-
Copy full SHA for 5b931ac - Browse repository at this point
Copy the full SHA 5b931acView commit details -
Add pinning for pyarrow in wheels (#17018)
We have recently observed a number of seg faults in our Python tests. From some investigation, the error comes from the import of pyarrow loading the bundled libarrow.so, and in particular when that library runs a jemalloc function `background_thread_entry`. We have observed similar (but not identical) errors in the past that have to do with as-yet unsolved problems in the way that arrow handles multi-threaded environments. The error is currently only observed on arm runners and with pyarrow 17.0.0. In my tests the error is highly sensitive to everything from import order to unrelated code segments, suggesting a race condition, some form of memory corruption, or perhaps symbol resolution errors at runtime. As a result, I have had limited success in drilling down further into specific causes, especially since attempts to rebuild libarrow.so also squash the error and I therefore cannot use debug symbols. From some offline discussion we decided that avoiding the problematic version is a sufficient fix for now. Due to the sensitivity, I am simply skipping 17.0.0 in this PR. I suspect that future builds of pyarrow will also usually not exhibit this bug (although it may recur occasionally on specific versions of pyarrow). Therefore, rather than lowering the upper bound I would prefer to allow us to float and see if and when this problem reappears. Since our DFG+RBB combination for wheel builds does not yet support any matrix entry other than `cuda`, I'm using environment markers to specify the constraint rather than a matrix entry in dependencies.yaml. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17018
Configuration menu - View commit details
-
Copy full SHA for ded4dd2 - Browse repository at this point
Copy the full SHA ded4dd2View commit details -
Refactor
histogram
reduction using `cuco::static_set::insert_and_fi……nd` (#16485) Refactors `histogram` reduce and groupby aggregations using `cuco::static_set::insert_and_find`. Speed improvement results [here](#16485 (comment)) and [here](#16485 (comment)). Authors: - Srinivas Yadav (https://github.com/srinivasyadav18) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Nghia Truong (https://github.com/ttnghia) URL: #16485
Configuration menu - View commit details
-
Copy full SHA for a6853f4 - Browse repository at this point
Copy the full SHA a6853f4View commit details -
Disable kvikio remote I/O to avoid openssl dependencies in JNI build (#…
…17026) the same issue as NVIDIA/spark-rapids-jni#2475 due to rapidsai/kvikio#464 Port the fix from NVIDIA/spark-rapids-jni#2476, verified locally Authors: - Peixin (https://github.com/pxLi) Approvers: - Nghia Truong (https://github.com/ttnghia) URL: #17026
Configuration menu - View commit details
-
Copy full SHA for bfac5e5 - Browse repository at this point
Copy the full SHA bfac5e5View commit details -
Merge pull request #17027 from rapidsai/branch-24.10
Forward-merge branch-24.10 into branch-24.12
Configuration menu - View commit details
-
Copy full SHA for 9c37e1e - Browse repository at this point
Copy the full SHA 9c37e1eView commit details -
Use std::optional for host types (#17015)
cuda::std::optional shouldn't be used for host types such as `std::vector` as it requires the constructors of the `T` types to be host+device. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #17015
Configuration menu - View commit details
-
Copy full SHA for dfdae59 - Browse repository at this point
Copy the full SHA dfdae59View commit details -
[DOC] Document limitation using
cudf.pandas
proxy arrays (#16955)When instantiating a `cudf.pandas` proxy array, a DtoH transfer occurs so that the data buffer is set correctly. We do this because functions which utilize NumPy's C API can utilize the data buffer directly instead of going through `__array__`. This PR documents this limitation. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) URL: #16955
Configuration menu - View commit details
-
Copy full SHA for bd51a25 - Browse repository at this point
Copy the full SHA bd51a25View commit details -
Fix
host_span
constructor to correctly copyis_device_accessible
(#……17020) One of the `host_span` constructors was not updated when we added `is_device_accessible`, so the value was not assigned. This PR fixes this simple error and adds tests that checks that this property is correctly set when creating `host_span`s. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17020
Configuration menu - View commit details
-
Copy full SHA for c7b5119 - Browse repository at this point
Copy the full SHA c7b5119View commit details -
Add string.convert_floats APIs to pylibcudf (#16990)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - https://github.com/brandon-b-miller URL: #16990
Configuration menu - View commit details
-
Copy full SHA for 3791c8a - Browse repository at this point
Copy the full SHA 3791c8aView commit details
Commits on Oct 10, 2024
-
Update all rmm imports to use pylibrmm/librmm (#16913)
This PR updates all the RMM imports to use pylibrmm/librmm now that `rmm._lib` is deprecated . It should be merged after [rmm/1676](rapidsai/rmm#1676). Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Charles Blackmon-Luca (https://github.com/charlesbluca) URL: #16913
Configuration menu - View commit details
-
Copy full SHA for 31423d0 - Browse repository at this point
Copy the full SHA 31423d0View commit details -
Fix regex parsing logic handling of nested quantifiers (#16798)
Fixes the libcudf regex parsing logic when handling nested fixed quantifiers. The logic handles fixed quantifiers by simple repeating the previous instruction. If the previous item is a group (capture or non-capture) that group may also contain an internal fixed quantifier as well. Found while working on #16730 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #16798
Configuration menu - View commit details
-
Copy full SHA for 7173b52 - Browse repository at this point
Copy the full SHA 7173b52View commit details -
Add string.convert.convert_lists APIs to pylibcudf (#16997)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16997
Configuration menu - View commit details
-
Copy full SHA for 69b0f66 - Browse repository at this point
Copy the full SHA 69b0f66View commit details -
Add json APIs to pylibcudf (#17025)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - James Lamb (https://github.com/jameslamb) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17025
Configuration menu - View commit details
-
Copy full SHA for 7d49df7 - Browse repository at this point
Copy the full SHA 7d49df7View commit details
Commits on Oct 11, 2024
-
Move pylibcudf/libcudf/wrappers/decimals to pylibcudf/libcudf/fixed_p…
…oint (#17048) Contributes to #15162 I don't think there are any types in this file that needs to be exposed on the Python side; they're just used internally in pylibcudf. Also moves this to `libcudf/fixed_point` matching the libcudf location more closely Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17048
Configuration menu - View commit details
-
Copy full SHA for 097778e - Browse repository at this point
Copy the full SHA 097778eView commit details -
Remove unneeded pylibcudf.libcudf.wrappers.duration usage in cudf (#1…
…7010) Contributes to #15162 ~I just assumed since the associated libcudf files just publicly expose C types, we just want to match the name spacing when importing from pylibcudf (avoid importing from `pylibcudf.libcudf`) and not necessary expose a Python equivalent?~ ~Let me know if I am misunderstanding how to expose these types.~ #17010 (comment) Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17010
Configuration menu - View commit details
-
Copy full SHA for 1436cac - Browse repository at this point
Copy the full SHA 1436cacView commit details -
make conda installs in CI stricter (part 2) (#17042)
Follow-up to #17013 Changes relative to that PR: * switches to pinning CI conda installs to the output of `rapids-version` (`{major}.{minor}.{patch}`) instead of `rapids-version-major-minor` (`{major}.{minor}`), to get a bit more protection in the presence of hotfix releases * restores some exporting of variables needed for docs builds I made some mistakes in #17013 (comment). Missed that this project's Doxygen setup is expecting to find `RAPIDS_VERSION` and `RAPIDS_VERSION_MAJOR_MINOR` defined in the environment. https://github.com/rapidsai/cudf/blob/7173b52fce25937bb69e22a083a5de4655078fa1/cpp/doxygen/Doxyfile#L41 https://github.com/rapidsai/cudf/blob/7173b52fce25937bb69e22a083a5de4655078fa1/cpp/doxygen/Doxyfile#L2229 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #17042
Configuration menu - View commit details
-
Copy full SHA for 89a6fe5 - Browse repository at this point
Copy the full SHA 89a6fe5View commit details -
Pylibcudf: pack and unpack (#17012)
Adding python bindings to [`cudf::pack()`](https://docs.rapids.ai/api/libcudf/legacy/group__copy__split#ga86716e7ec841541deb6edc7e91fcb9e4), [`cudf::unpack()`](https://docs.rapids.ai/api/libcudf/legacy/group__copy__split#ga1d62a18c2e6f087a92289c63693762cc), and [`cudf::packed_columns`](https://docs.rapids.ai/api/libcudf/legacy/structcudf_1_1packed__columns). This is the first step to support serialization of cudf.polars' IR. cc. @wence- @rjzamora # Authors: - Mads R. B. Kristensen (https://github.com/madsbk) - Matthew Murray (https://github.com/Matt711) - Lawrence Mitchell (https://github.com/wence-) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) URL: #17012
Configuration menu - View commit details
-
Copy full SHA for 7cf0a1b - Browse repository at this point
Copy the full SHA 7cf0a1bView commit details -
Replace deprecated cuco APIs with updated versions (#17052)
This PR replaces the deprecated cuco APIs with the new ones, ensuring the code is up to date with the latest API changes. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Nghia Truong (https://github.com/ttnghia) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17052
Configuration menu - View commit details
-
Copy full SHA for 66a94c3 - Browse repository at this point
Copy the full SHA 66a94c3View commit details -
Remove unused hash helper functions (#17056)
This PR removes unused hash detail implementations. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17056
Configuration menu - View commit details
-
Copy full SHA for 349010e - Browse repository at this point
Copy the full SHA 349010eView commit details -
Organize parquet reader mukernel non-nullable code, introduce manual …
…block scans (#16830) This is a collection of a few small optimizations and tweaks for the parquet reader fixed-width mukernels (flat & nested, lists not implemented yet). The benchmark changes are negligible, this is mainly cleanup and code in preparation for the upcoming list mukernel. 1) If not reading the whole page (chunked reads) exit sooner 2) By having each thread keep track of the current valid_count (and not saving-to or reading-from the nesting_info until the end), we don't need to synchronize the block threads as frequently, so these extra syncs are removed. 3) For (non-list) nested columns that aren't nullable, we don't need to loop over the whole nesting depth; only the last level of nesting is used. After removing this loop, the non-nullable code for nested and flat hierarchies is identical, so they're extracted and consolidated into a new function. 4) When doing block scans in the parquet reader we also need to know the per-warp results of the scan. Because cub doesn't return those, we then do an additional warp-wide ballot that is unnecessary. This introduces code that does a block scan manually, saving the intermediate results. However using this code in the flat & nested kernels uses 8 more registers, so it isn't used yet. 5) By doing an exclusive-scan instead of an inclusive-scan, we don't need the extra "- 1's" that were everywhere. Authors: - Paul Mattione (https://github.com/pmattione-nvidia) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - https://github.com/nvdbaranec URL: #16830
Configuration menu - View commit details
-
Copy full SHA for 891e5aa - Browse repository at this point
Copy the full SHA 891e5aaView commit details -
docs: change 'CSV' to 'csv' in python/custreamz/README.md to match ka…
…fka.py (#17041) This PR corrects a typo in the `python/custreamz/README.md` file by changing the uppercase `'CSV'` to lowercase `'csv'`. This change aligns the documentation with the `message_format` options defined in `python/custreamz/custreamz/kafka.py`, ensuring consistency across the codebase. Authors: - Hirota Akio (https://github.com/a-hirota) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #17041
Configuration menu - View commit details
-
Copy full SHA for 0b840bb - Browse repository at this point
Copy the full SHA 0b840bbView commit details -
Reorganize
cudf_polars
expression code (#17014)This PR seeks to break up `expr.py` into a less unwieldy monolith. Authors: - https://github.com/brandon-b-miller Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #17014
Configuration menu - View commit details
-
Copy full SHA for b8f3e21 - Browse repository at this point
Copy the full SHA b8f3e21View commit details -
Move
flatten_single_pass_aggs
to its own TU (#17053)Part of splitting the original bulk shared memory groupby PR #16619. This PR separates `flatten_single_pass_aggs` into its own translation unit without making any code modifications. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Shruti Shivakumar (https://github.com/shrshi) - Karthikeyan (https://github.com/karthikeyann) - David Wendt (https://github.com/davidwendt) URL: #17053
Configuration menu - View commit details
-
Copy full SHA for fea87cb - Browse repository at this point
Copy the full SHA fea87cbView commit details -
Migrate Min Hashing APIs to pylibcudf (#17021)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17021
Configuration menu - View commit details
-
Copy full SHA for c8a56a5 - Browse repository at this point
Copy the full SHA c8a56a5View commit details -
Add an example to demonstrate multithreaded
read_parquet
pipelines (#……16828) Closes #16717. This PR adds a new example to read multiple parquet files using multiple threads. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) - Basit Ayantunde (https://github.com/lamarrr) URL: #16828
Configuration menu - View commit details
-
Copy full SHA for be1dd32 - Browse repository at this point
Copy the full SHA be1dd32View commit details
Commits on Oct 12, 2024
-
Refactor ORC dictionary encoding to migrate to the new `cuco::static_…
…map` (#17049) Part of #12261. This PR refactors ORC writer's dictionary encoding to migrate from `cuco::legacy::static_map` to the new `cuco::static_map`. No performance impact measured. Results [here](#17049 (comment)). Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Vukasin Milovanovic (https://github.com/vuule) URL: #17049
Configuration menu - View commit details
-
Copy full SHA for 4dbb8a3 - Browse repository at this point
Copy the full SHA 4dbb8a3View commit details
Commits on Oct 14, 2024
-
Made cudftestutil header-only and removed GTest dependency (#16839)
This merge request follows up on #16658. It removes the dependency on GTest by cudftestutil. It satisfies the requirement that we only need API compatibility with the GTest API and we don't expose the GTest symbols to our consumers nor ship any binary artifact. The source files defining the symbols are late-binded to the resulting executable (via library INTERFACE sources). The user has to link to manually link the GTest and GMock libraries to the final executable as illustrated below. Closes #16658 ### Usage CMakeLists.txt: ```cmake add_executable(test1 test1.cpp) target_link_libraries(test1 PRIVATE GTest::gtest GTest::gmock GTest::gtest_main cudf::cudftestutil cudf::cudftestutil_impl) ``` Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Robert Maynard (https://github.com/robertmaynard) - David Wendt (https://github.com/davidwendt) - Mike Sarahan (https://github.com/msarahan) URL: #16839
Configuration menu - View commit details
-
Copy full SHA for 3bee678 - Browse repository at this point
Copy the full SHA 3bee678View commit details -
Add profilers to CUDA 12 conda devcontainers (#17066)
This will make sure that profilers are available by default for everyone using our devcontainers. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) URL: #17066
Configuration menu - View commit details
-
Copy full SHA for e41dea9 - Browse repository at this point
Copy the full SHA e41dea9View commit details -
Fix ORC reader when using
device_read_async
while the destination d……evice buffers are not ready (#17074) This fixes a bug in ORC reader when `device_read_async` is called while the destination device buffers are not ready to write in. In detail, this bug is because `device_read_async` does not use the user-provided stream but its own generated stream for data copying. As such, the copying ops could happen before the destination device buffers are being allocated, causing data corruption. This bug only shows up in certain conditions, and also hard to reproduce. It occurs when copying buffers with small sizes (below `gds_threshold`) and most likely to show up with setting `rmm_mode=async`. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #17074
Configuration menu - View commit details
-
Copy full SHA for 768fbaa - Browse repository at this point
Copy the full SHA 768fbaaView commit details -
This PR adds clang-tidy checks to our CI. clang-tidy will be run in nightly CI via CMake. For now, only the parts of the code base that were already made compliant in the PRs leading up to this have been enabled, namely cudf source and test cpp files. Over time we can add more files like benchmarks and examples, add or subtract more rules, or enable linting of cu files (see https://gitlab.kitware.com/cmake/cmake/-/issues/25399). This PR is intended to be the starting point enabling systematic linting, at which point everything else should be significantly easier. Resolves #584 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) URL: #16958
Configuration menu - View commit details
-
Copy full SHA for 44afc51 - Browse repository at this point
Copy the full SHA 44afc51View commit details -
Clean up hash-groupby
var_hash_functor
(#17034)This work is part of splitting the original bulk shared memory groupby PR #16619. This PR renames the file originally titled `multi_pass_kernels.cuh`, which contains the `var_hash_functor`, to `var_hash_functor.cuh`. It also includes cleanups such as utilizing `cuda::std::` utilities in device code and removing redundant template parameters. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #17034
Configuration menu - View commit details
-
Copy full SHA for 86db980 - Browse repository at this point
Copy the full SHA 86db980View commit details -
Adding assertion to check for regular JSON inputs of size greater tha…
…n `INT_MAX` bytes (#17057) Addresses #17017 Libcudf does not support parsing regular JSON inputs of size greater than `INT_MAX` bytes. Note that the batched reader can only be used for JSON lines inputs. Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) URL: #17057
Configuration menu - View commit details
-
Copy full SHA for 319ec3b - Browse repository at this point
Copy the full SHA 319ec3bView commit details
Commits on Oct 15, 2024
-
Add string.convert.convert_integers APIs to pylibcudf (#16991)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) - https://github.com/brandon-b-miller Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - https://github.com/brandon-b-miller - Matthew Murray (https://github.com/Matt711) URL: #16991
Configuration menu - View commit details
-
Copy full SHA for c141ca5 - Browse repository at this point
Copy the full SHA c141ca5View commit details -
Fix regex handling of fixed quantifier with 0 range (#17067)
Fixes regex logic handling of a pattern with a fixed quantifier that includes a zero-range. Added new gtests for this specific case. Bug was introduced in #16798 Closes #17065 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) - Vyas Ramasubramani (https://github.com/vyasr) - MithunR (https://github.com/mythrocks) - Basit Ayantunde (https://github.com/lamarrr) URL: #17067
Configuration menu - View commit details
-
Copy full SHA for 7bcfc87 - Browse repository at this point
Copy the full SHA 7bcfc87View commit details
Commits on Oct 16, 2024
-
Migrate remaining nvtext NGrams APIs to pylibcudf (#17070)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - https://github.com/brandon-b-miller URL: #17070
Configuration menu - View commit details
-
Copy full SHA for 3420c71 - Browse repository at this point
Copy the full SHA 3420c71View commit details -
Remove unnecessary
std::move
's in pylibcudf (#16983)This PR removes a lot of unnecessary `std::move`'s from pylibcudf. These were necessary with older versions of Cython, but newer versions appear to generate the correct C++ without needing the extra hints. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16983
Configuration menu - View commit details
-
Copy full SHA for 95df62a - Browse repository at this point
Copy the full SHA 95df62aView commit details -
Reenable huge pages for arrow host copying (#17097)
It is unclear whether the performance gains here are entirely from huge pages themselves or whether invoking madvise with huge pages is primarily serving to trigger an eager population of the pages (huge or not). We attempted to provide alternate flags to `madvise` like `MADV_WILLNEED` and that was not sufficient to recover performance, so either huge pages themselves are doing something special or specifying huge pages is causing `madvise` to trigger a page migration that no other flag does. In any case, this change returns us to the performance before the switch to the C data interface, and this code is lifted straight out of our old implementation so I am comfortable making use of it and knowing that it is not problematic. We should explore further optimizations in this direction, though. Resolves #17075. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Mark Harris (https://github.com/harrism) URL: #17097
Configuration menu - View commit details
-
Copy full SHA for f1cbbcc - Browse repository at this point
Copy the full SHA f1cbbccView commit details -
Include timezone file path in error message (#17102)
Resolves #8795. Also needed for #16998. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17102
Configuration menu - View commit details
-
Copy full SHA for b513df8 - Browse repository at this point
Copy the full SHA b513df8View commit details -
bug fix: use
self.ck_consumer
inpoll
method of kafka.py to align…… with `__init__` (#17044) Updated the `poll` method in `kafka.py` to use `self.ck_consumer.poll(timeout)` instead of `self.ck.poll(timeout)`. This change ensures consistency with the `__init__` method where `self.ck_consumer` is initialized. Authors: - Hirota Akio (https://github.com/a-hirota) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17044
Configuration menu - View commit details
-
Copy full SHA for c9202a0 - Browse repository at this point
Copy the full SHA c9202a0View commit details
Commits on Oct 17, 2024
-
Implement batch construction for strings columns (#17035)
This implements batch construction of strings columns, allowing to create a large number of strings columns at once with minimal overhead of kernel launch and stream synchronization. There should be only one stream sync in the entire column construction process. Benchmark: #17035 (comment) Closes #16486. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - David Wendt (https://github.com/davidwendt) - Yunsong Wang (https://github.com/PointKernel) URL: #17035
Configuration menu - View commit details
-
Copy full SHA for 5f863a5 - Browse repository at this point
Copy the full SHA 5f863a5View commit details -
Add strings.combine APIs to pylibcudf (#16790)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) - Matthew Murray (https://github.com/Matt711) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Matthew Murray (https://github.com/Matt711) URL: #16790
Configuration menu - View commit details
-
Copy full SHA for 3683e46 - Browse repository at this point
Copy the full SHA 3683e46View commit details -
Make tests more deterministic (#17008)
Fixes #17045 This PR removes randomness in our pytests and switches from using `np.random.seed` to `np.random.default_rng` in all of the codebase. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Jake Awe (https://github.com/AyodeAwe) - Lawrence Mitchell (https://github.com/wence-) - Benjamin Zaitlen (https://github.com/quasiben) URL: #17008
Configuration menu - View commit details
-
Copy full SHA for e493340 - Browse repository at this point
Copy the full SHA e493340View commit details -
Add conda recipe for cudf-polars (#17037)
This PR adds conda recipes for `cudf-polars`. This is needed to get `cudf-polars` into RAPIDS Docker containers and the `rapids` metapackage. Closes #16816. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Matthew Murray (https://github.com/Matt711) - James Lamb (https://github.com/jameslamb) - Lawrence Mitchell (https://github.com/wence-) URL: #17037
Configuration menu - View commit details
-
Copy full SHA for 9980997 - Browse repository at this point
Copy the full SHA 9980997View commit details -
Fix
DataFrame._from_arrays
and introduce validations (#17112)Fixes: #17111 This PR fixes `DataFrame._from_arrays` to properly access `ndim` attribute and also corrects two validations in `Series` & `DataFrame` constructors. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17112
Configuration menu - View commit details
-
Copy full SHA for 6eeb7d6 - Browse repository at this point
Copy the full SHA 6eeb7d6View commit details -
Correctly set
is_device_accesible
when creatinghost_span
s from o……ther container/span types (#17079) Discovered that the way `host_span`s are created from `hostdevice_vector`, `hostdevice_span`, `hostdevice_2dvector` and `host_2dspan` (yes, these are all real types!) does not propagate the `is_device_accesible` flag. In most of the cases these spans use pinned memory, so we're incorrect most of the time. This PR fixed the way these conversions work. Adjusted some APIs to make it a bit harder to avoid passing the `is_device_accesible` flag. Removed a few unused functions in `span.hpp` to keep the file as light as possible (it's included EVERYWHERE). Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - Shruti Shivakumar (https://github.com/shrshi) URL: #17079
Configuration menu - View commit details
-
Copy full SHA for 14209c1 - Browse repository at this point
Copy the full SHA 14209c1View commit details -
Remove the additional host register calls initially intended for perf…
…ormance improvement on Grace Hopper (#17092) On Grace Hopper, file I/O takes a special path that calls `cudaHostRegister` to circumvent a performance issue. Recent benchmark shows that this workaround is no longer necessary . This PR is making a clean-up. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #17092
Configuration menu - View commit details
-
Copy full SHA for 920a5f6 - Browse repository at this point
Copy the full SHA 920a5f6View commit details -
Limit the number of keys to calculate column sizes and page starts in…
… PQ reader to 1B (#17059) This PR limits the number of keys to use at a time to calculate column `sizes` and `page_start_values` to 1B averting possible OOM and UB from implicit typecasting of `size_t` iterator to `size_type` iterators in `thrust::reduce_by_key`. Closes #16985 Closes #17086 ## Resolved - [x] Add tests - [x] Debug with fingerprinting structs table for a possible bug in PQ writer (nothing seems wrong with the writer as pyarrow is able to read the written parquet files). Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Bradley Dice (https://github.com/bdice) - Vukasin Milovanovic (https://github.com/vuule) - Yunsong Wang (https://github.com/PointKernel) URL: #17059
Configuration menu - View commit details
-
Copy full SHA for 00feb82 - Browse repository at this point
Copy the full SHA 00feb82View commit details -
Migrate NVText Normalizing APIs to Pylibcudf (#17072)
Apart of #15162. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17072
Configuration menu - View commit details
-
Copy full SHA for ce93c36 - Browse repository at this point
Copy the full SHA ce93c36View commit details
Commits on Oct 18, 2024
-
Add device aggregators used by shared memory groupby (#17031)
This work is part of splitting the original bulk shared memory groupby PR #16619. It introduces two device-side element aggregators: - `shmem_element_aggregator`: aggregates data from global memory sources to shared memory targets, - `gmem_element_aggregator`: aggregates from shared memory sources to global memory targets. These two aggregators are similar to the `elementwise_aggregator` functionality. Follow-up work is tracked via #17032. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - David Wendt (https://github.com/davidwendt) URL: #17031
Configuration menu - View commit details
-
Copy full SHA for 8ebf0d4 - Browse repository at this point
Copy the full SHA 8ebf0d4View commit details -
Control whether a file data source memory-maps the file with an envir…
…onment variable (#17004) Adds an environment variable, `LIBCUDF_MMAP_ENABLED`, to control whether we memory map the input file in the data source. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - Tianyu Liu (https://github.com/kingcrimsontianyu) URL: #17004
Configuration menu - View commit details
-
Copy full SHA for b891722 - Browse repository at this point
Copy the full SHA b891722View commit details -
Fix the GDS read/write segfault/bus error when the cuFile policy is s…
…et to GDS or ALWAYS (#17122) When `LIBCUDF_CUFILE_POLICY` is set to `GDS` or `ALWAYS`, cuDF uses an internal implementation to call the cuFile API and harness the GDS feature. Recent tests with these two settings were unsuccessful due to program crash. Specifically, for the `PARQUET_READER_NVBENCH`'s `parquet_read_io_compression` benchmark: - GDS write randomly crashed with segmentation fault (SIGSEGV). - GDS read randomly crashed with bus error (SIGBUS). - At the time of crash, stack frame is randomly corrupted. The root cause is the use of dangling reference, which occurs when a variable is captured by reference by nested lambdas. This PR performs a hotfix that turns out to be a 1-char change. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17122
Configuration menu - View commit details
-
Copy full SHA for 6ca721c - Browse repository at this point
Copy the full SHA 6ca721cView commit details -
Fix clang-tidy violations for span.hpp and hostdevice_vector.hpp (#17124
) Errors reported here: https://github.com/rapidsai/cudf/actions/runs/11398977412/job/31716929242 Just adding `[[nodiscard]]` to a few member functions. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Shruti Shivakumar (https://github.com/shrshi) URL: #17124
Configuration menu - View commit details
-
Copy full SHA for e1c9a5a - Browse repository at this point
Copy the full SHA e1c9a5aView commit details -
Disable the Parquet reader's wide lists tables GTest by default (#17120)
This PR disables Parquet reader's wide lists table gtest by default as it takes several minutes to complete with memcheck. See the discussion on PR #17059 (this [comment](#17059 (comment))) for more context. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17120
Configuration menu - View commit details
-
Copy full SHA for e242dce - Browse repository at this point
Copy the full SHA e242dceView commit details -
Add custom "fused" groupby aggregation to Dask cuDF (#17009)
The legacy Dask cuDF implementation uses a custom code path for GroupBy aggregations. However, when query-planning is enabled (the default), we use the same algorithm as the pandas backend. This PR ports the custom "fused aggregation" code path over to the dask-expr version of Dask cuDF. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17009
Configuration menu - View commit details
-
Copy full SHA for 6ad9074 - Browse repository at this point
Copy the full SHA 6ad9074View commit details -
Extend
device_scalar
to optionally use pinned bounce buffer (#16947)Depends on #16945 Added `cudf::detail::device_scalar`, derived from `rmm::device_scalar`. The new class overrides function members that perform copies between host and device. New implementation uses a `cudf::detail::host_vector` as a bounce buffer to avoid performing a pageable copy. Replaced `rmm::device_scalar` with `cudf::detail::device_scalar` across libcudf. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Basit Ayantunde (https://github.com/lamarrr) - Vyas Ramasubramani (https://github.com/vyasr) - David Wendt (https://github.com/davidwendt) URL: #16947
Configuration menu - View commit details
-
Copy full SHA for 98eef67 - Browse repository at this point
Copy the full SHA 98eef67View commit details
Commits on Oct 19, 2024
-
Changing developer guide int_64_t to int64_t (#17130)
Fixes #17129 Authors: - Mike Wilson (https://github.com/hyperbolic2346) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Bradley Dice (https://github.com/bdice) - Alessandro Bellina (https://github.com/abellina) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17130
Configuration menu - View commit details
-
Copy full SHA for fdd2b26 - Browse repository at this point
Copy the full SHA fdd2b26View commit details -
Replace old host tree algorithm with new algorithm in JSON reader (#1…
…7019) This PR replaced old tree algorithm in JSON reader, with experimental algorithms and removed the experimental namespace. Changes are old tree algorithm code removal, experimental namespace removal, code of `scatter_offsets` moved, always call new tree algorithm. No functional change is made in this PR. All unit tests should pass with this change. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Shruti Shivakumar (https://github.com/shrshi) - Vukasin Milovanovic (https://github.com/vuule) URL: #17019
Configuration menu - View commit details
-
Copy full SHA for 1ce2526 - Browse repository at this point
Copy the full SHA 1ce2526View commit details -
Split hash-based groupby into multiple smaller files to reduce build …
…time (#17089) This work is part of splitting the original bulk shared memory groupby PR #16619. This PR splits the hash-based groupby file into multiple translation units and uses explicit template instantiations to help reduce build time. It also includes some minor cleanups without significant functional changes. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17089
Configuration menu - View commit details
-
Copy full SHA for 074ab74 - Browse repository at this point
Copy the full SHA 074ab74View commit details
Commits on Oct 21, 2024
-
Ignore loud dask warnings about legacy dataframe implementation (#17137)
This PR ignores loud dask warnings about legacy dask dataframe implementation is going to be soon removed: dask/dask#11437 Note: We only see this error for `DASK_DATAFRAME__QUERY_PLANNING=False` cases, `DASK_DATAFRAME__QUERY_PLANNING=True` are passing fine. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) - Peter Andreas Entschev (https://github.com/pentschev) - Richard (Rick) Zamora (https://github.com/rjzamora) URL: #17137
Configuration menu - View commit details
-
Copy full SHA for 69ca387 - Browse repository at this point
Copy the full SHA 69ca387View commit details
Commits on Oct 22, 2024
-
Add compile time check to ensure the
counting_iterator
type in `cou……nting_transform_iterator` fits in `size_type` (#17118) This PR adds a compile time check to enforce that the `start` argument to `cudf::detail::counting_transform_iterator`, which is used to determine the type of `counting_iterator`, is of a type that fits in `int32_t` (aka `size_type`). The PR also modifies the instances of `counting_transform_iterator` that need to work with `counting_iterators` of type > `int32_t` to manually created `counting_transform_iterators` using thrust. More context in this [comment](https://github.com/rapidsai/cudf/pull/17059/files#r1803925659). Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - David Wendt (https://github.com/davidwendt) - Yunsong Wang (https://github.com/PointKernel) - Vukasin Milovanovic (https://github.com/vuule) - Tianyu Liu (https://github.com/kingcrimsontianyu) URL: #17118
Configuration menu - View commit details
-
Copy full SHA for 13de3c1 - Browse repository at this point
Copy the full SHA 13de3c1View commit details -
Unify treatment of
Expr
andIR
nodes in cudf-polars DSL (#17016)As part of in-progress multi-GPU work, we will likely want to: 1. Introduce additional nodes into the `IR` namespace; 2. Implement rewrite rules for `IR` trees to express needed communication patterns; 3. Write visitors that translate expressions into an appropriate description for whichever multi-GPU approach we end up taking. It was already straightforward to write generic visitors for `Expr` nodes, since those uniformly have a `.children` property for their dependents. In contrast, the `IR` nodes were more ad-hoc. To solve this, pull out the generic implementation from `Expr` into an abstract `Node` class. Now `Expr` nodes just inherit from this, and `IR` nodes do so similarly. Redoing the `IR` nodes is a little painful because we want to make them hashable, so we have to provide a bunch of custom `get_hashable` implementations (the schema dict, for example, is not hashable). With these generic facilities in place, we can now implement traversal and visitor infrastructure. Specifically, we provide: - a mechanism for pre-order traversal of an expression DAG, yielding each unique node exactly once. This is useful if one wants to know if an expression contains some particular node; - a mechanism for writing recursive visitors and then wrapping a caching scheme around the outside. This is useful for rewrites. Some example usages are shown in tests. Authors: - Lawrence Mitchell (https://github.com/wence-) - Richard (Rick) Zamora (https://github.com/rjzamora) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Matthew Murray (https://github.com/Matt711) - Richard (Rick) Zamora (https://github.com/rjzamora) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17016
Configuration menu - View commit details
-
Copy full SHA for 637e320 - Browse repository at this point
Copy the full SHA 637e320View commit details -
Add string.replace_re APIs to pylibcudf (#17023)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Murray (https://github.com/Matt711) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17023
Configuration menu - View commit details
-
Copy full SHA for 4fe338c - Browse repository at this point
Copy the full SHA 4fe338cView commit details -
Migrate NVText Replacing APIs to pylibcudf (#17084)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Mark Harris (https://github.com/harrism) URL: #17084
Configuration menu - View commit details
-
Copy full SHA for 14cdf53 - Browse repository at this point
Copy the full SHA 14cdf53View commit details -
Set the default number of threads in KvikIO thread pool to 8 (#17126)
Recent benchmarks have shown that setting the environment variable `KVIKIO_NTHREADS=8` in cuDF usually leads to optimal I/O performance. This PR internally sets the default KvikIO thread pool size to 8. The env `KVIKIO_NTHREADS` will still be honored if users explicitly set it. Fixes #16718 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Vukasin Milovanovic (https://github.com/vuule) URL: #17126
Configuration menu - View commit details
-
Copy full SHA for 27c0c9d - Browse repository at this point
Copy the full SHA 27c0c9dView commit details
Commits on Oct 23, 2024
-
JSON tokenizer memory optimizations (#16978)
The full push-down automata that tokenizes the input JSON string, as well as the bracket-brace FST over-estimates the total buffer size required for the translated output and indices. This PR splits the `transduce` calls for both FSTs into two invocations. The first invocation estimates the size of the translated buffer and the translated indices, and the second call performs the DFA run. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Basit Ayantunde (https://github.com/lamarrr) URL: #16978
Configuration menu - View commit details
-
Copy full SHA for cff1296 - Browse repository at this point
Copy the full SHA cff1296View commit details -
[Bug] Fix Arrow-FS parquet reader for larger files (#17099)
Follow-up to #16684 There is currently a bug in `dask_cudf.read_parquet(..., filesystem="arrow")` when the files are larger than the `"dataframe.parquet.minimum-partition-size"` config. More specifically, when the files are not aggregated together, the output will be `pd.DataFrame` instead of `cudf.DataFrame`. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #17099
Configuration menu - View commit details
-
Copy full SHA for 3126f77 - Browse repository at this point
Copy the full SHA 3126f77View commit details -
Add JNI Support for Multi-line Delimiters and Include Test (#17139)
This PR introduces the necessary changes to the cuDF jni to support the issue described in [NVIDIA/spark-rapids#11554](NVIDIA/spark-rapids#11554). For further information, refer to the details in the [comment](NVIDIA/spark-rapids#11554 (comment)). Issue #15961 adds support for handling multiple line delimiters. This PR extends that functionality to JNI, which was previously missing, and also includes a test to validate the changes. Authors: - Suraj Aralihalli (https://github.com/SurajAralihalli) Approvers: - MithunR (https://github.com/mythrocks) - Robert (Bobby) Evans (https://github.com/revans2) URL: #17139
Configuration menu - View commit details
-
Copy full SHA for f0c6a04 - Browse repository at this point
Copy the full SHA f0c6a04View commit details -
Use async execution policy for true_if (#17146)
Closes #17117 Related to #12086 This PR replaces the synchronous execution policy with an asynchronous one to eliminate unnecessary synchronization. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - David Wendt (https://github.com/davidwendt) - Shruti Shivakumar (https://github.com/shrshi) - Jason Lowe (https://github.com/jlowe) - Nghia Truong (https://github.com/ttnghia) URL: #17146
Configuration menu - View commit details
-
Copy full SHA for 02ee819 - Browse repository at this point
Copy the full SHA 02ee819View commit details -
Replace direct
cudaMemcpyAsync
calls with utility functions (limite……d to `cudf::io`) (#17132) Issue #15620 Replaced the calls to `cudaMemcpyAsync` with the new `cuda_memcpy`/`cuda_memcpy_async` utility, which optionally avoids using the copy engine. Changes are limited to cuIO to make the PR easier to review (repetitive enough as-is!). Also took the opportunity to use `cudf::detail::host_vector` and its factories to enable wider pinned memory use. Skipped a few instances of `cudaMemcpyAsync`; few are under `io::comp`, which we don't want to invest in further (if possible). The other `cudaMemcpyAsync` instances are D2D copies, which `cuda_memcpy`/`cuda_memcpy_async` don't support. Perhaps they should, just to make the use ubiquitous. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Nghia Truong (https://github.com/ttnghia) URL: #17132
Configuration menu - View commit details
-
Copy full SHA for deb9af4 - Browse repository at this point
Copy the full SHA deb9af4View commit details -
Use managed memory for NDSH benchmarks (#17039)
Fixes #16987 Use managed memory to generate the parquet data, and write parquet data to host buffer. Replace use of parquet_device_buffer with cuio_source_sink_pair Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - David Wendt (https://github.com/davidwendt) - Tianyu Liu (https://github.com/kingcrimsontianyu) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17039
Configuration menu - View commit details
-
Copy full SHA for e7653a7 - Browse repository at this point
Copy the full SHA e7653a7View commit details -
Use the full ref name of
rmm.DeviceBuffer
in the sphinx config file (……#17150) This is an improvement PR that uses the full name of `rmm.DeviceBuffer` in the sphinx config file. Its a follow-up to this [comment](#16913 (comment)). Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17150
Configuration menu - View commit details
-
Copy full SHA for 0287972 - Browse repository at this point
Copy the full SHA 0287972View commit details
Commits on Oct 24, 2024
-
Migrate NVText Stemming APIs to pylibcudf (#17085)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17085
Configuration menu - View commit details
-
Copy full SHA for d7cdf44 - Browse repository at this point
Copy the full SHA d7cdf44View commit details -
Upgrade to polars 1.11 in cudf-polars (#17154)
Polars 1.11 is out, with slight updates to the IR, so we can correctly raise for dynamic groupbys and see inequality joins. These changes adapt to that and do a first pass at supporting inequality joins (by translating to cross + filter). A followup (#17000) will use libcudf's conditional joins. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) - Mike Sarahan (https://github.com/msarahan) URL: #17154
Configuration menu - View commit details
-
Copy full SHA for 3a62314 - Browse repository at this point
Copy the full SHA 3a62314View commit details -
Remove unused variable in internal merge_tdigests utility (#17151)
Removes unused variable that contains host copy of the group_offsets data. This host variable appears to have been made obsolete by a combination of #16897 and #16780 Found while working on #17149 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Nghia Truong (https://github.com/ttnghia) URL: #17151
Configuration menu - View commit details
-
Copy full SHA for b75036b - Browse repository at this point
Copy the full SHA b75036bView commit details -
Move
segmented_gather
function from the copying module to the lists…… module (#17148) This PR moves `segmented_gather` out of the copying module and into the lists module. And it uses the pylibcudf `segmented_gather` implementation in cudf python. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17148
Configuration menu - View commit details
-
Copy full SHA for 7115f20 - Browse repository at this point
Copy the full SHA 7115f20View commit details
Commits on Oct 25, 2024
-
Fix host-to-device copy missing sync in strings/duration convert (#17149
) Fixes a missing stream sync when copying a temporary host vector to device. The host vector could be destroyed before the copy is completed. Updates the code to use vector factory function `make_device_uvector_sync()` instead of `cudaMemcpyAsync` Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) URL: #17149
Configuration menu - View commit details
-
Copy full SHA for 03777f6 - Browse repository at this point
Copy the full SHA 03777f6View commit details -
Deprecate current libcudf nvtext minhash functions (#17152)
Deprecates the current nvtext minhash functions some of which will be replaced in #16756 with a different signature. The others will no longer be used and removed in future release. The existing gtests and benchmarks will be retained for rework in the future release as well. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17152
Configuration menu - View commit details
-
Copy full SHA for e98e6b9 - Browse repository at this point
Copy the full SHA e98e6b9View commit details -
Move nvtext ngrams benchmarks to nvbench (#17173)
Moves the `nvtext::generate_ngrams` and `nvtext::generate_character_ngrams` benchmarks from google-bench to nvbench. Target parameters are exposed to help with profiling. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) URL: #17173
Configuration menu - View commit details
-
Copy full SHA for 0bb699e - Browse repository at this point
Copy the full SHA 0bb699eView commit details -
devcontainer: replace
VAULT_HOST
withAWS_ROLE_ARN
(#17134)This PR is replacing the `VAULT_HOST` variable with `AWS_ROLE_ARN`. This is required to use the new token service to get AWS credentials. Authors: - Jordan Jacobelli (https://github.com/jjacobelli) Approvers: - Bradley Dice (https://github.com/bdice) - Paul Taylor (https://github.com/trxcllnt) URL: #17134
Configuration menu - View commit details
-
Copy full SHA for 2113bd6 - Browse repository at this point
Copy the full SHA 2113bd6View commit details -
lint: replace
isort
with Ruff's rule I (#16685)since #15312 moved formatting from Black to Rufft, it would make sense also unify import formatting under the same ruff so use build-in `I` rule instead of additional `isort` Authors: - Jirka Borovec (https://github.com/Borda) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) - https://github.com/jakirkham URL: #16685
Configuration menu - View commit details
-
Copy full SHA for 5cba4fb - Browse repository at this point
Copy the full SHA 5cba4fbView commit details -
Add to_dlpack/from_dlpack APIs to pylibcudf (#17055)
Contributes to #15162 Could use some advice how to type the input of `from_dlpack` and outut of `to_dlpack` which are PyCapsule objects. EDIT: I notice Cython just types them as object https://github.com/cython/cython/blob/master/Cython/Includes/cpython/pycapsule.pxd. Stylistically do we want add `object var_name` or just leave untyped? Authors: - Matthew Roeschke (https://github.com/mroeschke) - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Murray (https://github.com/Matt711) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17055
Configuration menu - View commit details
-
Copy full SHA for 8bc9f19 - Browse repository at this point
Copy the full SHA 8bc9f19View commit details
Commits on Oct 28, 2024
-
Use make_device_uvector instead of cudaMemcpyAsync in inplace_bitmask…
…_binop (#17181) Changes `cudf::detail::inplace_bitmask_binop()` to use `make_device_uvector()` instead of `cudaMemcpyAsync()` Found while working on #17149 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) URL: #17181
Configuration menu - View commit details
-
Copy full SHA for 8c4d1f2 - Browse repository at this point
Copy the full SHA 8c4d1f2View commit details -
Add compute_mapping_indices used by shared memory groupby (#17147)
This work is part of splitting the original bulk shared memory groupby PR #16619. This PR introduces the `compute_mapping_indices` API, which is used by the shared memory groupby. libcudf will opt for the shared memory code path when the aggregation request is compatible with shared memory, i.e. there is enough shared memory space and no dictionary aggregation requests. Aggregating with shared memory involves two steps. The first step, introduced in this PR, calculates the offset for each input key within the shared memory aggregation storage, as well as the offset when merging the shared memory results into global memory. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Mark Harris (https://github.com/harrism) - Nghia Truong (https://github.com/ttnghia) URL: #17147
Configuration menu - View commit details
-
Copy full SHA for ef28cdd - Browse repository at this point
Copy the full SHA ef28cddView commit details -
Add 2-cpp approvers text to contributing guide [no ci] (#17182)
Adds text to the contributing guide mentioning 2 cpp-codeowner approvals are required for any C++changes. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) URL: #17182
Configuration menu - View commit details
-
Copy full SHA for a83e1a3 - Browse repository at this point
Copy the full SHA a83e1a3View commit details -
Remove java reservation (#17189)
This removes a file for a feature that we intended to use, but never was. The other parts of that feature were already removed, but this was missed. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17189
Configuration menu - View commit details
-
Copy full SHA for 7b17fbe - Browse repository at this point
Copy the full SHA 7b17fbeView commit details -
build wheels without build isolation (#17088)
Contributes to rapidsai/build-planning#108 Contributes to rapidsai/build-planning#111 Proposes some small packaging/CI changes, matching similar changes being made across RAPIDS. * building `libcudf` wheels with `--no-build-isolation` (for better `sccache` hit rate) * printing `sccache` stats to CI logs * updating to the latest `rapids-dependency-file-generator` (v1.16.0) * always explicitly specifying `cpp` / `python` in calls to `rapids-upload-wheels-to-s3` # Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17088
Configuration menu - View commit details
-
Copy full SHA for abecd0b - Browse repository at this point
Copy the full SHA abecd0bView commit details -
Added strings AST vs BINARY_OP benchmarks (#17128)
This merge request implements benchmarks to compare the strings AST and BINARY_OPs. It also moves out the common string input generator function to a common benchmarks header as it is repeated across other benchmarks. Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - David Wendt (https://github.com/davidwendt) - Yunsong Wang (https://github.com/PointKernel) URL: #17128
Configuration menu - View commit details
-
Copy full SHA for 4c04b7c - Browse repository at this point
Copy the full SHA 4c04b7cView commit details -
Remove includes suggested by include-what-you-use (#17170)
This PR cherry-picks out the suggestions from IWYU generated in #17078. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #17170
Configuration menu - View commit details
-
Copy full SHA for 1ad9fc1 - Browse repository at this point
Copy the full SHA 1ad9fc1View commit details
Commits on Oct 29, 2024
-
Check
num_children() == 0
inColumn.from_column_view
(#17193)This fixes a bug where `Column.from_column_view` is not verifying the existence of a string column's offsets child column prior to accessing it, resulting in a segmentation fault when passing a `column_view` from `Column.view()` to `Column.from_column_view(...)`. The issue can be reproduced with: ``` import cudf from cudf.core.column.column import as_column df = cudf.DataFrame({'a': cudf.Series([[]], dtype=cudf.core.dtypes.ListDtype('string'))}) s = df['a'] col = as_column(s) col2 = cudf._lib.column.Column.back_and_forth(col) print(col) print(col2) ``` where `back_and_forth` is defined as: ``` @staticmethod def back_and_forth(Column input_column): cdef column_view input_column_view = input_column.view() return Column.from_column_view(input_column_view, input_column) ``` I don't have the expertise to write the appropriate tests for this without introducing the `back_and_forth` function as an API, which seems undesirable. Authors: - Christopher Harris (https://github.com/cwharris) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17193
Configuration menu - View commit details
-
Copy full SHA for bf5b778 - Browse repository at this point
Copy the full SHA bf5b778View commit details -
Auto assign PR to author (#16969)
I think most PRs remain unassigned, so this PR auto assigns the PR to the PR author. I think this will help keep our project boards up-to-date. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #16969
Configuration menu - View commit details
-
Copy full SHA for 4b0a634 - Browse repository at this point
Copy the full SHA 4b0a634View commit details -
Fixed unused attribute compilation error for GCC 13 (#17188)
With `decltype(&pclose) ` for the destructor type of the `unique_ptr`, gcc makes the signature inherit the attributes of `pclose`. The compiler then ignores this attribute as it doesn't apply within the context with a warning, and since we have `-Werror` on for ignored attributes, the build fails. This happens on gcc 13.2.0. Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - David Wendt (https://github.com/davidwendt) - Paul Mattione (https://github.com/pmattione-nvidia) - Shruti Shivakumar (https://github.com/shrshi) URL: #17188
Configuration menu - View commit details
-
Copy full SHA for 3775f7b - Browse repository at this point
Copy the full SHA 3775f7bView commit details -
Support storing
precision
of decimal types inSchema
class (#17176)In Spark, the `DecimalType` has a specific number of digits to represent the numbers. However, when creating a data Schema, only type and name of the column are stored, thus we lose that precision information. As such, it would be difficult to reconstruct the original decimal types from cudf's `Schema` instance. This PR adds a `precision` member variable to the `Schema` class in cudf Java, allowing it to store the precision number of the original decimal column. Partially contributes to NVIDIA/spark-rapids#11560. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #17176
Configuration menu - View commit details
-
Copy full SHA for ddfb284 - Browse repository at this point
Copy the full SHA ddfb284View commit details -
Add in new java API for raw host memory allocation (#17197)
This is the first patch in a series of patches that should make it so that all java host memory allocations go through the DefaultHostMemoryAllocator unless another allocator is explicitly provided. This is to make it simpler to track/control host memory usage. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - Jason Lowe (https://github.com/jlowe) - Alessandro Bellina (https://github.com/abellina) URL: #17197
Configuration menu - View commit details
-
Copy full SHA for 63b773e - Browse repository at this point
Copy the full SHA 63b773eView commit details -
Unified binary_ops and ast benchmarks parameter names (#17200)
This merge request unifies the parameter names of the AST and BINARYOP benchmark suites and makes it easier to perform parameter sweeps and compare the outputs of both benchmarks. Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: #17200
Configuration menu - View commit details
-
Copy full SHA for 52d7e63 - Browse repository at this point
Copy the full SHA 52d7e63View commit details -
[BUG] Replace
repo_token
withgithub_token
in Auto Assign PR GHA (#……17203) The Auto Assign GHA workflow fails with this [error](https://github.com/rapidsai/cudf/actions/runs/11580081781). This PR fixes this error. xref #16969 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17203
Configuration menu - View commit details
-
Copy full SHA for 8d7b0d8 - Browse repository at this point
Copy the full SHA 8d7b0d8View commit details -
Parquet reader list microkernel (#16538)
This PR refactors fixed-width parquet list reader decoding into its own set of micro-kernels, templatizing the existing fixed-width microkernels. When skipping rows for lists, this will skip ahead the decoding of the definition, repetition, and dictionary rle_streams as well. The list kernel uses 128 threads per block and 71 registers per thread, so I've changed the launch_bounds to enforce a minimum of 8 blocks per SM. This causes a small register spill but the benchmarks are still faster, as seen below: DEVICE_BUFFER list benchmarks (decompress + decode, not bound by IO): run_length 1, cardinality 0, no byte_limit: 24.7% faster run_length 32, cardinality 1000, no byte_limit: 18.3% faster run_length 1, cardinality 0, 500kb byte_limit: 57% faster run_length 32, cardinality 1000, 500kb byte_limit: 53% faster Compressed list of ints on hard drive: 5.5% faster Sample real data on hard drive (many columns not lists): 0.5% faster Authors: - Paul Mattione (https://github.com/pmattione-nvidia) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - https://github.com/nvdbaranec - Nghia Truong (https://github.com/ttnghia) URL: #16538
Configuration menu - View commit details
-
Copy full SHA for eeb4d27 - Browse repository at this point
Copy the full SHA eeb4d27View commit details
Commits on Oct 30, 2024
-
Make ai.rapids.cudf.HostMemoryBuffer#copyFromStream public. (#17179)
This is the first pr of [a larger one](NVIDIA/spark-rapids-jni#2532) to introduce a new serialization format. It make `ai.rapids.cudf.HostMemoryBuffer#copyFromStream` public. For more background, see NVIDIA/spark-rapids-jni#2496 Authors: - Renjie Liu (https://github.com/liurenjie1024) - Jason Lowe (https://github.com/jlowe) Approvers: - Jason Lowe (https://github.com/jlowe) - Alessandro Bellina (https://github.com/abellina) URL: #17179
Configuration menu - View commit details
-
Copy full SHA for 6328ad6 - Browse repository at this point
Copy the full SHA 6328ad6View commit details -
[no ci] Add empty-columns section to the libcudf developer guide (#17183
) Adds a section on `Empty Columns` to the libcudf DEVELOPER_GUIDE Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Basit Ayantunde (https://github.com/lamarrr) - Vukasin Milovanovic (https://github.com/vuule) URL: #17183
Configuration menu - View commit details
-
Copy full SHA for 5ee7d7c - Browse repository at this point
Copy the full SHA 5ee7d7cView commit details -
Upgrade nvcomp to 4.1.0.6 (#17201)
This updates cudf to use nvcomp 4.1.0.6. The version is updated in rapids-cmake in rapidsai/rapids-cmake#709. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) - Jake Awe (https://github.com/AyodeAwe) URL: #17201
Configuration menu - View commit details
-
Copy full SHA for 6c2eb4e - Browse repository at this point
Copy the full SHA 6c2eb4eView commit details -
Fix bug in recovering invalid lines in JSONL inputs (#17098)
Addresses #16999 Authors: - Shruti Shivakumar (https://github.com/shrshi) - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) Approvers: - Basit Ayantunde (https://github.com/lamarrr) - Nghia Truong (https://github.com/ttnghia) URL: #17098
Configuration menu - View commit details
-
Copy full SHA for 0b9277b - Browse repository at this point
Copy the full SHA 0b9277bView commit details -
Add conversion from cudf-polars expressions to libcudf ast for parque…
…t filters (#17141) Previously, we always applied parquet filters by post-filtering. This negates much of the potential gain from having filters available at read time, namely discarding row groups. To fix this, implement, with the new visitor system of #17016, conversion to pylibcudf expressions. We must distinguish two types of expressions, ones that we can evaluate via `cudf::compute_column`, and the more restricted set of expressions that the parquet reader understands, this is handled by having a state that tracks the usage. The former style will be useful when we implement inequality joins. While here, extend the support in pylibcudf expressions to handle all supported literal types and expose `compute_column` so we can test the correctness of the broader (non-parquet) implementation. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17141
Configuration menu - View commit details
-
Copy full SHA for 7157de7 - Browse repository at this point
Copy the full SHA 7157de7View commit details -
Fix
to_parquet
append behavior with global metadata file (#17198)Closes #17177 When appending to a parquet dataset with Dask cuDF, the original metadata must be converted from `pq.FileMetaData` to `bytes` before it can be passed down to `cudf.io.merge_parquet_filemetadata`. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #17198
Configuration menu - View commit details
-
Copy full SHA for 5a6d177 - Browse repository at this point
Copy the full SHA 5a6d177View commit details
Commits on Oct 31, 2024
-
Add remaining datetime APIs to pylibcudf (#17143)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - https://github.com/brandon-b-miller URL: #17143
Configuration menu - View commit details
-
Copy full SHA for 3cf186c - Browse repository at this point
Copy the full SHA 3cf186cView commit details -
Add compute_shared_memory_aggs used by shared memory groupby (#17162)
This work is part of splitting the original bulk shared memory groupby PR #16619. This PR introduces the `compute_shared_memory_aggs` API, which is utilized by the shared memory groupby. The shared memory groupby process consists of two main steps. The first step was introduced in #17147, and this PR implements the second step, where the actual aggregations are performed based on the offsets from the first step. Each thread block is designed to handle up to 128 unique keys. If this limit is exceeded, there won't be enough space to store temporary aggregation results in shared memory, so a flag is set to indicate that follow-up global memory aggregations are needed to complete the remaining aggregation requests. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17162
Configuration menu - View commit details
-
Copy full SHA for 0e294b1 - Browse repository at this point
Copy the full SHA 0e294b1View commit details -
Migrate NVText Tokenizing APIs to pylibcudf (#17100)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Muhammad Haseeb (https://github.com/mhaseeb123) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17100
Configuration menu - View commit details
-
Copy full SHA for 893d0fd - Browse repository at this point
Copy the full SHA 893d0fdView commit details -
Fix some documentation rendering for pylibcudf (#17217)
* Fixed/modified some title headers * Fixed/added pylibcudf section docstrings Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #17217
Configuration menu - View commit details
-
Copy full SHA for 3f66087 - Browse repository at this point
Copy the full SHA 3f66087View commit details -
Migrate NVText Byte Pair Encoding APIs to pylibcudf (#17101)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17101
Configuration menu - View commit details
-
Copy full SHA for 0db2463 - Browse repository at this point
Copy the full SHA 0db2463View commit details -
Migrate hashing operations to
pylibcudf
(#15418)This PR creates `pylibcudf` hashing APIs and modifies the cuDF Cython to leverage them. cc @vyasr Authors: - https://github.com/brandon-b-miller Approvers: - Yunsong Wang (https://github.com/PointKernel) - Bradley Dice (https://github.com/bdice) - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) URL: #15418
Configuration menu - View commit details
-
Copy full SHA for a69de57 - Browse repository at this point
Copy the full SHA a69de57View commit details -
Migrate NVtext subword tokenizing APIs to pylibcudf (#17096)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17096
Configuration menu - View commit details
-
Copy full SHA for a0711d0 - Browse repository at this point
Copy the full SHA a0711d0View commit details -
Remove unsanitized nulls from input strings columns in reduction gtes…
…ts (#17202) Input strings column containing unsanitized nulls may result in undefined behavior. This PR fixes the input data to not include string characters in null rows in gtests for `REDUCTION_TESTS`. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: #17202
Configuration menu - View commit details
-
Copy full SHA for 01cfcff - Browse repository at this point
Copy the full SHA 01cfcffView commit details -
Add jaccard_index to generated cuDF docs (#17199)
Adds the `jaccard_index` API to the generated docs. Also noticed `minhash` is not present and so added here as well. Also removed duplicate `rsplit` entry from the `.rst` file Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17199
Configuration menu - View commit details
-
Copy full SHA for cafcf6a - Browse repository at this point
Copy the full SHA cafcf6aView commit details -
Move strings::concatenate benchmark to nvbench (#17211)
Moves the `cudf::strings::concatenate` benchmark source from google-bench to nvbench. This also removes the restrictions on the parameters to allows specifying arbitrary number of rows and string width. Reference #16948 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Mark Harris (https://github.com/harrism) - Nghia Truong (https://github.com/ttnghia) URL: #17211
Configuration menu - View commit details
-
Copy full SHA for e512258 - Browse repository at this point
Copy the full SHA e512258View commit details -
Fix
Schema.Builder
does not propagate precision value toBuilder
……instance (#17214) When calling `Schema.Builder.build()`, the value `topLevelPrecision` should be passed into the constructor of the `Schema` class. However, it was forgotten. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #17214
Configuration menu - View commit details
-
Copy full SHA for 9657c9a - Browse repository at this point
Copy the full SHA 9657c9aView commit details -
Add TokenizeVocabulary to api docs (#17208)
Adds the `TokenizeVocabulary` class to the cuDF API guide. Also removes the `SubwordTokenizer` which is to be deprecated in the future. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17208
Configuration menu - View commit details
-
Copy full SHA for 3db6a0e - Browse repository at this point
Copy the full SHA 3db6a0eView commit details -
Move detail header floating_conversion.hpp to detail subdirectory (#1…
…7209) Moves the 'cudf/fixed_point/floating_conversion.hpp' to `cudf/fixed_point/detail/` subdirectory since it only contains declarations and definition in the `detail` namespace. It had previously been its own module. https://docs.rapids.ai/api/libcudf/stable/modules.html Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Shruti Shivakumar (https://github.com/shrshi) - Vyas Ramasubramani (https://github.com/vyasr) - Nghia Truong (https://github.com/ttnghia) URL: #17209
Configuration menu - View commit details
-
Copy full SHA for f99ef41 - Browse repository at this point
Copy the full SHA f99ef41View commit details -
Expose stream-ordering in partitioning API (#17213)
Add stream parameter to public APIs: ``` cudf::partition cudf::round_robin_partition ``` Added stream gtest for above two functions and for `hash_partition`. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #17213
Configuration menu - View commit details
-
Copy full SHA for f7020f1 - Browse repository at this point
Copy the full SHA f7020f1View commit details -
Remove
nvtext::load_vocabulary
from pylibcudf (#17220)This PR follow up #17100 to address the last review here #17100 (review) Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17220
Configuration menu - View commit details
-
Copy full SHA for 02a50e8 - Browse repository at this point
Copy the full SHA 02a50e8View commit details -
Fix groupby.get_group with length-1 tuple with list-like grouper (#17216
) closes #17187 Adds similar logic as implemented in pandas: https://github.com/pandas-dev/pandas/blob/main/pandas/core/groupby/groupby.py#L751-L758 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17216
Configuration menu - View commit details
-
Copy full SHA for a83debb - Browse repository at this point
Copy the full SHA a83debbView commit details -
Fix binop with LHS numpy datetimelike scalar (#17226)
closes #17087 For binops, cudf tries to convert a 0D numpy array to a numpy scalar via `.dtype.type(value)`, but `.dtype.type` requires other parameters if its a `numpy.datetime64` or `numpy.timedelta64`. Indexing via `[()]` will perform this conversion correctly. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17226
Configuration menu - View commit details
-
Copy full SHA for 6055393 - Browse repository at this point
Copy the full SHA 6055393View commit details -
Support for polars 1.12 in cudf-polars (#17227)
No new updates are required, we must just no longer xfail a test if running with 1.12 Authors: - Lawrence Mitchell (https://github.com/wence-) - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17227
Configuration menu - View commit details
-
Copy full SHA for 0929115 - Browse repository at this point
Copy the full SHA 0929115View commit details -
use rapids-generate-pip-constraints to pin to oldest dependencies in …
…CI (#17131) Follow-up to #16570 (comment) Proposes using the new `rapids-generate-pip-constraints` tool from `gha-tools` to generate a list of pip constraints pinning to the oldest supported verisons of dependencies here. ## Notes for Reviewers ### How I tested this rapidsai/gha-tools#114 (comment) You can also see one the most recent `wheel-tests-cudf` builds here: * oldest-deps: numpy 1.x ([build link](https://github.com/rapidsai/cudf/actions/runs/11615430314/job/32347576688?pr=17131)) * latest-deps: numpy 2.x ([build link](https://github.com/rapidsai/cudf/actions/runs/11615430314/job/32347577095?pr=17131)) # Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17131
Configuration menu - View commit details
-
Copy full SHA for b5b47fe - Browse repository at this point
Copy the full SHA b5b47feView commit details
Commits on Nov 1, 2024
-
Expose streams in public round APIs (#16925)
Contributes to #13744 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #16925
Configuration menu - View commit details
-
Copy full SHA for 0a87284 - Browse repository at this point
Copy the full SHA 0a87284View commit details -
Minor I/O code quality improvements (#17105)
This PR makes small improvements for the I/O code. Specifically, - Place type constraint on a template class to allow only for rvalue argument. In addition, replace `std::move` with `std::forward` to make the code more *apparently* consistent with the convention, i.e. use `std::move()` on the rvalue references, and `std::forward` on the forwarding references (Effective modern C++ item 25). - Alleviate (but not completely resolve) an existing cuFile driver close issue by removing the explicit driver close call. See #17121 - Minor typo fix (`struct` → `class`). Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #17105
Configuration menu - View commit details
-
Copy full SHA for 8219d28 - Browse repository at this point
Copy the full SHA 8219d28View commit details -
Change default KvikIO parameters in cuDF: set the thread pool size to…
… 4, and compatibility mode to ON (#17185) This PR adjusts the default KvikIO parameters in light of recent discussions. - Set KvikIO compatibility mode to ON (previously unspecified). This is to avoid the overhead of KvikIO validating the cuFile library when most of the time clients are not using cuFile/GDS. - Set KvikIO thread pool size to 4 (previous 8). See the reason below. In addition, this PR updates the documentation on `LIBCUDF_CUFILE_POLICY`. --- It is reported that Dask-cuDF on a 8-GPU node with Lustre file system has major performance regression when the KvikIO thread pool size is 8. |KVIKIO_NTHREADS| 8 | 4 | 2 | 1 | |----------------------------|---|----|---|----------| |Dask-cuDF time [s]| 16 | 3.9 | 4.0 | 4.3 | |cuDF time [s]| 3.4 | 3.4 | 3.8 | 4.9 | Additional benchmark on Grace Hopper ([Parquet](https://docs.google.com/spreadsheets/d/1ZxuFTcu67kMVpESHwT0Cr-CAeAP7YmLDrcHxNTt22aU), [CSV](https://docs.google.com/spreadsheets/d/1yFLO-cdxG6jjPwHMtoUbPGMXilRaglush2U6KdrEAvA)) indicates no performance regression by switching the thread pool size from 8 to 4. For the time being, we choose 4 as an empirical sweet spot. Closes #16512 Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #17185
Configuration menu - View commit details
-
Copy full SHA for 6ce9ea4 - Browse repository at this point
Copy the full SHA 6ce9ea4View commit details
Commits on Nov 2, 2024
-
Add
num_iterations
axis to the multi-threaded Parquet benchmarks (#……17231) Added an axis that controls the number of times each thread reads its input. Running with a higher number of iterations should better show how work from different threads pipelines. The new axis, "num_iterations", is added to all multi-threaded Parquet reader benchmarks. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Paul Mattione (https://github.com/pmattione-nvidia) URL: #17231
Configuration menu - View commit details
-
Copy full SHA for 3d07509 - Browse repository at this point
Copy the full SHA 3d07509View commit details
Commits on Nov 4, 2024
-
Expose stream-ordering in subword tokenizer API (#17206)
Add stream parameter to public APIs: ``` nvtext::subword_tokenize nvtext::load_vocabulary_file ``` Added stream gtest. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17206
Configuration menu - View commit details
-
Copy full SHA for 0d37506 - Browse repository at this point
Copy the full SHA 0d37506View commit details -
Make HostMemoryBuffer call into the DefaultHostMemoryAllocator (#17204)
This is step 3 in a process of getting java host memory allocation to be plugable under a single allocation API. This is really only used for large memory allocations, which is what matters. This changes the most common java host memory allocation API to call into the plugable host memory allocation API. The reason that this had to be done in multiple steps is because the Spark Plugin code was calling into the common memory allocation API, and memory allocation would end up calling itself recursively. Step 1. Create a new API that will not be called recursively (#17197) Step 2. Have the Java plugin use that new API instead of the old one to avoid any recursive invocations (NVIDIA/spark-rapids#11671) Step 3. Update the common API to use the new backend (this) There are likely to be more steps after this that involve cleaning up and removing APIs that are no longer needed. This is marked as breaking even though it does not break any APIs, it changes the semantics enough that it feels like a breaking change. This is blocked and should not be merged in until Step 2 is merged in, to avoid breaking the Spark plugin. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - Nghia Truong (https://github.com/ttnghia) - Alessandro Bellina (https://github.com/abellina) URL: #17204
Configuration menu - View commit details
-
Copy full SHA for e6f5c0e - Browse repository at this point
Copy the full SHA e6f5c0eView commit details -
Expose mixed and conditional joins in pylibcudf (#17235)
Expose these join types to pylibcudf, they will be useful for implement inequality joins in cudf polars. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) URL: #17235
Configuration menu - View commit details
-
Copy full SHA for 076ad58 - Browse repository at this point
Copy the full SHA 076ad58View commit details -
Use more pylibcudf.io.types enums in cudf._libs (#17237)
If we consider the `pylibcudf.libcudf` namespace to eventually be more "private", this PR replaces that usage, specifically when accessing enums, with their public counterparts Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17237
Configuration menu - View commit details
-
Copy full SHA for a2001dd - Browse repository at this point
Copy the full SHA a2001ddView commit details -
Fix discoverability of submodules inside
pd.util
(#17215)Fixes: #17166 This PR fixes the discoverability of the submodules of attributes and modules inside `pd.util`. Somehow `importlib.import_module("pandas.util").__dict__` doesn't display submodules and only root level attributes. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17215
Configuration menu - View commit details
-
Copy full SHA for 1d25d14 - Browse repository at this point
Copy the full SHA 1d25d14View commit details -
Refactor Dask cuDF legacy code (#17205)
The "legacy" DataFrame API is now deprecated (dask/dask#11437). The main purpose of this PR is to start isolating legacy code in Dask cuDF. **Old layout**: ``` dask_cudf/ ├── expr/ │ ├── _collection.py │ ├── _expr.py │ ├── _groupby.py ├── io/ │ ├── tests/ │ ├── ... │ ├── parquet.py │ ├── ... ├── tests/ ├── accessors.py ├── backends.py ├── core.py ├── groupby.py ├── sorting.py ``` **New layout**: ``` dask_cudf/ ├── _expr/ │ ├── accessors.py │ ├── collection.py │ ├── expr.py │ ├── groupby.py ├── _legacy/ │ ├── io/ │ ├── core.py │ ├── groupby.py │ ├── sorting.py ├── io/ │ ├── tests/ │ ├── ... │ ├── parquet.py │ ├── ... ├── tests/ ├── backends.py ├── core.py ``` **Notes** - This PR adds some backward compatibility to the expr-based API that was previously missing: The user can now import collection classes from `dask_cudf.core` (previously led to a "silent" bug when query-planning was enabled). - The user can also import various IO functions from `dask_cudf.io` (and sub-modules like `dask_cudf.io.parquet`), but they will typically get a deprecation warning. - This PR is still technically "breaking" in the sense that the user can no longer import *some* functions/classes from `dask_cudf.io.*`. Also, the `groupby`, `sorting`, and `accessors` modules have simply moved. It *should* be uncommon for down-stream code to import from these modules. It's also worth noting that query-planning was already causing problems for these users if they *were* doing this. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) URL: #17205
Configuration menu - View commit details
-
Copy full SHA for 45563b3 - Browse repository at this point
Copy the full SHA 45563b3View commit details
Commits on Nov 5, 2024
-
Separate evaluation logic from
IR
objects in cudf-polars (#17175)Closes #17127 - This PR implements the proposal in #17127 - This change technically "breaks" with the existing `IR.evaluate` convention. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17175
Configuration menu - View commit details
-
Copy full SHA for 9d5041c - Browse repository at this point
Copy the full SHA 9d5041cView commit details -
Deprecate single component extraction methods in libcudf (#17221)
This PR deprecates the single component extraction methods (eg. `cudf::datetime::extract_year`) that are already covered by `cudf::datetime::extract_datetime_component`. xref #17143 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - David Wendt (https://github.com/davidwendt) - Karthikeyan (https://github.com/karthikeyann) URL: #17221
Configuration menu - View commit details
-
Copy full SHA for ac5b3ed - Browse repository at this point
Copy the full SHA ac5b3edView commit details
Commits on Nov 6, 2024
-
Search for kvikio with lowercase (#17243)
## Description <!-- Provide a standalone description of changes in this PR. --> <!-- Reference any issues closed by this PR with "closes #1234". --> <!-- Note: The pull request title will be included in the CHANGELOG. --> The case-sensitive name KvikIO is will throw off `find_package` searches, particularly after rapidsai/devcontainers#414 make the usage consistent in devcontainers. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes.
Configuration menu - View commit details
-
Copy full SHA for adf3269 - Browse repository at this point
Copy the full SHA adf3269View commit details -
Disallow cuda-python 12.6.1 and 11.8.4 (#17253)
Due to a bug in cuda-python we must disallow cuda-python 12.6.1 and 11.8.4. This PR disallows those versions. It also silences new cuda-python deprecation warnings so that our test suite passes. See rapidsai/build-planning#116 for more information. --------- Co-authored-by: James Lamb <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 06b3f83 - Browse repository at this point
Copy the full SHA 06b3f83View commit details
Commits on Nov 7, 2024
-
KvikIO shared library (#17239)
Update cudf to use the new KvikIO shared library: rapidsai/kvikio#527 #### Tasks - [x] Wait for the [KvikIO shared library PR](rapidsai/kvikio#527) to be merged. - [x] Revert the use of the [KvikIO shared library](rapidsai/kvikio#527) in CI: 2d8eeaf. Authors: - Mads R. B. Kristensen (https://github.com/madsbk) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - James Lamb (https://github.com/jameslamb) URL: #17239
Configuration menu - View commit details
-
Copy full SHA for 57900de - Browse repository at this point
Copy the full SHA 57900deView commit details -
Put a ceiling on cuda-python (#17264)
Follow-up to #17253 Contributes to rapidsai/build-planning#116 That PR used `!=` requirements to skip a particular version of `cuda-python` that `cudf` and `pylibcudf` were incompatible with. A newer version of `cuda-python` (12.6.2 for CUDA 12, 11.8.5 for CUDA 11) was just released, and it also causes some build issues for RAPIDS libraries: rapidsai/cuvs#445 (comment) To unblock CI across RAPIDS, this proposes **temporarily** switching to ceilings on the `cuda-python` dependency here. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: #17264
Configuration menu - View commit details
-
Copy full SHA for 29484cb - Browse repository at this point
Copy the full SHA 29484cbView commit details -
Fix the example in documentation for
get_dremel_data()
(#17242)Closes #11396. Fixes the example in the documentation of `get_dremel_data()` Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - David Wendt (https://github.com/davidwendt) - Vukasin Milovanovic (https://github.com/vuule) - Mike Wilson (https://github.com/hyperbolic2346) - MithunR (https://github.com/mythrocks) URL: #17242
Configuration menu - View commit details
-
Copy full SHA for bbd3b43 - Browse repository at this point
Copy the full SHA bbd3b43View commit details -
Move strings/numeric convert benchmarks to nvbench (#17255)
Moves the `cpp/benchmarks/string/convert_numerics.cpp` and `cpp/benchmarks/string/convert_fixed_point.cpp` benchmark implementations from google-bench to nvbench. Authors: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Shruti Shivakumar (https://github.com/shrshi) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17255
Configuration menu - View commit details
-
Copy full SHA for e29e0ab - Browse repository at this point
Copy the full SHA e29e0abView commit details -
Added ast tree to simplify expression lifetime management (#17156)
This merge request follows up on #10744. It attempts to simplify managing expressions by adding a class called an ast tree. The ast tree manages and holds related expressions together. When the tree is destroyed, all the expressions are also destroyed. Ideally we would use a bump allocator for allocating the expressions instead of `std::vector<std::unique_ptr<expression>>`. We'd also ideally use a `cuda::std::inplace_vector` for storing the operands of the `operation` class, but that's in a newer version of CCCL. Authors: - Basit Ayantunde (https://github.com/lamarrr) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Lawrence Mitchell (https://github.com/wence-) - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: #17156
Configuration menu - View commit details
-
Copy full SHA for 4cbc15a - Browse repository at this point
Copy the full SHA 4cbc15aView commit details -
cudf-polars
string/numeric casting (#17076)Depends on #16991 Part of #17060 Implements cross casting from string <-> numeric types in `cudf-polars` Authors: - https://github.com/brandon-b-miller - Matthew Murray (https://github.com/Matt711) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Muhammad Haseeb (https://github.com/mhaseeb123) - Matthew Murray (https://github.com/Matt711) URL: #17076
Configuration menu - View commit details
-
Copy full SHA for e4c52dd - Browse repository at this point
Copy the full SHA e4c52ddView commit details -
Fix extract-datetime deprecation warning in ndsh benchmark (#17254)
Fixes deprecation warning introduced by #17221 ``` [165+3+59=226] Building CXX object benchmarks/CMakeFiles/NDSH_Q09_NVBENCH.dir/ndsh/q09.cpp.o /cudf/cpp/benchmarks/ndsh/q09.cpp: In function 'void run_ndsh_q9(nvbench::state&, std::unordered_map<std::__cxx11::basic_string<char>, cuio_source_sink_pair>&)': /cudf/cpp/benchmarks/ndsh/q09.cpp:148:33: warning: 'std::unique_ptr<cudf::column> cudf::datetime::extract_year(const cudf::column_view&, rmm::cuda_stream_view, rmm::device_async_resource_ref)' is deprecated [-Wdeprecated-declarations] 148 | auto o_year = cudf::datetime::extract_year(joined_table->column("o_orderdate")); | ^~~~~~~~~~~~ In file included from /cudf/cpp/benchmarks/ndsh/q09.cpp:21: /cudf/cpp/include/cudf/datetime.hpp:70:46: note: declared here 70 | [[deprecated]] std::unique_ptr<cudf::column> extract_year( | ^~~~~~~~~~~~ /cudf/cpp/benchmarks/ndsh/q09.cpp:148:45: warning: 'std::unique_ptr<cudf::column> cudf::datetime::extract_year(const cudf::column_view&, rmm::cuda_stream_view, rmm::device_async_resource_ref)' is deprecated [-Wdeprecated-declarations] 148 | auto o_year = cudf::datetime::extract_year(joined_table->column("o_orderdate")); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /cudf/cpp/benchmarks/ndsh/q09.cpp:21: /cudf/cpp/include/cudf/datetime.hpp:70:46: note: declared here 70 | [[deprecated]] std::unique_ptr<cudf::column> extract_year( | ^~~~~~~~~~~~ ``` Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Shruti Shivakumar (https://github.com/shrshi) URL: #17254
Configuration menu - View commit details
-
Copy full SHA for 1981445 - Browse repository at this point
Copy the full SHA 1981445View commit details -
Refactor gather/scatter benchmarks for strings (#17223)
Combines the `benchmarks/string/copy.cu` and `benchmarks/string/gather.cpp` source files which both had separate gather benchmarks for strings. The result is a new `copy.cpp` that has both gather and scatter benchmarks. Also changes the default parameters to remove the need to restrict the values. Authors: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) - Basit Ayantunde (https://github.com/lamarrr) URL: #17223
Configuration menu - View commit details
-
Copy full SHA for 67c71e2 - Browse repository at this point
Copy the full SHA 67c71e2View commit details -
AWS S3 IO through KvikIO (#16499)
Implement remote IO read using KvikIO's S3 backend. For now, this is an experimental feature for parquet read only. Enable by defining `CUDF_KVIKIO_REMOTE_IO=ON`. Authors: - Mads R. B. Kristensen (https://github.com/madsbk) - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Vukasin Milovanovic (https://github.com/vuule) - Shruti Shivakumar (https://github.com/shrshi) - Richard (Rick) Zamora (https://github.com/rjzamora) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #16499
Configuration menu - View commit details
-
Copy full SHA for 08e4853 - Browse repository at this point
Copy the full SHA 08e4853View commit details -
Add io.text APIs to pylibcudf (#17232)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17232
Configuration menu - View commit details
-
Copy full SHA for c209dae - Browse repository at this point
Copy the full SHA c209daeView commit details -
Add support for
pyarrow-18
(#17256)This PR unpins the max `pyarrow` version allowed to `18`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17256
Configuration menu - View commit details
-
Copy full SHA for 2db58d5 - Browse repository at this point
Copy the full SHA 2db58d5View commit details -
Process parquet bools with microkernels (#17157)
This adds support for the bool type to reading parquet microkernels. Both plain (bit-packed) and RLE-encoded bool decode is supported, using separate code paths. This PR also massively reduces boilerplate code, as most of the template info needed is already encoded in the kernel mask. Also the superfluous level_t template parameter on rle_run has been removed. And bools have been added to the parquet benchmarks. Performance: register count drops from 62 -> 56, both plain and RLE-encoded bool decoding are now 46% faster (uncompressed). Reading sample customer data shows no change. NDS tests show no change. Authors: - Paul Mattione (https://github.com/pmattione-nvidia) Approvers: - Yunsong Wang (https://github.com/PointKernel) - https://github.com/nvdbaranec - Vukasin Milovanovic (https://github.com/vuule) URL: #17157
Configuration menu - View commit details
-
Copy full SHA for 5147882 - Browse repository at this point
Copy the full SHA 5147882View commit details -
Move strings to date/time types benchmarks to nvbench (#17229)
Moves the `cpp/benchmarks/string/convert_datetime.cpp` and `cpp/benchmarks/string/convert_duration.cpp` benchmark implementations from google-bench to nvbench. Authors: - David Wendt (https://github.com/davidwendt) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) URL: #17229
Configuration menu - View commit details
-
Copy full SHA for 64c72fc - Browse repository at this point
Copy the full SHA 64c72fcView commit details -
Use
pylibcudf.strings.convert.convert_integers.is_integer
in cudf p……ython (#17270) Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17270
Configuration menu - View commit details
-
Copy full SHA for 773aefc - Browse repository at this point
Copy the full SHA 773aefcView commit details -
Use pylibcudf.search APIs in cudf python (#17271)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17271
Configuration menu - View commit details
-
Copy full SHA for c73defd - Browse repository at this point
Copy the full SHA c73defdView commit details -
Mark column chunks in a PQ reader
pass
as large strings when the cu……mulative `offsets` exceeds the large strings threshold. (#17207) This PR implements a method to correctly set the large-string property for column chunks in a in the Chunked Parquet Reader subpass if the cumulative string offsets have exceeded the large strings threshold. Fixes #17158 Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) - David Wendt (https://github.com/davidwendt) URL: #17207
Configuration menu - View commit details
-
Copy full SHA for e52ce85 - Browse repository at this point
Copy the full SHA e52ce85View commit details
Commits on Nov 8, 2024
-
Add optional column_order in JSON reader (#17029)
This PR adds optional column order to enforce column order in the output. This feature is required by spark from_json. Optional `column_order` is added to `schema_element`, and it is validated during reader_option creation. The column order can be specified at root level and for any struct in any level. • For root level, the dtypes should be schema_element with type STRUCT. (schema_element is added to variant dtypes) • For nested level, column_order can be specified for any STRUCT type. (could be a map of schema_element , or schema_element) If the column order is not specified, the order of columns is same as the order of columns that appear in json file. Closes #17240 (metadata updated) Closes #17091 (will return all nulls column if not present in input json) Closes #17090 (fixed with new schema_element as dtype) Closes #16799 (output columns are created from column_order if present) Authors: - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #17029
Configuration menu - View commit details
-
Copy full SHA for b3b5ce9 - Browse repository at this point
Copy the full SHA b3b5ce9View commit details -
Allow generating large strings in benchmarks (#17224)
Updates the benchmark utility `create_random_utf8_string_column` to support large strings. Replaces the hardcoded `size_type` offsets with the offsetalator and related utilities. Reference #16948 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - MithunR (https://github.com/mythrocks) URL: #17224
Configuration menu - View commit details
-
Copy full SHA for 1777c29 - Browse repository at this point
Copy the full SHA 1777c29View commit details -
Fix data_type ctor call in JSON_TEST (#17273)
Fixes call to `data_type{}` ctor in `json_test.cpp`. The 2-parameter ctor is for fixed-point-types only and will assert in a debug build if used incorrectly: https://github.com/rapidsai/cudf/blob/2db58d58b4a986c2c6fad457f291afb1609fd458/cpp/include/cudf/types.hpp#L277-L280 Partial stack trace from a gdb run ``` #5 0x000077b1530bc71b in __assert_fail_base (fmt=0x77b153271130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x58c3e4baaa98 "id == type_id::DECIMAL32 || id == type_id::DECIMAL64 || id == type_id::DECIMAL128", file=0x58c3e4baaa70 "/cudf/cpp/include/cudf/types.hpp", line=279, function=<optimized out>) at ./assert/assert.c:92 #6 0x000077b1530cde96 in __GI___assert_fail ( assertion=0x58c3e4baaa98 "id == type_id::DECIMAL32 || id == type_id::DECIMAL64 || id == type_id::DECIMAL128", file=0x58c3e4baaa70 "/cudf/cpp/include/cudf/types.hpp", line=279, function=0x58c3e4baaa38 "cudf::data_type::data_type(cudf::type_id, int32_t)") at ./assert/assert.c:101 #7 0x000058c3e48ba594 in cudf::data_type::data_type (this=0x7fffdd3f7530, id=cudf::type_id::STRING, scale=0) at /cudf/cpp/include/cudf/types.hpp:279 #8 0x000058c3e49215d9 in JsonReaderTest_MixedTypesWithSchema_Test::TestBody (this=0x58c3e5ea13a0) at /cudf/cpp/tests/io/json/json_test.cpp:2887 ``` Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17273
Configuration menu - View commit details
-
Copy full SHA for 3c5f787 - Browse repository at this point
Copy the full SHA 3c5f787View commit details -
Plumb pylibcudf datetime APIs through cudf python (#17275)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17275
Configuration menu - View commit details
-
Copy full SHA for 18041b5 - Browse repository at this point
Copy the full SHA 18041b5View commit details -
This PR adds [`include-what-you-use`](https://github.com/include-what-you-use/include-what-you-use/) to the CI job running clang-tidy. Like clang-tidy, IWYU runs via CMake integration and only runs on cpp files, not cu files. This should help us shrink binaries and reduce compilation times in cases where headers are being included unnecessarily, and it helps keep our include lists clean. The IWYU suggestions for additions are quite noisy and the team determined this to be unnecessary, so this PR instead post-filters the outputs to only show the removals. The final suggestions are uploaded to a file that is uploaded to the GHA page so that it can be downloaded, inspected, and easily applied locally. Resolves #581. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) - David Wendt (https://github.com/davidwendt) - Yunsong Wang (https://github.com/PointKernel) - James Lamb (https://github.com/jameslamb) - Karthikeyan (https://github.com/karthikeyann) URL: #17078
Configuration menu - View commit details
-
Copy full SHA for 7b80a44 - Browse repository at this point
Copy the full SHA 7b80a44View commit details -
Rewrite Java API
Table.readJSON
to return the output from libcudf `……read_json` directly (#17180) With this PR, `Table.readJSON` will return the output from libcudf `read_json` directly without the need of reordering the columns to match with the input schema, as well as generating all-nulls columns for the ones in the input schema that do not exist in the JSON data. This is because libcudf `read_json` already does these thus we no longer have to do it. Depends on: * #17029 Partially contributes to NVIDIA/spark-rapids#11560. Closes #17002 Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #17180
Configuration menu - View commit details
-
Copy full SHA for e8935b9 - Browse repository at this point
Copy the full SHA e8935b9View commit details -
Implement inequality joins by translation to conditional joins (#17000)
Implement inequality joins by using the newly-exposed conditional join from pylibcudf. - Closes #16926 Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17000
Configuration menu - View commit details
-
Copy full SHA for 150d8d8 - Browse repository at this point
Copy the full SHA 150d8d8View commit details -
Wrap custom iterator result (#17251)
Fixes: #17165 Fixes: #14481 This PR properly wraps the result of custom iterator. ```python In [2]: import pandas as pd In [3]: s = pd.Series([10, 1, 2, 3, 4, 5]*1000000) # Without custom_iter: In [4]: %timeit for i in s: True 6.34 s ± 25.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) # This PR: In [4]: %timeit for i in s: True 6.16 s ± 17.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) # On `branch-24.12`: 1.53 s ± 6.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` I think `custom_iter` has to exist. Here is why, invoking any sort of `iteration` on GPU objects will raise errors and thus in the end we fall-back to CPU. Instead of trying to move the objects from host to device memory (if the object is on host memory only), we will avoid a CPU-to-GPU transfer. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17251
Configuration menu - View commit details
-
Copy full SHA for 0f1ae26 - Browse repository at this point
Copy the full SHA 0f1ae26View commit details -
Make constructor of DeviceMemoryBufferView public (#17265)
Make constructor of DeviceMemoryBufferView and ContiguousTable public. Authors: - Renjie Liu (https://github.com/liurenjie1024) Approvers: - Jason Lowe (https://github.com/jlowe) URL: #17265
Configuration menu - View commit details
-
Copy full SHA for 263a7ff - Browse repository at this point
Copy the full SHA 263a7ffView commit details -
remove WheelHelpers.cmake (#17276)
Related to rapidsai/build-planning#33 and rapidsai/build-planning#74 The last use of CMake function `install_aliased_imported_targets()` here was removed in #16946. This proposes removing the file holding its definition. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #17276
Configuration menu - View commit details
-
Copy full SHA for c46cf76 - Browse repository at this point
Copy the full SHA c46cf76View commit details -
Switch to using
TaskSpec
(#17285)dask/dask-expr#1159 made upstream changes in `dask-expr` to use `TaskSpec`, this PR updates `dask-cudf` to be compatible with those changes. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) URL: #17285
Configuration menu - View commit details
-
Copy full SHA for 990734f - Browse repository at this point
Copy the full SHA 990734fView commit details -
Improve the performance of low cardinality groupby (#16619)
This PR enhances groupby performance for low-cardinality input cases. When applicable, it leverages shared memory for initial aggregation, followed by global memory aggregation to reduce atomic contention and improve performance. Authors: - Yunsong Wang (https://github.com/PointKernel) - Mike Wilson (https://github.com/hyperbolic2346) Approvers: - David Wendt (https://github.com/davidwendt) - Mike Wilson (https://github.com/hyperbolic2346) - Vyas Ramasubramani (https://github.com/vyasr) URL: #16619
Configuration menu - View commit details
-
Copy full SHA for 2e0d2d6 - Browse repository at this point
Copy the full SHA 2e0d2d6View commit details -
Add
cudf::calendrical_month_sequence
to pylibcudf (#17277)Apart of #15162. Also adds tests for `pylibcudf.filling`. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #17277
Configuration menu - View commit details
-
Copy full SHA for d295f17 - Browse repository at this point
Copy the full SHA d295f17View commit details -
Add read_parquet_metadata to pylibcudf (#17245)
Contributes to #15162 Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17245
Configuration menu - View commit details
-
Copy full SHA for fea46cd - Browse repository at this point
Copy the full SHA fea46cdView commit details -
Follow up making Python tests more deterministic (#17272)
Addressing comments in https://github.com/rapidsai/cudf/pull/17008/files#r1823318321 and https://github.com/rapidsai/cudf/pull/17008/files#r1823318898 Didn't touch the `_fuzz_testing` directory because maybe we don't want that to be deterministic? Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) - GALI PREM SAGAR (https://github.com/galipremsagar) - James Lamb (https://github.com/jameslamb) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17272
Configuration menu - View commit details
-
Copy full SHA for db69c52 - Browse repository at this point
Copy the full SHA db69c52View commit details
Commits on Nov 9, 2024
-
Use numba-cuda<0.0.18 (#17280)
Numba-cuda 0.0.18 (not yet released) contains some changes that might break pynvjitlink patching. In order to avoid breaking RAPIDS CI whilst working through that after releasing numba-cuda 0.0.18 but before the next pynvjitlink, this PR makes use of numba-cuda 0.0.17 or less a requirement. Authors: - Graham Markall (https://github.com/gmarkall) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - https://github.com/brandon-b-miller - Vyas Ramasubramani (https://github.com/vyasr) URL: #17280
Configuration menu - View commit details
-
Copy full SHA for 0fc5fab - Browse repository at this point
Copy the full SHA 0fc5fabView commit details -
Use pylibcudf enums in cudf Python quantile (#17287)
Shouldn't need to use the "private" `pylibcudf.libcudf` types anymore now that the Python side enums are exposed Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17287
Configuration menu - View commit details
-
Copy full SHA for e399e95 - Browse repository at this point
Copy the full SHA e399e95View commit details -
Use more pylibcudf Python enums in cudf._lib (#17288)
Similar to #17287. Also remove a `plc` naming shadowing Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17288
Configuration menu - View commit details
-
Copy full SHA for 7a499f6 - Browse repository at this point
Copy the full SHA 7a499f6View commit details -
Expose delimiter character in JSON reader options to JSON reader APIs (…
…#17266) Fixes #17261 Removes delimiter symbol group from whitespace normalization FST since it is run post-tokenization. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Karthikeyan (https://github.com/karthikeyann) URL: #17266
Configuration menu - View commit details
-
Copy full SHA for 5cbdcd0 - Browse repository at this point
Copy the full SHA 5cbdcd0View commit details
Commits on Nov 12, 2024
-
Fix
Dataframe.__setitem__
slow-downs (#17222)Fixes: #17140 This PR fixes slow-downs in `DataFrame.__seitem__` by properly passing in CPU objects where needed instead of passing a GPU object and then failing and performing a GPU -> CPU transfer. `DataFrame.__setitem__` first argument can be a column(pd.Index), in our fast path this will be converted to `cudf.Index` and thus there will be failure from cudf side and then the transfer to CPU + slow-path executes, this is the primary reason for slowdown. This PR maintains a dict mapping of such special functions where we shouldn't be converting the objects to fast path. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17222
Configuration menu - View commit details
-
Copy full SHA for 84743c3 - Browse repository at this point
Copy the full SHA 84743c3View commit details -
Expose streams in public quantile APIs (#17257)
Adds stream parameter to ``` cudf::quantile cudf::quantiles cudf::percentile_approx ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - David Wendt (https://github.com/davidwendt) URL: #17257
Configuration menu - View commit details
-
Copy full SHA for 61031cc - Browse repository at this point
Copy the full SHA 61031ccView commit details -
cmake option:
CUDF_KVIKIO_REMOTE_IO
(#17291)Compile flag to enable/disable remote IO through KvikIO: `CUDF_KVIKIO_REMOTE_IO` Authors: - Mads R. B. Kristensen (https://github.com/madsbk) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Bradley Dice (https://github.com/bdice) URL: #17291
Configuration menu - View commit details
-
Copy full SHA for bdddab3 - Browse repository at this point
Copy the full SHA bdddab3View commit details -
Replace workaround of JNI build with CUDF_KVIKIO_REMOTE_IO=OFF (#17293)
JNI build does not require kvikIO, to unblock the build use `CUDF_KVIKIO_REMOTE_IO=OFF` in cpp build phase. this should be merged after #17291 Authors: - Peixin (https://github.com/pxLi) Approvers: - Nghia Truong (https://github.com/ttnghia) URL: #17293
Configuration menu - View commit details
-
Copy full SHA for 202c231 - Browse repository at this point
Copy the full SHA 202c231View commit details -
[FEA] Report all unsupported operations for a query in cudf.polars (#…
…16960) Closes #16690. The purpose of this PR is to list all of the unique operations that are unsupported by `cudf.polars` when running a query. 1. Question: How to traverse the tree to report the error nodes? Should this be done upstream in Polars? 2. Instead of traversing the query afterwards, we should probably catch each unsupported feature as we translate the IR. Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Lawrence Mitchell (https://github.com/wence-) URL: #16960
Configuration menu - View commit details
-
Copy full SHA for 043bcbd - Browse repository at this point
Copy the full SHA 043bcbdView commit details -
Add new nvtext minhash_permuted API (#16756)
Introduce new nvtext minhash API that takes a single seed for hashing and 2 parameter vectors to calculate the minhash results from the seed hash: ``` std::unique_ptr<cudf::column> minhash_permuted( cudf::strings_column_view const& input, uint32_t seed, cudf::device_span<uint32_t const> parameter_a, cudf::device_span<uint32_t const> parameter_b, cudf::size_type width, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr); ``` The `seed` is used to hash the `input` using rolling set of substrings `width` characters wide. The hashes are then combined with the values in `parameter_a` and `parameter_b` to calculate a set of 32-bit (or 64-bit) values for each row. Only the minimum value is returned per element of `a` and `b` when combined with all the hashes for a row. Each output row is a set of M values where `M = parameter_a.size() = parameter_b.size()` This implementation is significantly faster than the current minhash which computes hashes for multiple seeds. Included in this PR is also the `minhash64_permuted()` API that is identical but uses 64-bit values for the seed and the parameter values. Also included are new tests and a benchmark as well as the pylibcudf and cudf interfaces. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Matthew Murray (https://github.com/Matt711) - Lawrence Mitchell (https://github.com/wence-) - Karthikeyan (https://github.com/karthikeyann) - Yunsong Wang (https://github.com/PointKernel) URL: #16756
Configuration menu - View commit details
-
Copy full SHA for ccfc95a - Browse repository at this point
Copy the full SHA ccfc95aView commit details -
Add type stubs for pylibcudf (#17258)
Having looked at a bunch of the automation options, I just did it by hand. A followup will add some automation to add docstrings (so we can see those via LSP integration in editors) and do some simple validation. - Closes #15190 Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #17258
Configuration menu - View commit details
-
Copy full SHA for 7682edb - Browse repository at this point
Copy the full SHA 7682edbView commit details -
Add cudf::strings::contains_multiple (#16900)
Add new `cudf::strings::contains_multiple` API to search multiple targets within a strings column. Output is a table where the number of columns is the number of targets and each row is a boolean indicating that target was found at the row or not. This PR is to help in collaboration with #16641 Authors: - David Wendt (https://github.com/davidwendt) - GALI PREM SAGAR (https://github.com/galipremsagar) - Chong Gao (https://github.com/res-life) - Bradley Dice (https://github.com/bdice) Approvers: - Chong Gao (https://github.com/res-life) - Yunsong Wang (https://github.com/PointKernel) - MithunR (https://github.com/mythrocks) - Tianyu Liu (https://github.com/kingcrimsontianyu) - Bradley Dice (https://github.com/bdice) URL: #16900
Configuration menu - View commit details
-
Copy full SHA for 796de4b - Browse repository at this point
Copy the full SHA 796de4bView commit details -
enforce wheel size limits, README formatting in CI (#17284)
Contributes to rapidsai/build-planning#110 Proposes adding 2 types of validation on wheels in CI, to ensure we continue to produce wheels that are suitable for PyPI. * checks on wheel size (compressed), - *to be sure they're under PyPI limits* - *and to prompt discussion on PRs that significantly increase wheel sizes* * checks on README formatting - *to ensure they'll render properly as the PyPI project homepages* - *e.g. like how https://github.com/scikit-learn/scikit-learn/blob/main/README.rst becomes https://pypi.org/project/scikit-learn/* ## Notes for Reviewers ### How I tested this Initially set the size threshold for `libcudf` to a value that I knew it'd violate (75MB compressed, when the wheels are 400+ MB compressed). Saw CI fail as expected, and print a summary with the expected contents. ```text checking 'final_dist/libcudf_cu11-24.12.0a333-py3-none-manylinux_2_28_aarch64.whl' ----- package inspection summary ----- file size * compressed size: 0.4G * uncompressed size: 0.6G * compression space saving: 34.6% contents * directories: 164 * files: 1974 (2 compiled) size by extension * .so - 0.6G (97.0%) * .h - 6.7M (1.0%) * no-extension - 4.8M (0.7%) * .cuh - 3.8M (0.6%) * .hpp - 2.2M (0.3%) * .a - 1.1M (0.2%) * .inl - 0.8M (0.1%) * .cmake - 0.1M (0.0%) * .md - 8.3K (0.0%) * .py - 4.0K (0.0%) * .pc - 0.2K (0.0%) * .txt - 34.0B (0.0%) largest files * (0.6G) libcudf/lib64/libcudf.so * (3.3M) libcudf/bin/flatc * (1.0M) libcudf/lib64/libflatbuffers.a * (0.5M) libcudf/include/libcudf/rapids/libcudacxx/cuda/std/__atomic/functions/cuda_ptx_generated.h * (0.2M) libcudf_cu11-24.12.0a333.dist-info/RECORD ------------ check results ----------- 1. [distro-too-large-compressed] Compressed size 0.4G is larger than the allowed size (75.0M). errors found while checking: 1 ``` ([build link](https://github.com/rapidsai/cudf/actions/runs/11748370606/job/32732391718?pr=17284#step:13:3062)) Updated that threshold in `python/libcudf/pyproject.toml`, and saw the build succeed (but the summary still printed). # Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17284
Configuration menu - View commit details
-
Copy full SHA for 1f9ad2f - Browse repository at this point
Copy the full SHA 1f9ad2fView commit details -
Polars 1.13 is out, so add support for that. I needed to change some of the logic in the callback raising after @Matt711's changes, I am not sure why tests were passing previously. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Matthew Murray (https://github.com/Matt711) - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17299
Configuration menu - View commit details
-
Copy full SHA for bbaa1ab - Browse repository at this point
Copy the full SHA bbaa1abView commit details -
Always prefer
device_read
s anddevice_write
s when kvikIO is enabl……ed (#17260) Issue #17259 Avoid checking `_gds_read_preferred_threshold` threshold when deciding whether `device_read`/device_write` is preferred to host IO + copy. The reasons are twofold: 1. KvikIO already has an internal threshold for GDS use so we don't need to check on our end as well. 2. Without actual GDS use, kvikIO uses a pinned bounce buffer to efficiently copy to/from the device. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Tianyu Liu (https://github.com/kingcrimsontianyu) - Basit Ayantunde (https://github.com/lamarrr) URL: #17260
Configuration menu - View commit details
-
Copy full SHA for 487f97c - Browse repository at this point
Copy the full SHA 487f97cView commit details
Commits on Nov 13, 2024
-
Raise errors on specific types of fallback in
cudf.pandas
(#17268)Closes #14975 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17268
Configuration menu - View commit details
-
Copy full SHA for 76a5e32 - Browse repository at this point
Copy the full SHA 76a5e32View commit details -
Expose stream-ordering in public transpose API (#17294)
Adds stream parameter to `cudf::transpose`. Verifies correct stream forwarding with stream gtests. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: #17294
Configuration menu - View commit details
-
Copy full SHA for f5c0e5c - Browse repository at this point
Copy the full SHA f5c0e5cView commit details -
Exclude nanoarrow and flatbuffers from installation (#17308)
This change helps shrink RAPIDS wheels. It should not affect Spark builds since those use the build directory of cudf and statically link in those components to its final binary. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) - Bradley Dice (https://github.com/bdice) URL: #17308
Configuration menu - View commit details
-
Copy full SHA for 918266a - Browse repository at this point
Copy the full SHA 918266aView commit details -
Add
catboost
to the third-party integration tests (#17267)Closes #15397 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Matthew Roeschke (https://github.com/mroeschke) URL: #17267
Configuration menu - View commit details
-
Copy full SHA for 1b045dd - Browse repository at this point
Copy the full SHA 1b045ddView commit details -
Fixed lifetime issue in ast transform tests (#17292)
Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: #17292
Configuration menu - View commit details
-
Copy full SHA for c4a4a91 - Browse repository at this point
Copy the full SHA c4a4a91View commit details -
Replace FindcuFile with upstream FindCUDAToolkit support (#17298)
CMake's `FindCUDAToolkit` has supported cuFile since 3.25. Use this support and remove the custom `FindcuFile` module. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) - Yunsong Wang (https://github.com/PointKernel) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17298
Configuration menu - View commit details
-
Copy full SHA for 6acd33d - Browse repository at this point
Copy the full SHA 6acd33dView commit details -
Fix synchronization bug in bool parquet mukernels (#17302)
This fixes a synchronization bug in the parquet microkernels for plain-decoding bools. This closes [several](NVIDIA/spark-rapids#11715) timing [issues](NVIDIA/spark-rapids#11716) found during testing of spark-rapids. Authors: - Paul Mattione (https://github.com/pmattione-nvidia) Approvers: - Bradley Dice (https://github.com/bdice) - Vukasin Milovanovic (https://github.com/vuule) URL: #17302
Configuration menu - View commit details
-
Copy full SHA for 5e40691 - Browse repository at this point
Copy the full SHA 5e40691View commit details -
Update CI jobs to include Polars in nightlies and improve IWYU (#17306)
This PR adds Polars tests to our nightly runs now that [we no longer only fail conditional on certain files changing in PRs](#17299). This PR also updates the IWYU jobs to use [the version released three days ago, which supports clang 19 like we need](https://github.com/include-what-you-use/include-what-you-use/releases/tag/0.23). It also fixes a couple of errors in the CMake for how we were setting compile flags for IWYU. Closes #16383 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17306
Configuration menu - View commit details
-
Copy full SHA for 8294953 - Browse repository at this point
Copy the full SHA 8294953View commit details -
Move strings filter benchmarks to nvbench (#17269)
Move `cpp/benchmark/string/filter.cpp` from google-test to nvbench Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17269
Configuration menu - View commit details
-
Copy full SHA for 13c7115 - Browse repository at this point
Copy the full SHA 13c7115View commit details -
Clean up misc, unneeded pylibcudf.libcudf in cudf._lib (#17309)
* Removed `ctypedef const scalar constscalar` usage * Use `dtype_to_pylibcudf_type` where appropriate * Use pylibcudf enums instead of `pylibcudf.libcudf` types Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17309
Configuration menu - View commit details
-
Copy full SHA for 353d2de - Browse repository at this point
Copy the full SHA 353d2deView commit details
Commits on Nov 14, 2024
-
Add documentation for low memory readers (#17314)
Closes #16443 Authors: - Brian Tepera (https://github.com/btepera) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17314
Configuration menu - View commit details
-
Copy full SHA for 9da8eb2 - Browse repository at this point
Copy the full SHA 9da8eb2View commit details -
Polars: DataFrame Serialization (#17062)
Use pylibcudf’s pack and unpack to implement Dask compatible serialization. Authors: - Mads R. B. Kristensen (https://github.com/madsbk) - Lawrence Mitchell (https://github.com/wence-) - Richard (Rick) Zamora (https://github.com/rjzamora) - Vyas Ramasubramani (https://github.com/vyasr) - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Matthew Murray (https://github.com/Matt711) - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: #17062
Configuration menu - View commit details
-
Copy full SHA for 5d5b35d - Browse repository at this point
Copy the full SHA 5d5b35dView commit details -
Java JNI for Multiple contains (#17281)
This is Java JNI interface for [multiple contains PR](#16900) Authors: - Chong Gao (https://github.com/res-life) Approvers: - Alessandro Bellina (https://github.com/abellina) - Robert (Bobby) Evans (https://github.com/revans2) URL: #17281
Configuration menu - View commit details
-
Copy full SHA for 4cd40ee - Browse repository at this point
Copy the full SHA 4cd40eeView commit details -
Resolves #3155. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Robert Maynard (https://github.com/robertmaynard) URL: #17312
Configuration menu - View commit details
-
Copy full SHA for d93c3fc - Browse repository at this point
Copy the full SHA d93c3fcView commit details -
Fix reading of single-row unterminated CSV files (#17305)
Fixed the logic in the CSV reader that led to empty output instead of producing a table with a single column and one row. Added tests to make sure the new logic does not cause regressions. Also did some small clean up around the fix. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #17305
Configuration menu - View commit details
-
Copy full SHA for a7194f6 - Browse repository at this point
Copy the full SHA a7194f6View commit details -
prefer wheel-provided libcudf.so in load_library(), use RTLD_LOCAL (#…
…17316) Contributes to rapidsai/build-planning#118 Modifies `libcudf.load_library()` in the following ways: * prefer wheel-provided `libcudf.so` to system installation * expose environment variable `RAPIDS_LIBCUDF_PREFER_SYSTEM_LIBRARY` for switching that preference * load `libcudf.so` with `RTLD_LOCAL`, to prevent adding symbols to the global namespace ([dlopen docs](https://linux.die.net/man/3/dlopen)) ## Notes for Reviewers ### How I tested this Locally (x86_64, CUDA 12, Python 3.12), built `libcudf`, `pylibcudf`, and `cudf` wheels from this branch, then `libcuspatial` and `cuspatial` from the corresponding cuspatial branch. Ran `cuspatial`'s unit tests, and tried setting the environment variable and inspecting `ld`'s logs to confirm that the environment variable changed the loading and search behavior. e.g. ```shell # clear ld cache to avoid cheating rm -f /etc/ld.so.cache ldconfig # try using an env variable to say "prefer the system-installed version" LD_DEBUG=libs \ LD_DEBUG_OUTPUT=/tmp/out.txt \ RAPIDS_LIBCUDF_PREFER_SYSTEM_LIBRARY=true \ python -c "import cuspatial; print(cuspatial.__version__)" cat /tmp/out.txt.* > prefer-system.txt # (then manually looked through those logs to confirm it searched places like /usr/lib64 and /lib64) ``` # Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: #17316
Configuration menu - View commit details
-
Copy full SHA for 66c5a2d - Browse repository at this point
Copy the full SHA 66c5a2dView commit details
Commits on Nov 15, 2024
-
Do not exclude nanoarrow and flatbuffers from installation if statica…
…lly linked (#17322) Had an issue crop up in spark-rapids-jni where we statically link arrow and the build started to fail due to change #17308. Authors: - Mike Wilson (https://github.com/hyperbolic2346) Approvers: - Gera Shegalov (https://github.com/gerashegalov) - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #17322
Configuration menu - View commit details
-
Copy full SHA for 927ae9c - Browse repository at this point
Copy the full SHA 927ae9cView commit details -
Update java datetime APIs to match CUDF. (#17329)
This updates the java APIs related to datetime processing so that they match the CUDF APIs. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - MithunR (https://github.com/mythrocks) - Jason Lowe (https://github.com/jlowe) - Gera Shegalov (https://github.com/gerashegalov) URL: #17329
Configuration menu - View commit details
-
Copy full SHA for 8a9131a - Browse repository at this point
Copy the full SHA 8a9131aView commit details -
Remove cudf._lib.avro in favor of inlining pylicudf (#17319)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17319
Configuration menu - View commit details
-
Copy full SHA for d67d017 - Browse repository at this point
Copy the full SHA d67d017View commit details -
Fix various issues with
replace
API and add support indatetime
a……nd `timedelta` columns (#17331) This PR: - [x] Adds support for `find_and_replace` in `DateTimeColumn` and `TimeDeltaColumn`, such that when `.replace` is called on a series or dataframe with these columns, we don't error and replace the values correctly. - [x] Fixed various type combination edge cases that were previously incorrectly handled and updated stale tests associated with them. - [x] Added a small parquet file in pytests that has multiple rows that uncovered these bugs. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17331
Configuration menu - View commit details
-
Copy full SHA for d475dca - Browse repository at this point
Copy the full SHA d475dcaView commit details -
Implement
cudf-polars
chunked parquet reading (#16944)This PR provides access to the libcudf chunked parquet reader through the `cudf-polars` gpu engine, inspired by the cuDF python implementation. Closes #16818 Authors: - https://github.com/brandon-b-miller - GALI PREM SAGAR (https://github.com/galipremsagar) - Lawrence Mitchell (https://github.com/wence-) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Lawrence Mitchell (https://github.com/wence-) URL: #16944
Configuration menu - View commit details
-
Copy full SHA for aa8c0c4 - Browse repository at this point
Copy the full SHA aa8c0c4View commit details -
Remove another reference to
FindcuFile
(#17315)The reference in JNI was missed in #17298. Replace it with `FindCUDAToolkit`. Also backport `FindCUDAToolkit` from CMake 3.31 to get https://gitlab.kitware.com/cmake/cmake/-/commit/b38a8e77cb3c8401b3022a68f07a4fd77b290524. Also add an option to statically link `cuFile`. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) - Gera Shegalov (https://github.com/gerashegalov) URL: #17315
Configuration menu - View commit details
-
Copy full SHA for 81cd4a0 - Browse repository at this point
Copy the full SHA 81cd4a0View commit details -
add telemetry setup to test (#16924)
This is a prototype implementation of rapidsai/build-infra#139 The work that this builds on: * rapidsai/gha-tools#118, which adds a shell wrapper that automatically creates spans for the commands that it wraps. It also uses the `opentelemetry-instrument` command to set up monkeypatching for supported Python libraries, if the command is python-based * https://github.com/rapidsai/shared-workflows/tree/add-telemetry, which installs the gha-tools work from above and sets necessary environment variables. This is only done for the conda-cpp-build.yaml shared workflow at the time of submitting this PR. The goal of this PR is to observe telemetry data sent from a GitHub Actions build triggered by this PR as a proof of concept. Once it all works, the remaining work is: * merge rapidsai/gha-tools#118 * Move the opentelemetry-related install stuff in https://github.com/rapidsai/shared-workflows/compare/add-telemetry?expand=1#diff-ca6188672785b5d214aaac2bf77ce0528a48481b2a16b35aeb78ea877b2567bcR118-R125 into https://github.com/rapidsai/ci-imgs, and rebuild ci-imgs * expand coverage to other shared workflows * Incorporate the changes from this PR to other jobs and to other repos Authors: - Mike Sarahan (https://github.com/msarahan) Approvers: - Bradley Dice (https://github.com/bdice) URL: #16924
Configuration menu - View commit details
-
Copy full SHA for 8664fad - Browse repository at this point
Copy the full SHA 8664fadView commit details -
Update cmake to 3.28.6 in JNI Dockerfile (#17342)
Updates cmake to 3.28.6 in the JNI Dockerfile used to build the cudf jar. This helps avoid a bug in older cmake where FindCUDAToolkit can fail to find cufile libraries. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Nghia Truong (https://github.com/ttnghia) - Gera Shegalov (https://github.com/gerashegalov) URL: #17342
Configuration menu - View commit details
-
Copy full SHA for e683647 - Browse repository at this point
Copy the full SHA e683647View commit details
Commits on Nov 16, 2024
-
Use pylibcudf contiguous split APIs in cudf python (#17246)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #17246
Configuration menu - View commit details
-
Copy full SHA for 9cc9071 - Browse repository at this point
Copy the full SHA 9cc9071View commit details
Commits on Nov 18, 2024
-
Move strings translate benchmarks to nvbench (#17325)
Moves `cpp/benchmarks/string/translate.cpp` implementation from google-bench to nvbench. This is benchmark for the `cudf::strings::translate` API. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) URL: #17325
Configuration menu - View commit details
-
Copy full SHA for e4de8e4 - Browse repository at this point
Copy the full SHA e4de8e4View commit details -
Move cudf._lib.unary to cudf.core._internals (#17318)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17318
Configuration menu - View commit details
-
Copy full SHA for aeb6a30 - Browse repository at this point
Copy the full SHA aeb6a30View commit details -
Reading multi-source compressed JSONL files (#17161)
Fixes #17068 Fixes #12299 This PR introduces a new datasource for compressed inputs which enables batching and byte range reading of multi-source JSONL files using the reallocate-and-retry policy. Moreover. instead of using a 4:1 compression ratio heuristic, the device buffer size is estimated accurately for GZIP, ZIP, and SNAPPY compression types. For remaining types, the files are first decompressed then batched. ~~TODO: Reuse existing JSON tests but with an additional compression parameter to verify correctness.~~ ~~Handled by #17219, which implements compressed JSON writer required for the above test.~~ Multi-source compressed input tests added! Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Karthikeyan (https://github.com/karthikeyann) URL: #17161
Configuration menu - View commit details
-
Copy full SHA for 03ac845 - Browse repository at this point
Copy the full SHA 03ac845View commit details -
Test the full matrix for polars and dask wheels on nightlies (#17320)
This PR ensures that we have nightly coverage of more of the CUDA/Python/arch versions that we claim to support for dask-cudf and cudf-polars wheels. In addition, this PR ensures that we do not attempt to run the dbgen executable in the Polars repository on systems with too old of a glibc to support running them. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17320
Configuration menu - View commit details
-
Copy full SHA for d514517 - Browse repository at this point
Copy the full SHA d514517View commit details -
Fix reading Parquet string cols when
nrows
andinput_pass_limit
>…… 0 (#17321) This PR fixes reading string columns in Parquet using chunked parquet reader when `nrows` and `input_pass_limit` are > 0. Closes #17311 Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Ed Seidl (https://github.com/etseidl) - Lawrence Mitchell (https://github.com/wence-) - Bradley Dice (https://github.com/bdice) - https://github.com/nvdbaranec - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17321
Configuration menu - View commit details
-
Copy full SHA for 43f2f68 - Browse repository at this point
Copy the full SHA 43f2f68View commit details -
Remove cudf._lib.hash in favor of inlining pylibcudf (#17345)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17345
Configuration menu - View commit details
-
Copy full SHA for 18b40dc - Browse repository at this point
Copy the full SHA 18b40dcView commit details -
Remove cudf._lib.concat in favor of inlining pylibcudf (#17344)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17344
Configuration menu - View commit details
-
Copy full SHA for ba21673 - Browse repository at this point
Copy the full SHA ba21673View commit details -
Remove cudf._lib.quantiles in favor of inlining pylibcudf (#17347)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17347
Configuration menu - View commit details
-
Copy full SHA for 02c35bf - Browse repository at this point
Copy the full SHA 02c35bfView commit details -
Remove cudf._lib.labeling in favor of inlining pylibcudf (#17346)
Contributes to #17317 Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17346
Configuration menu - View commit details
-
Copy full SHA for 302e625 - Browse repository at this point
Copy the full SHA 302e625View commit details
Commits on Nov 19, 2024
-
1.13 was yanked for some reason, but 1.14 doesn't bring anything new and difficult. Authors: - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - https://github.com/brandon-b-miller - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17355
Configuration menu - View commit details
-
Copy full SHA for 5f9a97f - Browse repository at this point
Copy the full SHA 5f9a97fView commit details -
Writing compressed output using JSON writer (#17323)
Depends on #17161 for implementations of compression and decompression functions (`io/comp/comp.cu`, `io/comp/comp.hpp`, `io/comp/io_uncomp.hpp` and `io/comp/uncomp.cpp`) Adds support for writing GZIP- and SNAPPY-compressed JSON to the JSON writer. Verifies correctness using a parameterized test in `tests/io/json/json_writer.cpp` Authors: - Shruti Shivakumar (https://github.com/shrshi) - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Karthikeyan (https://github.com/karthikeyann) - Vukasin Milovanovic (https://github.com/vuule) URL: #17323
Configuration menu - View commit details
-
Copy full SHA for 384abae - Browse repository at this point
Copy the full SHA 384abaeView commit details -
fix library-loading issues in editable installs (#17338)
Contributes to rapidsai/build-planning#118 The pattern introduced in #17316 breaks editable installs in devcontainers. In that type of build, `libcudf.so` is built outside of the wheel but **not installed**, so it can't be found by `ld`. Extension modules in `cudf` and `pylibcudf` are able to find it via RPATHs instead. This proposes: * try-catching the entire library-loading attempt, to silently do nothing in cases like that * ~adding imports of the `cudf` and `pylibcudf` libraries in the `devcontainers` CI job, as a smoke test to catch issues like this in the future~ *(edit: removed those, [`devcontainer` builds run on CPU nodes](https://github.com/rapidsai/shared-workflows/blob/4e84062f333ce5649bc65029d3979569e2d0a045/.github/workflows/build-in-devcontainer.yaml#L19))* ## Notes for Reviewers ### How I tested this Tested this approach on rapidsai/kvikio#553 # Authors: - James Lamb (https://github.com/jameslamb) - Matthew Murray (https://github.com/Matt711) Approvers: - Bradley Dice (https://github.com/bdice) - Matthew Murray (https://github.com/Matt711) URL: #17338
Configuration menu - View commit details
-
Copy full SHA for 9c5cd81 - Browse repository at this point
Copy the full SHA 9c5cd81View commit details -
Fix integer overflow in compiled binaryop (#17354)
For large columns, the computed stride might end up overflowing size_type. To fix this, use the grid_1d helper. See also #10368. - Closes #17353 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) - Tianyu Liu (https://github.com/kingcrimsontianyu) - Muhammad Haseeb (https://github.com/mhaseeb123) - Nghia Truong (https://github.com/ttnghia) URL: #17354
Configuration menu - View commit details
-
Copy full SHA for c7bfa77 - Browse repository at this point
Copy the full SHA c7bfa77View commit details -
Move strings replace benchmarks to nvbench (#17301)
Move `cpp/benchmark/string/replace.cpp` implementation from google-test to nvbench This covers strings replace APIs: - `cudf::strings::replace` scalar version - `cudf::strings::replace_multiple` column version - `cudf::strings::replace_slice` Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Shruti Shivakumar (https://github.com/shrshi) URL: #17301
Configuration menu - View commit details
-
Copy full SHA for 03c055f - Browse repository at this point
Copy the full SHA 03c055fView commit details -
Optimize distinct inner join to use set
find
instead ofretrieve
(#……17278) This PR introduces a minor optimization for distinct inner joins by using the `find` results to selectively copy matches to the output. This approach eliminates the need for the costly `retrieve` operation, which relies on expensive atomic operations. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) URL: #17278
Configuration menu - View commit details
-
Copy full SHA for 56061bd - Browse repository at this point
Copy the full SHA 56061bdView commit details
Commits on Nov 20, 2024
-
Add compute_column_expression to pylibcudf for transform.compute_colu…
…mn (#17279) Follow up to #16760 `transform.compute_column` (backing `.eval`) requires an `Expression` object created by a private routine in cudf Python. Since this routine will be needed for any user of the public `transform.compute_column`, moving it to pylibcudf. Authors: - Matthew Roeschke (https://github.com/mroeschke) - Lawrence Mitchell (https://github.com/wence-) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17279
Configuration menu - View commit details
-
Copy full SHA for 7158ee0 - Browse repository at this point
Copy the full SHA 7158ee0View commit details -
Bug fix: restrict lines=True to JSON format in Kafka read_gdf method (#…
…17333) This pull request modifies the read_gdf method in kafka.py to pass the lines parameter only when the message_format is "json". This prevents lines from being passed to other formats (e.g., CSV, Avro, ORC, Parquet), which do not support this parameter. Authors: - Hirota Akio (https://github.com/a-hirota) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17333
Configuration menu - View commit details
-
Copy full SHA for 05365af - Browse repository at this point
Copy the full SHA 05365afView commit details -
Adapt to KvikIO API change in the compatibility mode (#17377)
This PR adapts cuDF to a breaking API change in KvikIO (rapidsai/kvikio#547) introduced recently, which adds the `AUTO` compatibility mode to file I/O. This PR causes no behavioral changes in cuDF: If the environment variable `KVIKIO_COMPAT_MODE` is left unset, cuDF by default still enables the compatibility mode in KvikIO. This is the same with the previous behavior (#17185). Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Vukasin Milovanovic (https://github.com/vuule) URL: #17377
Configuration menu - View commit details
-
Copy full SHA for 6f83b58 - Browse repository at this point
Copy the full SHA 6f83b58View commit details -
Benchmarking JSON reader for compressed inputs (#17219)
Depends on #17161 for implementations of compression and decompression functions (`io/comp/comp.cu`, `io/comp/comp.hpp`, `io/comp/io_uncomp.hpp` and `io/comp/uncomp.cpp`)\ Depends on #17323 for compressed JSON writer implementation. Adds benchmark to measure performance of the JSON reader for compressed inputs. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - MithunR (https://github.com/mythrocks) - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17219
Configuration menu - View commit details
-
Copy full SHA for fc08fe8 - Browse repository at this point
Copy the full SHA fc08fe8View commit details -
Deselect failing polars tests (#17362)
Deselect `test_join_4_columns_with_validity` which is failing in nightly CI tests and is reproducible in some systems (xref pola-rs/polars#19870), but apparently not all. Deselect `test_read_web_file` as well that fails on rockylinux8 due to SSL CA issues. Authors: - Peter Andreas Entschev (https://github.com/pentschev) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #17362
Configuration menu - View commit details
-
Copy full SHA for a2a62a1 - Browse repository at this point
Copy the full SHA a2a62a1View commit details -
Add new
dask_cudf.read_parquet
API (#17250)It's time to clean up the `dask_cudf.read_parquet` API and prioritize GPU-specific optimizations. To this end, it makes sense to expose our own `read_parquet` API within Dask cuDF. **Notes**: - The "new" `dask_cudf.read_parquet` API is only relevant when query-planning is enabled (the default). - Using `filesystem="arrow"` now uses `cudf.read_parquet` when reading from local storage (rather than PyArrow). - (specific to Dask cuDF): The default `blocksize` argument is now specific to the "smallest" NVIDIA device detected within the active dask cluster (or the first device visible to the the client). More specifically, we use `pynvml` to find this representative device size, and we set `blocksize` to be 1/32 this size. - The user may also pass in something like `blocksize=0.125` to use `1/8` the minimum device size (or `blocksize='1GiB'` to bypass the default logic altogether). - (specific to Dask cuDF): When `blocksize` is `None`, we disable partition fusion at optimization time. - (specific to Dask cuDF): When `blocksize` is **not** `None`, we use the parquet metadata from the first few files to inform partition fusion at optimization time (instead of a rough column-count ratio). Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) - Vyas Ramasubramani (https://github.com/vyasr) - Mads R. B. Kristensen (https://github.com/madsbk) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #17250
Configuration menu - View commit details
-
Copy full SHA for 3111aa4 - Browse repository at this point
Copy the full SHA 3111aa4View commit details -
Added Arrow Interop Benchmarks (#17194)
This merge request adds benchmarks for the Arrow Interop APIs: - `from_arrow_host` - `to_arrow_host` - `from_arrow_device` - `to_arrow_device` Closes #17104 Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - David Wendt (https://github.com/davidwendt) URL: #17194
Configuration menu - View commit details
-
Copy full SHA for be9ba6c - Browse repository at this point
Copy the full SHA be9ba6cView commit details -
Use
libcudf_exception_handler
throughoutpylibcudf.libcudf
(#17109)Closes #17036 (WIP, generated by a quick `sed` script) Authors: - https://github.com/brandon-b-miller - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17109
brandon-b-miller authoredNov 20, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 2e88835 - Browse repository at this point
Copy the full SHA 2e88835View commit details -
Extract
GPUEngine
config options at translation time (#17339)Follow up to #16944 That PR added `config: GPUEngine` to the arguments of every `IR.do_evaluate` function. In order to simplify future multi-GPU development, this PR extracts the necessary configuration argument at `IR` translation time instead. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) - Lawrence Mitchell (https://github.com/wence-) Approvers: - https://github.com/brandon-b-miller - Lawrence Mitchell (https://github.com/wence-) URL: #17339
rjzamora authoredNov 20, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for f550ccc - Browse repository at this point
Copy the full SHA f550cccView commit details -
Move strings url_decode benchmarks to nvbench (#17328)
Move `cpp/benchmarks/string/url_decode.cu` implementation from google-bench to nvbench. This benchmark is for the `cudf::strings::url_decode` API. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Nghia Truong (https://github.com/ttnghia) URL: #17328
davidwendt authoredNov 20, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 04502c8 - Browse repository at this point
Copy the full SHA 04502c8View commit details -
Support pivot with index or column arguments as lists (#17373)
closes #17360 Technically I suppose this was more of an enhancement since the documentation suggested only a single label was supported, but I'll mark as a bug since the error message was not informative. Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17373
mroeschke authoredNov 20, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 332cc06 - Browse repository at this point
Copy the full SHA 332cc06View commit details -
Move strings repeat benchmarks to nvbench (#17304)
Moves the `cpp/benchmarks/string/repeat_strings.cpp` implementation from google-bench to nvbench. This covers the overloads of the `cudf::strings::repeat_strings` API. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Yunsong Wang (https://github.com/PointKernel) URL: #17304
davidwendt authoredNov 20, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for d927992 - Browse repository at this point
Copy the full SHA d927992View commit details
Commits on Nov 21, 2024
-
Add
pynvml
as a dependency fordask-cudf
(#17386)#17250 started using `pynvml` but did not add the proper dependency, this change fixes the missing dependency. Authors: - Peter Andreas Entschev (https://github.com/pentschev) - https://github.com/jakirkham Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - https://github.com/jakirkham URL: #17386
pentschev authoredNov 21, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 68c4285 - Browse repository at this point
Copy the full SHA 68c4285View commit details -
Ignore errors when testing glibc versions (#17389)
This is likely the easiest fix for avoiding CI errors from this part of the code. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: #17389
vyasr authoredNov 21, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 0d9e577 - Browse repository at this point
Copy the full SHA 0d9e577View commit details -
Migrate CSV writer to pylibcudf (#17163)
Apart of #15162 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - David Wendt (https://github.com/davidwendt) - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) - Lawrence Mitchell (https://github.com/wence-) URL: #17163
Matt711 authoredNov 21, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for f54c1a5 - Browse repository at this point
Copy the full SHA f54c1a5View commit details
Commits on Nov 22, 2024
-
Enable unified memory by default in
cudf_polars
(#17375)This PR enables Unified memory as the default memory resource for `cudf_polars` --------- Co-authored-by: Vyas Ramasubramani <[email protected]> Co-authored-by: Vyas Ramasubramani <[email protected]> Co-authored-by: Matthew Murray <[email protected]> Co-authored-by: Lawrence Mitchell <[email protected]> Co-authored-by: Matthew Murray <[email protected]>
6 people authoredNov 22, 2024 An error occurred while loading commit statuses
If this error continues, check GitHub Status for more information.Configuration menu - View commit details
-
Copy full SHA for 305182e - Browse repository at this point
Copy the full SHA 305182eView commit details