Skip to content

Commit

Permalink
Merge branch 'branch-25.02' into account-for-raft-update
Browse files Browse the repository at this point in the history
cjnolet authored Jan 28, 2025
2 parents db12ab9 + 5527cdf commit d5f1500
Showing 181 changed files with 2,462 additions and 622 deletions.
2 changes: 1 addition & 1 deletion .devcontainer/cuda11.8-pip/devcontainer.json
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@
"args": {
"CUDA": "11.8",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:25.02-cpp-cuda11.8-ucx1.17.0-openmpi-ubuntu22.04"
"BASE": "rapidsai/devcontainers:25.02-cpp-cuda11.8-ucx1.18.0-openmpi-ubuntu22.04"
}
},
"runArgs": [
2 changes: 1 addition & 1 deletion .devcontainer/cuda12.5-pip/devcontainer.json
Original file line number Diff line number Diff line change
@@ -5,7 +5,7 @@
"args": {
"CUDA": "12.5",
"PYTHON_PACKAGE_MANAGER": "pip",
"BASE": "rapidsai/devcontainers:25.02-cpp-cuda12.5-ucx1.17.0-openmpi-ubuntu22.04"
"BASE": "rapidsai/devcontainers:25.02-cpp-cuda12.5-ucx1.18.0-openmpi-ubuntu22.04"
}
},
"runArgs": [
24 changes: 24 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
@@ -80,7 +80,30 @@ jobs:
node_type: "gpu-v100-latest-1"
run_script: "ci/build_docs.sh"
sha: ${{ inputs.sha }}
wheel-build-libcuvs:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.02
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
script: ci/build_wheel_libcuvs.sh
# build for every combination of arch and CUDA version, but only for the latest Python
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
wheel-publish-libcuvs:
needs: wheel-build-libcuvs
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-publish.yaml@branch-25.02
with:
build_type: ${{ inputs.build_type || 'branch' }}
branch: ${{ inputs.branch }}
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: libcuvs
package-type: cpp
wheel-build-cuvs:
needs: wheel-build-libcuvs
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.02
with:
@@ -99,3 +122,4 @@ jobs:
sha: ${{ inputs.sha }}
date: ${{ inputs.date }}
package-name: cuvs
package-type: python
12 changes: 11 additions & 1 deletion .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
@@ -22,6 +22,7 @@ jobs:
- conda-python-tests
- docs-build
- rust-build
- wheel-build-libcuvs
- wheel-build-cuvs
- wheel-tests-cuvs
- devcontainer
@@ -135,10 +136,19 @@ jobs:
arch: "amd64"
container_image: "rapidsai/ci-conda:latest"
run_script: "ci/build_rust.sh"
wheel-build-cuvs:
wheel-build-libcuvs:
needs: checks
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.02
with:
build_type: pull-request
script: ci/build_wheel_libcuvs.sh
# build for every combination of arch and CUDA version, but only for the latest Python
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
wheel-build-cuvs:
needs: wheel-build-libcuvs
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/wheels-build.yaml@branch-25.02
with:
build_type: pull-request
script: ci/build_wheel_cuvs.sh
9 changes: 7 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
# Copyright (c) 2022-2025, NVIDIA CORPORATION.

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
@@ -115,7 +120,7 @@ repos:
cpp/cmake/modules/FindAVX\.cmake|
- id: verify-alpha-spec
- repo: https://github.com/rapidsai/dependency-file-generator
rev: v1.16.0
rev: v1.17.0
hooks:
- id: rapids-dependency-file-generator
args: ["--clean"]
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -29,9 +29,9 @@ cuVS contains state-of-the-art implementations of several algorithms for running

Vector search is an information retrieval method that has been growing in popularity over the past few years, partly because of the rising importance of multimedia embeddings created from unstructured data and the need to perform semantic search on the embeddings to find items which are semantically similar to each other.

Vector search is also used in _data mining and machine learning_ tasks and comprises an important step in many _clustering_ and _visualization_ algorithms like [UMAP](https://arxiv.org/abs/2008.00325), [t-SNE](https://lvdmaaten.github.io/tsne/), K-means, and [HDBSCAN](https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html).
Vector search is also used in _data mining and machine learning_ tasks and comprises an important step in many _clustering_ and _visualization_ algorithms like [UMAP](https://arxiv.org/abs/2008.00325), [t-SNE](https://lvdmaaten.github.io/tsne/), K-means, and [HDBSCAN](https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html).

Finally, faster vector search enables interactions between dense vectors and graphs. Converting a pile of dense vectors into nearest neighbors graphs unlocks the entire world of graph analysis algorithms, such as those found in [GraphBLAS](https://graphblas.org/) and [cuGraph](https://github.com/rapidsai/cugraph).
Finally, faster vector search enables interactions between dense vectors and graphs. Converting a pile of dense vectors into nearest neighbors graphs unlocks the entire world of graph analysis algorithms, such as those found in [GraphBLAS](https://graphblas.org/) and [cuGraph](https://github.com/rapidsai/cugraph).

Below are some common use-cases for vector search

@@ -45,7 +45,7 @@ Below are some common use-cases for vector search
- Audio search
- Molecular search
- Model training


- ### Data mining
- Clustering algorithms
@@ -71,7 +71,7 @@ In addition to the items above, cuVS takes on the burden of keeping non-trivial

## cuVS Technology Stack

cuVS is built on top of the RAPIDS RAFT library of high performance machine learning primitives and provides all the necessary routines for vector search and clustering on the GPU.
cuVS is built on top of the RAPIDS RAFT library of high performance machine learning primitives and provides all the necessary routines for vector search and clustering on the GPU.

![cuVS is built on top of low-level CUDA libraries and provides many important routines that enable vector search and clustering on the GPU](img/tech_stack.png "cuVS Technology Stack")

@@ -103,7 +103,7 @@ pip install cuvs-cu11 --extra-index-url=https://pypi.nvidia.com
And CUDA 12 packages:
```bash
pip install cuvs-cu12 --extra-index-url=https://pypi.nvidia.com
```
```

### Nightlies
If installing a version that has not yet been released, the `rapidsai` channel can be replaced with `rapidsai-nightly`:
@@ -169,7 +169,7 @@ cuvsCagraIndexParamsDestroy(index_params);
cuvsResourcesDestroy(res);
```

For more code examples of the C APIs, including drop-in Cmake project templates, please refer to the [C examples](https://github.com/rapidsai/cuvs/tree/branch-24.10/examples/c)
For more code examples of the C APIs, including drop-in Cmake project templates, please refer to the [C examples](https://github.com/rapidsai/cuvs/tree/branch-25.02/examples/c)

### Rust API

@@ -232,15 +232,15 @@ fn cagra_example() -> Result<()> {
}
```

For more code examples of the Rust APIs, including a drop-in project templates, please refer to the [Rust examples](https://github.com/rapidsai/cuvs/tree/branch-24.10/examples/rust).
For more code examples of the Rust APIs, including a drop-in project templates, please refer to the [Rust examples](https://github.com/rapidsai/cuvs/tree/branch-25.02/examples/rust).

## Contributing

If you are interested in contributing to the cuVS library, please read our [Contributing guidelines](docs/source/contributing.md). Refer to the [Developer Guide](docs/source/developer_guide.md) for details on the developer guidelines, workflows, and principles.

## References

For the interested reader, many of the accelerated implementations in cuVS are also based on research papers which can provide a lot more background. We also ask you to please cite the corresponding algorithms by referencing them in your own research.
For the interested reader, many of the accelerated implementations in cuVS are also based on research papers which can provide a lot more background. We also ask you to please cite the corresponding algorithms by referencing them in your own research.
- [CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search](https://arxiv.org/abs/2308.15136)
- [Top-K Algorithms on GPU: A Comprehensive Study and New Methods](https://dl.acm.org/doi/10.1145/3581784.3607062)
- [Fast K-NN Graph Construction by GPU Based NN-Descent](https://dl.acm.org/doi/abs/10.1145/3459637.3482344?casa_token=O_nan1B1F5cAAAAA:QHWDEhh0wmd6UUTLY9_Gv6c3XI-5DXM9mXVaUXOYeStlpxTPmV3nKvABRfoivZAaQ3n8FWyrkWw>)
8 changes: 1 addition & 7 deletions build.sh
Original file line number Diff line number Diff line change
@@ -313,12 +313,6 @@ if [[ ${CMAKE_TARGET} == "" ]]; then
CMAKE_TARGET="all"
fi


SKBUILD_EXTRA_CMAKE_ARGS="${EXTRA_CMAKE_ARGS}"
if [[ "${EXTRA_CMAKE_ARGS}" != *"DFIND_CUVS_CPP"* ]]; then
SKBUILD_EXTRA_CMAKE_ARGS="${SKBUILD_EXTRA_CMAKE_ARGS};-DFIND_CUVS_CPP=ON"
fi

# If clean given, run it prior to any other steps
if (( ${CLEAN} == 1 )); then
# If the dirs to clean are mounted dirs in a container, the
@@ -434,7 +428,7 @@ fi

# Build and (optionally) install the cuvs Python package
if (( ${NUMARGS} == 0 )) || hasArg python; then
SKBUILD_CMAKE_ARGS="${SKBUILD_EXTRA_CMAKE_ARGS}" \
SKBUILD_CMAKE_ARGS="${EXTRA_CMAKE_ARGS}" \
SKBUILD_BUILD_OPTIONS="-j${PARALLEL_LEVEL}" \
python -m pip install --no-build-isolation --no-deps --config-settings rapidsai.disable-cuda=true ${REPODIR}/python/cuvs
fi
34 changes: 18 additions & 16 deletions ci/build_wheel.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.
# Copyright (c) 2023-2025, NVIDIA CORPORATION.

set -euo pipefail

package_name=$1
package_dir=$2
package_type=$3
underscore_package_name=$(echo "${package_name}" | tr "-" "_")

source rapids-configure-sccache
@@ -16,21 +17,22 @@ rapids-generate-version > ./VERSION

cd "${package_dir}"

case "${RAPIDS_CUDA_VERSION}" in
12.*)
EXCLUDE_ARGS=(
--exclude "libcublas.so.12"
--exclude "libcublasLt.so.12"
--exclude "libcurand.so.10"
--exclude "libcusolver.so.11"
--exclude "libcusparse.so.12"
--exclude "libnvJitLink.so.12"
EXCLUDE_ARGS=(
--exclude "libraft.so"
--exclude "libcublas.so.*"
--exclude "libcublasLt.so.*"
--exclude "libcurand.so.*"
--exclude "libcusolver.so.*"
--exclude "libcusparse.so.*"
--exclude "libnvJitLink.so.*"
)

if [[ "${package_dir}" != "python/libcuvs" ]]; then
EXCLUDE_ARGS+=(
--exclude "libcuvs_c.so"
--exclude "libcuvs.so"
)
;;
11.*)
EXCLUDE_ARGS=()
;;
esac
fi

rapids-logger "Building '${package_name}' wheel"

@@ -48,4 +50,4 @@ sccache --show-adv-stats
mkdir -p final_dist
python -m auditwheel repair -w final_dist "${EXCLUDE_ARGS[@]}" dist/*

RAPIDS_PY_WHEEL_NAME="${underscore_package_name}_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 python final_dist
RAPIDS_PY_WHEEL_NAME="${underscore_package_name}_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 ${package_type} final_dist
23 changes: 11 additions & 12 deletions ci/build_wheel_cuvs.sh
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.
# Copyright (c) 2023-2025, NVIDIA CORPORATION.

set -euo pipefail

package_dir="python/cuvs"

case "${RAPIDS_CUDA_VERSION}" in
12.*)
EXTRA_CMAKE_ARGS=";-DUSE_CUDA_MATH_WHEELS=ON"
;;
11.*)
EXTRA_CMAKE_ARGS=";-DUSE_CUDA_MATH_WHEELS=OFF"
;;
esac
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

# Set up skbuild options. Enable sccache in skbuild config options
export SKBUILD_CMAKE_ARGS="-DDETECT_CONDA_ENV=OFF;-DFIND_CUVS_CPP=OFF${EXTRA_CMAKE_ARGS}"
# Downloads libcuvs wheels from this current build,
# then ensures 'cuvs' wheel builds always use the 'libcuvs' just built in the same CI run.
#
# Using env variable PIP_CONSTRAINT is necessary to ensure the constraints
# are used when creating the isolated build environment.
RAPIDS_PY_WHEEL_NAME="libcuvs_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp /tmp/libcuvs_dist
echo "libcuvs-${RAPIDS_PY_CUDA_SUFFIX} @ file://$(echo /tmp/libcuvs_dist/libcuvs_*.whl)" > /tmp/constraints.txt
export PIP_CONSTRAINT="/tmp/constraints.txt"

ci/build_wheel.sh cuvs ${package_dir}
ci/build_wheel.sh cuvs ${package_dir} python
ci/validate_wheel.sh ${package_dir} final_dist
32 changes: 32 additions & 0 deletions ci/build_wheel_libcuvs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
# Copyright (c) 2025, NVIDIA CORPORATION.

set -euo pipefail

package_name="libcuvs"
package_dir="python/libcuvs"

rapids-logger "Generating build requirements"
matrix_selectors="cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION};cuda_suffixed=true"

rapids-dependency-file-generator \
--output requirements \
--file-key "py_build_${package_name}" \
--file-key "py_rapids_build_${package_name}" \
--matrix "${matrix_selectors}" \
| tee /tmp/requirements-build.txt

rapids-logger "Installing build requirements"
python -m pip install \
-v \
--prefer-binary \
-r /tmp/requirements-build.txt

# build with '--no-build-isolation', for better sccache hit rate
# 0 really means "add --no-build-isolation" (ref: https://github.com/pypa/pip/issues/5735)
export PIP_NO_BUILD_ISOLATION=0

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

ci/build_wheel.sh libcuvs ${package_dir} cpp
ci/validate_wheel.sh ${package_dir} final_dist libcuvs
7 changes: 7 additions & 0 deletions ci/check_style.sh
Original file line number Diff line number Diff line change
@@ -14,5 +14,12 @@ rapids-dependency-file-generator \
rapids-mamba-retry env create --yes -f env.yaml -n checks
conda activate checks

# get config for cmake-format checks
RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
FORMAT_FILE_URL="https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-${RAPIDS_VERSION_MAJOR_MINOR}/cmake-format-rapids-cmake.json"
export RAPIDS_CMAKE_FORMAT_FILE=/tmp/rapids_cmake_ci/cmake-formats-rapids-cmake.json
mkdir -p $(dirname ${RAPIDS_CMAKE_FORMAT_FILE})
wget -O ${RAPIDS_CMAKE_FORMAT_FILE} ${FORMAT_FILE_URL}

# Run pre-commit checks
pre-commit run --all-files --show-diff-on-failure
8 changes: 7 additions & 1 deletion ci/release/update-version.sh
Original file line number Diff line number Diff line change
@@ -44,8 +44,10 @@ echo "${NEXT_FULL_TAG}" > VERSION
DEPENDENCIES=(
dask-cuda
cuvs
pylibraft
libcuvs
libraft
librmm
pylibraft
rmm
rapids-dask-dependency
)
@@ -76,6 +78,10 @@ done
sed_runner "/rapidsai\/raft/ s|branch-[0-9][0-9].[0-9][0-9]|branch-${NEXT_SHORT_TAG}|g" docs/source/developer_guide.md

sed_runner "s|=[0-9][0-9].[0-9][0-9]|=${NEXT_SHORT_TAG}|g" README.md
sed_runner "s|branch-[0-9][0-9].[0-9][0-9]|branch-${NEXT_SHORT_TAG}|g" README.md

# references to license files
sed_runner "s|branch-[0-9][0-9].[0-9][0-9]|branch-${NEXT_SHORT_TAG}|g" python/cuvs_bench/cuvs_bench/plot/__main__.py

# rust can't handle leading 0's in the major/minor/patch version - remove
NEXT_FULL_RUST_TAG=$(printf "%d.%d.%d" $((10#$NEXT_MAJOR)) $((10#$NEXT_MINOR)) $((10#$NEXT_PATCH)))
9 changes: 6 additions & 3 deletions ci/test_wheel_cuvs.sh
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
#!/bin/bash
# Copyright (c) 2023-2024, NVIDIA CORPORATION.
# Copyright (c) 2023-2025, NVIDIA CORPORATION.

set -euo pipefail

mkdir -p ./dist
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="cuvs_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 ./dist
RAPIDS_PY_WHEEL_NAME="libcuvs_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp ./local-libcuvs-dep
RAPIDS_PY_WHEEL_NAME="cuvs_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 python ./dist

# echo to expand wildcard before adding `[extra]` requires for pip
python -m pip install $(echo ./dist/cuvs*.whl)[test]
python -m pip install \
./local-libcuvs-dep/libcuvs*.whl \
"$(echo ./dist/cuvs*.whl)[test]"

python -m pytest ./python/cuvs/cuvs/test
12 changes: 0 additions & 12 deletions ci/validate_wheel.sh
Original file line number Diff line number Diff line change
@@ -8,24 +8,12 @@ wheel_dir_relative_path=$2

RAPIDS_CUDA_MAJOR="${RAPIDS_CUDA_VERSION%%.*}"

# some packages are much larger on CUDA 11 than on CUDA 12
if [[ "${RAPIDS_CUDA_MAJOR}" == "11" ]]; then
PYDISTCHECK_ARGS=(
--max-allowed-size-compressed '1.4G'
)
else
PYDISTCHECK_ARGS=(
--max-allowed-size-compressed '950M'
)
fi

cd "${package_dir}"

rapids-logger "validate packages with 'pydistcheck'"

pydistcheck \
--inspect \
"${PYDISTCHECK_ARGS[@]}" \
"$(echo ${wheel_dir_relative_path}/*.whl)"

rapids-logger "validate packages with 'twine'"
Loading

0 comments on commit d5f1500

Please sign in to comment.