-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize CI System and Reduce Redundancy #85
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update to CUDA 12.8.0 in CI.
- { CUDA_VER: '11.4.3', ARCH: 'amd64', PY_VER: '3.9', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } | ||
- { CUDA_VER: '11.8.0', ARCH: 'amd64', PY_VER: '3.10', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } | ||
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.11', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } | ||
- { CUDA_VER: '12.5.1', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- { CUDA_VER: '12.5.1', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } | |
- { CUDA_VER: '12.8.0', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'v100', driver: 'latest' } |
- { CUDA_VER: '11.4.3', ARCH: 'arm64', PY_VER: '3.9', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } | ||
- { CUDA_VER: '11.8.0', ARCH: 'arm64', PY_VER: '3.10', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } | ||
- { CUDA_VER: '12.0.1', ARCH: 'arm64', PY_VER: '3.11', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } | ||
- { CUDA_VER: '12.5.1', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- { CUDA_VER: '12.5.1', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } | |
- { CUDA_VER: '12.8.0', ARCH: 'arm64', PY_VER: '3.12', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' } |
- { CUDA_VER: '12.0.1', ARCH: 'amd64', PY_VER: '3.12', LINUX_VER: 'rockylinux8' } | ||
" | ||
|
||
export TEST_MATRIX=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need 8 jobs? Can we do a reduced matrix here of 3 jobs, and then do the rest with a separate set of nightly jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need 8 jobs?
No, if you look at the pr.yaml, there's a matrix filter that only issue 2 jobs for each of the tests. One for amd64, and the other for arm64.
ci/test_conda.sh
Outdated
pytest \ | ||
clangdev >=18 \ | ||
cuda-nvcc >=12.5 \ | ||
cuda-version >=12.5 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want something like this:
cuda-version >=12.5 \ | |
cuda-version =${RAPIDS_CUDA_VERSION%.*} \ |
ci/test_conda.sh
Outdated
|
||
conda index $RAPIDS_CONDA_BLD_OUTPUT_DIR/ | ||
conda config --add channels $RAPIDS_CONDA_BLD_OUTPUT_DIR | ||
|
||
rapids-print-env | ||
|
||
rapids-mamba-retry install \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to fuse this into the rapids-mamba-retry create -n test
command. If you can do a single conda solve, your CI will be faster and more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had thought that when we installed all dependencies for the libs first, then force installing from pwd/conda-repo
channel, it should've installed from the local channel. Apparently that's not the case.
Still seeing that conda is pulling from the wrong package channel:
|
This PR aims to completely separate conda building and package testing into two github job. As building only requires CPU node and testing requires GPU node, this reduces GPU usage in CI.