Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add runtimeclass nvidia as a default option for nimcache #177

Open
jxdn opened this issue Oct 5, 2024 · 6 comments
Open

add runtimeclass nvidia as a default option for nimcache #177

jxdn opened this issue Oct 5, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@jxdn
Copy link

jxdn commented Oct 5, 2024

Hi ,

can help to add runtimeclass on the nimcache and all others crd ?

got this error

Traceback (most recent call last):
File "/usr/local/bin/download-to-cache", line 5, in
from vllm_nvext.hub.pre_download import download_to_cache
File "/usr/local/lib/python3.10/dist-packages/vllm_nvext/hub/pre_download.py", line 20, in
from vllm_nvext.hub.ngc_injector import get_optimal_manifest_config
File "/usr/local/lib/python3.10/dist-packages/vllm_nvext/hub/ngc_injector.py", line 22, in
from vllm.engine.arg_utils import AsyncEngineArgs
File "/usr/local/lib/python3.10/dist-packages/vllm/init.py", line 3, in
from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 6, in
from vllm.config import (CacheConfig, DecodingConfig, DeviceConfig,
File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 12, in
from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/init.py", line 3, in
from vllm.model_executor.layers.quantization.aqlm import AQLMConfig
File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/aqlm.py", line 11, in
from vllm._C import ops
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

@jxdn
Copy link
Author

jxdn commented Oct 6, 2024

Image

i add this line and rebuild the nim-operator, and it works

@jxdn jxdn changed the title add runtimeclass option for nimcache add runtimeclass nvidia as a default option for nimcache Oct 6, 2024
@jxdn
Copy link
Author

jxdn commented Oct 6, 2024

this happened also on nimservices
need to patch with
kubectl patch deployment meta-llama3-8b-instruct --type='merge' -p='{"spec": {"template": {"spec": {"runtimeClassName": "nvidia"}}}}' -n nim

@mkhaas
Copy link
Collaborator

mkhaas commented Oct 6, 2024

Thanks for the suggestion. We'll add it to our backlog. In the meantime, recommend adding webhooks to add runtimeclass.

@kirson-git
Copy link

Can i use the patch command for NIMCACHE ?

@shivamerla
Copy link
Collaborator

@jxdn thanks for catching this. We didn't hit this error as nvidia was configured as a default runtime with the gpu-operator. Will include this in the next patch.

@shivamerla shivamerla self-assigned this Oct 14, 2024
@shivamerla shivamerla added the bug Something isn't working label Oct 14, 2024
@shivamerla
Copy link
Collaborator

This PR should fix for NIM Service deployments. For caching, we should not need to specify "nvidia" runtime class as that Job can be run on a non-gpu node. For the issue reported the fix should be in download-to-cache NIM tool which is loading cuda libs unnecessarily. Going to request to fix that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants