-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Having some problem while testing #13
Comments
Hi, Please make sure that the following settings are completed:
After completing the above work, you can install KubeShare
BTW, I don't think it is caused by the cuda version 11.2 because I try to build on cuda 11.2 and it works. If you still have questions, please provide logs |
Sorry, I've checked those things you mentioned above but the result is still nothing.
nvidia-smi:
My sharepod1 & 2
Test command:
My docker version: 20.10.6 And I want to ask if I can add |
Hi @benesse1899
Also we provide you with our settings
When applying sharepod, please provide some logs about KubeShare pod
|
Hi @StarCoral
I've done already, but I forgot to show in my previous reply, sorry for that. And because I only have one gpu in each node(I have two nodes, and my master node doesn't have gpu), so I use this pod to check
Then I go into the pod and write
And the log of
|
Hi @benesse1899, The annotation |
Hi @StarCoral , system still can get sharepod
and the yaml of pod1
Should I show the info of my three nodes? |
According to the log of I tried the yaml file you provided and it can work normally in our environment. |
Hi @StarCoral, I met almost the same problem when I tried to deploy sharepod with YAML file in the doc/yaml/ folder the log of
However, I found error messages in the Kubelet log after I deployed Kubeshare
But
Maybe this is the reason, but I can't solve it. |
Hi @y-ykcir, the error Can you see the pod running normally in the default namespace? |
Hi @justin0u0, I can't see the pod running normally in the default namespace. However, It seemed to work after I re-installed Kubernetes and Kubeshare. |
I've followed the step in https://asciinema.org/a/302094,
but after I create the sharePod and input
kubectl get sharepod pod1 -o yaml | egrep -m2 'GPUID|nodeName'
,it didn't show anything. Why?
I don't know if it is because of my cuda version is 11.2 or what.
BTW, do I need to run the makefile? I'm not sure if that is necessary.
The text was updated successfully, but these errors were encountered: