-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Exec auth support on k8s #4544
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubernetes:
Syncing (to 1 node): /var/folders/d1/c810pqs51p58jyq4g4d8czbh0000gn/T/tmp5egq4o1h -> ~/.sky/managed_jobs/sky-ddb6-hong-9032.config_yaml
✓ Files synced. View logs at: ~/sky_logs/sky-2025-01-08-19-02-48-871939/file_mounts.log
Auto-stop is not supported for Kubernetes and RunPod clusters. Skipping.
⚙︎ Job submitted, ID: 3
E0108 11:06:08.217324 282258 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0108 11:06:08.218067 282258 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0108 11:06:08.220005 282258 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
E0108 11:06:08.220567 282258 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The job cluster is preempted or failed.
✓ Managed job finished: 3 (status: SUCCEEDED).
The output from the terminal caused by executing sudo kubectl get nodes
whereas kubectl get nodes
works successfully without sudo.
Check our codebase seems like it related to https://github.com/weih1121/skypilot/blob/885e5279daa3c52b933a796d82c3438b66772f6a/sky/provision/kubernetes/utils.py#L449. @romilbhardwaj @Michaelvll any suggestion?
Looks like you're testing a on |
linked ticket: https://linear.app/skypilot/issue/SKY-959/[k8s]-support-exec-based-auth-kubeconfigs-on-controllers
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh