Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Add validation for pod_config #4206 #4466
[k8s] Add validation for pod_config #4206 #4466
Changes from 1 commit
12f1208
64bb66a
a0f29e5
47e724d
699961d
164f436
0471b4c
98a1d84
7e501f3
ef562da
2b5dba1
81029d7
d349c37
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this approach work even if the pod_config is partially specified? E.g.,
My hunch is k8s will reject this pod spec since it's not a complete pod spec, but it's a valid pod_config in our case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the k8s will reject this pod spec.
if this pod_config is valid in this project. is there any definition about this config? for example: some filed is required or optional? or all the filed is optional here, but it must follow the k8s pod require only if it has been set ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is my solution about this, we can check the pod config by using k8s api after
combine_pod_config_fields
andcombine_metadata_fields
during launch (that is the early stage of launching.).it's really hard and complex to follow and maintain the k8s pod json/yaml schema in this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the definition of a valid pod_spec.
Yes, that sounds reasonable as long as we can surface to the user where the error comes in the user's pod config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered having a simple local schema check, with the json schema fetched and flattened from something like https://github.com/instrumenta/kubernetes-json-schema/tree/master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I took a look at this before. The main problem with this setup is that it needs to grab JSON schema files from other repo eg: https://github.com/yannh/kubernetes-json-schema, depending on which version of k8s user using. I'm not sure if it's a good idea for sky to download dependencies to the local machine while it's running. Plus, if we want to check pod_config locally using JSON schema, we might need to let users choose their k8s version so we can get the right schema file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try the approach you proposed above (
check the pod config by using k8s api after combine_pod_config_fields and combine_metadata_fields
) if it can surface the exact errors to the users.If that does not work, we may need to do schema validation locally. Pod API has been relatively stable, so might not be too bad to have a fixed version schema for validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
BTW i found a error case when i test the approach with json schema in kubernetes-json-schema.
here is my part of test yaml
note, the name here
local_test
with_
inside, it's invalid when we creating a pod, but will pass the check by json schema.and if we use this config to create sky cluster, it will fail later because the invalid name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI. Here is the output after pod_config check failed during launch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can kubernetes.api_exception() be caused by reasons other than those not related to invalid config (e.g., insufficient permissions)? In that case, the error message is misleading. For example, I ran into this:
Can we filter the exception further and return valid = False only if the failure is due to invalid pod schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's hard to filter, here is an alternative implementation (not sure if it works, needs testing):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ! If
sanitize_for_serialization
works, i think this approach is much better than create with dryrun.