-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use replicas when initializing pg minResources #4000
base: master
Are you sure you want to change the base?
use replicas when initializing pg minResources #4000
Conversation
/assign @hzxuzhonghu |
Same situation as mentioned in this issue: |
proportion plugin controls the queue related capabilities, if you disable it, the capacity check will not take effect. |
Just as @hwdef said, please set an annotation to control the minmember to set, and also other workload type like statefulset should also has this capability: ) |
hi @hwdef @Monokaix thanks for all your suggestions, I have updated the feature using new annotation, so it won't break current behaviors. Please have a look. |
|
||
for _, reference := range pod.OwnerReferences { | ||
if reference.Kind != "" && reference.Name != "" { | ||
tmp := pg.getAnnotationsFromUpperRes(reference.Kind, reference.Name, pod.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we write annotation on the deployment to specify minMember, but here the ownerReferences of the Pod can only be traced back to the replicaset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From k8s source code https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/util/deployment_util.go#L235, the ReplicaSet will inherit annotations from Deployment, so I think it's enough to get annotations from RS
for _, reference := range pod.OwnerReferences { | ||
if reference.Kind != "" && reference.Name != "" { | ||
tmp := pg.getAnnotationsFromUpperRes(reference.Kind, reference.Name, pod.Namespace) | ||
if err := mergo.Merge(&annotations, &tmp); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have scenarios that Pod has multiple OwnerReferences? If there are trully multiple annotations, we won't specify twice the minMember annotations, do we need to use the merge.Merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally there won't be multiple OwnerReferences with this annotation, updated. Thanks
@@ -177,6 +181,37 @@ func (pg *pgcontroller) getAnnotationsFromUpperRes(kind string, name string, nam | |||
} | |||
} | |||
|
|||
func (pg *pgcontroller) getMinMemberFromUpperRes(pod *v1.Pod) *int32 { | |||
defaultMinMember := int32(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable name is very strange. You can set 1 as a constant (as defaultMinMember), and then the variable here is initialized to defaultMinMember. minMember
here is ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -193,7 +228,7 @@ func (pg *pgcontroller) inheritUpperAnnotations(pod *v1.Pod, obj *scheduling.Pod | |||
} | |||
} | |||
|
|||
func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod) error { | |||
func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod, minAvailable *int32) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the variable names here not unified as minMember?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -203,6 +238,11 @@ func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod) error { | |||
return err | |||
} | |||
|
|||
minMember := int32(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to initialize again here, you already have the initialization value in getMinMemberFromUpperRes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -327,10 +327,10 @@ func (p TasksPriority) CalcFirstCountResources(count int32) v1.ResourceList { | |||
|
|||
for _, task := range p { | |||
if count <= task.Replicas { | |||
minReq = quotav1.Add(minReq, calTaskRequests(&v1.Pod{Spec: task.Template.Spec}, count)) | |||
minReq = quotav1.Add(minReq, *util.CalTaskRequests(&v1.Pod{Spec: task.Template.Spec}, count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of the function here is a pointer, and then you use *
directly to get the value pointed to by the pointer, the way it's written looks a bit strange and doesn't quite fit in with go's clean code. The return value of the func CalTaskRequests
does not need to be refactored, keeping the original is better. In fact, when the API was designed, minResources should not be a pointer, ResourceList is a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @JesseStutler Thanks for your reviewing! Have updated, please have a look |
API related pr is merged, you can move on: ) |
4241911
to
05dd3e0
Compare
Updated go.mod, please have a look @Monokaix @hwdef @JesseStutler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/ok-to-test |
"k8s.io/kubernetes/pkg/apis/core/v1/helper" | ||
quotacore "k8s.io/kubernetes/pkg/quota/v1/evaluator/core" | ||
"k8s.io/utils/clock" | ||
) | ||
|
||
func GetPodQuotaUsage(pod *v1.Pod) *v1.ResourceList { | ||
func GetPodQuotaUsage(pod *v1.Pod) v1.ResourceList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changed to non-poiner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just follow this comment from @JesseStutler for better style #4000 (comment)
@@ -172,7 +172,8 @@ func (pg *pgcontroller) processNextReq() bool { | |||
|
|||
// normal pod use volcano | |||
klog.V(4).Infof("Try to create podgroup for pod %s/%s", pod.Namespace, pod.Name) | |||
if err := pg.createNormalPodPGIfNotExist(pod); err != nil { | |||
minMember := pg.getMinMemberFromUpperRes(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should put in createNormalPodPGIfNotExist
, because when the podgroup is already existed, there is no need to calculate the min member, it will request APIServer which is a heavy operation.
And there is also a qustion mentioned here #3970 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good point 👍 so it's better to keep rs/sts etc informers in pgcontroller and read cache from these informers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, DaemonSet and Job should also be considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this should also be considered #3970 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this should also be considered #3970 (comment)
I think it's not necessary to check pod annotation? The annotation is designed for workload, it will be strange to see this in pod
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: sceneryback <[email protected]>
2495285
to
27e95ed
Compare
Refactored code using informers, please have a look again @hwdef @Monokaix @JesseStutler |
Hi, please make the CI happy |
When creating pod using replicaSet, the minMember of podgroup created is always 1, this is not reasonable. Consider this case:
As can be seen from this picture:
![image](https://private-user-images.githubusercontent.com/11677736/410854094-a3e3041d-44d2-4f3d-b948-005ab0e227ca.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1MTczMTYsIm5iZiI6MTczOTUxNzAxNiwicGF0aCI6Ii8xMTY3NzczNi80MTA4NTQwOTQtYTNlMzA0MWQtNDRkMi00ZjNkLWI5NDgtMDA1YWIwZTIyN2NhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE0VDA3MTAxNlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTczNzQ4YWM2ZTQ1YWM0MzU1NjlmZDg5M2YxMzEyNDIyZmUxMjdmNDkxZTk5MTM2NTA0MzhiMWNiNjEzZGUxNzMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.T38qe-NC6LJSesTEX_AUg78UkK5Iizq7B8yebfNOVu4)
In the case that pod has replicas from its owners, the podgroup should consider relicas, just as training-operator does, which creates pod group based on resources from all replicas