Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default upgradeJobResources values are too low and get the UpgradeCRD job OOMKilled #515

Closed
BaudouinH opened this issue Nov 2, 2023 · 5 comments · Fixed by #524
Closed
Assignees
Labels
enhancement New feature or request velero

Comments

@BaudouinH
Copy link

What steps did you take and what happened:

Upgrading the Velero Helm chart from velero-5.1.2 to velero-5.1.3 fails:
the Upgrade CRD job cannot complete due to the pods of the job being OOMKilled when using the chart default resource values.

What did you expect to happen:

The upgrade CRD job using the Chart default resource values should not be OOMKilled.

The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)

Anything else you would like to add:
The job resumed working as intended when I bumped it with the following resource requirements:

upgradeJobResources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 1000m
    memory: 512Mi

I am not sure if the OOMKill is due to my environment or because the default values are too low.

I would gladly do the PR if you want, but I am not sure that the values I provided are the best values.

Also, I often hear that setting CPU limits are an anti-pattern and should be avoided, is there a particular reason you set default CPU limit values in this chart instead of making them optional?

Environment:

  • helm version (use helm version): 3.11.2 (packaged with ArgoCD)
  • helm chart version and app version (use helm list -n <YOUR NAMESPACE>):
    Chart version: velero-5.1.3
    App version: v1.12.1
  • Kubernetes version (use kubectl version):
Client Version: v1.28.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.4-gke.900
  • Kubernetes installer & version:
GKE 1.27.4
  • Cloud provider or hardware configuration:
    Google Cloud Platform
  • OS (e.g. from /etc/os-release):
Ubuntu 22.04.3 LTS
@reefland
Copy link

reefland commented Nov 3, 2023

I only had to bump memory a little....

upgradeJobResources:
  requests:
    cpu: 50m
    memory: 128Mi
  limits:
    cpu: 100m
    memory: 256Mi

I have a small 6 node home lab environment and the default values couldn't handle that.

@jenting jenting added enhancement New feature or request velero labels Nov 8, 2023
@jenting
Copy link
Collaborator

jenting commented Nov 10, 2023

How about remove the request/limit default value within values.yaml?
Leave the configuring request/limit value to the user.

@makarov-roman
Copy link

It was added here: #514
I personally do not see any benefits having it.

@reefland
Copy link

It should be commented out and left for the user to enable. If the default is known to be invalid, then bump up the default. No reason to set the default to something known to OOM kill. That's just mean.

@jenting
Copy link
Collaborator

jenting commented Nov 12, 2023

@reefland Would you mind file a PR to address it?

j2L4e added a commit to YAKEcloud/yake that referenced this issue Nov 20, 2023
…something generous that actually works. Revert this once vmware-tanzu/helm-charts#515 is resolved

Signed-off-by: Jan Lohage <[email protected]>
j2L4e added a commit to YAKEcloud/yake that referenced this issue Nov 20, 2023
…limits

.upgradeJobResources defaults are too low (128Mi limit). set them to something generous that actually works. Revert this once vmware-tanzu/helm-charts#515 is resolved (#1168)

Signed-off-by: Jan Lohage <[email protected]>
j2L4e added a commit to YAKEcloud/yake that referenced this issue Nov 20, 2023
…limits

.upgradeJobResources defaults are too low (128Mi limit). set them to something generous that actually works. Revert this once vmware-tanzu/helm-charts#515 is resolved (#1168)

Signed-off-by: Jan Lohage <[email protected]>
j2L4e added a commit to YAKEcloud/yake that referenced this issue Nov 20, 2023
…limits

.upgradeJobResources defaults are too low (128Mi limit). set them to something generous that actually works. Revert this once vmware-tanzu/helm-charts#515 is resolved (#1168)

Signed-off-by: Jan Lohage <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request velero
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants