Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove non-transient logs on missing artifact-repositories configmap #13516

Conversation

thor
Copy link
Contributor

@thor thor commented Aug 27, 2024

Fixes #9085.

Motivation

Remove high levels of error messages that aren't errors. Reduce additional observability costs associated with running Argo Workflows, although negligble.

Clarify the difference between side-effect and non-side-effect functions in utility module.

Modifications

Renames IsTransientErr to LogIsTransientErr, which is easily undoable if it's more noise than signal.

Introduces a LogOnlyTransientErr, which only logs errors if they're transient. Suitable for a use-case such as trying to fetch an object that doesn't exist when it's optional.

Verification

No verification performed due to the (deceptive) simplicity of the change.

@thor thor force-pushed the fix/reduce-non-transient-error-logging-for-artifact-repositories branch 3 times, most recently from 358d18d to d8fcfb3 Compare August 28, 2024 10:39
@thor thor marked this pull request as ready for review August 28, 2024 10:52
Copy link
Member

@Joibel Joibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two new function names LogIsTransientError and LogOnlyIsTransient error imply things about their purpose which aren't really their primary use (logging, not a test).

They also don't immediately make sense - two functions called LogDogName and LogOnlyDogName I wouldn't know which to call to log the dogs name. The function names you've created make sense if you understand the history. The Is part of their name helps, but...

I'd suggest either calling them IsTransientErr and IsTransientErrVerbose, or my slight preference would be IsTransientErrQuiet and IsTransientErr.

Naming things is hard.

util/errors/errors.go Outdated Show resolved Hide resolved
util/errors/errors.go Show resolved Hide resolved
@agilgur5 agilgur5 changed the title fix: remove non-transient missing artifact-repositories configmap errors fix: remove non-transient logs on missing artifact-repositories configmap Aug 28, 2024
@thor
Copy link
Contributor Author

thor commented Aug 30, 2024

The two new function names LogIsTransientError and LogOnlyIsTransient error imply things about their purpose which aren't really their primary use (logging, not a test).

They also don't immediately make sense - two functions called LogDogName and LogOnlyDogName I wouldn't know which to call to log the dogs name. The function names you've created make sense if you understand the history. The Is part of their name helps, but...
[...]
Naming things is hard.

Thanks for the review, @Joibel, much appreciated. I solidly agree, naming things is hard.

I'd suggest either calling them IsTransientErr and IsTransientErrVerbose, or my slight preference would be IsTransientErrQuiet and IsTransientErr.

Considering that the specific scenario wants to ignore non-transient errors only, do I understand you correctly that IsTransientErrQuiet seems appropriate although it won't be quiet, just less verbose?

EDIT: I've assumed that you indeed meant IsTransientErrQuiet was appropriate although it's not fully quiet, re: naming things, and simplified the PR to only switch the lookup of the configuration map artifact-repositories to IsTransientErrQuiet.

@thor thor force-pushed the fix/reduce-non-transient-error-logging-for-artifact-repositories branch from d8fcfb3 to 75edb7a Compare August 30, 2024 11:40
@thor thor requested a review from Joibel August 30, 2024 11:48
@thor thor force-pushed the fix/reduce-non-transient-error-logging-for-artifact-repositories branch from 75edb7a to f9e73b8 Compare September 2, 2024 07:53
@Joibel
Copy link
Member

Joibel commented Sep 2, 2024

EDIT: I've assumed that you indeed meant IsTransientErrQuiet was appropriate although it's not fully quiet, re: naming things, and simplified the PR to only switch the lookup of the configuration map artifact-repositories to IsTransientErrQuiet.

It's Quieter, but in the interests of sane shorter names it felt better to just call it Quiet.

Thanks, for doing this.

Fixes argoproj#9085 by eliminating log-messages for override ConfigMaps which do
not have to exist.

Signed-off-by: Thor K. Høgås <[email protected]>
@thor thor force-pushed the fix/reduce-non-transient-error-logging-for-artifact-repositories branch from f9e73b8 to 2a00d82 Compare September 13, 2024 11:56
@thor
Copy link
Contributor Author

thor commented Sep 13, 2024

@Joibel I don't think the minimal E2E test that's failing here is relevant to the code change. Is that something you could confirm, and if so, merge the PR?

@Joibel Joibel enabled auto-merge (squash) September 13, 2024 13:55
@Joibel
Copy link
Member

Joibel commented Sep 13, 2024

I'll re-run them and see if they pass, they look like a flakey test but I'd like to be sure.

@Joibel Joibel merged commit 8065edb into argoproj:main Sep 13, 2024
27 checks passed
@agilgur5 agilgur5 added the area/artifacts S3/GCP/OSS/Git/HDFS etc label Sep 13, 2024
Joibel pushed a commit to pipekit/argo-workflows that referenced this pull request Sep 19, 2024
Joibel pushed a commit that referenced this pull request Sep 20, 2024
@agilgur5 agilgur5 added this to the v3.5.x patches milestone Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reduce log spamming about artifact-repositories configmap
4 participants