Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ms-output - do not break out producer and consumer loops #12194
base: master
Are you sure you want to change the base?
ms-output - do not break out producer and consumer loops #12194
Changes from all commits
9a702b2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we explicitly set an expiration time for this alert? Without it, what is the default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The configuration is here https://gitlab.cern.ch/cmsmonitoring/cmsmon-configs/-/blob/master/alertmanager/alertmanager.yaml
and we use this function:
WMCore/src/python/WMCore/Services/AlertManager/AlertManagerAPI.py
Line 32 in 44d2a8a
so:
wmcore
repeat_interval: 2h
, then alertmanager will not send the same notification againThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Dario. Where did you come with
repeat_interval
configuration from?I am undecided whether we should make this time for re-raising an alert larger or not, as a fear of spam.
Would 12h be a better option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's always the same file, here
we can change to 12h if:
dmwm-admins-12h
) , and we override the default values (for example setting 12h for repeat interval)tag
, let's saywmcore12h
wmcore12h
and send the alert todmwm-admins-12h
at this point it would be beneficial to have a broader discussion on our alerts, because we could take this opportunity to improve the situation across the board.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for these details.
Given that we would have to either change the default repeat_interval value or fork it for
dmwm
, I would suggest to leave it for one of the monitoring-related issues that we are discussing and considering for Q1. I fully agree that a discussion on that is important, so I suggest to keep it out of these developments.