Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAID state in trigger #1

Open
stettler opened this issue Jan 8, 2014 · 4 comments
Open

RAID state in trigger #1

stettler opened this issue Jan 8, 2014 · 4 comments

Comments

@stettler
Copy link

stettler commented Jan 8, 2014

Hello,
If the purpose is to fire the trigger when there is a problem with the RAID, you can't use the "clean" or "active" state : An array can be in degraded mode (have a faulty disk) and in "clean" or "active" state at the same time.

If the goal is the fire the trigger when a disk fail, you should use something like :
{#MD_DEVICE} RAID status degraded
{Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(degraded|failed)")}=1&{Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(resyncing|recovering)")}=0

I don't think that there is a "failed" state but I put it just in case...
I would also add another trigger with "information" severity to know when an array is rebuilding :
{#MD_DEVICE} RAID status building
{Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(resyncing|recovering)")}=1

Cheers!

@MichaKersloot
Copy link

I altered the templace to:
{Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(^clean$|^active$)")}=0

@linuxsquad
Copy link
Owner

Sorry, I failed to update this module... here is my latest attempt to handle it the failed state and it works:

{Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(degraded)")}=1 | {Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].strlen(0)}<5

@stettler
Copy link
Author

stettler commented Jun 3, 2014

Why the "strlen(0)<5" part ?
I think this is enough : {Template MD RAID:mdraid[{#MD_DEVICE},e,"State"].regexp("(degraded)")}=1

Personally, I prefer the expression I first wrote (with the resyncing/recovering part) : Once the problem was acted upon and the array is rebuilding, I prefer not to have the trigger anymore. But fire-up another trigger with a lower severity to inform that the array is currently being rebuilt.

@linuxsquad
Copy link
Owner

Why the "strlen(0)<5" part ?
Error on safe side.. just in case my script spew some random characters

Agreed, your script does provide more status granularity, however I was more concern with catching failing mdraid when I wrote it initially. Since we had very small md partitions on high-perf SSD, (resyncing|recovering) phase flew really fast... almost unnoticeable. Otherwise, it is a welcome addition.

@alperkar alperkar mentioned this issue Dec 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants