EBS fault-tolerant filter doesn't filter as expected #7378

treyatkinson · 2020-08-21T20:26:47Z

treyatkinson
Aug 21, 2020

I'm attempting to use a basic EBS fault-tolerant policy (policy included below) to identify EBS volumes that do not have a snapshot from the last 7 days. I have tried this with 'custodian run -s output --cache-period 0 getebsfaulttolerant.yml' as well as in a CloudTamer compliance check with the same results.

In my AWS account, I have 7 EBS volumes; 2 of them have snapshots taken in the last 3 days. AWS Trusted Advisor reports "5 of 7 volumes do not have a recent snapshot." However, the custodian policy below returns all 7 EBS volumes. When I ran the same policy with 'tolerant: True', I got 0 volumes returned, so custodian is responding consistently but not as I expected.

Do I have my policy written correctly? Is there something specific that must be done with the snapshots to have custodian recognize the volumes as fault-tolerant?

Please include any sample policy [sanitized ~ no sensitive info or account ids] to indicate
what you've tried so far.

policies:

name: ebs-snapshots-not-fault-tolerant
resource: aws.ebs
filters:
- type: fault-tolerant
  tolerant: False

kapilt · 2020-08-21T21:29:52Z

kapilt
Aug 21, 2020
Maintainer

the fault tolerant filter just uses trusted advisor (checkId H7IgTzjTYb), afaics this is because we don't expose the color mapping to days that the advisor does in the console ie quoting from that check

Checks the age of the snapshots for your Amazon Elastic Block Store (Amazon EBS) volumes (available or in-use). Even though Amazon EBS volumes are replicated, failures can occur. Snaps
hots are persisted to Amazon Simple Storage Service (Amazon S3) for durable storage and point-in-time recovery.<br>\n<br>\n<b>Alert Criteria</b><br>\nYellow: The most recent volume snapshot is between 7 and 30 da
ys old.<br>\nRed: The most recent volume snapshot is more than 30 days old.<br>\nRed: The volume does not have a snapshot.

0 replies

treyatkinson · 2020-08-21T22:02:56Z

treyatkinson
Aug 21, 2020
Author

Thanks for the quick response! That checkId helps me understand better what's going on here. It seems to me this filter isn't filtering out fault tolerant EBS volumes as defined. Can you confirm I'm reading this correctly?

When I run 'aws support describe-trusted-advisor-check-result --check-id H7IgTzjTYb', I see all 7 of my EBS volumes listed in "flaggedResources". Those with snapshots in the last 7 days have "status": "ok" and a metadata entry of "Green". Those without a snapshot have "status": "error" and a metadata entry of "Red". I suspect a volume with a snapshot 7-30 days old would have a different status and metadata entry of "Yellow", but I don't currently have a snapshot that age to verify.

The Cloud Custodian class FaultTolerantSnapshots includes all volumes in "flaggedResources" in its list of flagged resources, so I'm either getting all of my EBS volumes returned or none of them depending on True/False setting of tolerant in the policy. I think there should be a check in there to look at either the status or the metadata color to properly filter for fault-tolerance.

1 reply

mloa7 Oct 10, 2022

Hi has there been any movement on this or a proposed fix? I am running into the same problem as @treyatkinson on the most recent version of cloud custodian (0.9.91).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloud Custodian

EBS fault-tolerant filter doesn't filter as expected #7378

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Cloud Custodian

EBS fault-tolerant filter doesn't filter as expected #7378

treyatkinson Aug 21, 2020

Replies: 2 comments · 1 reply

kapilt Aug 21, 2020 Maintainer

treyatkinson Aug 21, 2020 Author

mloa7 Oct 10, 2022

treyatkinson
Aug 21, 2020

Replies: 2 comments 1 reply

kapilt
Aug 21, 2020
Maintainer

treyatkinson
Aug 21, 2020
Author