Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EC2 describe-instances filter within instance_for_port() function returning multiple instance IDs rather than one #43

Open
alyoamz opened this issue Oct 31, 2024 · 1 comment

Comments

@alyoamz
Copy link

alyoamz commented Oct 31, 2024

I have seen an issue result of function instance_for_port() returning multiple instances while my understanding is that it is expected to return a single instance. So, I observed a condition that will return both nodes instead of one due to the filters being used below:

instance=`aws ec2 describe-instances $options --filters "Name=tag-value,Values=${port}" "Name=tag-key,Values=${ec2_tag}" --query 'Reservations[*].Instances[*].InstanceId'`

Imagine a two-node cluster, each with the following set of tags respectively:

node1:
tag-key=pacemaker / tag-value=node1

node2:
tag-key=pacemaker / tag-value=node2
tag-key=whatever / tag-value=node1

I believe the intent with the filter above is to collect a single instance ID that meets the condition ${ec2_tag} = ${port}. However, this is not what is happening in this case because it's returning all instances that has a tag-key ${ec2_tag} regardless of its value AND tag-value ${port} regardless of its tag-key. Hence, the scenario above will return both EC2 instances.

This is mentioned in the AWS CLI documentation for describe-instances [1] operation.

[1] https://awscli.amazonaws.com/v2/documentation/api/2.9.6/reference/ec2/describe-instances.html

tag-key - The key of a tag assigned to the resource. Use this filter to find all resources that have a tag with a specific key, regardless of the tag value.

Example 6: To filter for instances with the specified my-team tag value
The following describe-instances example uses tag filters to scope the results to instances that have a tag with the specified tag value (my-team), regardless of the tag key.

I'm logging this issue suggesting the following filter if the intent above is indeed the expected condition:

instance=`aws ec2 describe-instances $options --filters "Name=tag:${ec2_tag},Values=${port}" --query 'Reservations[*].Instances[*].InstanceId'`

Looking forward to seeing your comments.

Pull request:

master...alyoamz:cluster-glue:patch-1

@alyoamz
Copy link
Author

alyoamz commented Nov 2, 2024

I performed additional tests and collected the results to demonstrate the behavior (current filter vs proposed filter):

# Scenario

Two EC2 instances with the following tags:

node1:
tag-key=pacemaker / tag-value=node1
tag-key=whatever / tag-value=node2

node2:
tag-key=pacemaker / tag-value=node2

From node1 trigger EC2 STONITH to fence node2:

# Results using existing filter of "Name=tag-value,Values=${port}" "Name=tag-key,Values=${ec2_tag}"

node1:~ # export PATH=$PATH:/usr/share/cluster-glue
node1:~ # time stonith -t external/ec2 profile=cluster tag=pacemaker port=node2 -T off node2
/usr/lib64/stonith/plugins/external/ec2: line 217: [: i-11111111111111111: binary operator expected
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    56  100    56    0     0  56000      0 --:--:-- --:--:-- --:--:-- 56000
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    19  100    19    0     0  19000      0 --:--:-- --:--:-- --:--:-- 19000
external/ec2[5780]: info: status check for i-11111111111111111is running
info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: STOPPINGINSTANCES	i-11111111111111111

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: CURRENTSTATE	64	stopping

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: PREVIOUSSTATE	16	running

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: STOPPINGINSTANCES	i-22222222222222222

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: CURRENTSTATE	64	stopping

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: PREVIOUSSTATE	16	running

Connection to aa.bbb.cc.ddd closed by remote host.
Connection to aa.bbb.cc.ddd closed.

^ Observe that both instances have been fenced.

# Results using proposed filter of "Name=tag:${ec2_tag},Values=${port}"

node1:~ # export PATH=$PATH:/usr/share/cluster-glue
node1:~ # time stonith -t external/ec2 profile=cluster tag=pacemaker port=node2 -T off node2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    56  100    56    0     0  56000      0 --:--:-- --:--:-- --:--:-- 56000
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    19  100    19    0     0  19000      0 --:--:-- --:--:-- --:--:-- 19000
external/ec2[3268]: info: status check for i-11111111111111111 is running
info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: STOPPINGINSTANCES	i-22222222222222222

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: CURRENTSTATE	64	stopping

info: external_run_cmd: '/usr/lib64/stonith/plugins/external/ec2 off node2' output: PREVIOUSSTATE	16	running

external/ec2[3282]: info: status check for i-22222222222222222 is stopping
external/ec2[3568]: info: status check for i-22222222222222222 is stopping
external/ec2[3579]: info: status check for i-22222222222222222 is stopping
external/ec2[3590]: info: status check for i-22222222222222222 is stopping
external/ec2[3601]: info: status check for i-22222222222222222 is stopping
external/ec2[3612]: info: status check for i-22222222222222222 is stopping
external/ec2[3623]: info: status check for i-22222222222222222 is stopping
external/ec2[3634]: info: status check for i-22222222222222222 is stopping
external/ec2[3645]: info: status check for i-22222222222222222 is stopped
external/ec2[3255]: info: Operation off passed

real	0m22.602s
user	0m10.619s
sys	0m1.261s

^ Expected behavior to fence only one instance.

NOTE: Instance IDs have been redacted of course.

# Additional information

  • OS release used for testing:
node1:~ # cat /etc/os-release
NAME="SLES"
VERSION="15-SP3"
VERSION_ID="15.3"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP3"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp3"
VARIANT_ID="sles-sap"
  • Involved packages along with version:
node1:~ # rpm -qa | egrep "(aws|glue)"
aws-cli-1.27.89-150200.30.11.1.noarch
libglue2-1.0.12+v1.git.1587474580.a5fda2bc-150000.3.17.1.x86_64
cluster-glue-1.0.12+v1.git.1587474580.a5fda2bc-150000.3.17.1.x86_64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant