Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto scaler (ASG) kills builders in artifacts pipelines #9493

Open
1 of 2 tasks
fruch opened this issue Dec 8, 2024 · 1 comment
Open
1 of 2 tasks

Auto scaler (ASG) kills builders in artifacts pipelines #9493

fruch opened this issue Dec 8, 2024 · 1 comment
Assignees

Comments

@fruch
Copy link
Contributor

fruch commented Dec 8, 2024

Packages

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

4 artifact runs in the same job failed when
then builder was killed when running 4 artifacts jobs at the same time

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 1 nodes (im4gn.8xlarge)

Scylla Nodes used in this run:

  • artifacts-ami-jenkins-db-node-5d2b1b1f-1 (18.205.1.166 | 10.12.0.197) (shards: -1)

OS / Image: ami-085bfd76efe4fdd62 (aws: undefined_region)

Test: artifacts-ami-arm-test
Test id: 5d2b1b1f-6091-45d7-ae16-e1c8634522c5
Test name: enterprise-2024.2/artifacts/artifacts-ami-arm-test
Test method: artifacts_test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 5d2b1b1f-6091-45d7-ae16-e1c8634522c5
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 5d2b1b1f-6091-45d7-ae16-e1c8634522c5

Logs:

No logs captured during this run.

Jenkins job URL
Argus

@fruch
Copy link
Contributor Author

fruch commented Dec 8, 2024

seems like we have lot of case like:

instance was taken out of service in response to an EC2 health check indicating it has been terminated or stopped.

trying to set health grace period to zero, and let's see if it's happening again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant