Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a resilience test suite to cover the expected behavior of the data plane for both existing and new proxy instances during specific edge conditions #4861

Closed
alexwo opened this issue Dec 6, 2024 · 3 comments

Comments

@alexwo
Copy link
Contributor

alexwo commented Dec 6, 2024

Description:
Consider having a resilience test suite test to cover the expected behavior of the data plane for both existing and new proxy instances during specific edge conditions—such as API server unavailability, Envoy Gateway (EG) downtime, or EG leader failures.

  1. Ensuring Continuous Operation:
    By testing scenarios where the control plane components are unavailable, we can verify that existing and new proxy instances maintain their functionality.

  2. Validating Last Known Good (LKG) State Persistence:
    Assessing how long the LKG state persists in active EG/Envoy instances during disruptions.

  3. Preventing Premature Readiness of Unconfigured Instances:
    Ensuring that unconfigured Envoys/EGs do not reach a ready state prematurely prevents potential misrouting or service failures, thereby maintaining the integrity of the data plane.

  4. Reflecting Real-World Scenarios:
    Running EG in a production-like setting with leader election enabled and multiple instances ensures that the tests accurately represent real-world operations, providing confidence in the system’s resilience when things go wrong.

@alexwo alexwo added the triage label Dec 6, 2024
@alexwo alexwo changed the title Add ש resilience test suite to cover the expected behavior of the data plane for both existing and new proxy instances during specific edge conditions Add a resilience test suite to cover the expected behavior of the data plane for both existing and new proxy instances during specific edge conditions Dec 6, 2024
@alexwo
Copy link
Contributor Author

alexwo commented Dec 7, 2024

we can also add a test case to cover #4845 (comment) once fixed

Copy link

github-actions bot commented Jan 6, 2025

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Jan 6, 2025
@arkodg
Copy link
Contributor

arkodg commented Jan 6, 2025

can this be closed @alexwo ?

@alexwo alexwo closed this as completed Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants