Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LB priority for endpoints #3055

Closed
guydc opened this issue Mar 29, 2024 · 23 comments
Closed

Support LB priority for endpoints #3055

guydc opened this issue Mar 29, 2024 · 23 comments
Assignees
Labels
area/api API-related issues
Milestone

Comments

@guydc
Copy link
Contributor

guydc commented Mar 29, 2024

Description:
Envoy supports routing priorities:

  • When load balancing traffic, envoy uses healthy endpoints (as determined by configured health checks).
  • Users can assign priorities to endpoints, determining which healthy endpoints are preferred for load balancing.

This allows users to configure a failover mechanism from primary to secondary endpoints.

Envoy Gateway currently configures a default priority for all endpoints:

LB Priority can be implemented by extending existing Gateway-API resource:

  • BackendRefs may support a priority field, similar to the existing weight field.
  • The BackendLBPolicy resource may be extended to support additional LB considerations, such as priorty.

Other options:

  • EG can support this as an additional attribute of the extended BackendRef proposed in Support cluster settings for non-xRoute BackendRefs #3069
  • EG can support a custom BackendRef target, which will include a priority field, such as the proposed Backend API: API: Backend #3063
  • EG BackendTrafficPolicy can be used to assign priorities. For example, the policy will hold a map of BackendRefs to priorities. Or, the policy can attach directly to Service resources.
@guydc guydc added the area/api API-related issues label Mar 29, 2024
@arkodg
Copy link
Contributor

arkodg commented Apr 1, 2024

ideally prefer if this was in upstream
relates to kubernetes-sigs/gateway-api#2304 & kubernetes-sigs/gateway-api#1802

@arkodg
Copy link
Contributor

arkodg commented Apr 3, 2024

if we add into field into non xRoute backendRefs, we may need to pluraize existing instantiations to backendRefs

@guydc
Copy link
Contributor Author

guydc commented Apr 3, 2024

if we add into field into non xRoute backendRefs, we may need to pluralize existing instantiations to backendRefs

Yes, also ext-proc. I don't object, and this is in-line with gateway-api. Do you think that we should consider a different intermediate approach?

@arkodg
Copy link
Contributor

arkodg commented Apr 4, 2024

I'd vote to pluraize the backendRefs for non xRoute asap while we are in alpha, which can be a good place to test the priority / standby use case

Copy link

github-actions bot commented May 4, 2024

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label May 4, 2024
@modatwork
Copy link

This could be used to solve #1909, though it may require further refinement. Specifically, the Envoy Gateway should allocate distinct priority levels to the same upstream endpoint, varying by the Envoy Pod’s location in different availability zones. For instance, for an identical endpoint in zone A, the Envoy Pod situated in zone A should be assigned priority 0, whereas Envoy Pods in zones B and C should be assigned priority 1.

Do you think Envoy Gateway will handle this? Or leave it to the user?

@github-actions github-actions bot removed the stale label May 9, 2024
@guydc
Copy link
Contributor Author

guydc commented May 15, 2024

Do you think Envoy Gateway will handle this? Or leave it to the user?

We briefly discussed this scenario in the community meeting. It seems like this will require pushing different config to different envoy proxies based on zone information. For the time being, we'll focus on supporting the simpler use case described above.

@aoledk
Copy link
Contributor

aoledk commented May 24, 2024

As mentioned in #1909 (comment), Envoy in different zones may use same backend endpoint as different priority to reduce cross-zone cost.

It's a required feature for production adoption. Does community have roadmap on this feature ?

@aoledk
Copy link
Contributor

aoledk commented May 24, 2024

#1909 (comment) says:

thanks for outlining the steps @aoledk ! we currently have #3055 open to get explicit priority per backendRef and program that into the xds cluster resource.

In the future, we can use this issue to make sure we track the auto priority work, the field in k8s preferClose could be the knob for users to say they want to opt in to this feature

@arkodg One backendRef (mainly Service) will refer to endpoints from multiple zones, priority should not be applied on backendRef, unless we use selector to collect endpoints in same zone into single backendRef.

Besides, same endpoint will have different priority in Envoy.

So maybe we can't assign a fixed priority for backendRef, codes will be needed to assign right priority for specific Envoy.

@arkodg
Copy link
Contributor

arkodg commented May 24, 2024

there are two use cases for priority

  • the one you've highlighted where zones represent physical locations (done for HA purposes at the node level) so location aware routing reduces overall latency as well as cost (cross zone or region can be billed by cloud provider)
  • creating a primary and backup backendRef for HA cases (at the app/backend level)

@aoledk
Copy link
Contributor

aoledk commented May 30, 2024

@arkodg Can we add a new xDS Hook API, like PostEndpointModify(ClusterLoadAssignment, Node) which allow extension server to assign endpoint priority for individual envoy. Leave these tasks to EG adopters themselves.

@aoledk
Copy link
Contributor

aoledk commented Jun 7, 2024

hey @aoledk thanks for driving this work, before we implement this, we as a community need to align on what the feature is, is it

  • Auto Zone Aware Routing where a user sets a global field to enable this, and envoy proxy prioritizes endpoints in the same zone based on its zone ?
    Open Questions:

    • is this granular enough ? there are only 2 levels here - local or not local
  • Priorites for BackendRefs - Prioritizes one backendRef over another backendRef within the same HTTP Rule

@arkodg I've quoted your #3538 (comment) here to continue discussion about priority. I'll freeze #3542 until we reach agreement.


Assuming two cases demonstrated in image below.

Case A: priority isn't in gateway-api, EG will generate #1, #2 LocalityLbEndpoints.

Case B: priority is in gateway-api, EG will generate #3, #4, #5, #6 LocalityLbEndpoints.

Priority

Code in xds/cache will be used to modify priority for individual envoy based on envoy's running topology.

for _, node := range s.getNodeIDs(irKey) {
s.log.Debugf("Generating a snapshot with Node %s", node)
xdsSnapshotUpdateTotal.With(nodeIDLabel.Value(node)).Increment()
if err = s.SetSnapshot(context.TODO(), node, snapshot); err != nil {

We can have 2 options to enable topology aware routing ( NOT envoy zone aware routing ) based on priority.

Option 1

Add knob into BackendTrafficPolicy to enable topology aware routing for affected BackendRefs.

If we enable in Case A, #1, #2 LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache like this:

  • #1 P=0
  • #2 P=1

If we enable in Case B, #3, #4, #5, #6 LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache like this:

  • #3 P=0
  • #4 P=1
  • #5 P=2
  • #6 P=3

But this option has several challenges.

Challenge 1

The new knob is accessible in xds/translator, but xds/translator only pass xds to xds/server, then to xds/cache, a new communication channel need to be setup to pass this particular knob to xds/cache. It will add additional dependency between those modules.

Challenge 2

Should we allow user to enable topology aware routing for BackendRefs specified with priority ?

If yes, that means xds/cache will modify LocalityLbEndpoints's priority specified by user.

Option 2

Add knob into EnvoyGateway to enable topology aware routing for all BackendRefs.

If enabled, #1, #2, #3, #4, #5, #6 LocalityLbEndpoints for envoy running in zone a will be modified by xds/cache like this:

  • #1 P=0
  • #2 P=1
  • #3 P=0
  • #4 P=1
  • #5 P=2
  • #6 P=3

Challenge 1

Same to Option 1's Challenge 2.


is this granular enough ? there are only 2 levels here - local or not local

LocalityLbEndpoints can be set with region / zone / sub_zone:

  • zone can be retrieved from EndpointSlice directly.
  • region can be retrieved like this: use nodeName in EndpointSlice to query Node, retrieve region from Node topology.kubernetes.io/region label.
  • sub_zone` can't be retrieved.

Envoy can be set with region / zone / sub_zone.

  • Since Pod downward API doesn't expose Pod zone, region and zone can be retrieved like this: use nodeName via Pod downward API to query Node, retrieve region and zone from Node topology.kubernetes.io/region and topology.kubernetes.io/zone labels.
  • sub_zone can't be retrieved.

So region can be used as level bigger than zone technically, we can discuss whether we need to use region on topology aware routing.

@arkodg
Copy link
Contributor

arkodg commented Jun 7, 2024

thanks for the detailed analysis @aoledk !

  1. looks like the "priority is in gateway api backendRef" case is simpler and straightforward and doesnt need zone based endpoint aggregation

Auto Zone Aware Routing
2) if we okay with 2 level priorities then envoy's inherent zone aware routing feature is the way to go
Pros

  • easier to build / less complex to maintain and debug
    Cons
  • only 2 levels of priority - local to envoy or not
  1. if we want multi level priority based on region / zone / sub zone, we'll have to key EDS o/p in the control plane based on envoy locality and start maintaining this extra dimension
    Pros
  • more levels of priority
    Cons
  • need to maintain per locality endpoint cache

As next steps, here is what I propose

  1. Decide on which feature we want to prioritize first or both ? This is a great roadmap item for v1.2 (Oct 2024)
  2. For Zone Aware Routing gather more feedback on how many levels of priority / locality we want to support

Copy link

github-actions bot commented Jul 8, 2024

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Jul 8, 2024
@guydc guydc added this to the v1.2.0 milestone Jul 23, 2024
@github-actions github-actions bot removed the stale label Jul 23, 2024
@alexwo
Copy link
Contributor

alexwo commented Aug 11, 2024

thanks for the detailed analysis @aoledk !

  1. looks like the "priority is in gateway api backendRef" case is simpler and straightforward and doesnt need zone based endpoint aggregation

Auto Zone Aware Routing 2) if we okay with 2 level priorities then envoy's inherent zone aware routing feature is the way to go Pros

  • easier to build / less complex to maintain and debug
    Cons
  • only 2 levels of priority - local to envoy or not
  1. if we want multi level priority based on region / zone / sub zone, we'll have to key EDS o/p in the control plane based on envoy locality and start maintaining this extra dimension
    Pros
  • more levels of priority
    Cons
  • need to maintain per locality endpoint cache

As next steps, here is what I propose

  1. Decide on which feature we want to prioritize first or both ? This is a great roadmap item for v1.2 (Oct 2024)
  2. For Zone Aware Routing gather more feedback on how many levels of priority / locality we want to support

1 sounds straight forward and can bring value, we can extend this struct with an endpoint priority field:

type BackendRef struct {

Similar to how gwapiv1.BackendObjectReference has "weight". Then, translate this value to an envoy endpoint priority:
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/endpoint/v3/endpoint_components.proto#envoy-v3-api-msg-config-endpoint-v3-localitylbendpoints

We can also have an E2E test for this, to demonstrate that when a high-prio backend is down, traffic is steered to the low prio backend.

This will only work for "technical" backends at the moment, like ext-auth/proc/observability sinks ...

If the direction sounds good, I can get started with adding it?

@arkodg
Copy link
Contributor

arkodg commented Aug 12, 2024

hey @alexwo thanks for lookiing into this !
im hoping we can weigh in pros/cons of an explicit priority field vs adding another field like backupBackendRef

note for @aoledk who is looking into Zone Aware Routing, priority routing is mutually exclusive to it, and EG should be able to do zone aware routing for P=0 Hosts

@arkodg
Copy link
Contributor

arkodg commented Aug 12, 2024

GCP expresses failover using a bool failover field inside backend definition
https://cloud.google.com/compute/docs/reference/rest/v1/backendServices

@alexwo
Copy link
Contributor

alexwo commented Aug 13, 2024

GCP expresses failover using a bool failover field inside backend definition https://cloud.google.com/compute/docs/reference/rest/v1/backendServices

Sure, we can simplify and offer the same level of flexibility as done in GCP. So that via the API layer we expose just a boolean and manage as priority at the translation layer, allowing for a priority of: primary / backup (0,1).

@arkodg
Copy link
Contributor

arkodg commented Aug 13, 2024

nginx has a backup keyword It uses https://nginx.org/en/docs/http/ngx_http_upstream_module.html

I'm a +1 for the boolean option

also @alexwo another design decision is around when does an endpoint get removed from the load balancing pool and when does it get re added ? is the user responsible for setting up healthCheck (active or passive) which results in endpoints being removed/readded from the pool ?

@guydc
Copy link
Contributor Author

guydc commented Aug 13, 2024

+1 for a boolean as well.

when does an endpoint get removed from the load balancing pool and when does it get re added

@alexwo
Copy link
Contributor

alexwo commented Aug 14, 2024

Sounds good @arkodg / @guydc,

I have updated: #4033 to reflect the “designated failover backend” boolean approach.

arkodg added a commit to arkodg/gateway that referenced this issue Aug 22, 2024
Adds a `failover` field to Backend API so we can support Active/Passive
Failove backends within xRoutes similar to envoyproxy#4033

Relates to envoyproxy#3055

Signed-off-by: Arko Dasgupta <[email protected]>
arkodg added a commit that referenced this issue Aug 27, 2024
* [api] Add Failover field to Backend

Adds a `failover` field to Backend API so we can support Active/Passive
Failove backends within xRoutes similar to #4033

Relates to #3055

Signed-off-by: Arko Dasgupta <[email protected]>

* fix doc string

Signed-off-by: Arko Dasgupta <[email protected]>

* notImplementedHide

Signed-off-by: Arko Dasgupta <[email protected]>

---------

Signed-off-by: Arko Dasgupta <[email protected]>
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days.

@github-actions github-actions bot added the stale label Sep 13, 2024
arkodg added a commit to arkodg/gateway that referenced this issue Oct 16, 2024
zirain pushed a commit that referenced this issue Oct 18, 2024
* feat: implement fallback for the Backend API

Relates to #3055

Signed-off-by: Arko Dasgupta <[email protected]>

* fix lint

Signed-off-by: Arko Dasgupta <[email protected]>

---------

Signed-off-by: Arko Dasgupta <[email protected]>
@arkodg
Copy link
Contributor

arkodg commented Oct 18, 2024

docs and e2e is left, moving this to the v1.2.0 milestone

@arkodg arkodg modified the milestones: v1.2.0-rc1, v1.2.0 Oct 18, 2024
@github-actions github-actions bot removed the stale label Oct 18, 2024
@arkodg arkodg closed this as completed Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api API-related issues
Projects
None yet
Development

No branches or pull requests

5 participants