Support active/passive failover between backend services

## Description
Add support for active/passive (primary/fallback) failover between backend services. Users should be able to designate one backend as primary and another as a fallback that only receives traffic when the primary's health degrades, with automatic recovery back to the primary when it becomes healthy again.
This has been requested in #13507.
## Motivation
Currently, when an HTTPRoute references multiple `backendRefs`, they are translated as weighted clusters — traffic is split proportionally. There is no way to express "only use backend B if backend A is unhealthy."
## Proposed Approach
Use Envoy's [priority-based load balancing](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/priority). Endpoints from both the primary and fallback backends would be placed in the same Envoy cluster but in different `LocalityLbEndpoints` groups with different priority levels:
- **Primary backend endpoints** → `priority: 0`
- **Fallback backend endpoints** → `priority: 1`
Envoy's built-in priority load balancing handles the rest: traffic goes to priority-0 endpoints first, and only spills over to priority-1 when the health of priority-0 drops below a threshold (controlled by the overprovisioning factor — at the default of 1.4, failover triggers when active health drops below ~72%).
Health checks (active or passive via outlier detection) are required for failover detection and automatic recovery.
### API Options
A few options for how to expose this in the kgateway API:
1. **Annotation or field on `backendRefs`**: e.g., a `fallback: true` annotation or a new field on the backendRef to indicate it is a fallback destination. This works for Backends and kube resources
2. **New policy CRD field**: e.g., a field on `TrafficPolicy` or `BackendConfigPolicy` that designates failover relationships between services. Policy can apply to Backends and kube resources
3. **Extension to the `Backend` CRD**: A `fallback` field on the Backend resource itself. Problem with this is we could need to add the kube type to the backend for this to work and it would change the UX users already have if they are directly referenced in routes.
### Considerations
- Weighted routing and failover are mutually exclusive semantics for the same `backendRefs` list — the API should make this clear.
- Both primary and fallback backends share the same Envoy cluster, so they share cluster-level settings (timeouts, circuit breakers, LB algorithm). This is a limitation of the priority-based approach & we should make sure our choice of API respects this as well.
- Users should be strongly encouraged to configure health checks or outlier detection alongside failover, otherwise there's no mechanism to detect backend failure.
- The Gateway API spec does not currently have a standard for this pattern, so this would be a kgateway extension. Maybe a good topic for GEP?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support active/passive failover between backend services #13643

Description

Motivation

Proposed Approach

API Options

Considerations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support active/passive failover between backend services #13643

Description

Description

Motivation

Proposed Approach

API Options

Considerations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions