Skip to content

reverse_tunnel: observe peer GOAWAY in drain-aware HCM and re-dial the tunnel#2

Open
ivpr wants to merge 4 commits into
reverse-tunnel-acceptor-drain-goawayfrom
reverse-tunnel-drain-aware-hcm-goaway-observer
Open

reverse_tunnel: observe peer GOAWAY in drain-aware HCM and re-dial the tunnel#2
ivpr wants to merge 4 commits into
reverse-tunnel-acceptor-drain-goawayfrom
reverse-tunnel-drain-aware-hcm-goaway-observer

Conversation

@ivpr

@ivpr ivpr commented May 27, 2026

Copy link
Copy Markdown
Owner

Commit Message: When a reverse-tunnel accepted by the drain-aware HCM filter receives a peer-initiated HTTP/2 GOAWAY, drop the tunnel from the ReverseConnectionIOHandle's tracking and kick the maintenance loop so a replacement tunnel is dialed before the original TCP socket closes. Gated behind the same opt-in runtime guard as the upstream-side GOAWAY emission (envoy.reloadable_features.reverse_tunnel_drain_with_goaway).

Additional Description: HCM::onGoAway is a no-op by default. For reverse tunnels that means a peer GOAWAY from the upstream side is observed at the H2 server codec on the downstream side and then discarded; the tunnel sits in the initiator's tracking map as if nothing happened until the TCP socket eventually closes.

This change introduces three pieces, all behind the same runtime guard introduced by the upstream-side admin endpoint:

  • DrainAwareConnectionCallbacksWrapper (new header drain_aware_hcm/drain_aware_connection_callbacks_wrapper.h) decorates Http::ServerConnectionCallbacks. It forwards every callback unchanged and additionally fires a user-supplied hook on onGoAway. HCM's onGoAway forward stays a no-op, so behavior toward HCM is unchanged; the hook is what lets the reverse-tunnel layer react. Mirrors the ConnectionCallbacksWrapper pattern used elsewhere in source.

  • In drain_aware_hcm::DrainAwareHttpConnectionManagerConfig::createCodec, when the runtime guard is enabled, recover the ReverseConnectionIOHandle that owns this tunnel by typed cast on the accepted socket's IoHandle (DownstreamReverseConnectionIOHandle carries its parent + connection_key as members) and wrap the HCM callbacks with DrainAwareConnectionCallbacksWrapper. The hook calls owner->markTunnelDrainingAndDialReplacement(connection_key).

    Note: we cannot key off connection.localAddress() because the reverse_conn_listener exposes its bound rc:// address rather than the outbound TCP socket's local port that the IOHandle tracks. The typed cast avoids that alignment problem.

    DrainAwareServerConnection now owns the wrapping callbacks so its lifetime matches the codec's (the codec stores a bare reference).

  • ReverseConnectionIOHandle::markTunnelDrainingAndDialReplacement (new public method) erases the connection_key from host_to_conn_info_map_[host].connection_keys, drops the gauge state, and kicks rev_conn_retry_timer_->enableTimer(0ms) so maintenance runs on the next dispatcher pass and dials a replacement. The underlying TCP socket is left alone so in-flight H2 streams complete naturally; onDownstreamConnectionClosed is made idempotent so the eventual TCP close is a benign debug-log no-op when tracking is already gone.

  • DownstreamReverseConnectionIOHandle gains two public accessors: connectionKey() and parent(). The cast above relies on these.

When the runtime guard is disabled (the default), createCodec skips the wrapping callbacks creation; the codec uses the unwrapped HCM callbacks, peer GOAWAY remains a no-op in HCM::onGoAway, and markTunnelDrainingAndDialReplacement has no callers. Existing behavior is preserved bit-for-bit.

Risk Level: Low. All new behavior is gated by an opt-in runtime guard, default off, that must be flipped on each downstream Envoy process. markTunnelDrainingAndDialReplacement has no other callers, and onDownstreamConnectionClosed's idempotency change is a strict log-level downgrade ("warn" -> "debug") for an already-benign no-op that can also now arise from manual races against the (still-not-called) mark-draining method.

Testing: Unit tests added:

  • drain_aware_connection_callbacks_wrapper_test.cc (5 tests) exercising the wrapper's forwarding and the GOAWAY hook.
  • 4 new tests in reverse_connection_io_handle_test.cc for markTunnelDrainingAndDialReplacement: happy path, unknown key, idempotent-with-close, idempotent on second call.

Manual end-to-end validation:

  • 2-Envoy upstream/downstream pair: with the runtime guard enabled, POST /reverse_tunnel/drain_cluster on the upstream causes all three components (upstream admin handler, peer-GOAWAY wrapper on DS, IOHandle mark-draining) to fire in the expected order, and the downstream re-dials.
  • 3-Envoy DNS failover: a slow in-flight request on R1 completes with HTTP 200 despite a drain triggered mid-stream; a DNS change to R2 + the maintenance loop establishes a fresh tunnel to R2; requests to R1 fail only after its drain-time elapses; requests to R2 keep working.
  • Same scenarios run with the guard disabled confirm no observable change vs. main.

Docs Changes: n/a.

Release Notes: When the envoy.reloadable_features.reverse_tunnel_drain_with_goaway runtime guard is enabled, the reverse-tunnel drain-aware HCM filter now observes peer-initiated GOAWAY on reverse-tunnel sockets and asks the initiator to drop the tunnel from tracking and dial a replacement. In-flight streams on the original tunnel complete naturally.

@ivpr ivpr force-pushed the reverse-tunnel-drain-aware-hcm-goaway-observer branch from 917ec9a to 788c2ae Compare May 27, 2026 21:58
@ivpr ivpr force-pushed the reverse-tunnel-acceptor-drain-goaway branch from 66c2311 to 7702d30 Compare May 28, 2026 19:32
@ivpr ivpr force-pushed the reverse-tunnel-drain-aware-hcm-goaway-observer branch from 788c2ae to 3116743 Compare May 28, 2026 19:32
@ivpr ivpr force-pushed the reverse-tunnel-acceptor-drain-goaway branch from 7702d30 to 4b1b5d9 Compare May 29, 2026 22:36
@ivpr ivpr force-pushed the reverse-tunnel-drain-aware-hcm-goaway-observer branch from 3116743 to a5da885 Compare May 29, 2026 22:53
@ivpr ivpr changed the title reverse_tunnel: observe peer GOAWAY in drain-aware HCM and re-dial th… reverse_tunnel: observe peer GOAWAY in drain-aware HCM and re-dial the tunnel May 29, 2026
Prasad I V added 3 commits May 30, 2026 00:04
…DrainAndDelete

Commit Message: Make the reverse-tunnel upstream socket interface bootstrap
extension fan a server-wide drain out to every envoy.clusters.reverse_connection
cluster via DrainBehavior::GoAwayAndDrainAndDelete, so the cluster's outbound
HTTP/2 client codecs send GOAWAY to their peers before tear-down. Gated behind
a new opt-in runtime feature flag.

Additional Description: A reverse_connection cluster's HTTP/2 client codecs are
the cluster's outbound link to a peer Envoy (the cluster sits on the sending
side of a long-lived multiplexed tunnel). Without a drain-time hook, those
peers learn about the drain only when the TCP socket eventually closes, which
defeats the point of a graceful drain.

This change adds `onDrainCheckTimer()` to `ReverseTunnelAcceptorExtension`:

* `onServerInitialized()` arms a 100ms repeating timer.
* The timer polls `context_.drainManager().draining(DrainDirection::All)`,
  which is what `/drain_listeners?graceful` (and other server-wide drain
  triggers) flip.
* The first time it observes draining AND the runtime guard
  `envoy.reloadable_features.reverse_tunnel_drain_with_goaway` is enabled,
  the extension iterates `clusterManager().clusters().active_clusters_`,
  finds clusters of type `envoy.clusters.reverse_connection`, and calls
  `clusterManager().drainConnections(name, /*predicate=*/nullptr,
  ConnectionPool::DrainBehavior::GoAwayAndDrainAndDelete)` on each. The
  ClusterManager fans the per-cluster drain out to every worker thread that
  holds a thread-local entry for the cluster. The timer is then one-shot,
  not re-armed.

When the runtime guard is disabled (the default), the timer keeps re-arming
until the server fully exits drain; no GOAWAY is emitted, preserving existing
behavior bit-for-bit.

The drain hook only runs on the upstream socket interface (the side that
accepts reverse tunnels). The downstream side's reaction to a received GOAWAY
is added by a follow-up change, also gated by the same runtime guard.

Risk Level: Low. The new code path is bounded to clusters of type
`envoy.clusters.reverse_connection`, only fires on a server-wide drain, and is
default-off behind a runtime feature flag. The flag must be flipped on every
process that should participate; existing deployments see no change.

Testing: Manual end-to-end validation with two real Envoy processes (upstream
+ downstream reverse-tunnel pair) and a 3-Envoy DNS failover variant, with
the runtime guard enabled. With the guard enabled,
`/drain_listeners?graceful` on the upstream causes
`cluster.reverse_connection_cluster.http2.goaway_sent` to increment and the
downstream's drain-aware HCM filter to observe the peer GOAWAY. With the
guard disabled, no GOAWAY is emitted and the existing behavior is preserved.
Integration test for the full end-to-end flow is in the follow-up PR that
introduces the downstream reaction.

Docs Changes: n/a.

Release Notes: Added an opt-in runtime guard
`envoy.reloadable_features.reverse_tunnel_drain_with_goaway`. When enabled,
the reverse-tunnel upstream socket interface bootstrap extension sends
GOAWAY on every reverse_connection cluster pool when the server enters
listener drain, so peer Envoys can fail over to a sibling upstream before
in-flight streams complete. Default off.

Signed-off-by: Prasad I V <prasad.iv@databricks.com>
…AndDrainExisting

Follow-up to the conn-pool rename: switch the upstream acceptor extension's
drain trigger from DrainBehavior::GoAwayAndDrainAndDelete to
DrainBehavior::NotifyPeerAndDrainExisting, drop the pod-prestop wording in
the header comment to keep the documentation generic, and tighten the comment
to focus on the why (peers receive GOAWAY and dial replacement tunnels)
rather than restating the mechanism.

No behavior change beyond the new enum name and its updated semantics
(the pool stays usable for new streams via fresh clients while existing
clients drain in the background).

Signed-off-by: Prasad I V <prasad.iv@databricks.com>
…_tunnel/drain_cluster admin endpoint

Coupling cluster-level drain to /drain_listeners was novel in Envoy. Listener
drain (HCM GOAWAY downstream, listener-manager stop) and cluster drain (upstream
pool fan-out) are independent concerns elsewhere in the codebase. Tying them
together also creates a race window where in-flight egress requests on the
draining server can fail with 503 because the listener and the upstream pool
are draining in parallel.

Switching to an explicit admin verb lets the operator sequence:
  POST /drain_listeners?graceful&inboundonly   # quiesce ingress
  (wait for egress traffic to migrate to a sibling US)
  POST /reverse_tunnel/drain_cluster           # GOAWAY all reverse tunnels
  POST /drain_listeners?graceful               # full server drain + SIGTERM

This removes:
  - the drain_check_timer_ field, periodic poll, and one-shot drain_goaway_dispatched_ guard
  - the onDrainCheckTimer() method
  - the dependency on envoy/server/drain_manager.h and envoy/network/drain_decision.h

and adds:
  - admin handler registration for POST /reverse_tunnel/drain_cluster in
    onServerInitialized(), gated on the existing
    envoy.reloadable_features.reverse_tunnel_drain_with_goaway runtime flag
  - handlerDrainCluster() + private fanOutDrainNotifyPeer() helper
  - three unit tests covering: runtime-flag-off short circuit, fan-out limited to
    envoy.clusters.reverse_connection cluster types, and second-call idempotency

The runtime guard stays in source/common/runtime/runtime_features.cc so a single
opt-in flag still controls the end-to-end behavior.

Signed-off-by: Prasad I V <prasad.iv@databricks.com>
@ivpr ivpr force-pushed the reverse-tunnel-acceptor-drain-goaway branch from 4b1b5d9 to 66b488b Compare May 30, 2026 00:12
…e tunnel

Commit Message: When a reverse-tunnel accepted by the drain-aware HCM filter
receives a peer-initiated HTTP/2 GOAWAY, drop the tunnel from the
ReverseConnectionIOHandle's tracking and kick the maintenance loop so a
replacement tunnel is dialed before the original TCP socket closes. Gated
behind the same opt-in runtime guard as the upstream-side GOAWAY emission
(envoy.reloadable_features.reverse_tunnel_drain_with_goaway).

Additional Description: HCM::onGoAway is a no-op by default. For reverse
tunnels that means a peer GOAWAY from the upstream side is observed at the
H2 server codec on the downstream side and then discarded; the tunnel sits
in the initiator's tracking map as if nothing happened until the TCP socket
eventually closes.

This change introduces three pieces, all behind the same runtime guard
introduced by the upstream-side drain hook:

* `DrainAwareConnectionCallbacksWrapper`
  (new header `drain_aware_hcm/drain_aware_connection_callbacks_wrapper.h`)
  decorates `Http::ServerConnectionCallbacks`. It forwards every callback
  unchanged and additionally fires a user-supplied hook on onGoAway. HCM's
  onGoAway forward stays a no-op, so behavior toward HCM is unchanged; the
  hook is what lets the reverse-tunnel layer react. Mirrors the
  `ConnectionCallbacksWrapper` pattern used elsewhere in source.

* In `drain_aware_hcm::DrainAwareHttpConnectionManagerConfig::createCodec`,
  when the runtime guard is enabled, recover the
  `ReverseConnectionIOHandle` that owns this tunnel by typed cast on the
  accepted socket's IoHandle (DownstreamReverseConnectionIOHandle carries
  its parent + connection_key as members) and wrap the HCM callbacks with
  `DrainAwareConnectionCallbacksWrapper`. The hook calls
  `owner->markTunnelDrainingAndDialReplacement(connection_key)`.

  Note: we cannot key off `connection.localAddress()` because the
  reverse_conn_listener exposes its bound rc:// address rather than the
  outbound TCP socket's local port that the IOHandle tracks. The typed
  cast avoids that alignment problem.

  `DrainAwareServerConnection` now owns the wrapping_callbacks so its
  lifetime matches the codec's (the codec stores a bare reference).

* `ReverseConnectionIOHandle::markTunnelDrainingAndDialReplacement` (new
  public method) erases the connection_key from
  `host_to_conn_info_map_[host].connection_keys`, drops the gauge state,
  and kicks `rev_conn_retry_timer_->enableTimer(0ms)` so maintenance runs
  on the next dispatcher pass and dials a replacement. The underlying
  TCP socket is left alone so in-flight H2 streams complete naturally;
  `onDownstreamConnectionClosed` is made idempotent so the eventual TCP
  close is a benign debug-log no-op when tracking is already gone.

* `DownstreamReverseConnectionIOHandle` gains two public accessors:
  `connectionKey()` and `parent()`. The cast above relies on these.

When the runtime guard is disabled (the default), `createCodec` skips
the wrapping_callbacks creation; the codec uses the unwrapped HCM
callbacks, peer GOAWAY remains a no-op in `HCM::onGoAway`, and
`markTunnelDrainingAndDialReplacement` has no callers. Existing behavior
is preserved bit-for-bit.

Risk Level: Low. All new behavior is gated by an opt-in runtime guard,
default off, that must be flipped on each downstream Envoy process.
`markTunnelDrainingAndDialReplacement` has no other callers, and
`onDownstreamConnectionClosed`'s idempotency change is a strict log-level
downgrade ("warn" -> "debug") for an already-benign no-op that can also
now arise from manual races against the (still-not-called) mark-draining
method.

Testing: Unit tests added:
* `drain_aware_connection_callbacks_wrapper_test.cc` (5 tests) exercising
  the wrapper's forwarding and the GOAWAY hook.
* 4 new tests in `reverse_connection_io_handle_test.cc` for
  `markTunnelDrainingAndDialReplacement`: happy path, unknown key,
  idempotent-with-close, idempotent on second call.

Manual end-to-end validation:
* 2-Envoy upstream/downstream pair: with the runtime guard enabled,
  `/drain_listeners?graceful` on the upstream causes all three components
  (acceptor drain hook, peer-GOAWAY wrapper on DS, IOHandle
  mark-draining) to fire in the expected order, and the downstream
  re-dials.
* 3-Envoy DNS failover: a slow in-flight request on R1 completes with
  HTTP 200 despite a drain triggered mid-stream; a DNS change to R2 +
  the maintenance loop establishes a fresh tunnel to R2; requests to R1
  fail only after its drain-time elapses; requests to R2 keep working.
* Same scenarios run with the guard disabled confirm no observable
  change vs. main.

Docs Changes: n/a.

Release Notes: When the
`envoy.reloadable_features.reverse_tunnel_drain_with_goaway` runtime
guard is enabled, the reverse-tunnel drain-aware HCM filter now
observes peer-initiated GOAWAY on reverse-tunnel sockets and asks the
initiator to drop the tunnel from tracking and dial a replacement.
In-flight streams on the original tunnel complete naturally.

Signed-off-by: Prasad I V <prasad.iv@databricks.com>
@ivpr ivpr force-pushed the reverse-tunnel-drain-aware-hcm-goaway-observer branch from a5da885 to 041c74e Compare May 30, 2026 00:17
@ivpr ivpr force-pushed the reverse-tunnel-acceptor-drain-goaway branch from 66b488b to c3713b9 Compare June 1, 2026 04:51
ivpr pushed a commit that referenced this pull request Jun 15, 2026
…envoyproxy#45324)

### Description
This PR resolves a use-after-free (UAF) / pure virtual method call crash
that occurs during `ConnectivityGrid` teardown when connection pools
(HTTP/3, HTTP/2, or HTTP/1) are destroyed while active connection
attempts are still ongoing in the background.

---

### 💥 The Bug & Crash Callstack

When the `ConnectivityGrid` is deleted, its destructor destroys its
underlying connection pools in reverse order. If a pool has active
connection attempts, deleting the pool synchronously cancels those
attempts. In C++, once a derived class destructor finishes, the object's
vtable pointer transitions to point to the base class vtable.

Because `destructAllConnections()` is invoked in
`~HttpConnPoolImplBase()` (the base class destructor), the connection
cancellation callback wrappers execute `onConnectionAttemptFailed()`
after the derived pool object has already been destructed.

When `onConnectionAttemptFailed()` attempts to write a log trace calling
`describePool(attempt->pool())` (which executes
`attempt->pool().protocolDescription()`), virtual dispatch invokes a
**pure virtual function call** on the base class `HttpConnPoolImplBase`,
crashing Envoy instantly:

```log
Program received signal SIGSEGV, Segmentation fault.
0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed()
Backtrace:
  #0  0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed()
  #1  0x000056453663606d in Envoy::Http::ConnectivityGrid::WrapperCallbacks::ConnectionAttemptCallbacks::onPoolFailure()
  #2  0x000056453664e7f1 in Envoy::Http::HttpConnPoolImplBase::onPoolFailure()
  #3  0x00005645366cb447 in Envoy::ConnectionPool::ConnPoolImplBase::purgePendingStreams()
  #4  0x00005645366c9d6e in Envoy::ConnectionPool::ConnPoolImplBase::onConnectionEvent()
  #5  0x000056453666394d in Envoy::Tcp::ActiveTcpClient::onEvent()
  envoyproxy#6  0x000056453709b0e9 in Envoy::Network::MockConnectionBase::raiseEvent()
  ...
  envoyproxy#24 0x00005645366be287 in Envoy::ConnectionPool::ConnPoolImplBase::destructAllConnections()
  envoyproxy#25 0x00005645366766b8 in Envoy::Http::HttpConnPoolImplBase::~HttpConnPoolImplBase()
  envoyproxy#26 0x000056453664e70f in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed()
  envoyproxy#27 0x000056453664e739 in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed()
  envoyproxy#28 0x000056453647099c in std::__1::default_delete<>::operator()()
  envoyproxy#29 0x000056453647091c in std::__1::unique_ptr<>::reset()
  envoyproxy#30 0x000056453663a3ad in Envoy::Http::ConnectivityGrid::~ConnectivityGrid()
  envoyproxy#31 0x000056453663a4f9 in Envoy::Http::ConnectivityGrid::~ConnectivityGrid()
  envoyproxy#32 0x00005645365de65c in std::__1::default_delete<>::operator()()
  envoyproxy#33 0x000056453656b55c in std::__1::unique_ptr<>::reset()
  envoyproxy#34 0x00005645363174fa in ConnectivityGridTest_DestroyGridWithActiveH2Attempts_Test::TestBody()
```

---

### 🛠️ The Fix

Introduced a reloadable feature flag guard
`envoy.reloadable_features.conn_pool_grid_early_return_on_teardown`
(enabled by default) to shield both H3 and H2/H1 pools during
destruction:
* **Early Return**: Inside
`WrapperCallbacks::onConnectionAttemptFailed()`, we check if
`delete_started_` is `true`. If the flag is enabled and the wrapper is
already being deleted/torn down, we return immediately, completely
skipping the unsafe virtual method lookups on the half-destructed pool
object.
* **ALPN & Protocol Safety**: This safely protects the system regardless
of which pool is being torn down (H3 or H2/H1 mixed pools).

---

### 🧪 Testing & Verification

* Added the `DestroyGridWithActiveH2Attempts` unit test to verify that
tearing down the `ConnectivityGrid` while an active H2 connection
attempt is running does crash without the fix and not with the fix.

Risk Level: low, only exposed with log level trace
Testing: new unit test
Docs Changes: N/A 
Release Notes: Y
Platform Specific Features: N/A
Runtime guard:
envoy.reloadable_features.conn_pool_grid_early_return_on_teardown

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
bool drain_goaway_sent_{false};
// Held to keep the wrapper alive for the codec's full lifetime; the codec stores a
// bare reference. nullptr when peer-GOAWAY observation is not wired up.
std::unique_ptr<DrainAwareConnectionCallbacksWrapper> wrapping_callbacks_;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be declared above inner so that while destroying the codec is destroyed first and the wrapper outlives the codec

"reverse_tunnel: peer GOAWAY for connection {}; draining tunnel and dialing replacement",
connection_key);

// Find the host that owns this connection key.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates a lot of the cleanup logic onDownstreamConnectionClosed already does, let's move this to a helper and call it from both the functions:

  • dropTunnelFromTracking(key): add a helper which finds the host, erases key etc
  • call it from onDownstreamConnectionClosed and markTunnelDrainingAndDialReplacement
  • looks like onDownstreamConnectionClosed waits for the next maintainence timer (10s) and markTunnelDrainingAndDialReplacement keeps it for the next dispatcher tick, maybe we can use the same re-initiation interval for both cases (set both to the next dispatcher tick)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use MAKE_ADMIN_HANDLER(x) here

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a constant for "envoy.clusters.reverse_connection" and use it here and in reverse_connection.cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants