reverse_tunnel: add /reverse_tunnel/drain_cluster admin endpoint for NotifyPeerAndDrainExisting#1
Open
ivpr wants to merge 3 commits into
Conversation
6636144 to
2e69798
Compare
7702d30 to
4b1b5d9
Compare
4b1b5d9 to
66b488b
Compare
added 3 commits
June 1, 2026 04:51
…DrainAndDelete Commit Message: Make the reverse-tunnel upstream socket interface bootstrap extension fan a server-wide drain out to every envoy.clusters.reverse_connection cluster via DrainBehavior::GoAwayAndDrainAndDelete, so the cluster's outbound HTTP/2 client codecs send GOAWAY to their peers before tear-down. Gated behind a new opt-in runtime feature flag. Additional Description: A reverse_connection cluster's HTTP/2 client codecs are the cluster's outbound link to a peer Envoy (the cluster sits on the sending side of a long-lived multiplexed tunnel). Without a drain-time hook, those peers learn about the drain only when the TCP socket eventually closes, which defeats the point of a graceful drain. This change adds `onDrainCheckTimer()` to `ReverseTunnelAcceptorExtension`: * `onServerInitialized()` arms a 100ms repeating timer. * The timer polls `context_.drainManager().draining(DrainDirection::All)`, which is what `/drain_listeners?graceful` (and other server-wide drain triggers) flip. * The first time it observes draining AND the runtime guard `envoy.reloadable_features.reverse_tunnel_drain_with_goaway` is enabled, the extension iterates `clusterManager().clusters().active_clusters_`, finds clusters of type `envoy.clusters.reverse_connection`, and calls `clusterManager().drainConnections(name, /*predicate=*/nullptr, ConnectionPool::DrainBehavior::GoAwayAndDrainAndDelete)` on each. The ClusterManager fans the per-cluster drain out to every worker thread that holds a thread-local entry for the cluster. The timer is then one-shot, not re-armed. When the runtime guard is disabled (the default), the timer keeps re-arming until the server fully exits drain; no GOAWAY is emitted, preserving existing behavior bit-for-bit. The drain hook only runs on the upstream socket interface (the side that accepts reverse tunnels). The downstream side's reaction to a received GOAWAY is added by a follow-up change, also gated by the same runtime guard. Risk Level: Low. The new code path is bounded to clusters of type `envoy.clusters.reverse_connection`, only fires on a server-wide drain, and is default-off behind a runtime feature flag. The flag must be flipped on every process that should participate; existing deployments see no change. Testing: Manual end-to-end validation with two real Envoy processes (upstream + downstream reverse-tunnel pair) and a 3-Envoy DNS failover variant, with the runtime guard enabled. With the guard enabled, `/drain_listeners?graceful` on the upstream causes `cluster.reverse_connection_cluster.http2.goaway_sent` to increment and the downstream's drain-aware HCM filter to observe the peer GOAWAY. With the guard disabled, no GOAWAY is emitted and the existing behavior is preserved. Integration test for the full end-to-end flow is in the follow-up PR that introduces the downstream reaction. Docs Changes: n/a. Release Notes: Added an opt-in runtime guard `envoy.reloadable_features.reverse_tunnel_drain_with_goaway`. When enabled, the reverse-tunnel upstream socket interface bootstrap extension sends GOAWAY on every reverse_connection cluster pool when the server enters listener drain, so peer Envoys can fail over to a sibling upstream before in-flight streams complete. Default off. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
…AndDrainExisting Follow-up to the conn-pool rename: switch the upstream acceptor extension's drain trigger from DrainBehavior::GoAwayAndDrainAndDelete to DrainBehavior::NotifyPeerAndDrainExisting, drop the pod-prestop wording in the header comment to keep the documentation generic, and tighten the comment to focus on the why (peers receive GOAWAY and dial replacement tunnels) rather than restating the mechanism. No behavior change beyond the new enum name and its updated semantics (the pool stays usable for new streams via fresh clients while existing clients drain in the background). Signed-off-by: Prasad I V <prasad.iv@databricks.com>
…_tunnel/drain_cluster admin endpoint
Coupling cluster-level drain to /drain_listeners was novel in Envoy. Listener
drain (HCM GOAWAY downstream, listener-manager stop) and cluster drain (upstream
pool fan-out) are independent concerns elsewhere in the codebase. Tying them
together also creates a race window where in-flight egress requests on the
draining server can fail with 503 because the listener and the upstream pool
are draining in parallel.
Switching to an explicit admin verb lets the operator sequence:
POST /drain_listeners?graceful&inboundonly # quiesce ingress
(wait for egress traffic to migrate to a sibling US)
POST /reverse_tunnel/drain_cluster # GOAWAY all reverse tunnels
POST /drain_listeners?graceful # full server drain + SIGTERM
This removes:
- the drain_check_timer_ field, periodic poll, and one-shot drain_goaway_dispatched_ guard
- the onDrainCheckTimer() method
- the dependency on envoy/server/drain_manager.h and envoy/network/drain_decision.h
and adds:
- admin handler registration for POST /reverse_tunnel/drain_cluster in
onServerInitialized(), gated on the existing
envoy.reloadable_features.reverse_tunnel_drain_with_goaway runtime flag
- handlerDrainCluster() + private fanOutDrainNotifyPeer() helper
- three unit tests covering: runtime-flag-off short circuit, fan-out limited to
envoy.clusters.reverse_connection cluster types, and second-call idempotency
The runtime guard stays in source/common/runtime/runtime_features.cc so a single
opt-in flag still controls the end-to-end behavior.
Signed-off-by: Prasad I V <prasad.iv@databricks.com>
66b488b to
c3713b9
Compare
basundhara-c
reviewed
Jun 1, 2026
| // egress-listener drain and reverse-tunnel-cluster drain separately. | ||
| auto admin = context_.admin(); | ||
| if (admin.has_value()) { | ||
| const bool rc = admin->addHandler( |
There was a problem hiding this comment.
Do we need to flip this at runtime? Wondering if we could use a flag here instead of a runtime guard?
| } | ||
|
|
||
| void ReverseTunnelAcceptorExtension::fanOutDrainNotifyPeer() { | ||
| drain_goaway_dispatched_ = true; |
There was a problem hiding this comment.
Can the drain call fail? Should be setting this to true at the end of the function?
ivpr
pushed a commit
that referenced
this pull request
Jun 15, 2026
…envoyproxy#45324) ### Description This PR resolves a use-after-free (UAF) / pure virtual method call crash that occurs during `ConnectivityGrid` teardown when connection pools (HTTP/3, HTTP/2, or HTTP/1) are destroyed while active connection attempts are still ongoing in the background. --- ### 💥 The Bug & Crash Callstack When the `ConnectivityGrid` is deleted, its destructor destroys its underlying connection pools in reverse order. If a pool has active connection attempts, deleting the pool synchronously cancels those attempts. In C++, once a derived class destructor finishes, the object's vtable pointer transitions to point to the base class vtable. Because `destructAllConnections()` is invoked in `~HttpConnPoolImplBase()` (the base class destructor), the connection cancellation callback wrappers execute `onConnectionAttemptFailed()` after the derived pool object has already been destructed. When `onConnectionAttemptFailed()` attempts to write a log trace calling `describePool(attempt->pool())` (which executes `attempt->pool().protocolDescription()`), virtual dispatch invokes a **pure virtual function call** on the base class `HttpConnPoolImplBase`, crashing Envoy instantly: ```log Program received signal SIGSEGV, Segmentation fault. 0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed() Backtrace: #0 0x000056453663656e in Envoy::Http::ConnectivityGrid::WrapperCallbacks::onConnectionAttemptFailed() #1 0x000056453663606d in Envoy::Http::ConnectivityGrid::WrapperCallbacks::ConnectionAttemptCallbacks::onPoolFailure() #2 0x000056453664e7f1 in Envoy::Http::HttpConnPoolImplBase::onPoolFailure() #3 0x00005645366cb447 in Envoy::ConnectionPool::ConnPoolImplBase::purgePendingStreams() #4 0x00005645366c9d6e in Envoy::ConnectionPool::ConnPoolImplBase::onConnectionEvent() #5 0x000056453666394d in Envoy::Tcp::ActiveTcpClient::onEvent() envoyproxy#6 0x000056453709b0e9 in Envoy::Network::MockConnectionBase::raiseEvent() ... envoyproxy#24 0x00005645366be287 in Envoy::ConnectionPool::ConnPoolImplBase::destructAllConnections() envoyproxy#25 0x00005645366766b8 in Envoy::Http::HttpConnPoolImplBase::~HttpConnPoolImplBase() envoyproxy#26 0x000056453664e70f in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed() envoyproxy#27 0x000056453664e739 in Envoy::Http::HttpConnPoolImplMixed::~HttpConnPoolImplMixed() envoyproxy#28 0x000056453647099c in std::__1::default_delete<>::operator()() envoyproxy#29 0x000056453647091c in std::__1::unique_ptr<>::reset() envoyproxy#30 0x000056453663a3ad in Envoy::Http::ConnectivityGrid::~ConnectivityGrid() envoyproxy#31 0x000056453663a4f9 in Envoy::Http::ConnectivityGrid::~ConnectivityGrid() envoyproxy#32 0x00005645365de65c in std::__1::default_delete<>::operator()() envoyproxy#33 0x000056453656b55c in std::__1::unique_ptr<>::reset() envoyproxy#34 0x00005645363174fa in ConnectivityGridTest_DestroyGridWithActiveH2Attempts_Test::TestBody() ``` --- ### 🛠️ The Fix Introduced a reloadable feature flag guard `envoy.reloadable_features.conn_pool_grid_early_return_on_teardown` (enabled by default) to shield both H3 and H2/H1 pools during destruction: * **Early Return**: Inside `WrapperCallbacks::onConnectionAttemptFailed()`, we check if `delete_started_` is `true`. If the flag is enabled and the wrapper is already being deleted/torn down, we return immediately, completely skipping the unsafe virtual method lookups on the half-destructed pool object. * **ALPN & Protocol Safety**: This safely protects the system regardless of which pool is being torn down (H3 or H2/H1 mixed pools). --- ### 🧪 Testing & Verification * Added the `DestroyGridWithActiveH2Attempts` unit test to verify that tearing down the `ConnectivityGrid` while an active H2 connection attempt is running does crash without the fix and not with the fix. Risk Level: low, only exposed with log level trace Testing: new unit test Docs Changes: N/A Release Notes: Y Platform Specific Features: N/A Runtime guard: envoy.reloadable_features.conn_pool_grid_early_return_on_teardown Signed-off-by: Dan Zhang <danzh@google.com> Co-authored-by: Dan Zhang <danzh@google.com>
| if (!Runtime::runtimeFeatureEnabled( | ||
| "envoy.reloadable_features.reverse_tunnel_drain_with_goaway")) { | ||
| response.add("runtime envoy.reloadable_features.reverse_tunnel_drain_with_goaway is off\n"); | ||
| return Http::Code::ServiceUnavailable; |
There was a problem hiding this comment.
ServiceUnavailable is misleading and might cause the client to retry, can we return a NotImplemented here?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message: Add a
POST /reverse_tunnel/drain_clusteradmin endpoint on the reverse-tunnel upstream socket interface bootstrap extension. The handler fans outConnectionPool::DrainBehavior::NotifyPeerAndDrainExistingto everyenvoy.clusters.reverse_connectioncluster, so the cluster's outbound HTTP/2 client codecs send GOAWAY to their peers. Gated behind an opt-in runtime guardenvoy.reloadable_features.reverse_tunnel_drain_with_goaway.Additional Description: A
reverse_connectioncluster's HTTP/2 client codecs are the cluster's outbound link to a peer Envoy (the cluster sits on the sending side of a long-lived multiplexed tunnel). Without an explicit drain trigger those peers learn about the drain only when the TCP socket eventually closes, which defeats the point of a graceful drain./drain_listeners?gracefulis intentionally NOT used as the trigger for two reasons:ClusterManager::drainConnections()calls.An explicit admin verb lets the operator sequence drains:
What this change introduces
handlerDrainCluster(...)admin handler onReverseTunnelAcceptorExtension. IteratesclusterManager().clusters().active_clusters_, finds clusters of typeenvoy.clusters.reverse_connection, and callsclusterManager().drainConnections(name, /*predicate=*/nullptr, ConnectionPool::DrainBehavior::NotifyPeerAndDrainExisting)on each. The ClusterManager fans the per-cluster drain out to every worker thread that holds a thread-local entry for the cluster.fanOutDrainNotifyPeer()private helper containing the iterate-and-drain body.drain_goaway_dispatched_: a repeat call to the admin endpoint returns"already drained"and is a no-op.onServerInitialized()registers the handler viacontext_.admin()->addHandler("/reverse_tunnel/drain_cluster", ...)withmutates_server_state=true. Registration is conditional onadmin.has_value()so the extension still builds in admin-disabled binaries.Risk Level
Low. The new code path is contained to clusters of type
envoy.clusters.reverse_connection, only fires on an explicit admin call (no implicit triggers), and is default-off behind a runtime feature flag. The flag must be flipped on every process that should participate; existing deployments see no change.Testing
Unit tests (3 new) in
reverse_tunnel_acceptor_extension_test.cc:AdminDrainClusterRespectsRuntimeFlag— flag off → 503 with explanatory body; no fan-out.AdminDrainClusterFiresOnReverseConnectionClustersOnly— flag on; assertsdrainConnections(NotifyPeerAndDrainExisting)is called on everyenvoy.clusters.reverse_connectioncluster and is NOT called on a same-mock-test cluster of a different type.AdminDrainClusterIsIdempotent— second call returns"already drained"and does NOT re-firedrainConnections.Manual end-to-end validation with two real Envoy processes (upstream + downstream reverse-tunnel pair), with the runtime guard enabled.
POST /reverse_tunnel/drain_clusteron the upstream causescluster.reverse_connection_cluster.http2.goaway_sentto increment for every tunnel and the downstream's drain-aware HCM filter (PR3) to observe the peer GOAWAY. With the guard disabled, the handler returns 503 and no drain is dispatched.The crash-fix regression on the underlying PR (PR1) is also exercised here: post-drain requests to the egress listener are served by fresh clients without crashing US.
Docs Changes
n/a.
Release Notes
Added an opt-in runtime guard
envoy.reloadable_features.reverse_tunnel_drain_with_goaway. When enabled, the reverse-tunnel upstream socket interface bootstrap extension exposes aPOST /reverse_tunnel/drain_clusteradmin endpoint that fans outNotifyPeerAndDrainExistingto everyenvoy.clusters.reverse_connectioncluster. Peer initiators receive GOAWAY and can dial replacement tunnels to sibling upstream replicas. Default off.