http: add per-cluster upstream client codec factory seam#45632
Conversation
22b605f to
dc1fddb
Compare
Introduce Http::ClientCodecFactory, an optional per-cluster extension point for building the upstream (client) HTTP codec. ClusterInfo exposes it via upstreamHttpClientCodecFactory(); ClusterInfoImpl recovers it from typed_extension_protocol_options by sidecast (an options object that also implements ClientCodecFactory is the factory). CodecClientProd routes codec construction through the factory when present, passing a thunk that builds the stock codec so the factory can either decorate it or return a fresh codec. This is a generic, opt-in seam: with no factory configured, behavior is unchanged and the stock codec is used. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
dc1fddb to
eb48580
Compare
|
@agrawroh / @basundhara-c Can you please review and assign to folks who can review this |
|
/gemini-review |
|
/retest |
Add RELEASE_ASSERT checks to guard against calling create_default multiple times and returning a null codec from custom upstream client codec factories. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a per-cluster upstream HTTP client codec factory (ClientCodecFactory) that allows decorating or replacing the stock codec via typed extension protocol options. The reviewer provided valuable feedback on several areas: first, the stock codec is currently constructed eagerly, which can cause immediate, unwanted side effects on the connection if the factory decides to replace it; constructing it lazily would prevent this. Second, finding the factory by iterating over absl::flat_hash_map introduces non-deterministic behavior if multiple factories are configured, so the code should detect and throw an exception in such cases. Finally, envoy/common/pure.h should be explicitly included in client_codec_factory.h for the PURE macro.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Avoid eager stock codec construction when an upstream codec factory is configured, make factory selection deterministic by sorting extension option names, and add explicit pure.h include for PURE usage. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/retest |
Replace the create_default callback with a nullptr-return contract so CodecClientProd builds the stock codec via a single switch only when no custom codec is used (avoids eager stock construction side effects). Discover the per-cluster factory via an explicit ProtocolOptionsConfig::upstreamHttpClientCodecFactory() hook instead of a dynamic_cast probe, and enforce at-most-one factory per cluster. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
Avoid the non-dictionary word flagged by the pedantic spell check. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a per-cluster upstream HTTP client codec factory interface (ClientCodecFactory) allowing custom or decorated client codecs to be built via cluster protocol options. The review feedback recommends passing the transport socket options (TransportSocketOptions) into the factory's Context struct. This addition is necessary for custom codecs to replicate or decorate stock HTTP/1 behavior, which depends on whether the connection is proxied. The feedback includes actionable suggestions to update the interface, the codec client instantiation, and the associated unit tests.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Add the connection's transport socket options to ClientCodecFactory::Context so a custom codec factory can replicate or decorate the stock HTTP/1 codec, which inspects http11ProxyInfo() to detect a proxied connection. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a per-cluster factory interface (ClientCodecFactory) for upstream HTTP codecs, allowing custom or decorated client codecs to be built via typed_extension_protocol_options. CodecClientProd is updated to consult this factory and fall back to stock codecs if none is provided. The review feedback suggests explicitly documenting the lifetime contract of the returned factory in ProtocolOptionsConfig to prevent potential lifetime bugs.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Note that upstreamHttpClientCodecFactory() must return a factory owned by the options object, since ClusterInfoImpl pins it via a shared_ptr aliasing the options object. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
| // constructing it has immediate side effects on the connection (e.g. the HTTP/2 codec writes | ||
| // SETTINGS), so building one we would discard could corrupt the connection. | ||
| if (auto factory = host->cluster().upstreamHttpClientCodecFactory(); factory.has_value()) { | ||
| codec_ = factory->createClientCodec(Http::ClientCodecFactory::Context{ |
There was a problem hiding this comment.
We are skipping quic_session.Initialize() for the custom factory path. We should evaluate whether this is okay.
This should be fine for reverse tunnels, but it's an invariant we should note; if a custom codec factory is used, QUIC will not work?
There was a problem hiding this comment.
Good catch. The contract is that a factory returning a codec owns full construction, including QUIC init for HTTP/3 (it can dynamic_cast context.connection to EnvoyQuicClientSession& and call Initialize(), like core does). Documented this on the createClientCodec docstring and also added an HTTP/3 test covering the factory path.
| * Defaults to none. An options extension attached via typed_extension_protocol_options | ||
| * can override this to make CodecClientProd build a custom or decorated client codec. | ||
| */ | ||
| virtual OptRef<const Http::ClientCodecFactory> upstreamHttpClientCodecFactory() const { |
There was a problem hiding this comment.
Why cannot this be PURE?
There was a problem hiding this comment.
This is on generic ProtocolOptionsConfig, the base for all typed_extension_protocol_options objects (TCP options, HTTP options, etc.), almost none of which provide a codec. PURE would force all of them to add a boilerplate return {};. Codec providers override it to return *this; everyone else uses the default.
(createClientCodec() on ClientCodecFactory itself is PURE, since a real codec factory must implement it.)
| TEST(CodecClientProdTest, NoUpstreamClientCodecFactoryUsesStockCodec) { | ||
| NiceMock<Event::MockDispatcher> dispatcher; | ||
| NiceMock<Random::MockRandomGenerator> random; | ||
| auto cluster = std::make_shared<NiceMock<Upstream::MockClusterInfo>>(); |
There was a problem hiding this comment.
We need to add more tests covering protocols other than HTTP/1, like:
- HTTP/2 factory-used + factory-returns-nullptr
- HTTP/3 + whatever we decide for QUIC: do we reject the config if it's used with a custom factory?
There was a problem hiding this comment.
Done. Added tests for HTTP/2 and HTTP/3 as well
| } | ||
|
|
||
| // A factory that returns nullptr defers to the stock codec, which CodecClientProd then builds. | ||
| TEST(CodecClientProdTest, UpstreamClientCodecFactoryNullptrUsesStockCodec) { |
There was a problem hiding this comment.
We also need a test for the case where multiple factories are registered, and we throw a "multiple upstream HTTP client codec factories configured" error
| .WillOnce(Invoke([&](const ClientCodecFactory::Context& context) -> ClientConnectionPtr { | ||
| EXPECT_EQ(CodecType::HTTP1, context.type); | ||
| EXPECT_EQ(cluster.get(), &context.cluster); | ||
| EXPECT_EQ(nullptr, context.options); |
There was a problem hiding this comment.
Add one test that passes a real options and asserts the factory receives that same value
| // SETTINGS), so building one we would discard could corrupt the connection. | ||
| if (auto factory = host->cluster().upstreamHttpClientCodecFactory(); factory.has_value()) { | ||
| codec_ = factory->createClientCodec(Http::ClientCodecFactory::Context{ | ||
| type, *connection_, *this, host->cluster(), random_generator, options}); |
There was a problem hiding this comment.
we are also not passing the response header limits that are passed for stock HTTP/1 and HTTP/2:
host->cluster().maxResponseHeadersKb().value_or(...), // ← response-header size cap
host->cluster().maxResponseHeadersCount(), ...); // ← response-header count cap
There was a problem hiding this comment.
Since we are passing host->cluster(), factory can read them from context.cluster (maxResponseHeadersKb().value_or(...), maxResponseHeadersCount()), which the reverse_tunnel codec already does(code ref). Unlike the transport-socket options (not reachable from the cluster, hence added to Context), the header limits are available via Context::cluster.
Add HTTP/2 factory-used and factory-returns-nullptr cases, assert the transport socket options are forwarded to the factory, and verify that configuring multiple upstream client codec factories on one cluster is rejected. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
Note that a factory returning a non-null codec owns the full construction the stock path would do, including QUIC session initialization for HTTP/3, and that returning nullptr defers to the stock codec for unsupported types. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
Verify the factory is consulted for HTTP/3 and that returning a codec skips the stock QUIC construction path (running against a mock connection without crashing proves the dynamic_cast/Initialize path is not taken). Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/assign @yanavlasov |
|
@yanavlasov Can you please review this |
yanavlasov
left a comment
There was a problem hiding this comment.
LGTM, just some style changes, please.
/wait
| namespace Envoy { | ||
|
|
||
| namespace Network { | ||
| class Connection; |
There was a problem hiding this comment.
Per the style guide, we try to avoid forward declarations and include the headers with the types definitions. If it does not cause a problem, please include headers with the types instead of forward declarations.
There was a problem hiding this comment.
Done. I've now change this to use include files
Per the style guide, include the headers defining the types used in ClientCodecFactory::Context rather than forward-declaring them. No dependency cycle: upstream_interface only forward-declares Http::ClientCodecFactory. Signed-off-by: Prasad I V <prasad.iv@databricks.com>
|
/retest |
Commit Message: http: add per-cluster upstream client codec factory seam
Additional Description:
Adds
Http::ClientCodecFactory, an optional per-cluster extension point for building the upstream(client) HTTP codec. It is discovered from
typed_extension_protocol_optionsand surfaced via a newnon-pure
ClusterInfo::upstreamHttpClientCodecFactory()accessor that defaults to empty.CodecClientProdinvokes the factory when present, handing it a thunk that builds the stock codec,so an implementation can either decorate the stock codec or return a fresh one. With no factory
configured the stock codec is built exactly as before and behavior is unchanged.
Motivating use case: reverse tunnels invert the HTTP roles -- the connection is dialed from the
downstream to the upstream, but HTTP requests then flow from the upstream to the downstream over
that tunnel, so the upstream is the HTTP/2 client and the downstream is the HTTP/2 server. Graceful
drain of such a tunnel needs cooperation in both directions and both land on the upstream (client)
codec: when the upstream is draining it must signal the downstream to dial a replacement connection
while in-flight requests finish; when the downstream is draining it must tell the upstream to stop
sending new requests and let the downstream re-establish, again finishing in-flight requests.
Neither is expressible with the stock upstream codec, and Envoy currently has no per-cluster hook to
substitute the upstream (client) codec (
CodecClientProdhard-codes it by protocol). This changeadds that hook generically; the reverse-tunnel codec that uses it is a follow-up change.
Reverse-tunnel implementation that uses this seam (in the ivpr/envoy fork): the upstream drain-aware client codec (ivpr#4) and the downstream drain-aware HCM that dials a replacement tunnel (ivpr#5).
Reverse-tunnel graceful drain (sequence diagram)
sequenceDiagram autonumber participant D as Downstream participant U as Upstream Note over D,U: Reverse tunnel setup - unlike a normal hop, the downstream dials OUT to the upstream D->>U: open TCP connection and initiate reverse tunnel U-->>D: tunnel established Note over D,U: On the established tunnel the upstream is the HTTP/2 client and the downstream is the HTTP/2 server, so requests flow from upstream to downstream U->>D: request over tunnel D-->>U: 200 Note over D,U: Upstream graceful drain U->>D: GOAWAY - drain signal toward downstream D->>D: dial a replacement tunnel to a healthy upstream U->>D: in-flight request continues on the old tunnel D-->>U: 200 Note over D,U: old tunnel closes once idle, new requests use the replacement Note over D,U: Downstream graceful drain D->>U: GOAWAY - tell upstream to stop sending new requests U->>U: drain pool, route new requests to the replacement tunnel U->>D: in-flight request continues on the old tunnel D-->>U: 200 Note over D,U: downstream re-establishes, old tunnel closes once idleRisk Level: Low -- new, opt-in per-cluster extension point. When no factory is configured the stock
codec is used and there is no behavior change.
Testing:
//test/common/http:codec_client_testcovers the unchanged no-factory path. A dedicatedunit test for the factory-injection path is a planned follow-up.
Docs Changes: None (the new interface is documented inline in
envoy/http/client_codec_factory.h).Release Notes: None (extension point only; no user-facing behavior change on its own).
Platform Specific Features: None.
Runtime guard: None (this PR adds no runtime-guarded behavior).
API Considerations: No xDS/proto API changes. Adds one new C++ extension interface
(
envoy/http/client_codec_factory.h) and a non-pureClusterInfo::upstreamHttpClientCodecFactory()accessor that defaults to empty, so existing
ClusterInfoimplementations are unaffected.