feat(grpc): Add resolver wrapper to handle HTTP CONNECT proxy configuration#2679
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the gRPC client to support arbitrary byte-based addresses (via ByteStr) instead of requiring valid UTF-8 strings, enabling support for non-UTF-8 unix socket paths. It also introduces an HTTP CONNECT proxy resolver that intercepts DNS resolution and applies proxy configurations based on environment variables. The review feedback suggests bypassing proxy resolution entirely for non-DNS schemes (like unix or inmemory), safely handling empty NO_PROXY environment variables, and updating tests to verify that unix paths correctly bypass the proxy.
| options: ResolverOptions, | ||
| matcher: Option<&Matcher>, | ||
| ) -> Result<Box<dyn Resolver>, (String, ResolverOptions)> { | ||
| // Skip proxy lookup for known non-TCP schemes. |
There was a problem hiding this comment.
I'd expect to skip it for anything that isn't dns. I think there had been a little discussion around that, but I don't know where it ended.
The host will be passed to the proxy and the proxy then has to do the resolution, but we've discarded the scheme. It seems we are assuming dns formatting of the target string as well. We shouldn't be parsing the target string without knowledge of the specific scheme in use. Things would be better if we delegated to the name resolver to parse the target string into an authority.
There was a problem hiding this comment.
Answering inline:
I'd expect to skip it for anything that isn't dns. I think there had been a little discussion around that, but I don't know where it ended.
The initial intention behind implementing proxy resolution as a separate resolver builder was to allow all custom resolvers to get HTTP CONNECT support for free. For context, the C++ implementation skips proxying for unix URLs, while in Go, target host resolution can only be skipped for dns URLs. Java simply implements the proxy logic inside the DNS resolver itself.
I've now updated the Rust implementation to skip proxying for all non-dns URLs.
Things would be better if we delegated to the name resolver to parse the target string into an authority.
I've updated the code to use the target resolver builder for authority parsing. Based on the previous point, this will inherently be a DNS resolver.
| use super::ResolverOptions; | ||
| use super::Target; |
There was a problem hiding this comment.
Nit: please try to avoid use super except in test mods.
| fn build(&self, target: &Target, options: ResolverOptions) -> Box<dyn Resolver> { | ||
| match self.new_resolver(target, options, MATCHER.as_ref()) { | ||
| Ok(resolver) => resolver, | ||
| Err((err, options)) => NopResolver::new_with_err(err, options), |
There was a problem hiding this comment.
Optional: consider making new_resolver infallible and return the NopResolver itself instead. It avoids needing to pass the options back and forth.
| } | ||
|
|
||
| struct HttpsProxyResolver { | ||
| child: Box<dyn Resolver>, |
There was a problem hiding this comment.
Interestingly this is always a dns::DnsResolver. Should we put that type in here directly instead of a dyn?
There was a problem hiding this comment.
The DNS resolver builder returns a NopResolver if the target is an IP address here:
grpc-rust/grpc/src/client/name_resolution/dns/mod.rs
Lines 212 to 217 in 829ce18
To return a single concrete type from dns::Builder, we could introduce a dns::Resolver enum with NopResolver and DnsResolver variants and expose it via a pub(crate) method.
I'm wondering if this refactoring provides any real benefit. Since HttpsProxyResolver only requires the methods already defined on the Resolver trait, staying with the trait might be cleaner. What do you think?
There was a problem hiding this comment.
I'm wondering if this refactoring provides any real benefit.
No, I think this is fine. If this was always one type I think it would make sense to just name it and have it there, but since it's more, then this way is better.
| /// Returns the address of the proxy server to connect to (host:port). | ||
| /// This is Punycode-encoded, i.e., it's a valid URL host:port. | ||
| pub(crate) fn target_authority(&self) -> &str { | ||
| &self.connect_addr |
There was a problem hiding this comment.
Would it be better if the names matches?
There was a problem hiding this comment.
Renamed both, the field and the getter to connect_authority.
| if let Ok(endpoints) = &mut update.endpoints { | ||
| for endpoint in endpoints { | ||
| for address in &mut endpoint.addresses { | ||
| address.attributes = address.attributes.add(self.proxy_options.clone()); |
There was a problem hiding this comment.
I think the attribute should be somewhere more central, like in grpc/client/transport maybe?
There was a problem hiding this comment.
Moved the ProxyOptions and its helpers into the transport module.
5395e05 to
b2f47f5
Compare
| pub(crate) handshake_info: ClientHandshakeInfo, | ||
| } | ||
|
|
||
| /// Options for establishing an HTTP CONNECT proxy tunnel. |
There was a problem hiding this comment.
Maybe mention that this can be added to addresses' attributes, and transports that support it will read it from there and use the address as a CONNECT proxy instead of a backend.
What about transports that don't support it, though? Should those error? (It seems like they should...) We can deal with this later, but something to think about. Maybe we should have a separate address type for a proxy address or something.
There was a problem hiding this comment.
Based on the latest revision of the design, the subchannel will wrap the channel credentials in an HTTP CONNECT handshaker, making it transparent to transports. I've added this information in the rustdoc.
| /// Must be a valid hostname. | ||
| /// * `proxy_authorization_header` - The value of the `Proxy-Authorization` header, if present. | ||
| pub(crate) fn new( | ||
| target_authority: String, |
There was a problem hiding this comment.
connect_authority?
I'm not sure what the best name to use for this is (consider also target_address or backend_address maybe?), but it should match between field, constructor, and accessor.
There was a problem hiding this comment.
Changed to target_authority throughout.
This PR implements a resolver wrapper that determines proxy routing by checking the
HTTPS_PROXYandNO_PROXYenvironment variables. By default,hyper-utilmatches cURL's behavior by falling back toALL_PROXYwhenHTTPS_PROXYis unset. However, because Go'sFromEnvironmentand gRPC C++ ignoreALL_PROXY, gRPC Rust manually configures theMatcherto bypass it, ensuring cross-language consistency.Resolution Flow
When
HttpsProxyResolver::Builderbuilds a resolver for a target URI, it useshyper_util::client::proxy::Matcherto evaluate the environment variables:HttpsProxyResolver, which uses the child DNS resolver to resolve the proxy server's hostname instead of the target.HttpsProxyResolverintercepts resolution updates from the child resolver. It attaches the necessary proxy configuration (the original target authority and basic auth credentials) as attributes to each resolved proxy address.In follow-up PRs, the subchannel will read these address attributes to wrap the channel credentials and carry out the HTTP
CONNECThandshake prior to the standard credential handshake.Internal design doc: go/grpc-rust-http-connect
Transport changes: #2698