Skip to content

Track and protect WebClient outbound requests, fix private-IP SSRF regression#312

Open
Mishenevd wants to merge 1 commit into
mainfrom
fix/webclient-outbound-tracking
Open

Track and protect WebClient outbound requests, fix private-IP SSRF regression#312
Mishenevd wants to merge 1 commit into
mainfrom
fix/webclient-outbound-tracking

Conversation

@Mishenevd

@Mishenevd Mishenevd commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Context

Follow-up to #308/#310 (reverted as #311). Customer flood was InetAddress.getAllByName() picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::, /etc/resolv.conf nameservers) as "new outbound connections" on port 0, and blocking them in lockdown mode. #310 fixed the flood but did it with an early return in DNSRecordCollector.report() that also skipped the SSRF check below it — verified with a live regression test that this let an attacker-supplied private-IP literal (e.g. a webhook field pointing straight at 169.254.169.254) through undetected.

While investigating a better fix, found the actual root cause is bigger: Spring's WebClient was never instrumented at all, and Reactor Netty's default HTTP client bypasses InetAddress.getAllByName() entirely (it uses its own async DNS resolver). So even after wrapping WebClient to register pending ports, DNSRecordCollector was never invoked for real WebClient targets — confirmed empirically via trace logs against a live running app. WebClient traffic had zero outbound-domain visibility and zero SSRF protection, independent of this bug.

Review feedback addressed

  • Diff size / comments — trimmed: reverted the recordHitAndEnforceBlocklist extraction back inline, cut comments down to what's non-obvious.
  • isInfrastructureNoise/ports.isEmpty() reliability — was a real gap, not just a theoretical one: it's exactly what let redirect targets slip through undetected (see SpringWebClientRedirectWrapper below). Fixed for the redirect case; the more general "any explicit scheduler hop breaks ThreadLocal propagation" class of problem is noted under Known limitations, out of scope for this PR.
  • e2e test using WebClient in a Spring app — added both a JUnit-level test (WebClientTest, real network calls, same style as HttpURLConnectionTest) and a full e2e payload in spring_webflux_postgres.py. See Tests below.

Architecture note: this fits the agent's existing two-layer hook pattern, it doesn't add a new one

Every HTTP client the agent tracks is covered by two kinds of hook working together:

  1. A per-client "intent" hook — one per library, because each has its own API for making a request: OkHttpWrapper (OkHttpClient.newCall), ApacheHttpClientWrapper (HttpClient.execute), HttpClientSendWrapper (JDK HttpClient.send/sendAsync), and now SpringWebClientWrapper (ExchangeFunction.exchange, the Spring-level interface every WebClient call goes through, independent of whatever transport it's configured with). All of these just extract host+port and register it via URLCollectorPendingHostnamesStore.
  2. A shared, client-agnostic "connection" hook — this is where recording, outbound blocking, and SSRF checks actually happen (DNSRecordCollector). Before this PR there was exactly one: InetAddressWrapper, hooking the JDK's InetAddress.getAllByName(). It's generic — it doesn't know or care which library called it.

OkHttp, Apache HttpClient, and the JDK's own HttpClient all resolve DNS through that same standard JDK method, so layer 2 already covered them with no extra work. WebClient's underlying transport (Reactor Netty) is the one exception: it deliberately uses its own async DNS resolver instead of InetAddress.getAllByName(), so it was invisible to layer 2 no matter how well layer 1 was instrumented.

SocketChannelWrapper is not Netty-specific instrumentation — it hooks java.nio.channels.SocketChannel.connect(SocketAddress), a standard JDK class, with zero references to any io.netty.* type. It catches Reactor Netty as a side effect of Netty's NioSocketChannel internally calling this same standard method to open its sockets — the same hook would also catch any other NIO-based client, not just Netty. This is layer 2 gaining a second, complementary entry point, not a new Netty-specific layer.

What changed

  • DNSRecordCollector.java — private-IP-literal gate narrowed to ports.isEmpty() && IsPrivateIP.isPrivateIp(hostname), only skipping recording + outbound blocking (SSRF checks are unconditional again). Deliberately not Fully ignore private IP literals as outbound connections (early return) #310's "just ignore private IPs" — the ports.isEmpty() half is what makes it safe: a genuine request through any instrumented client always registers a pending port before resolution/connection happens, so this can only ever match traffic with no app-level request behind it (DNS bootstrap, direct-by-IP connects), never a real request, attacker-driven or not.

  • SpringWebClientWrapper.java — new wrapper, hooks ExchangeFunction.exchange() (the interface every WebClient request goes through) to register pending host+port via URLCollector, same pattern as the existing OkHttp/Apache/JDK HttpClient wrappers (layer 1 above).

  • SocketChannelWrapper.java — new wrapper, hooks java.nio.channels.SocketChannel.connect(SocketAddress) (layer 2 above). This is what actually closes the gap: it's the JDK-level call every NIO-based client (including Reactor Netty) makes once it has a resolved address, regardless of which DNS resolver produced it, and it also catches literal-IP targets that never go through any resolver at all.

  • DNSRecordCollector.reportConnect() — new entry point used by SocketChannelWrapper. Peeks the pending port (PendingHostnamesStore.getPorts) instead of consuming it (report()'s getAndRemove), for a reason specific to this new hook: report() (fed by InetAddress.getAllByName()) always sees every resolved address for a hostname in one call, so it's safe to consume the pending port immediately — the full picture is already known before any connection is attempted. reportConnect() (fed by SocketChannel.connect()) instead learns about addresses one at a time, exactly when each individual connection attempt happens.

    This matters because of Happy Eyeballs: a dual-stack hostname like localhost resolves to both 127.0.0.1 and ::1, and Netty tries them as separate, sequential connect() calls, not one combined resolution step. A hostname can resolve to more than 2 addresses in general (multiple A/AAAA records are common for load-balanced services) — Netty will try each one in turn.

    Found live: with a naive getAndRemove(), the first connect() attempt (IPv4) correctly detected and blocked SSRF — but that also consumed the pending port. The second attempt (IPv6, same hostname) found nothing pending, so the SSRF check loop didn't run at all, and the connection succeeded — the earlier "block" was logged but had no real effect, because the client just silently retried on the other address family and got through. Fixed by peeking instead of consuming; both attempts are now checked. Covered by testSsrfDetectedOnEveryConnectAttemptForDualStackHostname.

  • PendingHostnamesStore.java — peeking instead of consuming (above) means entries now rely on WebRequestCollector's per-incoming-request clear() for cleanup, which never fires for WebClient calls made outside any incoming-request context (e.g. a @Scheduled background task) — those entries would otherwise sit in that thread's ThreadLocal map forever. Capped the store at 1000 entries per thread, evicting the least-recently-used one once exceeded — the same bounded-LRU pattern (LinkedHashMap with accessOrder=true + removeEldestEntry()) already used by Hostnames.java for the same class of problem. Deliberately not a time-based TTL: eviction is purely by usage count, not wall-clock time, so it can't interact with the dual-stack fix above (an entry touched by a second connect attempt is safe unless ~1000 unrelated hostnames get registered on the same thread in between, which doesn't happen within one request's connect sequence) — no timing-dependent race, and add()/getAndRemove()/getPorts()/clear() stay in their original, simple form. Covered by PendingHostnamesStoreTest.

  • SpringWebClientRedirectWrapper.java — new wrapper closing a gap flagged in review: ports.isEmpty() isn't a fully reliable signal for "not a real request". Concretely: WebClient configured with .followRedirect(true) never re-invokes Spring's request-adaptation layer for redirect hops — Reactor Netty resends bodiless (e.g. GET) requests internally without going back through ExchangeFunction/ReactorClientHttpRequest — so a redirect to a private IP had no pending port and was invisible to both tracking and SSRF, indistinguishable from genuine infra noise even though it originated from a real, tainted request. Hooks the package-private HttpClientConnect$HttpClientHandler.redirect() (the only place this is visible, verified by tracing Spring's send(BiFunction) callback — it does not re-fire per redirect for bodiless requests, contrary to its own javadoc's implication) and feeds the chain into the existing RedirectCollector/PrivateIPRedirectFinder machinery, the same one already used for JDK HttpURLConnection redirects. Mirrors HttpConnectionRedirectWrapper's existing tradeoff of hooking an internal, undocumented method (there: followRedirect0) for this exact reason.

  • RequestController.java (SpringWebfluxSampleApp) — /api/request endpoint (GET with url query param, POST with JSON body) plus a .followRedirect(true) variant at /api/request/follow-redirects, used to validate all of the above against a real running app, and now wired into end2end/spring_webflux_postgres.py as an automated e2e payload.

Also fixed along the way: SpringWebClientWrapper's first version referenced ExchangeFunction.class directly in its ByteBuddy matchers, which requires the class to resolve on the agent's own classloader at premain — but spring-webflux is compileOnly, only present on the target app's classloader. This crashed the JVM with NoClassDefFoundError at startup for any app using the agent. Fixed with string-based matchers (hasSuperType(named(...))), the same pattern OkHttpWrapper/ApacheHttpClientWrapper already use for third-party types.

e2e scenarios tested

Built and ran the real agent against a real Spring Boot WebFlux app end to end (not just unit tests):

# Build agent jars with the changes
./gradlew agent:shadowJar agent_api:shadowJar
cp agent/build/libs/agent*-all.jar dist/agent.jar
cp agent_api/build/libs/agent_api*-all.jar dist/agent_api.jar

# Mock core (captures heartbeat/attack events)
cd end2end/server && docker build -t mock_core . && docker run --name mock_core -d -p 5000:5000 mock_core

# Sample app with the agent attached
cd sample-apps/SpringWebfluxSampleApp && ./gradlew build -x test
AIKIDO_LOG_LEVEL=trace AIKIDO_TOKEN=token AIKIDO_BLOCK=1 \
  AIKIDO_ENDPOINT=http://localhost:5000 AIKIDO_REALTIME_ENDPOINT=http://localhost:5000/realtime \
  java -javaagent:../../dist/agent.jar -jar build/libs/SpringWebfluxSampleApp-0.0.1-SNAPSHOT.jar --server.port=8090
  1. App startup didn't crash — caught the NoClassDefFoundError bug this way; unit tests can't catch agent-premain class loading issues.

  2. Real domain via WebClient gets correct port tracking (not port 0):

    curl -G "http://localhost:8090/api/request" --data-urlencode "url=https://example.com"

    Log: DNSRecordCollector called with example.com & inet addresses: [example.com/104.20.23.154]Hostname: example.com, Port: 443 (previously untracked entirely).

  3. SSRF via WebClient, literal private IP, now blocked:

    curl -G "http://localhost:8090/api/request" --data-urlencode "url=http://169.254.169.254/latest/meta-data/"

    → HTTP 500, "Aikido Zen has blocked a server-side request forgery". Stack trace confirms interception point: sun.nio.ch.SocketChannelImpl.connectNioSocketChannel.doConnectreactor.netty.transport.TransportConnector.

  4. SSRF via WebClient, dual-stack hostname (localhost127.0.0.1 + ::1), both connect attempts blocked:

    curl -G "http://localhost:8090/api/request" --data-urlencode "url=http://localhost:5000"

    → HTTP 500. Before the reportConnect() fix, this returned 200 with the mock server's real response — the first (IPv4) attempt was blocked and logged, but the second (IPv6) attempt silently bypassed SSRF because the first had already consumed the pending port.

  5. Private-IP noise from Netty's resolver bootstrap not recorded and not blocked in lockdown — confirmed via mock server heartbeat: after triggering the bootstrap noise (0.0.0.0, ::, local resolver IP 10.2.0.1) and a real request, only localhost:5000 (the agent's own reporting) showed up in reported hostnames.

  6. Safe request to a real public domain still works normally, HTTP 200, both via WebClient and confirmed no regression for already-instrumented clients.

  7. Redirect-based SSRF via WebClient, now blocked:

    curl -G "http://localhost:8090/api/request/follow-redirects" --data-urlencode "url=http://localhost:5000/mock/redirect-to-metadata"

    (mock core's /mock/redirect-to-metadata returns a 302 to http://169.254.169.254/latest/meta-data/.) → HTTP 500. With block temporarily set to detect-only, confirmed the attack event fires specifically for 169.254.169.254:80 with source=query, i.e. traced back through the redirect chain to the tainted origin — not just the origin itself being blocked. Before this wrapper, 169.254.169.254 never appeared in any DNSRecordCollector/SSRF log at all despite SocketChannelWrapper seeing the actual connect attempt — no pending port meant no taint check.

Tests

Added to DNSRecordCollectorTest (22 tests total in the file now, all passing):

  • testPrivateIpLiteralWithNoPendingPortNotRecorded / ...NotBlockedInLockdown — infra noise filtering still works.
  • testPrivateIpLiteralWithPendingPortStillRecordedAndBlockedInLockdown — real requests to private IPs via instrumented clients are unaffected by the noise filter.
  • testSsrfStillDetectedForPrivateIpLiteralWithPendingPort — regression test proving the earlier early-return SSRF bypass is fixed.
  • testReportConnectRecordsHostnameWithPendingPort, testReportConnectDoesNotConsumePendingPort, testReportConnectPrivateIpLiteralWithNoPendingPortNotRecorded, testReportConnectStoredSsrfStillRunsUnconditionallyreportConnect() behavior.
  • testSsrfDetectedOnEveryConnectAttemptForDualStackHostname — regression test for the dual-stack SSRF bypass found during e2e testing.

New PendingHostnamesStoreTest:

  • testUnboundedHostnamesDoNotGrowThreadLocalMapForever — adds 2000 distinct hostnames, confirms the oldest untouched ones are evicted and the store stays bounded.
  • testReadingAnEntryProtectsItFromEvictionWhileStillInUse — confirms an entry touched by a second connect attempt survives a realistic number of unrelated hostnames being added in between.

New testSsrfDetectedForRedirectToPrivateIp in DNSRecordCollectorTest — regression test for the redirect gap, exercising RedirectCollector + DNSRecordCollector.reportConnect() directly.

New agent_api/src/test/java/wrappers/WebClientTest.java, mirroring the existing HttpURLConnectionTest pattern (real network calls, run with the actual built agent attached via the agent_api test task's -javaagent) — split into two tests instead of one end-to-end test, because SpringWebClientWrapper and SocketChannelWrapper fire on different threads for a real WebClient call (the former on the subscribing thread, the latter on Reactor Netty's own event-loop thread), and PendingHostnamesStore/Context are ThreadLocal, so they can't be observed together the way a same-thread blocking client's wrapper can:

  • testSpringWebClientWrapperRegistersPendingPort — confirms SpringWebClientWrapper fires and registers the right port.
  • testSocketChannelWrapperBlocksSsrf — confirms SocketChannelWrapper + DNSRecordCollector.reportConnect()'s SSRF logic, via a raw SocketChannel.connect() call (single-threaded, deterministic).

New end2end/spring_webflux_postgres.py payload: ssrf, using the new /api/request endpoint, following the same pattern as javalin_mysql_kotlin.py's existing SSRF payload. Ran the full e2e suite (all existing SQL/command-injection/path-traversal/bot/IP-blocking/rate-limiting payloads plus this new one) — all passing, no regressions.

Known limitations (not fixed in this PR)

  • Spring WebFlux has no request-body taint tracking at all (SpringWebfluxContextObject never populates ContextObject.body), so SSRF/etc. via JSON body can't be detected for WebFlux apps regardless of this change — that's why the e2e scenarios above use query params instead of body. Doesn't regress anything (it was already broken before this PR), but worth a dedicated follow-up.
  • Context/PendingHostnamesStore are ThreadLocal, which doesn't survive an explicit scheduler hop in a reactive chain (e.g. .publishOn(Schedulers.boundedElastic()) before a WebClient call — a common pattern for mixing blocking and non-blocking code). This is the same underlying class of gap the redirect fix closes (Reactor Netty doing work on a thread other than the one that registered the request's context), just via a different trigger. It only breaks under an app's own explicit scheduler switch, not the default WebFlux request-handling flow (which stays on one reactor-http-nio thread throughout). Not fixed here — a proper fix would mean propagating via Reactor's own Context instead of ThreadLocal, a much larger change affecting the whole detection pipeline, not just WebClient.

🤖 Generated with Claude Code

Summary by Aikido

⚠️ Security Issues: 7 🔍 Quality Issues: 10 Resolved Issues: 0

🚀 New Features

  • Instrumented Spring WebClient by adding ExchangeFunction-based wrapper
  • Added SocketChannel.connect hook to observe resolved addresses on connect

⚡ Enhancements

  • Capped PendingHostnamesStore with LRU eviction and peek semantics

🐛 Bugfixes

  • Fixed SSRF/private-IP regression and unified DNS handling via reportConnect

More info

Comment thread agent/build.gradle

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 Open source vulnerabilities detected - high severity
Aikido detected 7 vulnerabilities across 1 package, it includes 2 high, 2 medium and 3 low vulnerabilities.

Details

Remediation Aikido suggests bumping the vulnerable packages to a safe version.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

return;
}

try {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SocketChannelWrapper dynamically loads a JAR and reflectively invokes reportConnect, hiding execution via URLClassLoader/reflection and selective exception handling; this runtime indirection obscures intent and should be clarified or documented.

Details

✨ AI Reasoning
​The new SocketChannelWrapper uses runtime dynamic class loading and reflection to locate and invoke DNSRecordCollector.reportConnect from an external JAR path supplied via a system property. It constructs a URLClassLoader from a string path, loads a class by name, searches methods by string name, and invokes one reflectively. It selectively rethrows InvocationTargetException causes only when the cause's toString() starts with the project's vulnerability package prefix, and otherwise swallows or prints exceptions. These behaviors introduce non-transparent control flow and runtime execution paths that can hide behavior from static review. Such patterns are commonly used to obscure intent and can be indicative of first-party malware or backdoor techniques.

🔧 How do I fix it?
Ensure code is transparent and not intentionally obfuscated. Avoid hiding functionality from code review. Focus on intent and deception, not specific patterns.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

}
// Ignore non-aikido throwables.
} catch(Throwable e) {
System.out.println("AIKIDO: " + e.getMessage());

@aikido-pr-checks aikido-pr-checks Bot Jul 1, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.out.println used for error reporting in SocketChannelWrapper. Use the project's Logger or remove this debug print.

Suggested change
System.out.println("AIKIDO: " + e.getMessage());
Details

✨ AI Reasoning
​A new ad-hoc console print was added inside an exception handler that swallows errors. Using System.out.println in library/agent code is a debug artifact: it produces unstructured output and may leak internal errors to stdout. This harms maintainability and can confuse production logs. Replacing it with the project's Logger or removing it will restore production-grade logging practices.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

try {
URL[] urls = { new URL(jarFilePath) };
classLoader = new URLClassLoader(urls);
} catch (MalformedURLException ignored) {}

@aikido-pr-checks aikido-pr-checks Bot Jul 1, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty catch swallowing MalformedURLException; at minimum log the exception or document why it can be safely ignored to avoid hiding classloader/URL setup failures.

Show fix
Suggested change
} catch (MalformedURLException ignored) {}
} catch (MalformedURLException e) {
// Log malformed JAR path to aid debugging of classloader setup failures
System.out.println("AIKIDO: Failed to load agent API JAR from path: " + jarFilePath + " - " + e.getMessage());
}
Details

✨ AI Reasoning
​The new wrapper code performs URL/JAR classloader setup inside a try block and then silently swallows MalformedURLException with an empty catch body. Silently ignoring classloading/URL errors can hide configuration or instrumentation failures and makes debugging harder; the catch should at least log the error or document why it is intentionally ignored. The surrounding try also involves loading an external JAR and reflective invocation, which are sensitive operations where failures should not be silently dropped.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

}
String hostname = inetSocketAddress.getHostString();

String jarFilePath = System.getProperty("AIK_agent_api_jar");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new URLClassLoader and doing reflective class/method lookups on every SocketChannel.connect() is expensive. Cache the loader/reflection results or use a preloaded bridge to avoid per-connection classloader and reflection overhead.

Details

✨ AI Reasoning
​This code attempts to report the resolved address of a SocketChannel connection by dynamically loading the agent_api JAR and invoking DNSRecordCollector.reportConnect via reflection. It performs the following expensive operations on each connect: reads a system property, constructs a URL and new URLClassLoader, loads a class by name, iterates methods to find reportConnect, invokes reflectively, and closes the class loader. Doing this on the hot path (connect) will cause significant allocation and classloading overhead that scales with connection frequency. Caching the URLClassLoader (or, better, obtaining a static bridge or a lightweight agent-to-api interface) would avoid repeated expensive work. The issue was introduced by the new SocketChannelWrapper in this PR which added this per-invocation classloader/reflection flow.

🔧 How do I fix it?
Move constant work outside loops. Use StringBuilder instead of string concatenation in loops. Cache compiled regex patterns. Use hash-based lookups instead of nested loops. Batch database operations instead of N+1 queries.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

process(hostname, new InetAddress[]{resolvedAddress}, PendingHostnamesStore.getPorts(hostname), SOCKET_CHANNEL_OPERATION_NAME);
}

private static void process(String hostname, InetAddress[] inetAddresses, Set<Integer> ports, String operationName) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method 'process' is too vague; rename to reflect its responsibilities (e.g., handleDnsLookupRecordingAndSsrfChecks) and add Javadoc describing side effects and when it throws.

Details

✨ AI Reasoning
​A newly added private static method named 'process' was introduced to handle statistics registration, hostname/port recording, outbound-block checks, and SSRF detection for different invocation paths. The implementation mixes multiple responsibilities and performs important side effects (mutating stores, throwing blocking exceptions). The name 'process' doesn't convey the multi-step purpose (recording, blocking checks, SSRF evaluation). Although file comments give context, the function name is still vague and makes the code harder to understand at call sites and during maintenance.

🔧 How do I fix it?
Use descriptive verb-noun function names, add docstrings explaining the function's purpose, or provide meaningful return type hints.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@Mishenevd Mishenevd force-pushed the fix/webclient-outbound-tracking branch from cea92b6 to 2230760 Compare July 1, 2026 14:40
process(hostname, new InetAddress[]{resolvedAddress}, PendingHostnamesStore.getPorts(hostname), SOCKET_CHANNEL_OPERATION_NAME);
}

private static void process(String hostname, InetAddress[] inetAddresses, Set<Integer> ports, String operationName) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNSRecordCollector.process() is large and mixes multiple unrelated responsibilities (stats, pending-port semantics, outbound blocking, SSRF detection and stored-SSRF checks); consider splitting into smaller focused methods.

Details

✨ AI Reasoning
​1. What is this code trying to accomplish? It converts resolved InetAddress objects and pending ports into outbound-hit statistics and runs outbound-blocking and SSRF detectors. 2. Does this harm maintainability? The method combines several distinct responsibilities (stats, store mutation vs peeking semantics, blocking policy, SSRF detection, stored-SSRF detection and exception handling) making it cognitively heavy. 3. Is it appropriate? The method grew in this change to handle both InetAddress.getAllByName and SocketChannel.connect callers plus different port-consumption semantics - this increases branching and conceptual load. 4. Would fixing it meaningfully improve codebase? Yes — splitting concerns (port handling vs SSRF checks vs storage/stats) would make logic easier to reason about and reduce risk of regression. 5. Common pattern? Not particularly; existing code previously separated some behavior but this change merged additional responsibilities into one process() method. 6. Fixability within PR? Yes: extraction of smaller helper methods is feasible here.

🔧 How do I fix it?
Break down long functions into smaller helper functions. Aim for functions under 60 lines with fewer than 10 local variables.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@Mishenevd Mishenevd force-pushed the fix/webclient-outbound-tracking branch from 2230760 to 1eab8be Compare July 1, 2026 14:50
}
String hostname = inetSocketAddress.getHostString();

String jarFilePath = System.getProperty("AIK_agent_api_jar");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loading a JAR path from System property then runtime-classloading and reflective method invocation hides executed code; avoid dynamic loading/reflection or document and validate the JAR path and expected class/method.

Details

✨ AI Reasoning
​The new wrapper loads a JAR path read from a system property, constructs a URLClassLoader at runtime, loads a class by name, iterates over its methods and invokes one by matching its name, all via reflection. This sequence (system-property-controlled JAR path + runtime classloading + reflective invocation) obscures the control flow and makes it hard to audit what code will run, which fits obfuscation/hiding patterns. It also depends on external input (System property) to locate executable code, increasing the risk that different code executes in different environments and hiding behavior from static analysis and code reviewers. These are deliberate runtime indirections introduced by this PR and therefore should be flagged.

🔧 How do I fix it?
Ensure code is transparent and not intentionally obfuscated. Avoid hiding functionality from code review. Focus on intent and deception, not specific patterns.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment on lines +77 to +79
} catch (InvocationTargetException invocationTargetException) {
if(invocationTargetException.getCause().toString().startsWith("dev.aikido.agent_api.vulnerabilities")) {
throw invocationTargetException.getCause();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on exception.getCause().toString().startsWith(...) to decide to rethrow hides error semantics; use explicit exception type checks instead of string-matching the cause.

Show fix
Suggested change
} catch (InvocationTargetException invocationTargetException) {
if(invocationTargetException.getCause().toString().startsWith("dev.aikido.agent_api.vulnerabilities")) {
throw invocationTargetException.getCause();
} catch (InvocationTargetException invocationTargetException) {
Throwable cause = invocationTargetException.getCause();
if(cause != null && cause.getClass().getName().startsWith("dev.aikido.agent_api.vulnerabilities")) {
throw cause;
Details

✨ AI Reasoning
​The added code catches InvocationTargetException and inspects invocationTargetException.getCause().toString().startsWith(...) to decide whether to throw the cause. Using string matching on the cause's toString() to classify exceptions is brittle and can be used to selectively propagate certain errors while swallowing others, masking failures or creating hidden control paths. This conditional propagation was introduced here, increasing opacity of error handling and potentially hiding malicious or unexpected behavior.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@Mishenevd Mishenevd force-pushed the fix/webclient-outbound-tracking branch from 1eab8be to 470324d Compare July 1, 2026 14:59
Comment on lines +70 to +75
for (Method method2: clazz.getMethods()) {
if(method2.getName().equals("reportConnect")) {
method2.invoke(null, hostname, resolvedAddress);
break;
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflectively scanning/ invoking reportConnect on a runtime-loaded class obscures the actual operation performed; prefer a clearly referenced API boundary or documented explicit delegation.

Show fix
Suggested change
for (Method method2: clazz.getMethods()) {
if(method2.getName().equals("reportConnect")) {
method2.invoke(null, hostname, resolvedAddress);
break;
}
}
Method reportConnectMethod = clazz.getMethod("reportConnect", String.class, InetAddress.class);
reportConnectMethod.invoke(null, hostname, resolvedAddress);
Details

✨ AI Reasoning
​The change locates and invokes a method named reportConnect via reflection on a class loaded from the external classloader. Using reflection in this way (scanning methods by name and invoking them) obscures the actual call target and arguments from code readers and static tools, complicating understanding of what is executed when a socket connects. Reflection here is used to execute security-relevant logic (reporting/SSRF checks) and is performed conditionally at runtime, which can hide behavior and make auditing more difficult.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

…gression

Follow-up to the reverted #308/#310. Customer flood was InetAddress.getAllByName()
picking up Reactor Netty's own DNS-resolver bootstrap noise (0.0.0.0, ::,
/etc/resolv.conf nameservers) as "new outbound connections" on port 0, and
blocking them in lockdown mode. #310 fixed the flood with an early return in
DNSRecordCollector.report() that also skipped the SSRF check below it -
verified with a regression test that this let an attacker-supplied private-IP
literal (e.g. a webhook field pointing straight at 169.254.169.254) through
undetected.

Investigating further found the actual root cause is bigger: Spring's
WebClient was never instrumented at all, and Reactor Netty's default HTTP
client bypasses InetAddress.getAllByName() entirely (it uses its own async
DNS resolver). So even after wrapping WebClient to register pending ports,
DNSRecordCollector was never invoked for real WebClient targets - confirmed
empirically via trace logs against a live running app, with distinct markers
proving InetAddressWrapper never fires for WebClient/Reactor Netty traffic in
this configuration. WebClient had zero outbound-domain visibility and zero
SSRF protection, independent of the original bug.

- DNSRecordCollector: narrow the private-IP-literal gate to only skip
  recording + outbound blocking when there's no pending port (genuine infra
  noise). SSRF checks are unconditional again, fixing the bypass above.
- SpringWebClientWrapper: register pending host+port for every WebClient
  request by hooking ExchangeFunction.exchange(), the interface every
  WebClient call goes through, same pattern as the existing OkHttp/Apache/JDK
  HttpClient wrappers. Uses string-based ByteBuddy matchers
  (hasSuperType(named(...))) instead of .class literals, since spring-webflux
  is compileOnly and only present on the target app's classloader - a .class
  reference in the matcher crashes the agent at premain with
  NoClassDefFoundError.
- SocketChannelWrapper: hook java.nio.channels.SocketChannel.connect(), the
  JDK-level call every NIO-based client (including Reactor Netty) makes once
  it has a resolved address, regardless of which DNS resolver produced it.
  This is what actually closes the gap for WebClient, and it also catches
  literal IP targets that never go through any resolver at all. Not
  Netty-specific instrumentation - it's a generic JDK hook with no references
  to io.netty.* types.
- DNSRecordCollector.reportConnect(): entry point for the new hook. Peeks the
  pending port instead of consuming it (report()'s getAndRemove), because a
  single request can trigger multiple connect() calls to the same hostname
  (e.g. the IPv4 then IPv6 address of a dual-stack host like localhost).
  Consuming on the first attempt let a blocked SSRF target succeed on the
  second attempt via the other address family - found live, fixed, covered by
  a regression test.
- PendingHostnamesStore: peeking instead of consuming means entries rely on
  WebRequestCollector's per-incoming-request clear() for cleanup, which never
  fires for WebClient calls made outside any incoming-request context (e.g. a
  @scheduled background task). Capped the store at 1000 entries per thread,
  evicting the least-recently-used one once exceeded - the same bounded-LRU
  pattern (LinkedHashMap with accessOrder=true + removeEldestEntry())
  already used by Hostnames.java for the same class of problem. Deliberately
  not a time-based TTL, to avoid a timing-dependent race reopening the
  dual-stack gap under load.
- SpringWebClientRedirectWrapper: WebClient calls with followRedirect(true)
  never re-invoke Spring's request-adaptation layer for redirect hops (Reactor
  Netty resends bodiless requests internally), so a redirect to a private IP
  was invisible to both tracking and SSRF - same failure mode as the DNS gap
  above, just one layer up. Hooks HttpClientConnect$HttpClientHandler.redirect()
  (package-private, mirroring the same tradeoff HttpConnectionRedirectWrapper
  already makes for the JDK's equally-internal followRedirect0) and feeds the
  chain into the existing RedirectCollector/PrivateIPRedirectFinder mechanism,
  the same one already used for JDK HttpURLConnection redirects.
- RequestController (SpringWebfluxSampleApp): /api/request endpoint (plus a
  followRedirect(true) variant) used to validate all of the above against a
  real running app end to end, and now wired into end2end/spring_webflux_postgres.py
  as an automated "ssrf" e2e payload.

Known limitation, not fixed here: Spring WebFlux has no request-body taint
tracking at all (SpringWebfluxContextObject never populates
ContextObject.body), so SSRF via JSON body can't be detected for WebFlux apps
regardless of this change - flagged separately, doesn't regress anything.
@Mishenevd Mishenevd force-pushed the fix/webclient-outbound-tracking branch from 470324d to b33ebce Compare July 1, 2026 16:39
Class<?> clazz = classLoader.loadClass("dev.aikido.agent_api.collectors.DNSRecordCollector");

// Run reportConnect with "argument"
for (Method method2: clazz.getMethods()) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterating clazz.getMethods() to find and invoke reportConnect is verbose; use clazz.getMethod("reportConnect", String.class, InetAddress.class) and invoke it directly.

Details

✨ AI Reasoning
​The code loads a class via reflection and then iterates over clazz.getMethods() to find a method named "reportConnect", invoking it when found. This is more verbose than directly calling clazz.getMethod("reportConnect", parameterTypes...). Iterating all methods is unnecessary and reduces clarity; a direct reflective lookup is standard, clearer, and concise. This change was introduced in the new SocketChannelWrapper and is a straightforward simplification opportunity that keeps identical behavior.

🔧 How do I fix it?
Rewrite the snippet in the simpler, behavior-equivalent form: return a boolean expression directly instead of if cond return true else return false, avoid using lists when they are guaranteed to contain one element, etc.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant