Skip to content

Tunnel WebSockets end to end#27

Merged
mmkal merged 14 commits into
mainfrom
exciting-evening
Jun 12, 2026
Merged

Tunnel WebSockets end to end#27
mmkal merged 14 commits into
mainfrom
exciting-evening

Conversation

@jonastemplestein

@jonastemplestein jonastemplestein commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Captun tunnels now forward WebSockets end to end. The headline use case: a tunnel target is just a fetch function, and that fetch function can serve a Cap'n Web RPC capability over a WebSocket — so any process, anywhere, can expose a live object the whole internet can call:

import { newWebSocketRpcSession, newWorkersWebSocketRpcResponse, RpcTarget } from "capnweb";
import { createCaptunTunnel } from "captun";

class Calculator extends RpcTarget {
  add(a: number, b: number) {
    return a + b;
  }
}

using tunnel = await createCaptunTunnel({
  fetch: (request) => newWorkersWebSocketRpcResponse(request, new Calculator()),
});

// ...anywhere else on the internet:
const rpc = newWebSocketRpcSession<Calculator>(tunnel.url.replace(/^http/, "ws"));
await rpc.add(2, 3); // 5

No WebSocket plumbing in sight: a fetch handler that returns a Workers-style response carrying a webSocket is bridged automatically. This exact flow is the first e2e test.

API

CLIcaptun 3000 now forwards WebSockets to the local server like it forwards HTTP, including handshake headers (Cookie, Authorization, Origin), subprotocol negotiation, binary frames, and close codes. Verified against socket.io, ws, Bun.serve, Deno, and crossws servers (see below).

Workers-style on any runtime — Node has no WebSocketPair and its Response rejects status 101, so the library exports the two missing primitives; the same handler code runs on workerd, Node, or Bun:

import { createCaptunTunnel, createWebSocketResponse, isWebSocketUpgradeRequest, WebSocketPair } from "captun";

using tunnel = await createCaptunTunnel({
  fetch(request) {
    if (!isWebSocketUpgradeRequest(request)) return new Response("hello");
    const pair = new WebSocketPair();
    pair[1].accept();
    pair[1].addEventListener("message", (event) => pair[1].send(`echo:${event.data}`));
    return createWebSocketResponse(pair[0]);
  },
});

Low-level hookconnectWebSocket(request, remote) for dial-out proxying to a separate local WebSocket server (what the CLI uses internally). Cap'n Web stub lifetime (dup()/dispose) is handled inside pipeWebSocketToHandle, so implementors never see it.

How it works

There is no WebSocket wire format in this PR. The tunnel already runs on Cap'n Web, which passes capabilities both directions, delivers calls in order, and serializes Uint8Array natively — so a tunneled WebSocket is a pair of handles (send/close), and every message is one ordered RPC call:

public client ⇄ gateway DO (WebSocketPair) ⇄ Cap'n Web RPC ⇄ tunnel client ⇄ local WebSocket server

Decisions worth knowing:

  • The 101 must come from DO fetch, not the forward RPC — an upgrade response with an attached webSocket does not survive a Durable Object RPC method return (verified empirically), so WebSocket requests route through shard.fetch().
  • Tunnel death closes its public sockets with 1001 (going away), so browser reconnect logic fires instead of hanging on a zombie socket.
  • Messages are capped at 16MiB (close 4009 on just that socket): benchmarking showed a ~24MiB message overflows the Cap'n Web leg's ~32MiB frame limit (base64 inflates 4/3) and would otherwise kill every connection on the tunnel.
  • Message-level relay: ping/pong and permessage-deflate are per-hop; close codes outside 1000/3000–4999 degrade to a bare close (the close() API refuses them).
  • An earlier draft shipped a hand-rolled frame codec (newline JSON + base64 in streamed HTTP bodies); replacing it with capability passing deleted ~200 lines and structurally removed reordering/drop/crash races.

Tested and proven

E2e suite (17 tests): Cap'n Web RPC through a tunneled WebSocket, CLI forwarding (text, binary, subprotocols, close codes, handshake headers), Workers-style handler in plain Node, hosted-gateway forwarding, concurrent sockets without cross-talk, fast failure when the local server rejects the upgrade, disconnect cleanup, the oversized-message guard, and Captun through Captun (an inner gateway tunneled through an outer tunnel — traffic crosses two gateways and two tunnel clients).

Robustness campaign against a real deployed gateway (Cloudflare, gateway secret):

  • Server matrix, all passing the full check set (connect, push, text, byte-verified binary, close codes both directions, concurrent clients): socket.io (own framing + acks, unmodified), Node ws, Bun.serve, Deno, crossws, in-process WebSocketPair, and a Cloudflare Durable Object running createCaptunTunnel against its own fetch (tunnel survived 67+ minutes inside the DO).
  • Benchmarks: p50 RTT 51ms through the real edge vs 0.32ms local (captun's own stack adds <1ms); ~1,900 msgs/s; byte-exact payloads to ~24MiB (hence the 16MiB guard); 50 concurrent connections, 100-cycle churn, 90s soak, 200 pipelined mixed text/binary — zero losses, reorders, or cross-talk.

What was difficult

  • Cap'n Web disposes stubs received as call arguments when the call returns — handles kept for the socket's lifetime must be dup()ed (verified with a standalone repro, encapsulated in the pipe).
  • Runtimes disagree about binary: Node delivers Blob (async to read), workerd ArrayBuffer/Blob, ws gives Buffer — sometimes cross-realm where instanceof lies. Conversions are duck-typed and chained through a promise queue so a slow Blob read can never reorder frames or race the close event.
  • miniflare landmines (test infra only): getWorker().fetch() corrupts binary frames to "[object Blob]", and workerd stringifies send(Blob) — noted in the tests where they bite; binary coverage runs over real sockets.

Validation

  • pnpm run check, pnpm test (core suite, 134 passed — example tests moved to pnpm test:examples, which CI runs with its pinned Bun/Deno versions)
  • Full server/client matrix + benchmarks against a deployed gateway (results above)

🤖 Generated with Claude Code


Note

High Risk
Changes core gateway routing, Durable Object lifecycle, and tunnel RPC surface; bugs could break upgrades, leak cookies on non-101 paths, or affect all connections on a tunnel if limits fail.

Overview
Adds end-to-end WebSocket forwarding through Captun tunnels, not just HTTP.

The gateway routes upgrades through the Durable Object’s fetch path (not forward RPC) so 101 + webSocket survives workerd. Public sockets are bridged to the tunnel client via Cap’n Web connectWebSocket and pipeWebSocketToHandle, with a 16MiB per-message cap, ordered relay, and 1001 cleanup when the tunnel client drops.

Library: new WebSocketFetcher / WebSocketHandle, optional connectWebSocket on createCaptunTunnel, runtime helpers (WebSocketPair, createWebSocketResponse, isWebSocketUpgradeRequest), and automatic bridging when fetch returns a Workers-style upgrade response.

CLI: dials wss: to the local target, forwards handshake headers, and logs 101 upgrades.

Tests/docs: large e2e coverage (RPC, CLI, hosted gateway, nested tunnels), pnpm test scoped to test/, pnpm test:examples in CI, engines.node >=22.4, and README WebSocket section.

Reviewed by Cursor Bugbot for commit 89d2dd5. Bugbot is set up for automated code reviews on this repo. Configure here.

@jonastemplestein jonastemplestein marked this pull request as ready for review June 11, 2026 13:25
Comment thread src/index.ts Outdated
Comment thread src/server/worker.ts Outdated
The tunnel already runs over Cap'n Web, which passes capabilities in both
directions and serializes Uint8Array natively, so tunneled WebSockets don't
need a bespoke wire format. connectWebSocket now receives a WebSocketHandle
for the public socket and returns one for the local socket; each message is
an ordered RPC call. This deletes the newline-JSON/base64 frame codec, the
streamed-body POST bridge, and the listener-vs-close races that could drop
or reorder frames and crash the CLI on an unhandled rejection.

Also:
- Pump conversions through a promise chain so Blob messages (Node delivers
  binary as Blob) cannot reorder or race the close event.
- Forward only the negotiated subprotocol onto the public 101 instead of
  every local response header.
- Skip set-cookie stripping for 101 responses, which cannot be reconstructed.
- Compare the Upgrade header case-insensitively.
- Drop the unused string overload of createTunnelForwardRequest and the
  unreachable 501 guard.
- Replace the spawned Bun fixture with a miniflare echo worker and cover
  binary frames and close-code propagation end to end.
@pkg-pr-new

pkg-pr-new Bot commented Jun 11, 2026

Copy link
Copy Markdown

Open in StackBlitz

npx https://pkg.pr.new/captun@27

commit: 89d2dd5

A tunnel client disconnect now closes that tunnel's forwarded public
WebSockets with 1001 instead of leaving idle sockets open until their next
send fails. The shard tracks forwarded sockets per active tunnel and closes
them on disconnect and on same-name reconnect.

New e2e coverage for the ways a tunneled WebSocket server should just work:
- Captun through Captun: the CLI exposes an inner gateway through an outer
  tunnel; RPC and echo traffic cross two gateways and two tunnel clients.
- The hosted gateway forwards WebSockets (was only covered on the core worker).
- Concurrent WebSockets over one tunnel do not cross talk.
- Subprotocol negotiation reaches the public 101 on both the CLI path and
  the local-fetch fallback path.
- A local server that rejects the upgrade fails the public handshake instead
  of hanging.
- Disconnecting the tunnel client closes public sockets with 1001.

Also dispose the piped handle stub even when the final close call throws.
Comment thread src/cli/bin.ts Outdated
@jonastemplestein jonastemplestein changed the title [codex] add websocket tunnel proof Forward WebSockets through Captun tunnels Jun 11, 2026
The CLI's WebSocket path dropped Cookie, Authorization, Origin, and other
handshake headers that tunneled HTTP requests forward, so local servers
doing handshake auth worked over HTTP but not over WebSockets. Node's
WebSocket (undici) accepts a headers option; forward everything except
hop-by-hop headers and the Sec-WebSocket-* family, which the local client
negotiates itself. The e2e CLI test now asserts the local server sees the
public client's cookie and authorization headers.

Found by Cursor Bugbot on the PR.
A plain fetch handler can now answer tunneled WebSockets the Workers way on
any runtime. Two primitives were missing outside workerd and are now
exported: WebSocketPair (the native pair where the runtime has one,
otherwise an in-memory pair with Workers accept()/queuing semantics) and
createWebSocketResponse (a real 101 upgrade response in Workers; elsewhere a
Response that carries the socket to the tunnel bridge, since other runtimes
reject status 101). connectWebSocket remains the low-level hook for dialing
out to a separate local WebSocket server, which is what the CLI does.

The new e2e test runs a Workers-style handler in plain Node — welcome
message sent before the handler returns, echo, subprotocol selection, and
close codes in both directions — with no workerd on the target side.
Comment thread src/cli/bin.ts
Robustness testing against a deployed gateway showed a single ~24MiB message
kills the entire tunnel: base64 over the Cap'n Web leg overflows its ~32MiB
frame limit and every connection on the tunnel gets closed with 1001. The
pipe now closes just the offending socket with 4009 before forwarding;
covered by an e2e test that checks the tunnel survives.

Also: a short WebSockets section in the README, and pnpm test now runs the
core test/ suite only — example tests depend on the Bun/Deno versions CI
pins, so they move to pnpm test:examples, which CI runs as its own step.
@jonastemplestein jonastemplestein changed the title Forward WebSockets through Captun tunnels Tunnel WebSockets: serve Cap'n Web capabilities from a fetch function, expose any local WebSocket server Jun 11, 2026
If the local server closes immediately after the WebSocket handshake, the
CLI now returns the 502 rejection instead of accepting an already-dead
socket. A close that lands after acceptance was already propagated by the
pipe; this narrows the remaining race where readyState flips before the
close event dispatches.

Found by Cursor Bugbot on the PR.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 665c18d. Configure here.

Comment thread src/index.ts
Comment thread src/index.ts Outdated
@mmkal mmkal changed the title Tunnel WebSockets: serve Cap'n Web capabilities from a fetch function, expose any local WebSocket server Tunnel WebSockets end to end Jun 11, 2026
mmkal and others added 6 commits June 11, 2026 18:01
The in-memory WebSocketPair accepts any value in send(), and the tunnel
pipe's message normalization falls back to String(data) for types it
doesn't recognize — so a handler bug like send({object}) reaches the
public client as the text frame "[object Object]" instead of failing.

This test asserts the sane behavior (the socket closes instead of
delivering corrupted text) and fails against the current fallback; the
fix follows in the next commit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
webSocketMessage now throws on payload types it doesn't recognize
instead of falling back to String(data). The pipe's existing error
handling turns the throw into a 1011 close of just that socket, so a
handler bug surfaces as a failed connection rather than the public
client silently receiving "[object Object]".

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A string's .length counts UTF-16 code units; multi-byte characters
serialize to up to 3x that in UTF-8 and JSON escapes up to 6x, so a
string under 16M units could still overflow the ~32MiB Cap'n Web frame
limit and kill every connection on the tunnel — the exact failure the
cap exists to prevent. Strings now measure their JSON-escaped UTF-8
size, with a fast path that skips encoding when even the worst case
fits.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The CLI's WebSocket forwarding relies on undici's non-standard
WebSocket options ({ protocols, headers }); older Nodes either lack the
global WebSocket or silently drop the forwarded handshake headers
rather than erroring. Fail loudly at install time instead.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
WorkerWebSocket, WorkerWebSocketPairConstructor, and
WebSocketResponseInit were duplicated across src/index.ts,
src/server/worker.ts, and the e2e fixture. The public WebSocketPair
export was already typed with the private constructor type, so
exporting it also fixes a leaked-private-type hole.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ting

newWebSocketRpcSession's type parameter describes the remote interface,
and capnweb's stub provides onRpcBroken itself — so the session can be
declared as RemoteFetcherCapability directly. The as-unknown-as cast
was papering over a wrong type argument.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@mmkal mmkal merged commit e147d2c into main Jun 12, 2026
5 checks passed
mmkal added a commit that referenced this pull request Jun 12, 2026
PR #27 changed src/index.ts without re-running
scripts/build-hosted-browser-module.ts, so the committed module was
missing the WebSocket forwarding code. Deploys were unaffected (build
and deploy scripts regenerate first), but wrangler dev and direct src
consumers served the stale module.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
mmkal added a commit that referenced this pull request Jun 12, 2026
PR #27 shipped with a stale browser-module.generated.ts because nothing
checks it. Now autofix.ci regenerates it on PRs (same pattern as
artifact.ci's generate step), and CI fails on any remaining drift via
git diff --exit-code after pnpm build, which already runs the
generator.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants