Status: draft / sketch. Open for refinement before scoping into work items. Not committing to the exact API surface below.
Background
Two halves of "agent-side x402 spending" already exist:
- Stateless single-shot —
buy.py pay <url> (skill buy-x402, merged in 6eb9a8b) signs one EIP-3009 / Permit2 auth, attaches X-PAYMENT, sends any HTTP method, and returns the response. Works against any type:http x402 endpoint. Max loss = 1 request.
- Persistent inference budgets —
buy.py buy <name> pre-signs N auths, stores them in a PurchaseRequest CR, and the x402-buyer sidecar in the litellm pod spends them transparently as the agent hits paid/<remote-model> through LiteLLM. Max loss = N × price; runtime path holds zero signer access.
- Seller side —
obol sell http and obol sell demo {hello,blocks,oracle} let anyone gate any in-cluster HTTP service behind x402 ForwardAuth.
Gap
The persistent / sidecar-mediated path is inference-shaped, OpenAI-Chat-only. There's no equivalent for non-LLM HTTP services, and the only LLM wire format covered is OpenAI Chat. Specifically:
A. No pre-paid budgets for HTTP services
An agent that wants to call demo-blocks 200 times to monitor a wallet, or hit a paid OCR / RAG / indexer / market-data endpoint repeatedly, has to re-sign on every call via pay. That burns local-signer round-trips and adds latency to every request. The signing/retry/settlement state machine in internal/x402/buyer/ already handles the persistent case for inference — it just isn't exposed for arbitrary HTTP.
B. No in-cluster proxy endpoint for non-Python consumers
buy.py is a Python skill. Anything else in the cluster — Node.js MCP servers, Go indexers, Rust services in another pod, browser webapps behind the tunnel, other agent runtimes — cannot easily shell out to a Python script. They need an HTTP endpoint they can call to get paid responses back.
C. No transparent tool-call shape for agents
Even from a Python skill, pay is an explicit subprocess call. For an LLM tool call, "pay this URL" should look like a normal HTTP request the model issues, with the sidecar transparently handling 402 retries — same UX as paid/<model> for inference.
D. No Anthropic-native wire format
Claude Code, Cline, Aider, and most "Claude-class" agent tools speak the Anthropic Messages API (POST /v1/messages), not OpenAI Chat. They can't be pointed at the existing paid/* route without an adapter. Today the only path for those tools is to hold an ANTHROPIC_API_KEY and pay Anthropic directly — bypassing the cluster, the marketplace, and any seller's monetized models.
Proposal sketch
Reframe the sidecar as a single payment + retry + settle engine fronted by multiple wire-format adapters, not a single LLM-shaped proxy:
┌──────────────────────────┐
/v1/chat/completions │ OpenAI Chat adapter │ ← shipped (paid/* via LiteLLM)
───────────────────► │ │
│ │
/v1/messages │ Anthropic Messages │ ← new — Claude Code / Cline / Aider
───────────────────► │ adapter │
│ │
/proxy/<host>/<path> │ Generic HTTP adapter │ ← new — arbitrary x402 services
───────────────────► │ │
└────────────┬─────────────┘
│
┌────────────▼─────────────┐
│ Auth pool lookup │
│ (host- or model-keyed) │
│ X-PAYMENT attach │
│ 402 retry │
│ Settle on <400 success │
└──────────────────────────┘
New endpoints on the existing x402-buyer sidecar:
POST /proxy/<host>/<path...>
- per-host auth pool lookup
- one auth attached as X-PAYMENT
- upstream call forwarded verbatim
- 402 retry with next auth
- settle on <400 success
- upstream response returned to caller
POST /v1/messages
- Anthropic Messages API surface
- translate request → OpenAI Chat → upstream (or pass-through if upstream is anthropic-native)
- reuse the same paid/<model> auth pool the sidecar already drives
- translate response back to Anthropic Messages format
POST /admin/budget
- extension of the buy.py / PurchaseRequest path for type:http
- pre-sign N auths against a (host, path-glob) tuple
- store in a host-keyed pool alongside today's model-keyed pools
Plus controller / skill plumbing:
- Relax the
PurchaseRequest inference-shape assumption: a type:http purchase should NOT publish a paid/<model> LiteLLM route. It should write a buyer-config entry keyed by (host, path-glob) instead of (upstream, model).
- Extend
buy.py buy to accept --type http --path-glob <glob> and produce the right CR shape.
- Pricing-aware Anthropic adapter: respect the Anthropic-style
cache_control markers we now inject by default for Anthropic models, so prompt caching cost savings carry through the paid path.
What this unlocks (concrete examples)
Demo services already in tree
demo-blocks polling — 200-call wallet monitor: pre-buy budget once, hit /proxy/obol.stack:8080/services/demo-blocks/... from any pod. Today: 200 separate sign/send round-trips.
demo-oracle from a Node dashboard — frontend (or any non-Python service in the cluster) does fetch("http://x402-buyer:8402/proxy/...") and gets paid responses, zero crypto code on the dashboard.
Real monetizable services
- Self-hosted OCR / transcription / embedding endpoints behind
obol sell http, consumed as transparent agent tools.
- Private RAG corpus charged per-query, used inside a research session with bounded total cost.
- Custom indexers / Dune-style analytical queries monetized per call.
- Validator / DV monitoring feeds subscribed to with a daily auth budget.
- Image / video / voice generation endpoints — bulk batch with one budget instead of N signs.
- Third-party market data / weather / search — same agent, same code, regardless of upstream.
Cross-runtime
- MCP servers in the cluster expose tools whose backends are paid x402 endpoints. The MCP server just speaks HTTP to the sidecar — no key handling, no signing, no budget bookkeeping. Every MCP server in the ecosystem becomes spend-aware without code changes.
Adoption: Anthropic-native tools (new with the /v1/messages adapter)
The Anthropic Messages adapter turns every existing Claude-class tool into a potential obol buyer with one env var:
export ANTHROPIC_BASE_URL=http://x402-buyer.llm.svc.cluster.local:8402/anthropic
# or, from the host:
export ANTHROPIC_BASE_URL=http://127.0.0.1:18402/anthropic # via kubectl port-forward
unset ANTHROPIC_API_KEY
claude # or cline / aider / any anthropic-native client
From that point on:
- Claude Code's autonomous loop, tool calls, and subagents work unchanged.
- Every model call is paid per request from the agent's wallet against whatever upstream the cluster routes to (could be a marketplace seller's hosted Claude-compatible endpoint, a self-hosted GPU box, or a remote Anthropic-fork).
- No
ANTHROPIC_API_KEY required.
- Live model switching reuses the existing LiteLLM model-routing layer.
- Spending caps, auto-refill, and audit trail come for free from the existing budget machinery.
The same env-var pattern works for any tool that respects ANTHROPIC_BASE_URL (Claude Code, Cline, Aider, claude-dev, custom integrations). It also gives sellers a much larger addressable audience — anyone with claude installed becomes a potential customer of any model exposed via obol sell http or obol sell inference.
Risk surface
- Auth pools are already per-upstream-keyed today — a malicious host can only spend auths the user pre-signed for it. Property carries over.
- SSRF guard required:
/proxy/<host> must refuse <host> resolving to internal cluster addresses unless explicitly allowlisted. Same discipline that keeps frontend and erpc HTTPRoutes hostname-restricted in traefik / obol-frontend / erpc namespaces today.
- The Anthropic adapter must not silently drop fields it can't translate (vision, parallel tool use, MCP-shaped messages). Errors should be 400 with a clear body, not a "successful" payment for a degraded response. Document the supported subset up front.
- The sidecar is distroless — the new mux routes add no new dependencies.
Suggested split
Out of scope for this issue
- Changing the verifier path (
x402-verifier stays verify-only on the cluster-routed paid flow).
- Adding new chains / tokens — orthogonal to this proxy work.
- A dedicated CLI surface (
obol agent pay …) — can come later as ergonomics on top of the sidecar route.
- Vision / image input on the Anthropic adapter — explicitly deferred to a follow-up; not all sellers' upstreams support it and the translation layer adds risk.
Background
Two halves of "agent-side x402 spending" already exist:
buy.py pay <url>(skillbuy-x402, merged in6eb9a8b) signs one EIP-3009 / Permit2 auth, attachesX-PAYMENT, sends any HTTP method, and returns the response. Works against anytype:httpx402 endpoint. Max loss = 1 request.buy.py buy <name>pre-signs N auths, stores them in aPurchaseRequestCR, and thex402-buyersidecar in thelitellmpod spends them transparently as the agent hitspaid/<remote-model>through LiteLLM. Max loss = N × price; runtime path holds zero signer access.obol sell httpandobol sell demo {hello,blocks,oracle}let anyone gate any in-cluster HTTP service behind x402 ForwardAuth.Gap
The persistent / sidecar-mediated path is inference-shaped, OpenAI-Chat-only. There's no equivalent for non-LLM HTTP services, and the only LLM wire format covered is OpenAI Chat. Specifically:
A. No pre-paid budgets for HTTP services
An agent that wants to call
demo-blocks200 times to monitor a wallet, or hit a paid OCR / RAG / indexer / market-data endpoint repeatedly, has to re-sign on every call viapay. That burns local-signer round-trips and adds latency to every request. The signing/retry/settlement state machine ininternal/x402/buyer/already handles the persistent case for inference — it just isn't exposed for arbitrary HTTP.B. No in-cluster proxy endpoint for non-Python consumers
buy.pyis a Python skill. Anything else in the cluster — Node.js MCP servers, Go indexers, Rust services in another pod, browser webapps behind the tunnel, other agent runtimes — cannot easily shell out to a Python script. They need an HTTP endpoint they can call to get paid responses back.C. No transparent tool-call shape for agents
Even from a Python skill,
payis an explicit subprocess call. For an LLM tool call, "pay this URL" should look like a normal HTTP request the model issues, with the sidecar transparently handling 402 retries — same UX aspaid/<model>for inference.D. No Anthropic-native wire format
Claude Code, Cline, Aider, and most "Claude-class" agent tools speak the Anthropic Messages API (
POST /v1/messages), not OpenAI Chat. They can't be pointed at the existingpaid/*route without an adapter. Today the only path for those tools is to hold anANTHROPIC_API_KEYand pay Anthropic directly — bypassing the cluster, the marketplace, and any seller's monetized models.Proposal sketch
Reframe the sidecar as a single payment + retry + settle engine fronted by multiple wire-format adapters, not a single LLM-shaped proxy:
New endpoints on the existing
x402-buyersidecar:Plus controller / skill plumbing:
PurchaseRequestinference-shape assumption: atype:httppurchase should NOT publish apaid/<model>LiteLLM route. It should write a buyer-config entry keyed by(host, path-glob)instead of(upstream, model).buy.py buyto accept--type http --path-glob <glob>and produce the right CR shape.cache_controlmarkers we now inject by default for Anthropic models, so prompt caching cost savings carry through the paid path.What this unlocks (concrete examples)
Demo services already in tree
demo-blockspolling — 200-call wallet monitor: pre-buy budget once, hit/proxy/obol.stack:8080/services/demo-blocks/...from any pod. Today: 200 separate sign/send round-trips.demo-oraclefrom a Node dashboard — frontend (or any non-Python service in the cluster) doesfetch("http://x402-buyer:8402/proxy/...")and gets paid responses, zero crypto code on the dashboard.Real monetizable services
obol sell http, consumed as transparent agent tools.Cross-runtime
Adoption: Anthropic-native tools (new with the
/v1/messagesadapter)The Anthropic Messages adapter turns every existing Claude-class tool into a potential obol buyer with one env var:
From that point on:
ANTHROPIC_API_KEYrequired.The same env-var pattern works for any tool that respects
ANTHROPIC_BASE_URL(Claude Code, Cline, Aider, claude-dev, custom integrations). It also gives sellers a much larger addressable audience — anyone withclaudeinstalled becomes a potential customer of any model exposed viaobol sell httporobol sell inference.Risk surface
/proxy/<host>must refuse<host>resolving to internal cluster addresses unless explicitly allowlisted. Same discipline that keepsfrontendanderpcHTTPRoutes hostname-restricted intraefik/obol-frontend/erpcnamespaces today.Suggested split
POST /proxy/<host>/<path...>with per-host pool lookup, retry, settle (internal/x402/buyer/proxy.go).POST /v1/messagesAnthropic Messages adapter — translate request to OpenAI Chat, reuse the existing paid pool, translate response back. Document the supported subset (parity with what LiteLLM's anthropic-passthrough already covers, minus what's intentionally out of scope)./admin/reloadto pick up host-keyed pools alongside model-keyed pools.PurchaseRequestto handletype:httpwithout publishing apaid/<model>LiteLLM route (internal/serviceoffercontroller/).buy.py buy --type http --path-globflag ininternal/embed/skills/buy-x402/scripts/buy.py.fetch.ANTHROPIC_BASE_URLenv-var setup, including the port-forward path for host-side use.flow-08-buy.sh/ monetize integration test to cover an HTTP budget againstdemo-blocks.claude(or a curl shaped like Anthropic Messages) → sidecar/v1/messages→ upstream → settled response.Out of scope for this issue
x402-verifierstays verify-only on the cluster-routed paid flow).obol agent pay …) — can come later as ergonomics on top of the sidecar route.