Skip to content

Design a remote kernel mode: sidecar service with a thin client (big swing) #227

@dgenio

Description

@dgenio

Summary

Design (and prototype) running the kernel as an out-of-process sidecar — a small
HTTP/JSON service exposing request_capabilities, grant, invoke, expand,
explain — with a thin Python client preserving the current API, so multiple agent
processes (or non-Python agents) can share one policy/audit/budget domain.

Why this matters

The in-process design is the kernel's strength (latency, simplicity) and its
adoption ceiling: rate limits, revocation, budgets, and traces are per-process,
so a horizontally scaled agent fleet gets N independent security domains (see
ISSUE 62). A sidecar mode turns the kernel into shared infrastructure — one audit
trail, one limiter, one revocation list per deployment — and creates the seam for
non-Python SDKs (ISSUE 63). This is the natural "big swing" that the MCP-gateway
recipe (#131) gestures at; a first-class wire API is broader: #131 exposes the
kernel to MCP clients, this exposes the kernel to its own distributed hosts.

Current evidence

External context

The sidecar/policy-agent deployment model (policy engine as a local daemon with
thin clients) is well-established in the authorization ecosystem; agent-security
gateways converge on the same shape.

Proposed implementation

  1. Phase 0 (this issue's deliverable): a design doc — wire schema for the five
    verbs, authn between host and sidecar, error mapping (stable reason codes are
    already wire-friendly), streaming transport choice, what stays client-side
    (adapter rendering), versioning policy.
  2. Phase 1: prototype under examples/ or a separate package (keep core deps
    minimal — the server can live behind an extra), reusing pydantic models.
  3. Phase 2: conformance — remote mode must pass the invariant suite (ISSUE 34) and
    canary suite (ISSUE 18) verbatim through the client.
  4. Explicit non-goals: not a multi-tenant SaaS, not a persistence layer ([Feature] Pluggable persistence for TraceStore, HandleStore, and token revocation (SQLite + JSONL backends) #126
    handles durability).

AI-agent execution notes

  • Inspect first: kernel/__init__.py (verb signatures), models.py (serializability), errors.py (error taxonomy → HTTP mapping), policy_reasons.py.
  • Decide handle semantics remotely (handles reference server-side data — natural fit).
  • Edge cases: client/server version skew; streaming over HTTP (SSE vs chunked); secret bootstrap between host and sidecar.
  • Keep the in-process mode the default and untouched.

Acceptance criteria

  • A reviewed design doc covering wire schema, auth, errors, streaming, and versioning.
  • A working prototype demonstrating grant→invoke→expand→explain across processes.
  • Invariant and canary suites pass through the remote client in the prototype.

Test plan

Prototype integration tests (local loopback); reuse invariant/canary suites through
the client; latency comparison documented. Run make ci.

Documentation plan

Design doc under docs/; README deployment-modes section when shipped; CHANGELOG
Added.

Migration and compatibility notes

Opt-in deployment mode; in-process API unchanged. Wire API versioned independently
from day one.

Risks and tradeoffs

Significant scope: networked security service implies authn, TLS guidance, DoS
surface. Phasing (design → prototype → harden) bounds the bet; the design doc alone
de-risks #131/#126/ISSUE 63 sequencing.

Suggested labels

ecosystem, architecture, product, integrations

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions