Sandbox is a TypeScript-first Node.js library for running AI-agent work inside libkrun-backed microVMs. It gives agent builders a small API for booting isolated Linux workers, mounting host-controlled filesystems, persisting rootfs mutations, and enforcing explicit network egress policy.
Use Sandbox when your agent needs to run tools, install packages, clone repos, execute untrusted code, or call external APIs without handing the guest broad host filesystem or network access.
Sandbox is designed for:
- agent runtimes that need disposable or durable Linux workspaces,
- coding agents that need real shells, compilers, package managers, and repo checkouts,
- browser or data agents that need tightly-scoped outbound network access,
- hosted agent platforms that need per-task isolation with policy and audit points in TypeScript,
- systems that need to broker credentials from the host without putting long lived secrets inside the guest.
This example creates a durable agent lane, mounts a host-controlled workspace, allows DNS through Cloudflare, allows only GitHub API HTTP traffic, and injects a short-lived GitHub token from the host before the request leaves the sandbox.
import {
defineSandbox,
fs,
network,
rootfs,
type SandboxBlockStore,
} from "@torkbot/sandbox";
const workspace = fs.memory({
files: {
"/task.txt": "Summarize the current GitHub user\n",
},
});
const writableRootfs: SandboxBlockStore = new BlobBackedBlockStore({
bucket: "agent-rootfs-overlays",
keyPrefix: "lanes/github-worker",
});
const githubTokens = new GitHubInstallationTokenService({
installationId: 123456,
});
const sandbox = defineSandbox({
rootfs: rootfs.cow({
base: rootfs.builtIn("alpine:3.23"),
writable: writableRootfs,
}),
resources: {
cpus: 4,
memoryMiB: 4096,
},
network: network.policy(async (conn) => {
// Let the guest resolve names, but answer DNS via an explicit resolver.
if (conn.matchDns()?.accept({ resolvers: ["1.1.1.1"] })) return;
// Only GitHub API HTTP(S) traffic gets HTTP middleware.
const github = conn.matchHttp("api.github.com");
if (!github) return;
// Keep the credential decision in host-controlled TypeScript.
if (!(await githubTokens.canServe(github))) return;
github.accept(async (request) => {
request.headers.set(
"authorization",
`Bearer ${await githubTokens.tokenForRequest(request)}`,
);
});
}),
});
await using lane = await sandbox.boot({
mounts: {
"/workspace": fs.virtual(workspace),
},
cwd: "/workspace",
});
const result = await lane.exec("sh", [
"-lc",
"cat task.txt && curl -fsSL https://api.github.com/user",
]);
if (result.exitCode !== 0) {
throw new Error(result.stderr);
}The guest gets a normal Linux environment. The host keeps control over the workspace contents, rootfs persistence, network decisions, and credential injection.
Use a built-in read-only rootfs and a memory-backed workspace:
const workspace = fs.memory({
files: { "/hello.txt": "hello from the host\n" },
});
const sandbox = defineSandbox({
rootfs: rootfs.builtIn("alpine:3.23"),
});
await using lane = await sandbox.boot({
mounts: { "/workspace": fs.virtual(workspace) },
cwd: "/workspace",
});
const result = await lane.exec("cat", ["hello.txt"]);Use rootfs.cow(...) when package installs and rootfs mutations should survive
across boots. Sandbox owns the block-device protocol; your application owns the
storage backend.
const source = rootfs.compose({
base: rootfs.builtIn("alpine:3.23"),
overlay: blockStore,
});
const sandbox = defineSandbox({
rootfs: rootfs.cow({ source }),
});Attach a writable COW block store to at most one running sandbox instance at a time. Create one store per lane, or enforce exclusivity in your storage layer.
Mounts are per boot. They are guest-visible paths backed by TypeScript filesystem implementations, not host path passthrough.
await using lane = await sandbox.boot({
hostname: "agent-42",
mounts: {
"/workspace": fs.virtual(workspaceFs),
"/mnt/shared": fs.virtual(sharedFs),
},
cwd: "/workspace",
});Sandbox configures the kernel hostname during boot. Omit hostname to use the
built-in default sandbox.
Sandbox init creates missing mount target directories immediately before attaching each virtual filesystem, matching container runtime behavior when the target parent is on init-owned tmpfs or comes from an earlier virtual mount. On durable rootfs paths that cannot be created without changing rootfs or COW semantics, boot fails with the guest init error surfaced in the host exception.
Networking is default-deny. Policy callbacks grant only the flows they accept:
const sandbox = defineSandbox({
rootfs: rootfs.builtIn("alpine:3.23"),
network: network.policy((conn) => {
if (conn.matchDns()?.accept({ resolvers: ["1.1.1.1"] })) return;
conn.matchHttp("api.example.com")?.accept((request) => {
request.headers.set("authorization", `Bearer ${apiToken}`);
});
}),
});Use conn.accept() for raw transport, and protocol match helpers when you want
Sandbox to handle protocol-specific semantics. conn.matchHttp(...) does not
trust the HTTP Host header; it uses trusted destination metadata and then
routes accepted traffic through HTTP-family enforcement.
Credential injection belongs in HTTP middleware, not in the guest filesystem or environment:
const github = conn.matchHttp("api.github.com");
if (!github) return;
if (!(await policyManager.allow(github))) return;
github.accept(async (request) => {
request.headers.set(
"authorization",
`Bearer ${await policyManager.tokenFor(request)}`,
);
});This lets the guest run ordinary tools such as curl, git, package managers,
or language CLIs while the host decides which outbound requests receive
credentials.
Creates reusable machine configuration.
const sandbox = defineSandbox({
rootfs: rootfs.builtIn("alpine:3.23"),
resources: {
cpus: 4,
memoryMiB: 4096,
},
network: network.policy((conn) => {
conn.matchDns()?.accept({ resolvers: ["1.1.1.1"] });
}),
});defineSandbox(...) does not boot a VM. It describes rootfs, resource, and
network policy defaults that can be reused across many boots.
rootfs.builtIn("alpine:3.23");Selects a read-only built-in rootfs artifact. Built-in rootfs artifacts are
prepared at build or install time; Sandbox does not pull container images or
build root filesystems during boot().
rootfs.ephemeral({
base: rootfs.builtIn("alpine:3.23"),
maxDirtyBytes: 64 * 1024 * 1024,
});Mounts a built-in base rootfs through a writable in-process copy-on-write block store. Clean base-image blocks are served from the built-in artifact, dirty blocks are kept by the native runtime, and all rootfs mutations are discarded when the sandbox instance exits. Use this when a command needs to mutate the machine rootfs but those mutations must not persist across boots.
rootfs.cow({
base: rootfs.builtIn("alpine:3.23"),
writable: blockStore,
maxDirtyBytes: 64 * 1024 * 1024,
});Mounts a built-in base rootfs through a writable copy-on-write block store. For offline image work, describe the same merged view without booting a VM:
const source = rootfs.compose({
base: rootfs.builtIn("alpine:3.23"),
overlay: blockStore,
});
const image = await rootfs.flatten({
format: "qcow2",
source,
dest: imageStore,
clusterSize: 65536,
});
for await (const chunk of rootfs.bytes(image)) {
await upload(chunk);
}rootfs.ephemeral(...) and rootfs.cow(...) both present a writable block
rootfs without modifying the built-in artifact. rootfs.cow(...) normalizes
through the same composed source used by
rootfs.flatten(...), so boot and image export share one base-plus-overlay
contract. Clean base-image blocks are served from the built-in artifact. Dirty
blocks are read lazily and flushed through your SandboxBlockStore.
maxDirtyBytes limits how many dirty COW block bytes the native runtime buffers
before forcing a write to the block store during a VM run. For
rootfs.ephemeral(...), the same value is the native in-memory dirty block
quota; guest writes beyond that budget fail instead of growing host memory
without bound. When omitted, Sandbox uses a 64 MiB block-aligned default.
rootfs.flatten(...) writes a standalone QCOW2 image into dest, using
dest as random-access image-byte storage. rootfs.bytes(...) streams raw image
container bytes for a built-in image or a flattened image; it does not stream the
filesystem contents inside the image.
interface SandboxBlockStore {
readonly blockSize: number;
list(context: SandboxBlockStoreContext): Promise<readonly bigint[]>;
read(
range: SandboxBlockRange,
context: SandboxBlockStoreContext,
): Promise<readonly SandboxBlockChunk[]>;
write(
chunks: readonly SandboxBlockChunk[],
context: SandboxBlockStoreContext,
): Promise<void>;
flush?(context: SandboxBlockStoreContext): Promise<void>;
}The context.base value identifies the exact built-in base image for the boot,
so storage layers can namespace blocks, reject mismatched snapshots, or migrate
state.
write() receives block bytes owned by the block store. The sandbox runtime
will not mutate those Uint8Array values after passing them to write(), so a
store may retain them for delayed persistence. A store that returns bytes from
read() should treat the returned arrays as immutable after the promise
resolves.
Boots a sandbox instance.
await using lane = await sandbox.boot({
mounts: {
"/workspace": fs.virtual(workspaceFs),
},
cwd: "/workspace",
});Boot options are per instance. The same sandbox definition can be booted with different mounts and working directories.
const workspaceFs = fs.memory({
files: {
"/README.md": "# Task\n",
},
});fs.memory(...) creates an in-memory POSIX filesystem.
const mount = fs.virtual(workspaceFs);fs.virtual(...) adapts a compatible JavaScript filesystem for guest mounts.
Sandbox mounts are host-implemented filesystems, not direct host directory
mounts.
const result = await lane.exec("npm", ["test"], {
cwd: "/workspace",
env: { CI: "1" },
timeoutMs: 120_000,
signal: abortController.signal,
});exec(...) is the buffered process API. It returns after the command exits with
exitCode, stdout, and stderr. When timeoutMs expires, Sandbox terminates
the guest process group and returns exit code 124. When signal aborts,
Sandbox terminates that guest process group, rejects the exec(...) promise with
an AbortError, and keeps the sandbox usable for subsequent commands.
Use spawn(...) for non-interactive long-lived processes with streaming
stdin, stdout, and stderr. It returns a process handle immediately; ready
tracks guest-side startup.
import { Writable } from "node:stream";
const child = lane.spawn("npm", ["test"], { cwd: "/workspace" });
child.stdout.pipeTo(Writable.toWeb(process.stdout));
child.stderr.pipeTo(Writable.toWeb(process.stderr));
const { exitCode } = await child.exit;Use pty(...) when the guest process should see a real terminal, such as an
interactive shell, REPL, pager, or terminal UI:
import { Readable, Writable } from "node:stream";
const shell = lane.pty("/bin/sh", ["-i"], {
cwd: "/workspace",
env: { TERM: "xterm-256color" },
size: { rows: 40, cols: 120 },
});
Readable.toWeb(process.stdin).pipeTo(shell.input);
shell.output.pipeTo(Writable.toWeb(process.stdout));const policy = network.policy(async (conn) => {
if (conn.matchDns()?.accept({ resolvers: ["1.1.1.1"] })) return;
const api = conn.matchHttp("api.example.com");
if (!api) return;
if (!(await policyManager.allow(api))) return;
api.accept((request) => policyManager.handleHttp(request));
});Policy callbacks are default-deny. If the callback does not create a grant, the connection is blocked.
Every policy event exposes IP-layer endpoints:
conn.src.ip;
conn.src.port;
conn.dst.ip;
conn.dst.port;Endpoint helpers classify address ranges without relying on DNS, TLS, or HTTP:
conn.dst.isLoopback();
conn.dst.isPrivate();
conn.dst.isLinkLocal();
conn.dst.isMulticast();
conn.dst.isBroadcast();
conn.dst.isDocumentation();
conn.dst.isReserved();
conn.dst.isPublicInternet();Transport and protocol helpers:
conn.accept(); // accept raw TCP or UDP transport
conn.matchDns(); // DNS over UDP or TCP
conn.matchTcp("203.0.113.10:5432");
conn.matchUdp("203.0.113.10:8125");
conn.matchHttp("api.example.com");conn.matchDns() normalizes DNS over UDP and TCP. The guest is configured to
use Sandbox's internal resolver; policy code decides whether to accept that DNS
flow and can choose explicit upstream resolvers in accept(...). Accepted DNS
answers are cached as trusted, guest-scoped hostname metadata for later
connection policy decisions. Sandbox may retain this attribution metadata for a
bounded window after a short DNS TTL so delayed connections can still be tied
back to the accepted DNS answer that produced their destination IP.
const dns = conn.matchDns();
if (dns) {
dns.accept({ resolvers: ["1.1.1.1", "8.8.8.8"] });
}conn.matchHttp(...) acquires an HTTP capability from trusted destination
metadata, including hostnames observed from accepted DNS answers. It does not
inspect or trust the HTTP Host header. IP-addressed HTTP requests can still be
accepted through lower-level policy, but they do not advertise a trusted
hostname. Calling accept() without middleware authorizes the matched flow
without rewriting bytes; pass middleware only when Sandbox should inspect and
mutate HTTP request headers.
const http = conn.matchHttp((candidate) =>
candidate.hostname.endsWith(".example.com")
);
if (http) {
http.accept((request) => {
request.headers.set("x-agent-policy", "allowed");
});
}http.accept(...) enters Sandbox's HTTP-family enforcement path. If the matched
flow is not actually HTTP or HTTPS, it fails closed.
Sandbox hides the kernel, init, transport, and host helper behind a TypeScript API:
- The runtime boots a libkrun-backed microVM from a prebuilt rootfs artifact.
- The built-in rootfs is a compressed QCOW2 image containing an ext4 guest filesystem.
- A signed
sandbox-hosthelper owns the Node/Rust/libkrun boundary. - Guest control traffic uses an fd-backed transport between the host and the custom Sandbox init process.
- Host-implemented virtual filesystems are mounted into the guest.
- Durable rootfs mutation is modeled as block-level copy-on-write storage, not as a guest-visible POSIX filesystem.
- Network egress is default-deny and policy-controlled.
- HTTP request middleware is caller-provided JavaScript, while Sandbox owns interception, forwarding, and certificate plumbing.
When HTTP interception is enabled, the host generates CA material and passes
only the public CA certificate to Sandbox init. Init installs that CA using the
selected rootfs' native trust-store mechanism only when the rootfs is writable.
Read-only built-in rootfs launches keep the CA available under /run and do not
mutate the trust store, so HTTP interception does not change rootfs write
behavior. If a writable rootfs does not provide a supported trust-store
installer, init fails closed.
The intended boundary is:
- Sandbox owns launch, isolation, mounts, network interception, and enforcement.
- User-space owns artifact selection, filesystem durability, network policy state, confirmation flows, audit logs, and credential brokering.
See docs/architecture.md for the design background, docs/kernel-build.md for kernel artifact builds, and docs/testing-strategy.md for the integration and e2e testing plan.
The npm package is published as @torkbot/sandbox. It does not use
post-install scripts. The root package contains the TypeScript API and declares
platform artifacts as optional dependencies:
@torkbot/sandbox-darwin-arm64@torkbot/sandbox-linux-x64-gnu
Each platform package contains the sandbox-host helper and built-in rootfs
artifacts for that target. Runtime artifact resolution only loads the installed
optional dependency for the current platform. Local development uses the same
layout by materializing the current platform package under node_modules.
For now, the macOS sandbox-host artifact is not Developer ID signed or
notarized. macOS users must sign the installed helper locally before launching a
VM:
npx @torkbot/sandbox setup-macosThis performs an ad-hoc local codesign with the
com.apple.security.hypervisor entitlement required by Hypervisor.framework. It
does not contact Apple and does not require an Apple Developer account. If a
macOS user tries to launch a VM before running setup, Sandbox throws a runtime
error that points back to this command.
Local release packaging sanity check:
npm run release:packAfter rebuilding local native artifacts, refresh the local optional package layout with:
npm run artifacts:link-currentRepository layout:
src/: TypeScript API consumed by Node.js callers.crates/sandbox: Rust host implementation for libkrun, block storage, network, HTTP, and VFS services.crates/sandbox-host: signed VM-host helper used for macOS HVF launch.crates/sandbox-init: custom guest init used to configure the guest before supervising untrusted code.tests/e2e: TypeScript e2e scenarios run directly by Node.js 24+ type stripping.