Turn a GitHub repo's issue & PR backlog into a queue you burn down — on boxd cloud VMs, with coding agents, in parallel.
Point it at a 200-issue repo on Friday night. Wake up to a wall of green PRs.
Open-source repos and big orgs drown in open issues and PRs. Triaging is expensive, fixing is slow, and most of the work is embarrassingly parallel — but nobody has the compute or the isolation to run a hundred fix attempts at once.
forq does. It installs as a GitHub App, shows you every repo's open issues and PRs as a queue, and lets you fire four actions at any issue/PR — manually or in a batch:
| Action | What happens | |
|---|---|---|
| 🔬 | Preview | Fork a warm environment, sync it to the issue/PR ref, hand back a live *.boxd.sh URL. |
| 🤖 | Implement | Fork, run a coding agent inside it, push a branch and open a PR. |
| 🧹 | Triage | A cheap fast model classifies the issue before you spend on a heavy agent. |
| 🐙 | GitHub | Close / comment / label, straight from the queue. |
The headline mode — speculative search — live-forks one running machine into many candidate fixes at once and lets the test suite collapse the tree onto the green, minimal patch.
forq's whole premise rests on one boxd primitive. From the boxd fork docs:
A fork is an exact copy of an existing machine — same filesystem, same installed packages, same data — and it boots in ~160ms.
But it's not a disk snapshot. boxd fork is a live fork: a sub-second copy of disk + RAM + every running process of a running VM.
GOLDEN (long-running, warm) FORK (born warm, ~113ms)
┌──────────────────────────────────┐ ┌──────────────────────────────────┐
│ your app: installed & RUNNING │ fork │ same app, ALREADY RUNNING │
│ deps on disk, server on :3000 │ ──────▶ │ deps on disk, server on :3000 │
│ test runner primed, in-RAM state │ CoW │ test runner primed, in-RAM state │
└──────────────────────────────────┘ └──────────────────────────────────┘
no rebuild · no reinstall · no cold boot — the process tree is *copied*, not replayed
What makes it the unlock for forq:
- Copy-on-write, worker-local. The fork lands on the same worker as its source and shares an overlay, so the marginal cost of one more candidate environment is tiny. Deep exploration is cheap and local.
- Warm, not cold. Because RAM and processes carry over, a fork doesn't
npm installor boot your server — it's already up. A per-issue preview is live in milliseconds. - Real isolation. Each fork gets its own 100 GB CoW disk, IP, and HTTPS domain. A hundred fix attempts can't step on each other.
- Idle cost ≈ zero. Auto-suspend parks idle goldens and short-lived forks at zero billable CPU/RAM, so a wide search bursts and then collapses without a standing bill.
🔑 One running machine → N running machines in ~113ms each. That is the entire reason speculative search is affordable.
expand a search node=literally fork the VM.
In the codebase this lives behind BoxdService.fork() / forkFrom() (apps/server/src/boxd/service.ts), which always waitUntilReadys before returning — a re-fork off a not-yet-ready box silently cold-boots and drops the in-RAM state that makes forks warm.
forq is three planes: the control plane we build, the boxd compute plane we drive, and the user's repo plane that makes it self-serve.
flowchart LR
subgraph repo["🗂 REPO PLANE — user's GitHub"]
GH["Issues · PRs · push<br/>+ versioned forq.yml"]
end
subgraph control["🧠 CONTROL PLANE — forq (what we build)"]
direction TB
WEB["Web dashboard<br/>React 19 + Vite + Tailwind v4"]
API["Backend / Orchestrator<br/>Hono + zod-openapi"]
ENG["Runs engine<br/>state machine + SSE"]
TRI["Triage gate<br/>Nebius fast model"]
WEB <-->|"REST + SSE"| API
API --> ENG
ENG --> TRI
end
subgraph compute["⚡ COMPUTE PLANE — boxd (@boxd-sh/sdk)"]
direction TB
GOLD["Golden per repo<br/>(warm, synced to main)"]
F1["fork → preview"]
F2["fork → implement → PR"]
FN["fork ×N → speculative search"]
GOLD -->|"live-fork ~113ms"| F1
GOLD -->|"live-fork ~113ms"| F2
GOLD -->|"live-fork ~113ms"| FN
end
GH -->|"App webhooks"| API
ENG -->|"@boxd-sh/sdk"| GOLD
F2 -->|"open PR"| GH
FN -->|"green patch → PR"| GH
classDef accent fill:#1a1a1a,stroke:#fab283,color:#fab283
classDef dim fill:#1a1a1a,stroke:#444,color:#ddd
class WEB,API,ENG,TRI,GOLD,F1,F2,FN dim
class GH accent
A run walks a single lifecycle, surfaced live in the dashboard over SSE:
queued → triaging → provisioning → implementing → testing → ┬─ pr_opened ✅
├─ needs_human 🙋 (graceful degrade)
├─ aborted
└─ failed
The default strategy (single-shot) runs one agent pass and tests it. The speculative strategy turns "implement this issue" into a best-first tree search over live-forked machine states, with tests as the fitness function. It's MCTS where expand = fork a VM and simulate = run the suite.
flowchart TD
G["🌱 genesis<br/>warm preview fork<br/>(deps up, server up, oracle test red)"]
G --> H1["guard the nil deref"]
G --> H2["normalize input first"]
G --> H3["await the async path"]
G --> H4["off-by-one in slice"]
H1 -.prune.-> X1["❌ red"]
H2 --> R2["♻ re-diagnose<br/>'real cause is upstream'"]
H3 -.prune.-> X3["❌ red"]
H4 -.prune.-> X4["❌ red"]
R2 --> W["✅ green<br/>target pass · 0 regressions<br/>best diff: 3 lines"]
W --> PR["📬 winning fork's disk diff → PR<br/>(VM stays live → open its preview URL)"]
classDef green fill:#0f2a14,stroke:#3fb950,color:#3fb950
classDef red fill:#2a0f0f,stroke:#f85149,color:#f85149
classDef amber fill:#2a220f,stroke:#fab283,color:#fab283
class G,R2 amber
class X1,X3,X4 red
class W,PR green
- Oracle. If the issue ships a repro, use it. Otherwise the agent writes a failing test that captures the bug (red), and that test ships in the PR as proof. For PRs, the PR's own CI is the oracle.
- Genesis. The warm per-issue fork is the search root — everything is already installed and running.
- Branch. A
BrancherLLM proposes K diverse hypotheses (distinct fixes, not K rewordings). The orchestrator forks the genesis K ways — ~113ms each, CoW, same worker. - Evaluate. Each child applies its candidate, runs the target test then the full suite, reports
{ target_pass, suite, diff }. - Search. The
Conductorscores children, prunes red subtrees, and expands the frontier — red branches can re-diagnose and re-branch, so it's a real tree, not a one-shot fan-out. Unlimited compute = expand everything; finite = best-first focuses it. - Collapse. First fully-green node wins (or keep hunting for a smaller-diff green within budget). The winning fork's disk diff becomes the PR; its VM stays live so you can open the preview URL and watch the fix run.
- Visualize. A live tree — nodes are VMs colored by status, edges labeled with the hypothesis, header counter:
1 bug · 312 universes · 14 green · best 3 lines · 41s. The explode-then-collapse is the demo.
Today this is a faithful, watchable stub that drives the visualization (
apps/server/src/runs/strategies/speculative.ts); the real fork-search seam is markedTODO(golden). The whole stack runs end-to-end with zero credentials in mock mode.
pnpm-workspaces monorepo · pnpm 10 · Node ≥24 · TypeScript 5 · ESM everywhere.
packages/shared @forq/shared domain types + zod schemas + API contracts + RunStatus + ForqConfig
└─ single source of truth: import shapes, never redefine them
packages/api-client @forq/api-client typed client (one method/endpoint) + SSE subscribe() + mock mode
apps/server @forq/server Hono + zod-openapi + boxd SDK + octokit + Nebius triage (REST/JSON + SSE)
apps/web @forq/web React 19 + Vite + Tailwind v4 SPA (dark-default, monospace-forward)
apps/cli @forq/cli commander tree mirroring the API
Two docs are authoritative: PLAN.md (product design, the four actions, speculative strategy, milestones) and CONTRACTS.md (the coordination contract — directory ownership, endpoint table, service interfaces, design tokens).
Every secret is optional at boot. env.ts derives a capability set from which credentials are present, and each service factory picks live-vs-mock off those flags — so the orchestrator and every route run end-to-end with an empty .env:
- no
BOXD_API_KEY→ boxd mock (fabricated VM handles,status().mock === true) - no
GITHUB_APP_*→ GitHub stubbed (mock repos/issues, config resolves to defaults) - no
NEBIUS_API_KEY→ triage returns a deterministicact:trueso the pipeline still flows
The web app is the same: VITE_FORQ_MOCK unset/true renders the entire UI with no server, animating SSE from shared-shaped fixtures.
pnpm install
cp .env.example .env # empty .env is fine — everything degrades to mock
pnpm dev # server (:8787) + web (:5173) concurrently
pnpm dev:server # backend only (tsx watch)
pnpm dev:web # dashboard only (mock data until VITE_FORQ_MOCK=false)
pnpm typecheck # all workspaces, strict
pnpm build # ORDER MATTERS: shared → api-client → (server, web, cli)The web app defaults to mock mode — open http://localhost:5173 and the full dashboard renders with no backend. Set VITE_FORQ_MOCK=false to hit the server (Vite proxies /api + /openapi.json → :8787).
Drive it from the CLI, also against mock data:
node apps/cli/dist/index.js --mock --json repos list
node apps/cli/dist/index.js --mock runs tree run_demo1
⚠️ No test harness is wired yet —pnpm typecheck+pnpm buildare the verification gate.
REST/JSON under /api/v1, OpenAPI at /openapi.json, live run telemetry via SSE at /runs/{id}/events. Every endpoint is typed in @forq/shared; the error envelope is exactly { error: { code, message, details? } } everywhere. SSE events are a RunEvent discriminated union (status | log | tree | result | error | heartbeat) with monotonic seq per run, so clients resume with Last-Event-ID. Full endpoint table in CONTRACTS.md.
| Layer | Choice |
|---|---|
| Web | React 19 + Vite SPA, Tailwind v4, dark-default, monospace-forward, one accent color, hairline borders. Statically served, no SSR. |
| Backend | Standalone long-running TypeScript service (Hono + zod-openapi) — webhooks + background jobs + SSE don't fit serverless. |
| Compute | boxd via @boxd-sh/sdk — create, fork, exec, suspend, destroy, createProxy, token.create, waitUntilReady. |
| Inference | Anthropic Claude Code in-fork (implement) + Nebius fast model (triage), keyed per funding rules. |
| Config | forq.yml in the user's repo — zod-validated golden/test/triage/funding semantics. |
forq · fork the queue · built on boxd live-fork · live at forq.boxd.sh
