Skip to content

MichielMAnalytics/forq

Repository files navigation

🍴 forq

Fork the queue.

Turn a GitHub repo's issue & PR backlog into a queue you burn down — on boxd cloud VMs, with coding agents, in parallel.

live stack runtime web powered

Point it at a 200-issue repo on Friday night. Wake up to a wall of green PRs.

forq — fork the queue, burn down a repo's issue & PR backlog (click to watch the demo)

▶︎ Watch the 1-minute demo


The pitch (30 seconds)

Open-source repos and big orgs drown in open issues and PRs. Triaging is expensive, fixing is slow, and most of the work is embarrassingly parallel — but nobody has the compute or the isolation to run a hundred fix attempts at once.

forq does. It installs as a GitHub App, shows you every repo's open issues and PRs as a queue, and lets you fire four actions at any issue/PR — manually or in a batch:

Action What happens
🔬 Preview Fork a warm environment, sync it to the issue/PR ref, hand back a live *.boxd.sh URL.
🤖 Implement Fork, run a coding agent inside it, push a branch and open a PR.
🧹 Triage A cheap fast model classifies the issue before you spend on a heavy agent.
🐙 GitHub Close / comment / label, straight from the queue.

The headline mode — speculative search — live-forks one running machine into many candidate fixes at once and lets the test suite collapse the tree onto the green, minimal patch.


Why this is even possible: the boxd fork principle

forq's whole premise rests on one boxd primitive. From the boxd fork docs:

A fork is an exact copy of an existing machine — same filesystem, same installed packages, same data — and it boots in ~160ms.

But it's not a disk snapshot. boxd fork is a live fork: a sub-second copy of disk + RAM + every running process of a running VM.

        GOLDEN  (long-running, warm)                 FORK  (born warm, ~113ms)
   ┌──────────────────────────────────┐         ┌──────────────────────────────────┐
   │  your app: installed & RUNNING    │  fork   │  same app, ALREADY RUNNING        │
   │  deps on disk, server on :3000    │ ──────▶ │  deps on disk, server on :3000    │
   │  test runner primed, in-RAM state │  CoW    │  test runner primed, in-RAM state │
   └──────────────────────────────────┘         └──────────────────────────────────┘
       no rebuild · no reinstall · no cold boot — the process tree is *copied*, not replayed

What makes it the unlock for forq:

  • Copy-on-write, worker-local. The fork lands on the same worker as its source and shares an overlay, so the marginal cost of one more candidate environment is tiny. Deep exploration is cheap and local.
  • Warm, not cold. Because RAM and processes carry over, a fork doesn't npm install or boot your server — it's already up. A per-issue preview is live in milliseconds.
  • Real isolation. Each fork gets its own 100 GB CoW disk, IP, and HTTPS domain. A hundred fix attempts can't step on each other.
  • Idle cost ≈ zero. Auto-suspend parks idle goldens and short-lived forks at zero billable CPU/RAM, so a wide search bursts and then collapses without a standing bill.

🔑 One running machine → N running machines in ~113ms each. That is the entire reason speculative search is affordable. expand a search node = literally fork the VM.

In the codebase this lives behind BoxdService.fork() / forkFrom() (apps/server/src/boxd/service.ts), which always waitUntilReadys before returning — a re-fork off a not-yet-ready box silently cold-boots and drops the in-RAM state that makes forks warm.


Architecture

forq is three planes: the control plane we build, the boxd compute plane we drive, and the user's repo plane that makes it self-serve.

flowchart LR
    subgraph repo["🗂  REPO PLANE — user's GitHub"]
        GH["Issues · PRs · push<br/>+ versioned forq.yml"]
    end

    subgraph control["🧠  CONTROL PLANE — forq (what we build)"]
        direction TB
        WEB["Web dashboard<br/>React 19 + Vite + Tailwind v4"]
        API["Backend / Orchestrator<br/>Hono + zod-openapi"]
        ENG["Runs engine<br/>state machine + SSE"]
        TRI["Triage gate<br/>Nebius fast model"]
        WEB <-->|"REST + SSE"| API
        API --> ENG
        ENG --> TRI
    end

    subgraph compute["⚡  COMPUTE PLANE — boxd (@boxd-sh/sdk)"]
        direction TB
        GOLD["Golden per repo<br/>(warm, synced to main)"]
        F1["fork → preview"]
        F2["fork → implement → PR"]
        FN["fork ×N → speculative search"]
        GOLD -->|"live-fork ~113ms"| F1
        GOLD -->|"live-fork ~113ms"| F2
        GOLD -->|"live-fork ~113ms"| FN
    end

    GH -->|"App webhooks"| API
    ENG -->|"@boxd-sh/sdk"| GOLD
    F2 -->|"open PR"| GH
    FN -->|"green patch → PR"| GH

    classDef accent fill:#1a1a1a,stroke:#fab283,color:#fab283
    classDef dim fill:#1a1a1a,stroke:#444,color:#ddd
    class WEB,API,ENG,TRI,GOLD,F1,F2,FN dim
    class GH accent
Loading

A run walks a single lifecycle, surfaced live in the dashboard over SSE:

queued → triaging → provisioning → implementing → testing → ┬─ pr_opened   ✅
                                                            ├─ needs_human  🙋  (graceful degrade)
                                                            ├─ aborted
                                                            └─ failed

⭐ Speculative search — the showpiece

The default strategy (single-shot) runs one agent pass and tests it. The speculative strategy turns "implement this issue" into a best-first tree search over live-forked machine states, with tests as the fitness function. It's MCTS where expand = fork a VM and simulate = run the suite.

flowchart TD
    G["🌱 genesis<br/>warm preview fork<br/>(deps up, server up, oracle test red)"]
    G --> H1["guard the nil deref"]
    G --> H2["normalize input first"]
    G --> H3["await the async path"]
    G --> H4["off-by-one in slice"]

    H1 -.prune.-> X1["❌ red"]
    H2 --> R2["♻ re-diagnose<br/>'real cause is upstream'"]
    H3 -.prune.-> X3["❌ red"]
    H4 -.prune.-> X4["❌ red"]

    R2 --> W["✅ green<br/>target pass · 0 regressions<br/>best diff: 3 lines"]
    W --> PR["📬 winning fork's disk diff → PR<br/>(VM stays live → open its preview URL)"]

    classDef green fill:#0f2a14,stroke:#3fb950,color:#3fb950
    classDef red fill:#2a0f0f,stroke:#f85149,color:#f85149
    classDef amber fill:#2a220f,stroke:#fab283,color:#fab283
    class G,R2 amber
    class X1,X3,X4 red
    class W,PR green
Loading
  1. Oracle. If the issue ships a repro, use it. Otherwise the agent writes a failing test that captures the bug (red), and that test ships in the PR as proof. For PRs, the PR's own CI is the oracle.
  2. Genesis. The warm per-issue fork is the search root — everything is already installed and running.
  3. Branch. A Brancher LLM proposes K diverse hypotheses (distinct fixes, not K rewordings). The orchestrator forks the genesis K ways — ~113ms each, CoW, same worker.
  4. Evaluate. Each child applies its candidate, runs the target test then the full suite, reports { target_pass, suite, diff }.
  5. Search. The Conductor scores children, prunes red subtrees, and expands the frontier — red branches can re-diagnose and re-branch, so it's a real tree, not a one-shot fan-out. Unlimited compute = expand everything; finite = best-first focuses it.
  6. Collapse. First fully-green node wins (or keep hunting for a smaller-diff green within budget). The winning fork's disk diff becomes the PR; its VM stays live so you can open the preview URL and watch the fix run.
  7. Visualize. A live tree — nodes are VMs colored by status, edges labeled with the hypothesis, header counter: 1 bug · 312 universes · 14 green · best 3 lines · 41s. The explode-then-collapse is the demo.

Today this is a faithful, watchable stub that drives the visualization (apps/server/src/runs/strategies/speculative.ts); the real fork-search seam is marked TODO(golden). The whole stack runs end-to-end with zero credentials in mock mode.


Repository layout

pnpm-workspaces monorepo · pnpm 10 · Node ≥24 · TypeScript 5 · ESM everywhere.

packages/shared       @forq/shared      domain types + zod schemas + API contracts + RunStatus + ForqConfig
                                         └─ single source of truth: import shapes, never redefine them
packages/api-client   @forq/api-client  typed client (one method/endpoint) + SSE subscribe() + mock mode
apps/server           @forq/server      Hono + zod-openapi + boxd SDK + octokit + Nebius triage  (REST/JSON + SSE)
apps/web              @forq/web         React 19 + Vite + Tailwind v4 SPA  (dark-default, monospace-forward)
apps/cli              @forq/cli         commander tree mirroring the API

Two docs are authoritative: PLAN.md (product design, the four actions, speculative strategy, milestones) and CONTRACTS.md (the coordination contract — directory ownership, endpoint table, service interfaces, design tokens).

The pattern that makes it hack-friendly: capability-gated mock/live

Every secret is optional at boot. env.ts derives a capability set from which credentials are present, and each service factory picks live-vs-mock off those flags — so the orchestrator and every route run end-to-end with an empty .env:

  • no BOXD_API_KEY → boxd mock (fabricated VM handles, status().mock === true)
  • no GITHUB_APP_* → GitHub stubbed (mock repos/issues, config resolves to defaults)
  • no NEBIUS_API_KEY → triage returns a deterministic act:true so the pipeline still flows

The web app is the same: VITE_FORQ_MOCK unset/true renders the entire UI with no server, animating SSE from shared-shaped fixtures.


Quickstart

pnpm install
cp .env.example .env          # empty .env is fine — everything degrades to mock

pnpm dev                      # server (:8787) + web (:5173) concurrently
pnpm dev:server               # backend only (tsx watch)
pnpm dev:web                  # dashboard only (mock data until VITE_FORQ_MOCK=false)

pnpm typecheck                # all workspaces, strict
pnpm build                    # ORDER MATTERS: shared → api-client → (server, web, cli)

The web app defaults to mock mode — open http://localhost:5173 and the full dashboard renders with no backend. Set VITE_FORQ_MOCK=false to hit the server (Vite proxies /api + /openapi.json:8787).

Drive it from the CLI, also against mock data:

node apps/cli/dist/index.js --mock --json repos list
node apps/cli/dist/index.js --mock runs tree run_demo1

⚠️ No test harness is wired yet — pnpm typecheck + pnpm build are the verification gate.


API

REST/JSON under /api/v1, OpenAPI at /openapi.json, live run telemetry via SSE at /runs/{id}/events. Every endpoint is typed in @forq/shared; the error envelope is exactly { error: { code, message, details? } } everywhere. SSE events are a RunEvent discriminated union (status | log | tree | result | error | heartbeat) with monotonic seq per run, so clients resume with Last-Event-ID. Full endpoint table in CONTRACTS.md.


Stack

Layer Choice
Web React 19 + Vite SPA, Tailwind v4, dark-default, monospace-forward, one accent color, hairline borders. Statically served, no SSR.
Backend Standalone long-running TypeScript service (Hono + zod-openapi) — webhooks + background jobs + SSE don't fit serverless.
Compute boxd via @boxd-sh/sdkcreate, fork, exec, suspend, destroy, createProxy, token.create, waitUntilReady.
Inference Anthropic Claude Code in-fork (implement) + Nebius fast model (triage), keyed per funding rules.
Config forq.yml in the user's repo — zod-validated golden/test/triage/funding semantics.

forq · fork the queue · built on boxd live-fork · live at forq.boxd.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors