aharness

For long-running agentic workflows where "please follow this process" is not enough: define the path once, start the run, and let enforced gates keep the agent on track.

Hypothesis

Agents are now capable enough for long, multi-step work, but the main failure mode has shifted from task ability to process drift: skipping approval, forgetting recovery rules, claiming evidence that was not produced, or continuing from stale context.

Prompts and skills can describe the process, but they cannot enforce it. aharness turns the process into a runtime: states define what Codex may do next, typed submissions prove what happened, and transitions only occur through validated exits.

aharness plugs into the Codex setup you already use. It runs with your AGENTS.md, skills, MCP servers, permissions, and local tools instead of asking you to adopt a separate agent ecosystem.

A finite state machine (FSM) is a graph of named states and allowed transitions. For Codex workflows, FSMs sit between two bad extremes: raw code is flexible but hard to constrain, while YAML or JSON is easy to validate but too rigid for real workflows. aharness defines FSMs in TypeScript so workflows stay enforceable while still using typed data, guards, reducers, effects, the npm package ecosystem, and ordinary code-level composition.

The bet is that useful workflows are reusable software. They should be authored, reviewed, versioned, composed from smaller FSMs, and published as npm packages instead of copied around as prompts.

What aharness Adds

Vanilla coding agents	With aharness
Put the process in instructions or a skill and hope it sticks	Encode the process as states the agent cannot skip
"Don't continue until I approve"	Approval is a real workflow gate
"Tell me what you did when you're done"	Evidence is typed, validated, and saved into workflow data
"If this fails, try to fix it, but don't loop forever"	Repair paths, retry limits, and failure outcomes are explicit
Long runs survive by compaction, summaries, or subagents	State, context clearing, and subprocess boundaries are deliberate
"Use the stronger model for the hard part"	Model and effort are selected per state
"Can I inspect what happened?"	Runs write state history, events, artifacts, and browser views

Install

Prerequisites:

Node.js >=20
Codex CLI on PATH; see packages/core/SUPPORTED_CODEX.md for the current compatibility gate

npm install -g @aharness/core

The global install puts the aharness command on PATH. Scaffolded projects still get local authoring dependencies so editors and tsc can typecheck FSM source.

Quickstart

Start with Codeflow to turn a large implementation roadmap into reviewed, verified, committed slices. It is the packaged aharness workflow for changes that are too broad or risky for one implementation plan.

Install the @aharness/codeflow workflow package through aharness:

aharness install @aharness/codeflow

Then run its recipe-driven development command against an implementation roadmap in your repository:

aharness run recipe-driven-development --roadmap-path docs/plans/my-roadmap.md

The Codeflow package also ships process skills for preparing the roadmap: writing-ideas, grill-me, writing-specs, reviewing-specs, and writing-implementation-roadmaps. See the Alfredvc/codeflow repository for docs and more information.

Writing Workflows

Author workflows with the bundled aharness FSM authoring skill, not from a blank TypeScript file. The skill guides Codex through state design, typed exits, owner choices, recovery paths, verification, and current @aharness/core API rules.

Install the authoring skill with npx skills:

npx skills add Alfredvc/aharness

Then ask Codex to use it:

Use $aharness-fsm-authoring to design and author an aharness FSM for this workflow.

The skill lives at skills/aharness-fsm-authoring. Under the hood, generated workflows are TypeScript files built with createFsm, fsm.state, fsm.submit, fsm.choice, and fsm.final; see docs/authoring.md when you need the API details.

A small FSM looks like this:

import { createFsm } from '@aharness/core';

interface Data {
  plan: string | null;
}

const fsm = createFsm<Data>();

export default fsm.machine({
  id: 'tiny-approval-workflow',
  initial: 'plan',
  data: () => ({ plan: null }),
  states: {
    plan: fsm.state({
      prompt:
        'Inspect the requested work, write a short plan, ' +
        'then submit it as { "plan": "..." }. Do not edit files yet.',
      on: {
        submitPlan: fsm.submit<{ plan: string }>({
          to: 'ownerApproval',
          reduce: (draft, payload) => {
            draft.plan = payload.plan;
          },
        }),
      },
    }),
    ownerApproval: fsm.choice({
      question: (data) => `Approve this plan before continuing?\n\n${data.plan}`,
      options: [{ label: 'Approve', to: 'done' }],
    }),
    done: fsm.final({ outcome: 'success' }),
  },
});

Installing FSM Packages

Published workflows are normal npm packages with aharness command metadata. Install them through the global CLI:

aharness install workflow-package
aharness list
aharness verify build
aharness run build --project ./app

aharness install <source> accepts package specs npm accepts: registry packages, versions or dist-tags, GitHub repos, git URLs, local directories, and tarballs.

aharness install workflow-package@latest
aharness install github:owner/workflows
aharness install git+https://github.com/owner/workflows.git
aharness install ../workflows
aharness install ./workflows-1.0.0.tgz

During install, aharness lets npm materialize the package in its managed npm project, then validates package command metadata, package-relative assets, bundled skill declarations, and every declared FSM before writing trusted command records. Installs may run npm lifecycle scripts, so install packages from sources you trust. Unverified commands are not runnable.

Installed commands can be run or verified by fully qualified command identity, or by bare command name when there is no collision. Package names by themselves are not accepted verification targets:

aharness run workflow-package/build
aharness run build
aharness verify workflow-package/build
aharness verify build

Remove a package by package identity, not by command name:

aharness uninstall workflow-package

Re-run aharness install <same-source> to refresh a package after a new npm version, Git ref, tarball, or local snapshot is available.

Try The Demo

After installing the global CLI, clone this repository so the demo FSM and fixture files are available:

git clone https://github.com/Alfredvc/aharness.git
cd aharness
aharness verify examples/coding-smoke.fsm.ts
aharness run examples/coding-smoke.fsm.ts

The demo files are:

examples/coding-smoke.fsm.ts - the FSM.
examples/coding-smoke/fixture - the tiny broken TypeScript fixture the agent repairs.
examples/coding-smoke/README.md - what to watch during the run.

After that, use examples/DEMOS.md as a catalog of focused mechanism demos for awaits, approvals, hooks, composition, skills, branching, and final artifacts.

How It Works

flowchart LR
    Codex["Codex CLI<br/>agent worker"]
    Aharness["aharness CLI<br/>FSM actor + verifier"]
    Browser["Loopback browser UI<br/>input + approvals + graph"]
    Runs[".aharness/runs/&lt;runId&gt;<br/>events.jsonl + reports + artifacts"]

    Aharness <--> Codex
    Aharness <--> Browser
    Aharness --> Runs

An aharness run has three jobs:

Verify the workflow before Codex starts. Invalid FSMs fail early, before the model can begin work.
Keep Codex inside the active state. aharness tells Codex the current state, valid exits, and required submit schema. Codex does the work; aharness validates submitted evidence and decides the next state.
Record the run. Every run writes canonical artifacts under .aharness/runs/<runId>/, including the event log, state history, final artifacts, and data used by the browser view.

The browser UI is the live operator surface. It shows the current state, graph, compact transcript, approvals, and owner-input controls. Use --no-open when you want aharness to serve and print the URL without opening a browser window.

Recorded inspection uses aharness view [run-id]. It reopens a completed run from .aharness/runs without starting Codex or resuming the workflow. Omit the run id to inspect the newest recorded run.

Run directories are sensitive. They can contain raw owner input, browser replies, tool arguments and results, command output, file diffs, approvals, token usage, and workflow context snapshots. Treat .aharness/runs as private runtime evidence, not as a sanitized transcript.

Common Commands

aharness init --dir <path>
aharness verify <file.fsm.ts|command>
aharness visualize <file.fsm.ts|command>
aharness run <file.fsm.ts|command> --help
aharness run [--ask|--yolo] [--no-open] <file.fsm.ts|command> [--<input-flag> <value>]...
aharness view [run-id]
aharness doctor
aharness install <source>

When the standard CI environment variable is set to a truthy value, aharness verify skips Codex-backed model catalog validation so structural FSM verification can run in environments without a Codex app-server. All other static verifier checks still run.

See docs/reference.md for the full CLI, authoring API, state options, hooks, installable package commands, completions, default Codex auto-review behavior, --ask, --yolo, and --no-open. See docs/advanced-runtime-surfaces.md for programmatic live runs and Codex sidecar threads.

Packages

@aharness/core provides the SDK, the aharness CLI binary, and the aharness-completion shell-completion helper binary.
@aharness/test-support provides integration-test fixtures for aharness runs.
packages/web-ui is the private React/Vite browser UI bundled into the core CLI build.

Documentation

docs/authoring.md teaches the workflow authoring mental model.
docs/fsm-packages.md explains how to publish, install, run, and compose reusable FSM packages.
docs/reference.md documents the public SDK and CLI.
docs/advanced-runtime-surfaces.md documents programmatic live runs and Codex sidecar threads.
docs/architecture.md explains the Codex/aharness runtime boundary.
docs/troubleshooting.md covers prerequisite and runtime failures.
packages/core/SUPPORTED_CODEX.md documents the Codex CLI compatibility gate.
CONTRIBUTING.md, CHANGELOG.md, and SECURITY.md cover project maintenance, release notes, and vulnerability reporting.

FAQ

How is this different from Claude Code Dynamic Workflows: Both try to solve the same issue: agents lack determinism. The approach is different. Dynamic workflows are generated on the fly by Claude Code itself. Aharness FSMs are long-lived workflows that are iterated on and improved over time. Aharness also supports single-use FSMs, but that is not the main use case.
Why Codex: This project was originally based on Claude Code, but Claude Code is closed source and changes often. That made it difficult to develop aharness while keeping up with upstream changes. Codex is open source, and its app-server split makes building on top of it much easier.
When should I use aharness instead of a normal Codex session: Use aharness when process drift matters: ordered phases, approvals, typed evidence, recovery paths, or terminal outcomes should be enforced instead of remembered. For tiny one-shot edits or fully owner-steered sessions, a normal Codex session is usually simpler.
Does aharness replace Codex: No. Codex still does the language, code, and tool work. Aharness owns the workflow boundary around that work: active states, valid exits, schema validation, owner choices, approval routing, hooks, transitions, and durable run evidence.
Will you ever support Claude Code or PI: It depends on traction. This is currently an experiment, and it is already useful to me in its current form.
Can I run many FSMs simultaneously from one single UI: Not yet. This also depends on traction. The long-term idea is to support aharness submit X together with a daemon that executes FSMs in the background. All UI <-> aharness communication is HTTP-based, so a local daemon could talk to a remote UI, or vice versa.
Can I share workflows with a team: Yes. Workflows can be shipped as npm packages with aharness command metadata, bundled skills, and package-relative assets. Install packages only from sources you trust, because npm lifecycle scripts may run during aharness install.
Do I have to hand-write FSMs: No. The intended authoring path is to use the bundled aharness FSM authoring skill with Codex, then use the docs as API reference when you need exact details.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
.github/workflows		.github/workflows
.husky		.husky
assets		assets
docs		docs
examples		examples
packages		packages
scripts		scripts
skills/aharness-fsm-authoring		skills/aharness-fsm-authoring
.gitignore		.gitignore
.npmrc		.npmrc
.oxlintrc.json		.oxlintrc.json
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
.versionrc.json		.versionrc.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
issues.md		issues.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
tsconfig.json		tsconfig.json
tsconfig.lint.json		tsconfig.lint.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aharness

Hypothesis

What aharness Adds

Install

Quickstart

Contents

Writing Workflows

Installing FSM Packages

Try The Demo

How It Works

Common Commands

Packages

Documentation

FAQ

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

aharness

Hypothesis

What aharness Adds

Install

Quickstart

Contents

Writing Workflows

Installing FSM Packages

Try The Demo

How It Works

Common Commands

Packages

Documentation

FAQ

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages