make-harness installs and maintains a project-local harness for each repository.
It is a durable contract bootstrap + maintenance tool: it captures repository-local rules, commands, constraints, and guardrails in one canonical contract, regenerates thin AGENTS.md, CLAUDE.md, and GEMINI.md projections from that contract, and checks whether the harness is still healthy.
make-harness is not a development methodology, not an execution framework, and not an orchestration layer. It is a framework-agnostic local rule layer that stays small enough to compose with stronger workflow systems.
Korean version: README.ko.md
Typical pain before make-harness:
AGENTS.md,CLAUDE.md, andGEMINI.mdslowly diverge- the same project rules get re-explained in every new AI session
- nobody is sure which file is authoritative anymore
- risky changes depend on tribal knowledge instead of an explicit local contract
- one durable contract becomes the source of truth
- thin entry files are regenerated instead of hand-maintained
- audit and completion checks tell you whether the harness is actually healthy
- repo-specific defaults survive across sessions without adopting a bigger agent runtime
In practice, the first win is simple: less repeated setup chatter and fewer drifted root files.
make-harness is for developers and teams who:
- use more than one AI coding tool on the same repository
- keep re-explaining repo rules, commands, and guardrails to new AI sessions
- want one local source of truth instead of hand-maintaining multiple root instruction files
Common examples include teams mixing Claude Code, Codex, Gemini CLI, Cursor, or other agentic coding workflows.
make-harness is especially useful when repository-local rules differ from project to project.
Strong-fit environments include:
- solution or service companies managing many client or product repositories with different repo-local defaults
- teams mixing legacy repositories, new greenfield repositories, and security-sensitive services
- organizations where multiple developers or AI tools touch the same repository over time
- repos where approval rules, verification commands, or sensitive paths differ by project
- long-lived repositories where repeated setup chatter and root-file drift keep coming back
Lower-fit environments include:
- disposable throwaway repositories
- very short-lived experiments or PoCs
- repos where no durable local rules are worth maintaining yet
make-harness is designed to compose with stronger workflow systems, not replace them.
A good pairing looks like this:
make-harnessowns the repository-local contract: defaults, commands, constraints, approval rules, and thin projection sync- stronger workflow tools such as ecc, superpowers, or other agent frameworks can still own planning, TDD, code review loops, and sub-agent coordination
- the result is a cleaner split: upper layers decide how work gets executed, while
make-harnesskeeps each repository's local rules stable
This is usually a better fit than trying to force one global workflow onto every repository.
In a practical multi-agent flow, a common split looks like this:
- a planner decomposes the work and chooses the execution path
- a generator or executor produces the code, UI, or migration changes
- an independent evaluator or reviewer checks the result against explicit quality criteria instead of relying on self-evaluation alone
make-harnesskeeps the shared local contract those agents read: definition of done, verification commands, approval boundaries, constraints, and lightweight rubric hints
That last point matters: make-harness should preserve the repository's quality expectations, but it should not become the runtime that schedules planner / generator / evaluator loops.
make-harness assumes artifact quality is stronger when the same agent is not the only judge of its own work.
For that reason, the durable contract is a good place to store:
- explicit
definition_of_donelanguage verification_policydefaultsproject_commandsfor tests, lint, typecheck, e2e, or visual review- short rubric hints that tell a reviewer what "good" means in this repository
But it is not the place to store:
- how many review iterations a workflow should run
- which orchestration framework should schedule the loop
- permanent planner / generator / evaluator topology
Typical flow:
You: /make-harness
Agent:
1. inspect the repository and any existing harness files
2. infer what it can from repo signals first
3. ask only the missing durable questions
4. generate or repair the harness files
5. report whether the harness is healthy
Result:
PROJECT_HARNESS.mdbecomes the human-readable contractharness-contract.jsonstores the machine-readable durable contractAGENTS.md,CLAUDE.md, andGEMINI.mdstay aligned as thin projections- audit and completion checks can verify whether the harness is still in a healthy state
npx skills add parkjangwon/make-harnessInstall to a specific agent:
npx skills add parkjangwon/make-harness -a claude-code
npx skills add parkjangwon/make-harness -a codex
npx skills add parkjangwon/make-harness -a gemini-cliList what the repository exposes before installing:
npx skills add parkjangwon/make-harness --listManaged files:
AGENTS.mdCLAUDE.mdGEMINI.mdPROJECT_HARNESS.mdharness-contract.jsonharness-runtime.json
PROJECT_HARNESS.md+harness-contract.jsonare the canonical durable contractharness-runtime.jsonis volatile runtime state onlyAGENTS.md+CLAUDE.md+GEMINI.mdare thin projections- only durable defaults and guardrails belong in the durable contract
- drift should be visible and repairable
- this skill should stay easy to compose with stronger frameworks and specialist skills
- project-local security guardrails belong in the contract, and the repo ships only a lightweight path-based guardrail smoke check
communication_languageproject_typedefinition_of_donechange_posturechange_guardrailsverification_policyapproval_policyproject_commandsproject_constraintsrule_strengthscommunication_tonestack_summaryenvironment
rule_strengths is the minimal enforcement layer for the contract. Use it to say whether a durable rule is advisory, guided, or enforced without turning the harness into a heavy execution framework.
Some fields can look broader than they really are. In make-harness, they must stay narrowly project-local:
definition_of_done= the repository's default completion expectation or local completion gate, not a universal development philosophyverification_policy= the repository's default verification rule, not a statement about how all development should be doneapproval_policy= the repository's confirmation rule for risky or sensitive changes, not a full team workflow policychange_posture= a narrow local default for change scope and risk tolerance in this repository, not a general engineering philosophy
If a rule is really about planning, TDD, branch strategy, code review loops, brainstorming, or sub-agent coordination, it belongs in a stronger workflow layer above make-harness, not in the durable harness contract.
bootstrap: no harness yetupdate: healthy harness -> update from the same/make-harnesscommandrepair: missing files, contract drift, or invariant failure
The skill inspects the repository first, then runs a short interview — one question at a time. It should detect likely collaboration language from repo signals first, then confirm when unclear. Where the repository already gives a clear answer, it asks for confirmation instead of asking open-ended.
The interview is intentionally bounded:
- this is a one-time setup, so precision matters more than blindly minimizing question count
- existing repositories use repo-first gap-filling rather than broad setup questions
- blank projects use a small upfront setup-discovery interview because there is no code to inspect
- fixed question order for durable defaults
- detect-first branching based on repo confidence
- question budget targeting 5 or fewer explicit questions for common repos, while allowing denser blank-project discovery when needed
- canonical normalization for free-form answers like change posture, approval policy, and verification policy
- explicit resume and contradiction rules when
harness-runtime.jsonshows an interrupted setup - minimal security interview items for security-sensitive areas, forbidden secret/debug patterns, configuration-based TLS exception rules, and security verification commands
Details: references/interview-guide.md, references/interview-protocol.md
Commit this file. It stores the durable machine-readable project contract.
Do not treat this as canonical durable state. It stores interview progress, sync metadata, language detection hints, and audit timestamps.
For shared repositories, consider adding harness-runtime.json to .gitignore. A starter template is at assets/templates/.gitignore-harness.
- all managed files exist
- canonical durable sources agree on the shared contract schema
- entry files are thin and reflect the same contract summary
- runtime invariants hold
- sync metadata is healthy
Details: assets/healthy-checklist.md
Run to verify all fixture scenarios are structurally valid:
python tools/validate-fixtures.py
This validator now also cross-checks interview-heavy fixtures against a deterministic planner in tools/interview_planner.py, so blank-project discovery, detect-first confirmation, and resume behavior stay behaviorally pinned rather than living only in prose.
Run to check a repository-local harness layout without invoking an LLM:
python tools/audit-harness.py /path/to/project
This verifies the file set, contract/runtime split, required PROJECT_HARNESS.md sections, entry-file thinness, and a few critical runtime invariants.
Run to materialize PROJECT_HARNESS.md, AGENTS.md, CLAUDE.md, and GEMINI.md directly from harness-contract.json and harness-runtime.json:
python tools/apply-harness.py /path/to/project
This is the smallest execution layer in the repo: the LLM can decide the durable contract, but projection files no longer need to be handwritten.
Run to decide whether a harness is actually done, not just structurally present:
python tools/check-harness-done.py /path/to/project
This gate requires audit success, configured + healthy runtime state, full validated_shared_fields, and zero drift between the checked-in projections and deterministic generator output.
make-harness/
├── SKILL.md
├── README.md
├── README.ko.md
├── docs/
│ ├── coexistence.md
│ └── positioning.md
├── agents/
│ ├── openai.yaml
│ └── gemini.yaml
├── tools/
│ ├── apply-harness.py
│ ├── audit-harness.py
│ ├── check-harness-done.py
│ ├── check-sensitive-change.py
│ ├── interview_planner.py
│ └── validate-fixtures.py
└── assets/
├── templates/
├── fixtures/
├── examples/
├── healthy-checklist.md
└── repair-playbook.md
Run to detect whether a diff touches auth / permissions / secrets / payments / encryption / public API areas:
python tools/check-sensitive-change.py /path/to/project --paths src/auth/login.py config/tls-dev.yaml
For git-based checks, use refs instead of explicit paths:
python tools/check-sensitive-change.py /path/to/project --base HEAD~1 --head HEAD
If the relevant rule_strengths are enforced, sensitive changes block completion and hooks/CI can fail. If they are only guided, the checker reports the category but does not block.
This checker is intentionally lightweight. Today it is a path-based smoke check, not a deep code-aware security engine.
- CI now runs
audit-harness,check-harness-done, and a diff-sensitive smoke check. - A sample git hook is provided at
assets/templates/pre-commit-harness.sh. - The hook now checks staged files with
git diff --cached --name-only --diff-filter=ACMRinstead of assumingHEAD~1..HEAD. MAKE_HARNESS_HOOK_MODEsupportsstrict(default),warn, andoff.- The hook prints short summaries like
audit: pass,done-gate: pass, andsensitive-change: passso failures are easier to understand.
Example success output:
PASS: managed files present, contract/runtime split detected, entry files thin
Example failure output:
missing managed files: ['harness-contract.json', 'harness-runtime.json']
Use make-harness when:
- a repository keeps drifting across
AGENTS.md,CLAUDE.md, andGEMINI.md - the team wants durable local defaults without adopting a heavier framework
- hidden legacy constraints should be captured once and reused across sessions
Do not use it when:
- the repo is disposable
- the real problem is execution orchestration rather than local contract clarity
- you want the skill to manage runtime tools, plugins, or agent teams
- positioning: docs/positioning.md
- coexistence: docs/coexistence.md
- fixtures: assets/fixtures
- fixture validator: tools/validate-fixtures.py
- interview planner: tools/interview_planner.py
- lightweight audit: tools/audit-harness.py
- sample output: assets/examples
- rollout example: assets/examples/legacy-webapp-rollout.md
- interview guide: references/interview-guide.md
- interview protocol: references/interview-protocol.md
- repair playbook: assets/repair-playbook.md
- healthy checklist: assets/healthy-checklist.md
- gitignore template: assets/templates/.gitignore-harness
/make-harness
Use the single-entry `/make-harness` skill to bootstrap when no harness exists, update when a healthy harness already exists, or repair drift before continuing.