fix: close out the eight-issue audit (validator, test runner, skill runner, generator) by SingleSourceStudios · Pull Request #26 · SingularityAI-Dev/covenant-md

SingleSourceStudios · 2026-06-11T22:34:58Z

Closes out all eight findings from @antnewman's validator / test-runner / skill-runner / generator audit. One commit per issue, each with a regression fixture or unit test.

Validator

validator: fixture.expect shape not cross-checked against contracts.outputs #18 fixture.expect is now cross-checked against contracts.outputs, mirroring the existing input check. Comparison operators excepted.
validator: quality.gates section entirely unvalidated #19 quality.gates is now validated: array shape, required id and check, action in retry | fail, integer max_retries, string on_exhaustion, and operation cross-checked against interface.surface. Normative fields were taken from docs/COVENANT.md (id/check/action/max_retries/on_exhaustion), not the name/severity/on_fail guessed in the issue.
validator + test runner: fixture.retry silently coerced via "|| 0" #21 fixture.retry must be a non-negative integer; the runner now uses ?? 0 instead of || 0.
validator: depends_on cycle detection missing (runner is single point of failure) #24 depends_on cycle detection at validation time, reporting the cycle path. The existence check also now handles array-valued depends_on, which the spec schema allows but the old check mishandled.

Test runner

test runner: validateOutput partial-match allows undeclared output fields #20 Opt-in strict_output: true per fixture fails the run when actual output carries fields absent from contracts.outputs. Default behaviour unchanged (non-breaking); the default-mode question is deferred to the v1.1 spec discussion. Landing this surfaced two latent bugs, fixed in the same commit: executeFixture read this.expect (never set), so fixture.expect was never enforced at all; and partialEqual was partial only at the top level, contradicting the documented partial-match semantics. The docx-generation example expected file_path literals the contract-driven simulator cannot honour; those lines are dropped and validation_result.valid remains the real assertion.
test runner: hasCircularDependencies silently ignores deps not in fixture list #25 hasCircularDependencies now throws on a depends_on naming an unknown fixture instead of silently treating it as no dependency.

DX papercuts

skill-runner: stdout JSON parse failure mis-attributed to "process runner failed" #22 Process-runner stdout that fails to parse as JSON is now reported as a parse failure with the first 200 chars of raw stdout, instead of process runner failed: exit 0.
generate.js: fs.stat(...).catch(() => false) treats permission errors as nonexistence #23 Both fs.stat(...).catch(() => false) sites in generate.js narrowed to ENOENT; permission and I/O errors propagate.

Verification

Jest: 72 passed (13 new) across 7 suites
npm run test:fixtures: 4/4 example skills green
npm run test:conformance: 15/15

Fixes #18
Fixes #19
Fixes #20
Fixes #21
Fixes #22
Fixes #23
Fixes #24
Fixes #25

Summary by cubic

Closes out audit issues #18–#25 across the validator, test runner, skill runner, and generator. Adds strict output checks, stronger validation, and clearer error messages for safer, more predictable runs.

New Features
- strict_output: true per fixture to fail on undeclared output fields; default behavior unchanged (test runner: validateOutput partial-match allows undeclared output fields #20).
- Validate quality.gates: required id/check, action in retry|fail, integer max_retries, string on_exhaustion, and operation cross-checked with interface.surface (validator: quality.gates section entirely unvalidated #19).
Bug Fixes
- Validator: cross-check fixture.expect against contracts.outputs; enforce retry as a non-negative integer; detect depends_on cycles and support array-valued depends_on (validator: fixture.expect shape not cross-checked against contracts.outputs #18, validator + test runner: fixture.retry silently coerced via "|| 0" #21, validator: depends_on cycle detection missing (runner is single point of failure) #24).
- Test runner: enforce fixture.expect (fixes this.expect bug) and make partial matching deep; throw on unknown depends_on instead of silently ignoring it (test runner: validateOutput partial-match allows undeclared output fields #20, test runner: hasCircularDependencies silently ignores deps not in fixture list #25).
- Skill runner: surface JSON parse failures with a clear message and a 200-char stdout sample instead of “exit 0” (skill-runner: stdout JSON parse failure mis-attributed to "process runner failed" #22).
- CLI generator: narrow fs.stat catch to ENOENT so permission/I/O errors aren’t masked (generate.js: fs.stat(...).catch(() => false) treats permission errors as nonexistence #23).

^{Written for commit 2c42f33. Summary will update on new commits.}

Summary by CodeRabbit

New Features
- Added quality gates validation for defining and enforcing quality checks on skill operations
- Introduced strict output validation for fixtures to detect undeclared fields in responses
Bug Fixes
- Improved error handling during skill generation to distinguish permission and I/O failures from missing directories
- Enhanced JSON parsing error reporting with clearer messages and raw output context
- Strengthened fixture dependency validation with cycle detection and better error messages
Tests
- Added comprehensive test coverage for error handling scenarios, quality gates validation, and strict output behavior

The validator checked fixture.input field names against contracts.inputs but had no mirror check for expect blocks, so a typo in an expected output field passed validation and only surfaced as a confusing runtime mismatch. Adds the mirror cross-check (comparison operators excepted) plus regression fixtures. The strict-output-skill fixture doubles as the target for the strict_output work in #20. Fixes #18

retry: "yes", retry: "1", retry: 1.5 and friends were silently coerced to zero retries at execution time via || 0, losing the author's intent. The validator now requires a non-negative integer; the runner uses ?? 0 so only absence defaults, never truthiness. Same bug class as logic-md \#64 (recovery configuration silently lost between spec and execution). Fixes #21

The existence check passed any graph whose references all resolve, so an A -> B -> A cycle validated ok: true and only the test runner caught it at execution. Consumers that validate without executing (IDE plugins, lint-only CI, the MCP covenant_validate tool) were blind. Adds DFS cycle detection reporting the cycle path, and extends the existence check to array-valued depends_on, which the spec schema allows but the check previously mishandled. Fixes #24

quality.gates was entirely unvalidated; grep for gates in validator.js returned nothing while generate.js prompts users to author them. Adds validateQualityGates mirroring validateFixtures: array shape, required id and check, action in the retry|fail enum, integer max_retries, string on_exhaustion and description, and operation cross-checked against interface.surface. Normative fields taken from docs/COVENANT.md, which uses id/check/action/max_retries/on_exhaustion rather than the name/severity/on_fail guessed in the issue. Fixes #19

A fixture can now set strict_output: true to fail when the actual output carries fields absent from contracts.outputs (leaked secrets, undeclared side-effect counters). Default stays partial matching, non-breaking; the default-mode question is deferred to the v1.1 spec discussion. Landing this surfaced two latent problems, both fixed here: 1. executeFixture read this.expect, which is never set, so fixture.expect was never enforced at all. Now reads fixture.expect. 2. partialEqual was partial at the top level but demanded deep equality below, contradicting the documented partial-match semantics. Nested plain objects now match partially at every level (arrays and primitives still require exact equality). With expect enforcement live, the docx-generation example expected file_path literals the contract-driven simulator cannot honour (it synthesizes string defaults; echoing output_path into file_path would need skill-specific knowledge). Those two expect lines are dropped; validation_result.valid remains the real assertion. The validator also type-checks strict_output as boolean. Fixes #20

hasCircularDependencies silently skipped any dependency name absent from the fixture list, so a typo in depends_on became no dependency and the fixture ran as if its dep was satisfied. Unreachable via the validated path, but the runner can be invoked on raw fixture arrays (programmatic callers, MCP tools), where it reported green instead of erroring. Now throws with the offending fixture and dependency named, same posture as the existing cycle throw. Companion to the validator-side cycle fix. Fixes #25

…t error A subprocess that exited 0 but emitted non-JSON stdout (stray console.log before the payload, double-encoded stringify) was reported as process runner failed: exit 0, pointing the author away from the real cause. The parse error is now surfaced with the first 200 chars of raw stdout. Fixes #22

fs.stat(...).catch(() => false) treated EACCES, ENOTDIR and EIO as nonexistence, so a permission error during skill init silently fell into create-directory logic and the secondary error pointed at the wrong root cause. Both stat sites now return false only on ENOENT and rethrow everything else. Same family as the skill-runner stdout fix: swallowed errors narrow the root-cause signal. Fixes #23

coderabbitai · 2026-06-11T22:35:11Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR addresses seven interconnected validation and error-handling issues across the COVENANT validator, test runner, and CLI. Changes narrow error handling to surface real failures, enhance the validator to enforce fixture schema contracts and quality gates, and improve the test runner to support strict output mode with better error attribution.

Changes

Validation and Test Execution Enhancements

Layer / File(s)	Summary
CLI and skill runner error handling `packages/cli/src/generate.js`, `packages/cli/tests/generate.test.js`, `packages/core/src/skill-runner.js`, `packages/core/tests/skill-runner.test.js`	Error handling in directory existence checks and JSON parsing now distinguishes real failures (permission, I/O, invalid format) from "missing" cases; error messages include relevant context (raw stdout samples, error codes) instead of generic fallback messages.
Validator fixture and gates schema checking `packages/core/src/validator.js`	Validator now cross-checks `fixture.expect` field names against declared `contracts.outputs`, enforces `fixture.retry` as non-negative integer and `strict_output` as boolean, validates `quality.gates` structure (required/optional fields, enum values, operation references), and detects cyclic `depends_on` graphs via new `findFixtureCycle` helper.
Test runner fixture parsing and dependency handling `packages/core/src/test.js`	Fixture parsing preserves `contracts.outputs` for strict output checking, defaults `retry` via nullish coalescing, throws on unknown `depends_on` references instead of silently ignoring, and rewrites `partialEqual` to apply `validateOutput` at nested object levels for consistent contract enforcement.
Test runner output validation and strict mode `packages/core/src/test.js`	Output validation now consults `fixture.expect` directly (instead of a runner-level property), implements `strict_output` mode that fails when actual output contains undeclared fields, and reports failures with fixture.expect JSON for clarity.
Test fixtures for validation rules `packages/core/tests/fixtures/{cyclic-fixture-deps,invalid-expect-cross-ref,invalid-gates,invalid-retry-type,invalid-strict-output-type,strict-output-skill,valid-gates}.md`	New fixture definitions exercise all validation paths: cross-reference mismatches, type errors, gate validation failure modes, cycle detection, and strict output behavior.
Test coverage for validator and test runner enhancements `packages/core/tests/{validator,test-runner,skill-runner}.test.js`	Jest suites add coverage for expect cross-refs, retry type validation, strict_output type checking, quality.gates malformed entries, depends_on cycle detection, and runtime strict output enforcement with proper error reporting.
Example fixture expectation updates `examples/docx-generation/COVENANT.md`	Remove `file_path` assertions from docx example fixtures; retain only `validation_result.valid` checks.

Sequence Diagrams

sequenceDiagram
    participant Validator
    participant TestRunner
    participant Execution
    Validator->>Validator: Parse and extract fixtures
    Validator->>Validator: Validate fixture.expect against contracts.outputs
    Validator->>Validator: Type-check retry, strict_output
    Validator->>Validator: Validate quality.gates structure and operations
    Validator->>Validator: Detect cycles in depends_on graph
    Validator-->>TestRunner: Report validation errors or pass
    alt Validation passes
        TestRunner->>TestRunner: Extract fixtures with contractsOutputs
        TestRunner->>TestRunner: Build topological sort (reject unknown deps)
        TestRunner->>Execution: Execute fixtures in dependency order
        Execution->>Execution: Run fixture operation
        Execution->>TestRunner: Return actual output
        TestRunner->>TestRunner: Validate output vs fixture.expect
        alt strict_output enabled
            TestRunner->>TestRunner: Check for undeclared fields
            TestRunner-->>Execution: Fail if leaked fields
        end
        TestRunner-->>TestRunner: Report pass/fail with error details
    else Validation fails
        Validator-->>TestRunner: Reject covenant
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A rabbit hops through contracts neat,
Ensuring fixtures all compete—
No cycles hide, no fields leak through,
Each gate is checked, each promise true. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the main changeset as closing eight audit issues across validator, test runner, skill runner, and generator components.
Description check	✅ Passed	The PR description is comprehensive, covering all required sections: summary of changes per component, linked issues with issue numbers, spec impacts, verification results, and a detailed cubic.dev summary.
Linked Issues check	✅ Passed	All eight linked issues (`#18`–#25) are addressed: validator expect cross-check (`#18`), quality.gates validation (`#19`), strict_output opt-in (`#20`), retry type-check (`#21`), JSON parse error reporting (`#22`), fs.stat error narrowing (`#23`), cycle detection (`#24`), and unknown depends_on throw (`#25`).
Out of Scope Changes check	✅ Passed	All changes are scoped to the eight audit issues: validator enhancements, test runner fixes, skill runner error reporting, CLI generator error handling, and corresponding regression tests/fixtures. No extraneous changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/issues-18-25

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

SingleSourceStudios added 8 commits June 12, 2026 00:18

SingleSourceStudios merged commit f55f908 into main Jun 11, 2026
6 of 7 checks passed

SingleSourceStudios deleted the fix/issues-18-25 branch June 11, 2026 22:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: close out the eight-issue audit (validator, test runner, skill runner, generator)#26

fix: close out the eight-issue audit (validator, test runner, skill runner, generator)#26
SingleSourceStudios merged 8 commits into
mainfrom
fix/issues-18-25

SingleSourceStudios commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

SingleSourceStudios commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Validator

Test runner

DX papercuts

Verification

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SingleSourceStudios commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading