Skip to content

fix: close out the eight-issue audit (validator, test runner, skill runner, generator)#26

Merged
SingleSourceStudios merged 8 commits into
mainfrom
fix/issues-18-25
Jun 11, 2026
Merged

fix: close out the eight-issue audit (validator, test runner, skill runner, generator)#26
SingleSourceStudios merged 8 commits into
mainfrom
fix/issues-18-25

Conversation

@SingleSourceStudios

@SingleSourceStudios SingleSourceStudios commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Closes out all eight findings from @antnewman's validator / test-runner / skill-runner / generator audit. One commit per issue, each with a regression fixture or unit test.

Validator

Test runner

  • test runner: validateOutput partial-match allows undeclared output fields #20 Opt-in strict_output: true per fixture fails the run when actual output carries fields absent from contracts.outputs. Default behaviour unchanged (non-breaking); the default-mode question is deferred to the v1.1 spec discussion. Landing this surfaced two latent bugs, fixed in the same commit: executeFixture read this.expect (never set), so fixture.expect was never enforced at all; and partialEqual was partial only at the top level, contradicting the documented partial-match semantics. The docx-generation example expected file_path literals the contract-driven simulator cannot honour; those lines are dropped and validation_result.valid remains the real assertion.
  • test runner: hasCircularDependencies silently ignores deps not in fixture list #25 hasCircularDependencies now throws on a depends_on naming an unknown fixture instead of silently treating it as no dependency.

DX papercuts

Verification

  • Jest: 72 passed (13 new) across 7 suites
  • npm run test:fixtures: 4/4 example skills green
  • npm run test:conformance: 15/15

Fixes #18
Fixes #19
Fixes #20
Fixes #21
Fixes #22
Fixes #23
Fixes #24
Fixes #25


Summary by cubic

Closes out audit issues #18#25 across the validator, test runner, skill runner, and generator. Adds strict output checks, stronger validation, and clearer error messages for safer, more predictable runs.

Written for commit 2c42f33. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • New Features

    • Added quality gates validation for defining and enforcing quality checks on skill operations
    • Introduced strict output validation for fixtures to detect undeclared fields in responses
  • Bug Fixes

    • Improved error handling during skill generation to distinguish permission and I/O failures from missing directories
    • Enhanced JSON parsing error reporting with clearer messages and raw output context
    • Strengthened fixture dependency validation with cycle detection and better error messages
  • Tests

    • Added comprehensive test coverage for error handling scenarios, quality gates validation, and strict output behavior

The validator checked fixture.input field names against contracts.inputs
but had no mirror check for expect blocks, so a typo in an expected
output field passed validation and only surfaced as a confusing runtime
mismatch. Adds the mirror cross-check (comparison operators excepted)
plus regression fixtures. The strict-output-skill fixture doubles as the
target for the strict_output work in #20.

Fixes #18
retry: "yes", retry: "1", retry: 1.5 and friends were silently coerced
to zero retries at execution time via || 0, losing the author's intent.
The validator now requires a non-negative integer; the runner uses ?? 0
so only absence defaults, never truthiness. Same bug class as logic-md
\#64 (recovery configuration silently lost between spec and execution).

Fixes #21
The existence check passed any graph whose references all resolve, so an
A -> B -> A cycle validated ok: true and only the test runner caught it
at execution. Consumers that validate without executing (IDE plugins,
lint-only CI, the MCP covenant_validate tool) were blind. Adds DFS cycle
detection reporting the cycle path, and extends the existence check to
array-valued depends_on, which the spec schema allows but the check
previously mishandled.

Fixes #24
quality.gates was entirely unvalidated; grep for gates in validator.js
returned nothing while generate.js prompts users to author them. Adds
validateQualityGates mirroring validateFixtures: array shape, required
id and check, action in the retry|fail enum, integer max_retries, string
on_exhaustion and description, and operation cross-checked against
interface.surface. Normative fields taken from docs/COVENANT.md, which
uses id/check/action/max_retries/on_exhaustion rather than the
name/severity/on_fail guessed in the issue.

Fixes #19
A fixture can now set strict_output: true to fail when the actual output
carries fields absent from contracts.outputs (leaked secrets, undeclared
side-effect counters). Default stays partial matching, non-breaking; the
default-mode question is deferred to the v1.1 spec discussion.

Landing this surfaced two latent problems, both fixed here:

1. executeFixture read this.expect, which is never set, so fixture.expect
   was never enforced at all. Now reads fixture.expect.
2. partialEqual was partial at the top level but demanded deep equality
   below, contradicting the documented partial-match semantics. Nested
   plain objects now match partially at every level (arrays and
   primitives still require exact equality).

With expect enforcement live, the docx-generation example expected
file_path literals the contract-driven simulator cannot honour (it
synthesizes string defaults; echoing output_path into file_path would
need skill-specific knowledge). Those two expect lines are dropped;
validation_result.valid remains the real assertion. The validator also
type-checks strict_output as boolean.

Fixes #20
hasCircularDependencies silently skipped any dependency name absent from
the fixture list, so a typo in depends_on became no dependency and the
fixture ran as if its dep was satisfied. Unreachable via the validated
path, but the runner can be invoked on raw fixture arrays (programmatic
callers, MCP tools), where it reported green instead of erroring. Now
throws with the offending fixture and dependency named, same posture as
the existing cycle throw. Companion to the validator-side cycle fix.

Fixes #25
…t error

A subprocess that exited 0 but emitted non-JSON stdout (stray
console.log before the payload, double-encoded stringify) was reported
as process runner failed: exit 0, pointing the author away from the real
cause. The parse error is now surfaced with the first 200 chars of raw
stdout.

Fixes #22
fs.stat(...).catch(() => false) treated EACCES, ENOTDIR and EIO as
nonexistence, so a permission error during skill init silently fell into
create-directory logic and the secondary error pointed at the wrong root
cause. Both stat sites now return false only on ENOENT and rethrow
everything else. Same family as the skill-runner stdout fix: swallowed
errors narrow the root-cause signal.

Fixes #23
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR addresses seven interconnected validation and error-handling issues across the COVENANT validator, test runner, and CLI. Changes narrow error handling to surface real failures, enhance the validator to enforce fixture schema contracts and quality gates, and improve the test runner to support strict output mode with better error attribution.

Changes

Validation and Test Execution Enhancements

Layer / File(s) Summary
CLI and skill runner error handling
packages/cli/src/generate.js, packages/cli/tests/generate.test.js, packages/core/src/skill-runner.js, packages/core/tests/skill-runner.test.js
Error handling in directory existence checks and JSON parsing now distinguishes real failures (permission, I/O, invalid format) from "missing" cases; error messages include relevant context (raw stdout samples, error codes) instead of generic fallback messages.
Validator fixture and gates schema checking
packages/core/src/validator.js
Validator now cross-checks fixture.expect field names against declared contracts.outputs, enforces fixture.retry as non-negative integer and strict_output as boolean, validates quality.gates structure (required/optional fields, enum values, operation references), and detects cyclic depends_on graphs via new findFixtureCycle helper.
Test runner fixture parsing and dependency handling
packages/core/src/test.js
Fixture parsing preserves contracts.outputs for strict output checking, defaults retry via nullish coalescing, throws on unknown depends_on references instead of silently ignoring, and rewrites partialEqual to apply validateOutput at nested object levels for consistent contract enforcement.
Test runner output validation and strict mode
packages/core/src/test.js
Output validation now consults fixture.expect directly (instead of a runner-level property), implements strict_output mode that fails when actual output contains undeclared fields, and reports failures with fixture.expect JSON for clarity.
Test fixtures for validation rules
packages/core/tests/fixtures/{cyclic-fixture-deps,invalid-expect-cross-ref,invalid-gates,invalid-retry-type,invalid-strict-output-type,strict-output-skill,valid-gates}.md
New fixture definitions exercise all validation paths: cross-reference mismatches, type errors, gate validation failure modes, cycle detection, and strict output behavior.
Test coverage for validator and test runner enhancements
packages/core/tests/{validator,test-runner,skill-runner}.test.js
Jest suites add coverage for expect cross-refs, retry type validation, strict_output type checking, quality.gates malformed entries, depends_on cycle detection, and runtime strict output enforcement with proper error reporting.
Example fixture expectation updates
examples/docx-generation/COVENANT.md
Remove file_path assertions from docx example fixtures; retain only validation_result.valid checks.

Sequence Diagrams

sequenceDiagram
    participant Validator
    participant TestRunner
    participant Execution
    Validator->>Validator: Parse and extract fixtures
    Validator->>Validator: Validate fixture.expect against contracts.outputs
    Validator->>Validator: Type-check retry, strict_output
    Validator->>Validator: Validate quality.gates structure and operations
    Validator->>Validator: Detect cycles in depends_on graph
    Validator-->>TestRunner: Report validation errors or pass
    alt Validation passes
        TestRunner->>TestRunner: Extract fixtures with contractsOutputs
        TestRunner->>TestRunner: Build topological sort (reject unknown deps)
        TestRunner->>Execution: Execute fixtures in dependency order
        Execution->>Execution: Run fixture operation
        Execution->>TestRunner: Return actual output
        TestRunner->>TestRunner: Validate output vs fixture.expect
        alt strict_output enabled
            TestRunner->>TestRunner: Check for undeclared fields
            TestRunner-->>Execution: Fail if leaked fields
        end
        TestRunner-->>TestRunner: Report pass/fail with error details
    else Validation fails
        Validator-->>TestRunner: Reject covenant
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A rabbit hops through contracts neat,
Ensuring fixtures all compete—
No cycles hide, no fields leak through,
Each gate is checked, each promise true. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main changeset as closing eight audit issues across validator, test runner, skill runner, and generator components.
Description check ✅ Passed The PR description is comprehensive, covering all required sections: summary of changes per component, linked issues with issue numbers, spec impacts, verification results, and a detailed cubic.dev summary.
Linked Issues check ✅ Passed All eight linked issues (#18#25) are addressed: validator expect cross-check (#18), quality.gates validation (#19), strict_output opt-in (#20), retry type-check (#21), JSON parse error reporting (#22), fs.stat error narrowing (#23), cycle detection (#24), and unknown depends_on throw (#25).
Out of Scope Changes check ✅ Passed All changes are scoped to the eight audit issues: validator enhancements, test runner fixes, skill runner error reporting, CLI generator error handling, and corresponding regression tests/fixtures. No extraneous changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issues-18-25

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@SingleSourceStudios SingleSourceStudios merged commit f55f908 into main Jun 11, 2026
6 of 7 checks passed
@SingleSourceStudios SingleSourceStudios deleted the fix/issues-18-25 branch June 11, 2026 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment