Guardrails for the requirement-analyzer agent, implemented with the Mastra
Processor API. They make the agent's requirement-coverage analyses reliable,
consistent and safe by preventing the two production failures documented in
test-data/:
- False negatives — declaring a requirement missing when it is actually
implemented (e.g. concluding
REQ_15MISSING without ever reading the/docs/agents.mdthe requirement names). - False positives — declaring a requirement implemented on the strength of
documentation prose or search snippets, without reading any code (e.g.
REQ_06→Implemented @0.95with zerosubmission_readcalls).
All guardrails are deterministic (no extra LLM), so they add < 1 ms/step, introduce no second model, and are fully unit-tested without Ollama.
Architecture details:
architecture.md· Validation steps:Validation.md
| # | Guardrail | Stage | Where |
|---|---|---|---|
| 1 | False-Negative Minimization | output | guardrails/false-negative-processor.ts |
| 2 | False-Positive Prevention | output | guardrails/false-positive-processor.ts |
| 3 | Output Quality Verification | output | guardrails/output-quality-processor.ts |
| 4 | Result Consistency | model + post | providers/ollama.ts, guardrails/consistency.ts |
| — | Context-Window Management | input | guardrails/context-window-processor.ts |
| — | Smart tools | tools | tools/suggest-searches-tool.ts, tools/verify-evidence-tool.ts |
The pure decision logic lives in src/mastra/agents/requirement-analyzer/guardrails/
and is consumed by thin Mastra Processor adapters and by the tests.
- Node.js ≥ 20 (
.nvmrcpins v24; CI tested on v22). - pnpm 10 (
packageManageris pinned). - Ollama with
qwen3:4b-instructfor live runs.
pnpm install| Variable | Default | Purpose |
|---|---|---|
OLLAMA_HOST |
http://localhost:11434 |
Ollama endpoint |
LLM_PROVIDER_NAME |
TC-Ollama |
Provider |
LLM_MODEL_NAME |
qwen3:4b-instruct |
Model (agent and any LLM step share it) |
MAX_CONTEXT_SIZE |
43960 |
Context window; drives num_ctx and the tool-result budget |
LLM_SEED |
42 |
Fixed decoding seed (Result Consistency) |
LOCAL_DEV |
true |
Enables LibSQL storage + observability + memory |
WORKSPACE_PATH |
— | Absolute path to <repo>/workspace (required) |
Guardrail tunables (all optional, env-overridable — see guardrails/config.ts):
FN_MIN_SEARCH_ATTEMPTS, FN_MIN_READ_ATTEMPTS, FN_NEGATIVE_SCORE_THRESHOLD,
FN_MAX_RETRIES, FN_EMPTY_WORKSPACE_FILES, FP_REQUIRE_CODE_EVIDENCE,
FP_MIN_EVIDENCE_LENGTH, FP_MAX_UNVERIFIED_PATHS, FP_MAX_RETRIES,
OQ_MIN_EVIDENCE_ITEMS, OQ_MIN_QUALITY_SCORE, OQ_MAX_STRUCTURE_RETRIES,
CONSISTENCY_SCORE_QUANTUM, CONSISTENCY_SCORE_TOLERANCE,
CONTEXT_RESERVE_TOKENS.
# challenge-context.json already lives in ./workspace
# The false-positive submission is preconfigured under workspace/submission.
# To use the false-negative submission instead:
rm -rf workspace/submission && mkdir -p workspace/submission
unzip test-data/false-NEGATIVE-submission.zip -d workspace/submissionSet WORKSPACE_PATH in .env to the absolute path of <repo>/workspace.
pnpm dev # Mastra Studio at http://localhost:3000/studio
pnpm start # CLI quality gate (src/cli/run-quality-gate.ts)- Single requirement:
studio/agents/requirement-analyzer-agent/chat/new— paste a requirement JSON. - Full review:
studio/workflows/requirementsAnalyzerWorkflow/graph.
pnpm test # all tests (guardrail + pre-existing)
pnpm test:guardrails # guardrail suite only (43 tests)
pnpm lint
pnpm format:checkTwo kinds of artifacts live under src/mastra/public/:
-
Offline verification trace (committed):
pnpm traces # writes src/mastra/public/guardrail-traces.dbReplays the real
test-data/reports through the guardrails and records each verdict (FN retry? FP block? quality score?) into a LibSQL DB — reproducible without Ollama. Inspect with any SQLite/LibSQL client:SELECT dataset, requirement_id, parsed_verdict, read_count, fn_should_retry, fp_unsupported, oq_quality_score FROM guardrail_traces ORDER BY dataset, requirement_id;
-
Live agent-run traces (optional): with
LOCAL_DEV=true, running the workflow on Ollama persists Mastra memory/observability to LibSQL (ai-review-libsql-storage.db,requirement-analyzer-memory.db). Copy the freshest of these intosrc/mastra/public/to ship real run traces:cp ai-review-libsql-storage.db src/mastra/public/ cp requirement-analyzer-memory.db src/mastra/public/
On the agent's conclusion step a guardrail inspects the report + the
thread's tool-call history. If the conclusion is unsafe it calls
args.abort(feedback, { retry: true }) with specific feedback (which files
to read, which patterns to search). The agent then performs the requested tool
calls and re-concludes. A hard maxRetries cap and an empty-workspace check
guarantee termination — no infinite loops.
See architecture.md §11 for the full file map.
Only libraries already present in the starter package.json are used (Mastra,
ai-sdk-ollama, zod, tokenx, @mastra/libsql). @libsql/client (already in
the dependency tree via @mastra/libsql, MIT-licensed) is declared as a
devDependency solely for the offline trace-generation script. No other
third-party code was added.