Playbook Agent: playbook loader with validation (2/8) by rashmiraghunandan · Pull Request #573 · roostorg/coop

rashmiraghunandan · 2026-05-23T00:05:13Z

depends on Playbook Agent: types and abstract interfaces (1/8) #572

Summary

Playbook loader that converts raw YAML/JSON configs into validated TypeScript Playbook objects with comprehensive cross-reference checking.

Part 2 of 8 — depends on #572 (types and interfaces).

`playbookLoader.ts`

parsePlaybook(): snake_case YAML input to camelCase TypeScript output
Cross-reference validation at load time:
- baseline_catalog_refs must be a subset of allowed_catalog_refs
- Deep-dive module query refs must be in the allowed set
- Hard rule outcomes must reference valid label IDs
- Default label must exist in labels
- Confidence weights must sum to ~1.0
Rejects forbidden floating versions (latest, head, main)

`playbookLoader.test.ts` — 7 tests

Parses valid playbook with correct type conversion
Applies default values for optional confidence fields
Rejects baseline ref not in allowed set
Rejects invalid default label
Rejects confidence weights that don't sum to ~1.0
Rejects invalid semantic versions
Validates hard rule outcomes against labels

Test plan

npm test -- --testPathPattern=playbookLoader passes (7 tests)

Depends on #572 Types and interfaces. Merge #572 first.

Summary by CodeRabbit

New Features
- Added a Playbook Agent framework for structured investigation workflows with playbook parsing, validation, and result types.
- Added semantic versioning and catalog reference validation for playbooks.
- Added configurable decision logic, confidence computation, hard rules, evidence policies, and output contracts.
Tests
- Added unit tests validating playbook parsing, schema enforcement, cross-reference checks, and error reporting.

coderabbitai · 2026-05-23T00:05:25Z

📝 Walkthrough

Walkthrough

This PR adds core Playbook domain types and utilities, a playbook parser with cross-reference validation and aggregate error reporting plus tests, and service-layer TypeScript interfaces for LLM, evidence store, artifact storage, and playbook orchestration.

Changes

Playbook Agent Framework

Layer / File(s)	Summary
Core domain types and utilities `server/services/playbookAgentService/playbookTypes.ts`	Defines semantic version parsing/formatting, catalog reference validation/formatting, input context and evidence ontology types, evidence request policy, deep-dive module types, label/hard-rule/decision logic types, confidence computation and phase budget types, output contract, top-level `Playbook`, and `PlaybookResult`/verdict/confidence types.
Playbook parsing and validation `server/services/playbookAgentService/playbookLoader.ts`, `server/services/playbookAgentService/playbookLoader.test.ts`	Adds raw snake_case input shapes, conversion helpers (`toCatalogReference`), `PlaybookValidationError`, `validatePlaybookCrossRefs`, and `parsePlaybook(raw)` which validates fields, parses/normalizes structures, applies defaults, checks cross-references (baseline ⊆ allowed, deep-dive refs in allowed, label IDs in hard rules/default), and enforces confidence weight sums; comprehensive Jest tests cover valid parsing and multiple rejection cases (invalid refs, labels, weights, versions, forbidden IDs, and hard-rule outcomes).
Service layer contracts and interfaces `server/services/playbookAgentService/interfaces.ts`	Adds `LLMRequest`/`LLMResponse` and `ILLMAdapter`, `EvidenceQuery` and `IEvidenceStore` returning `QueryResult`, `PlaybookArtifact` and `IArtifactStore` for immutable artifact storage, and `PlaybookRunInput` / `IPlaybookRunner` returning `PlaybookResult`.

Sequence Diagram(s)

sequenceDiagram
  participant Loader as parsePlaybook
  participant RawTypes as RawPlaybook
  participant Converter as toCatalogReference
  participant CrossRef as validatePlaybookCrossRefs
  participant Result as Playbook
  
  Loader->>RawTypes: validate raw fields
  RawTypes->>Converter: convert catalog refs
  Converter->>Loader: parse semantic versions
  Loader->>CrossRef: check label IDs, refs, weights
  CrossRef->>CrossRef: baseline subset of allowed
  CrossRef->>CrossRef: validate hard rule outcomes
  CrossRef->>CrossRef: confidence weight sum ~1.0
  CrossRef->>Result: return parsed Playbook

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: implementing a playbook loader with validation (part 2 of an 8-part series).
Description check	✅ Passed	The PR description includes all required template sections with adequate detail about changes, but the Tests section lacks specific manual testing or screenshot information.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Introduces a typed, versioned “playbook” schema and a YAML-to-Playbook loader with validation to support deterministic risk investigation workflows.

Changes:

Added TypeScript schema/types for playbooks, catalog references, confidence computation, and results.
Implemented a playbook loader that converts snake_case YAML objects into typed Playbook objects and validates cross-references.
Added unit tests for loader validation behavior and defaults; defined public adapter interfaces for LLM/evidence/artifact stores.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
server/services/playbookAgentService/playbookTypes.ts	Adds core playbook/type system, semantic version helpers, and catalog reference validation utilities.
server/services/playbookAgentService/playbookLoader.ts	Adds YAML object parsing + structural/cross-ref validation to produce validated `Playbook` objects.
server/services/playbookAgentService/playbookLoader.test.ts	Adds unit coverage for successful parsing, defaults, and key validation failures.
server/services/playbookAgentService/interfaces.ts	Adds framework adapter interfaces for LLM, evidence execution, artifact storage, and runner API.

+function toCatalogReference(raw: RawCatalogRef): CatalogReference {
+  const ref: CatalogReference = {
+    catalogId: raw.catalog_id,
+    version: parseSemanticVersion(raw.version),
+    catalogType: raw.catalog_type as CatalogType,
+  };
+  validateCatalogReference(ref);
+  return ref;
+}


+  const evidenceRequestPolicy: EvidenceRequestPolicy = {
+    querySelectionMode: 'query_id',
+    maxRounds: r.evidence_request_policy.max_rounds,
+    targetConfidence: r.evidence_request_policy.target_confidence,
+    maxAdditionalQueries: r.evidence_request_policy.max_additional_queries,
+    stopIfNoNewEvidence: r.evidence_request_policy.stop_if_no_new_evidence,
+    triggerDeepDiveIf: r.evidence_request_policy.trigger_deep_dive_if,
+    stopInvestigationIf: r.evidence_request_policy.stop_investigation_if,
+    guidance: r.evidence_request_policy.guidance,
+  };


+export type EvidenceQuery = {
+  readonly catalogId: string;
+  readonly version: string;
+  readonly parameters: Record<string, unknown>;
+};


+describe('parsePlaybook', () => {
+  it('parses a valid playbook', () => {
+    const playbook = parsePlaybook(MINIMAL_PLAYBOOK);
+
+    expect(playbook.useCaseId).toBe('test_investigation');
+    expect(playbook.version).toEqual({ major: 1, minor: 0, patch: 0 });
+    expect(playbook.baselineCatalogRefs).toHaveLength(1);
+    expect(playbook.allowedCatalogRefs).toHaveLength(2);
+    expect(playbook.labels).toHaveLength(2);
+    expect(playbook.confidenceComputation.llmUncertaintyAlpha).toBe(0.4);
+  });


coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

server/services/playbookAgentService/interfaces.ts (1)

42-45: 🏗️ Heavy lift

Use domain-level catalog/version types in EvidenceQuery to prevent contract drift.

catalogId and version are currently plain string, which weakens guarantees already modeled in the domain layer. Prefer reusing the typed catalog/version entities from playbookTypes.ts in this interface contract.

Proposed direction

-import type { PlaybookResult, QueryResult } from './playbookTypes.js';
+import type {
+  PlaybookResult,
+  QueryResult,
+  CatalogReference,
+  SemanticVersion,
+} from './playbookTypes.js';

 export type EvidenceQuery = {
-  readonly catalogId: string;
-  readonly version: string;
+  readonly catalogId: CatalogReference['catalogId'];
+  readonly version: SemanticVersion;
   readonly parameters: Record<string, unknown>;
 };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/services/playbookAgentService/interfaces.ts` around lines 42 - 45, The
EvidenceQuery type uses plain strings for catalogId and version; replace those
with the domain-level types exported from playbookTypes.ts (e.g., CatalogId and
CatalogVersion or the corresponding names used there) by importing them into
server/services/playbookAgentService/interfaces.ts and updating EvidenceQuery's
catalogId and version properties to use those imported types while leaving
parameters as Record<string, unknown>.

server/services/playbookAgentService/playbookLoader.test.ts (1)

62-157: ⚡ Quick win

Add regression tests for catalog type and query-selection validation paths.

Given the parser invariants, add negative tests for: invalid catalog_type, mismatched catalogType between baseline and allowed refs, and unsupported query_selection_mode.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/services/playbookAgentService/playbookLoader.test.ts` around lines 62
- 157, Add three negative unit tests in the parsePlaybook suite using
MINIMAL_PLAYBOOK: one that sets a baseline_catalog_refs entry with an invalid
catalog_type (e.g., 'invalid_type') and asserts parsePlaybook throws about
invalid catalog_type; one that creates a mismatch between
baseline_catalog_refs[0].catalog_type and an allowed_catalog_refs entry’s
catalog_type (different values) and asserts parsePlaybook throws about
mismatched catalog types; and one that sets query_selection_mode to an
unsupported value (e.g., 'UNKNOWN_MODE') and asserts parsePlaybook throws about
unsupported query_selection_mode. Reference parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode to locate
where to add each test.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/services/playbookAgentService/playbookLoader.ts`:
- Around line 303-305: The code creates evidenceRequestPolicy with a hard-coded
querySelectionMode 'query_id' which overwrites the authored value; update the
loader where evidenceRequestPolicy is constructed (evidenceRequestPolicy in
playbookLoader) to read the value from the authored config (e.g.,
r.evidence_request_policy.query_selection_mode) instead of hard-coding, and
validate it against allowed options (reject/throw a descriptive error if the
value is missing or not in the accepted set) so invalid configs are surfaced
rather than silently replaced.
- Around line 139-146: The function toCatalogReference currently type-casts
raw.catalog_type to CatalogType which skips runtime checks; instead perform a
runtime validation/parsing of raw.catalog_type (e.g., implement or call an
isValid/parse function) and set ref.catalogType only if the value is one of the
allowed CatalogType members, otherwise throw a descriptive error; keep
parseSemanticVersion(raw.version) and validateCatalogReference(ref) but remove
the "as CatalogType" cast and replace it with a runtime-validated assignment
(for example add a parseCatalogType(raw.catalog_type) or isCatalogType check
before constructing/returning the CatalogReference).
- Around line 177-199: The subset check builds allowedIds using
formatCatalogReference but only includes catalogId@version, allowing mismatched
catalogType to be accepted; update the matching key to incorporate catalogType
(e.g., catalogType:catalogId@version) and use that consistently when populating
allowedIds and when checking playbook.baselineCatalogRefs and each mod.queryRefs
in playbook.deepDiveModules; either extend formatCatalogReference to include
catalogType or add a new helper (referenced by allowedIds,
formatCatalogReference, playbook.allowedCatalogRefs,
playbook.baselineCatalogRefs, playbook.deepDiveModules, and mod.queryRefs) so
both creation and lookup use the same catalogType-aware key.

In `@server/services/playbookAgentService/playbookTypes.ts`:
- Around line 24-37: The parseSemanticVersion function currently uses
Number(...) which accepts malformed tokens (e.g., empty strings, "+1", " 1"), so
update parseSemanticVersion to first validate each token with a strict regex
(e.g., /^\d+$/) before converting to Number; if any token fails the regex, throw
the existing Invalid semantic version error (include input). Then convert tokens
to integers, check they are non-negative integers as before, and return the
SemanticVersion ({ major, minor, patch }). This change ensures
parseSemanticVersion rejects empty or non-digit tokens and only accepts strictly
numeric components.

---

Nitpick comments:
In `@server/services/playbookAgentService/interfaces.ts`:
- Around line 42-45: The EvidenceQuery type uses plain strings for catalogId and
version; replace those with the domain-level types exported from
playbookTypes.ts (e.g., CatalogId and CatalogVersion or the corresponding names
used there) by importing them into
server/services/playbookAgentService/interfaces.ts and updating EvidenceQuery's
catalogId and version properties to use those imported types while leaving
parameters as Record<string, unknown>.

In `@server/services/playbookAgentService/playbookLoader.test.ts`:
- Around line 62-157: Add three negative unit tests in the parsePlaybook suite
using MINIMAL_PLAYBOOK: one that sets a baseline_catalog_refs entry with an
invalid catalog_type (e.g., 'invalid_type') and asserts parsePlaybook throws
about invalid catalog_type; one that creates a mismatch between
baseline_catalog_refs[0].catalog_type and an allowed_catalog_refs entry’s
catalog_type (different values) and asserts parsePlaybook throws about
mismatched catalog types; and one that sets query_selection_mode to an
unsupported value (e.g., 'UNKNOWN_MODE') and asserts parsePlaybook throws about
unsupported query_selection_mode. Reference parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode to locate
where to add each test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 53afe152-6371-47da-9b81-a20ec4a16843

📥 Commits

Reviewing files that changed from the base of the PR and between 9d6bf65 and 3a824d9.

📒 Files selected for processing (4)

server/services/playbookAgentService/interfaces.ts
server/services/playbookAgentService/playbookLoader.test.ts
server/services/playbookAgentService/playbookLoader.ts
server/services/playbookAgentService/playbookTypes.ts

coderabbitai · 2026-05-23T00:09:10Z

+  const allowedIds = new Set(
+    playbook.allowedCatalogRefs.map(formatCatalogReference),
+  );
+  for (const ref of playbook.baselineCatalogRefs) {
+    const refStr = formatCatalogReference(ref);
+    if (!allowedIds.has(refStr)) {
+      errors.push({
+        field: 'baselineCatalogRefs',
+        message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`,
+      });
+    }
+  }
+
+  // deep-dive query refs must be in allowed set
+  if (playbook.deepDiveModules) {
+    for (const mod of playbook.deepDiveModules) {
+      for (const ref of mod.queryRefs) {
+        const refStr = formatCatalogReference(ref);
+        if (!allowedIds.has(refStr)) {
+          errors.push({
+            field: `deepDiveModules.${mod.moduleId}`,
+            message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`,
+          });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cross-reference keys should include catalogType.

Current subset checks compare only catalogId@version, so mismatched catalogType values can be treated as valid matches.

Suggested fix

+const referenceKey = (ref: CatalogReference): string => + `${ref.catalogType}:${formatCatalogReference(ref)}`; + function validatePlaybookCrossRefs(playbook: Playbook): PlaybookValidationError[] { const errors: PlaybookValidationError[] = []; // baseline_catalog_refs must be a subset of allowed_catalog_refs - const allowedIds = new Set( - playbook.allowedCatalogRefs.map(formatCatalogReference), - ); + const allowedIds = new Set(playbook.allowedCatalogRefs.map(referenceKey)); for (const ref of playbook.baselineCatalogRefs) { - const refStr = formatCatalogReference(ref); - if (!allowedIds.has(refStr)) { + const refStr = referenceKey(ref); + if (!allowedIds.has(refStr)) { errors.push({ field: 'baselineCatalogRefs', message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`, }); } } @@ for (const ref of mod.queryRefs) { - const refStr = formatCatalogReference(ref); + const refStr = referenceKey(ref); if (!allowedIds.has(refStr)) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const allowedIds = new Set(

playbook.allowedCatalogRefs.map(formatCatalogReference),

);

for (const ref of playbook.baselineCatalogRefs) {

const refStr = formatCatalogReference(ref);

if (!allowedIds.has(refStr)) {

errors.push({

field: 'baselineCatalogRefs',

message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`,

});

}

}

// deep-dive query refs must be in allowed set

if (playbook.deepDiveModules) {

for (const mod of playbook.deepDiveModules) {

for (const ref of mod.queryRefs) {

const refStr = formatCatalogReference(ref);

if (!allowedIds.has(refStr)) {

errors.push({

field: `deepDiveModules.${mod.moduleId}`,

message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`,

});

const referenceKey = (ref: CatalogReference): string =>

`${ref.catalogType}:${formatCatalogReference(ref)}`;

const allowedIds = new Set(

playbook.allowedCatalogRefs.map(referenceKey),

);

for (const ref of playbook.baselineCatalogRefs) {

const refStr = referenceKey(ref);

if (!allowedIds.has(refStr)) {

errors.push({

field: 'baselineCatalogRefs',

message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`,

});

}

}

// deep-dive query refs must be in allowed set

if (playbook.deepDiveModules) {

for (const mod of playbook.deepDiveModules) {

for (const ref of mod.queryRefs) {

const refStr = referenceKey(ref);

if (!allowedIds.has(refStr)) {

errors.push({

field: `deepDiveModules.${mod.moduleId}`,

message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`,

});

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.ts` around lines 177 - 199, The subset check builds allowedIds using formatCatalogReference but only includes catalogId@version, allowing mismatched catalogType to be accepted; update the matching key to incorporate catalogType (e.g., catalogType:catalogId@version) and use that consistently when populating allowedIds and when checking playbook.baselineCatalogRefs and each mod.queryRefs in playbook.deepDiveModules; either extend formatCatalogReference to include catalogType or add a new helper (referenced by allowedIds, formatCatalogReference, playbook.allowedCatalogRefs, playbook.baselineCatalogRefs, playbook.deepDiveModules, and mod.queryRefs) so both creation and lookup use the same catalogType-aware key.

coderabbitai · 2026-05-23T00:09:10Z

+  const evidenceRequestPolicy: EvidenceRequestPolicy = {
+    querySelectionMode: 'query_id',
+    maxRounds: r.evidence_request_policy.max_rounds,


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not silently overwrite query_selection_mode.

The loader currently ignores the authored value and always sets 'query_id', which hides invalid configs instead of rejecting them.

Suggested fix

+ if (r.evidence_request_policy.query_selection_mode !== 'query_id') { + throw new Error( + `Invalid query_selection_mode '${r.evidence_request_policy.query_selection_mode}', expected 'query_id'`, + ); + } + const evidenceRequestPolicy: EvidenceRequestPolicy = { - querySelectionMode: 'query_id', + querySelectionMode: r.evidence_request_policy.query_selection_mode, maxRounds: r.evidence_request_policy.max_rounds,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const evidenceRequestPolicy: EvidenceRequestPolicy = {

querySelectionMode: 'query_id',

maxRounds: r.evidence_request_policy.max_rounds,

if (r.evidence_request_policy.query_selection_mode !== 'query_id') {

throw new Error(

`Invalid query_selection_mode '${r.evidence_request_policy.query_selection_mode}', expected 'query_id'`,

);

}

const evidenceRequestPolicy: EvidenceRequestPolicy = {

querySelectionMode: r.evidence_request_policy.query_selection_mode,

maxRounds: r.evidence_request_policy.max_rounds,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.ts` around lines 303 - 305, The code creates evidenceRequestPolicy with a hard-coded querySelectionMode 'query_id' which overwrites the authored value; update the loader where evidenceRequestPolicy is constructed (evidenceRequestPolicy in playbookLoader) to read the value from the authored config (e.g., r.evidence_request_policy.query_selection_mode) instead of hard-coding, and validate it against allowed options (reject/throw a descriptive error if the value is missing or not in the accepted set) so invalid configs are surfaced rather than silently replaced.

Introduce the foundational type system for the playbook agent framework: playbookTypes.ts — TypeScript types for declarative investigation playbooks: - SemanticVersion with parsing and formatting - CatalogReference for versioned query references (no floating versions) - Playbook config: evidence ontology, labels, decision logic, confidence computation weights, output contracts - PlaybookVerdict and ConfidenceBreakdown result types - QueryResult/QueryResults for evidence store responses interfaces.ts — Abstract contracts that organizations implement: - ILLMAdapter: pluggable LLM provider (OpenAI, Anthropic, local, etc.) - IEvidenceStore: execute pre-approved queries by catalog ID - IArtifactStore: insert-only storage for immutable audit trail - IPlaybookRunner: core runner interface These types enforce the key safety property: playbooks reference queries by catalog_id@version and never contain SQL, table names, or field mappings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

server/services/playbookAgentService/playbookLoader.test.ts (1)

62-157: ⚡ Quick win

Add regressions for query_selection_mode and catalog_type-aware subset checks.

The suite misses two high-impact cases currently handled incorrectly in the loader: invalid query_selection_mode acceptance and cross-ref matching that ignores catalog_type.

Suggested tests

 describe('parsePlaybook', () => {
+  it('rejects invalid query_selection_mode', () => {
+    const bad = {
+      ...MINIMAL_PLAYBOOK,
+      evidence_request_policy: {
+        ...MINIMAL_PLAYBOOK.evidence_request_policy,
+        query_selection_mode: 'round_robin',
+      },
+    };
+
+    expect(() => parsePlaybook(bad)).toThrow('query_selection_mode');
+  });
+
+  it('treats catalog_type as part of allowed-set identity', () => {
+    const bad = {
+      ...MINIMAL_PLAYBOOK,
+      baseline_catalog_refs: [
+        { catalog_id: 'login_patterns', version: '1.0.0', catalog_type: 'query_bundle' },
+      ],
+      allowed_catalog_refs: [
+        { catalog_id: 'login_patterns', version: '1.0.0', catalog_type: 'query' },
+      ],
+    };
+
+    expect(() => parsePlaybook(bad)).toThrow('not found in allowed_catalog_refs');
+  });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/services/playbookAgentService/playbookLoader.test.ts` around lines 62
- 157, Add two tests to the parsePlaybook suite: one that sets
query_selection_mode on MINIMAL_PLAYBOOK to an invalid value and asserts
parsePlaybook throws (to cover validation of query_selection_mode), and another
that crafts baseline_catalog_refs and allowed_catalog_refs with the same
catalog_id but different catalog_type (e.g., one 'query' vs 'dataset') and
asserts parsePlaybook rejects the baseline because catalog_type must match (to
cover catalog_type-aware subset checks). Use parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode in the new
cases so they fail if validation is missing.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/services/playbookAgentService/playbookLoader.ts`:
- Around line 256-257: The function parsePlaybook currently casts raw to
Partial<RawPlaybook> (const r = raw as Partial<RawPlaybook>) and may access
properties when raw is null/undefined; add an explicit guard at the start of
parsePlaybook to verify raw is non-null and is an object (e.g., if (raw == null
|| typeof raw !== 'object') throw a validation/TypeError) before assigning to r
or reading properties; apply the same guard logic for the subsequent property
validations in the block that references r (lines around 259–274) so the
existing validation/error path is used instead of allowing a runtime TypeError
when parsePlaybook(null/undefined) is called.
- Around line 379-395: The optional confidence fields (base_coverage_weight,
additional_coverage_weight, critical_evidence_missing_penalty,
additional_query_bonus_max) are not being validated and can be outside [0,1];
add calls to validateRange for each optional field before building the
ConfidenceComputation object (use validateRange(cc.base_coverage_weight,
'base_coverage_weight') etc.), only invoking validateRange when the field is
defined, then construct ConfidenceComputation with the same defaults (??) as
now.

---

Nitpick comments:
In `@server/services/playbookAgentService/playbookLoader.test.ts`:
- Around line 62-157: Add two tests to the parsePlaybook suite: one that sets
query_selection_mode on MINIMAL_PLAYBOOK to an invalid value and asserts
parsePlaybook throws (to cover validation of query_selection_mode), and another
that crafts baseline_catalog_refs and allowed_catalog_refs with the same
catalog_id but different catalog_type (e.g., one 'query' vs 'dataset') and
asserts parsePlaybook rejects the baseline because catalog_type must match (to
cover catalog_type-aware subset checks). Use parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode in the new
cases so they fail if validation is missing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 30645a54-1a30-4411-8edb-bb84b3f29ed5

📥 Commits

Reviewing files that changed from the base of the PR and between 3a824d9 and f40b387.

📒 Files selected for processing (4)

server/services/playbookAgentService/interfaces.ts
server/services/playbookAgentService/playbookLoader.test.ts
server/services/playbookAgentService/playbookLoader.ts
server/services/playbookAgentService/playbookTypes.ts

coderabbitai · 2026-05-23T02:00:31Z

+export function parsePlaybook(raw: unknown): Playbook {
+  const r = raw as Partial<RawPlaybook>;


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard raw before property access.

parsePlaybook(null)/parsePlaybook(undefined) can throw a runtime TypeError before your validation error path is reached.

Proposed fix

export function parsePlaybook(raw: unknown): Playbook { + if (typeof raw !== 'object' || raw == null) { + throw new Error('Playbook must be an object'); + } const r = raw as Partial<RawPlaybook>;

Also applies to: 259-274

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.ts` around lines 256 - 257, The function parsePlaybook currently casts raw to Partial<RawPlaybook> (const r = raw as Partial<RawPlaybook>) and may access properties when raw is null/undefined; add an explicit guard at the start of parsePlaybook to verify raw is non-null and is an object (e.g., if (raw == null || typeof raw !== 'object') throw a validation/TypeError) before assigning to r or reading properties; apply the same guard logic for the subsequent property validations in the block that references r (lines around 259–274) so the existing validation/error path is used instead of allowing a runtime TypeError when parsePlaybook(null/undefined) is called.

coderabbitai · 2026-05-23T02:00:31Z

+  validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha');
+  validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight');
+  validateRange(cc.signal_strength_weight, 'signal_strength_weight');
+  validateRange(cc.signal_agreement_weight, 'signal_agreement_weight');
+  validateRange(cc.data_quality_weight, 'data_quality_weight');
+
+  const confidenceComputation: ConfidenceComputation = {
+    evidenceCompletenessWeight: cc.evidence_completeness_weight,
+    signalStrengthWeight: cc.signal_strength_weight,
+    signalAgreementWeight: cc.signal_agreement_weight,
+    dataQualityWeight: cc.data_quality_weight,
+    llmUncertaintyAlpha: cc.llm_uncertainty_alpha,
+    baseCoverageWeight: cc.base_coverage_weight ?? 0.7,
+    additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3,
+    criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3,
+    additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2,
+  };


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate optional confidence fields when provided.

Only the required confidence fields are range-checked. Optional authored values can currently be out of [0, 1] and still be accepted.

Proposed fix

validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha'); validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight'); validateRange(cc.signal_strength_weight, 'signal_strength_weight'); validateRange(cc.signal_agreement_weight, 'signal_agreement_weight'); validateRange(cc.data_quality_weight, 'data_quality_weight'); + + const baseCoverageWeight = cc.base_coverage_weight ?? 0.7; + const additionalCoverageWeight = cc.additional_coverage_weight ?? 0.3; + const criticalEvidenceMissingPenalty = + cc.critical_evidence_missing_penalty ?? 0.3; + const additionalQueryBonusMax = cc.additional_query_bonus_max ?? 0.2; + + validateRange(baseCoverageWeight, 'base_coverage_weight'); + validateRange(additionalCoverageWeight, 'additional_coverage_weight'); + validateRange( + criticalEvidenceMissingPenalty, + 'critical_evidence_missing_penalty', + ); + validateRange(additionalQueryBonusMax, 'additional_query_bonus_max'); const confidenceComputation: ConfidenceComputation = { evidenceCompletenessWeight: cc.evidence_completeness_weight, signalStrengthWeight: cc.signal_strength_weight, signalAgreementWeight: cc.signal_agreement_weight, dataQualityWeight: cc.data_quality_weight, llmUncertaintyAlpha: cc.llm_uncertainty_alpha, - baseCoverageWeight: cc.base_coverage_weight ?? 0.7, - additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3, - criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3, - additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2, + baseCoverageWeight, + additionalCoverageWeight, + criticalEvidenceMissingPenalty, + additionalQueryBonusMax, };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha');

validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight');

validateRange(cc.signal_strength_weight, 'signal_strength_weight');

validateRange(cc.signal_agreement_weight, 'signal_agreement_weight');

validateRange(cc.data_quality_weight, 'data_quality_weight');

const confidenceComputation: ConfidenceComputation = {

evidenceCompletenessWeight: cc.evidence_completeness_weight,

signalStrengthWeight: cc.signal_strength_weight,

signalAgreementWeight: cc.signal_agreement_weight,

dataQualityWeight: cc.data_quality_weight,

llmUncertaintyAlpha: cc.llm_uncertainty_alpha,

baseCoverageWeight: cc.base_coverage_weight ?? 0.7,

additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3,

criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3,

additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2,

};

validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha');

validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight');

validateRange(cc.signal_strength_weight, 'signal_strength_weight');

validateRange(cc.signal_agreement_weight, 'signal_agreement_weight');

validateRange(cc.data_quality_weight, 'data_quality_weight');

const baseCoverageWeight = cc.base_coverage_weight ?? 0.7;

const additionalCoverageWeight = cc.additional_coverage_weight ?? 0.3;

const criticalEvidenceMissingPenalty =

cc.critical_evidence_missing_penalty ?? 0.3;

const additionalQueryBonusMax = cc.additional_query_bonus_max ?? 0.2;

validateRange(baseCoverageWeight, 'base_coverage_weight');

validateRange(additionalCoverageWeight, 'additional_coverage_weight');

validateRange(

criticalEvidenceMissingPenalty,

'critical_evidence_missing_penalty',

);

validateRange(additionalQueryBonusMax, 'additional_query_bonus_max');

const confidenceComputation: ConfidenceComputation = {

evidenceCompletenessWeight: cc.evidence_completeness_weight,

signalStrengthWeight: cc.signal_strength_weight,

signalAgreementWeight: cc.signal_agreement_weight,

dataQualityWeight: cc.data_quality_weight,

llmUncertaintyAlpha: cc.llm_uncertainty_alpha,

baseCoverageWeight,

additionalCoverageWeight,

criticalEvidenceMissingPenalty,

additionalQueryBonusMax,

};

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.ts` around lines 379 - 395, The optional confidence fields (base_coverage_weight, additional_coverage_weight, critical_evidence_missing_penalty, additional_query_bonus_max) are not being validated and can be outside [0,1]; add calls to validateRange for each optional field before building the ConfidenceComputation object (use validateRange(cc.base_coverage_weight, 'base_coverage_weight') etc.), only invoking validateRange when the field is defined, then construct ConfidenceComputation with the same defaults (??) as now.

Copilot AI review requested due to automatic review settings May 23, 2026 00:05

rashmiraghunandan requested review from cassidyjames, dom-notion, juanmrad, julietshen and vinaysrao1 as code owners May 23, 2026 00:05

rashmiraghunandan mentioned this pull request May 23, 2026

Playbook Agent: query catalog (3/8) #574

Open

1 task

Copilot AI reviewed May 23, 2026

View reviewed changes

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

rashmiraghunandan and others added 2 commits May 22, 2026 18:00

Add playbook loader with YAML parsing and cross-reference validation

f40b387

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

rashmiraghunandan force-pushed the playbook-agent/02-playbook-loader branch from 3a824d9 to f40b387 Compare May 23, 2026 01:56

coderabbitai Bot reviewed May 23, 2026

View reviewed changes

cassidyjames changed the title ~~[2/8] Playbook Agent: playbook loader with validation~~ Playbook Agent: playbook loader with validation (2/8) May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Playbook Agent: playbook loader with validation (2/8)#573

Playbook Agent: playbook loader with validation (2/8)#573
rashmiraghunandan wants to merge 2 commits into
roostorg:mainfrom
rashmiraghunandan:playbook-agent/02-playbook-loader

rashmiraghunandan commented May 23, 2026 •

edited by cassidyjames

Loading

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 23, 2026

Uh oh!

coderabbitai Bot May 23, 2026

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 23, 2026

Uh oh!

coderabbitai Bot May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		export function parsePlaybook(raw: unknown): Playbook {
		const r = raw as Partial<RawPlaybook>;

Conversation

rashmiraghunandan commented May 23, 2026 • edited by cassidyjames Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

playbookLoader.ts

playbookLoader.test.ts — 7 tests

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rashmiraghunandan commented May 23, 2026 •

edited by cassidyjames

Loading

`playbookLoader.ts`

`playbookLoader.test.ts` — 7 tests

coderabbitai Bot commented May 23, 2026 •

edited

Loading