Playbook Agent: playbook loader with validation (2/8)#573
Playbook Agent: playbook loader with validation (2/8)#573rashmiraghunandan wants to merge 2 commits into
Conversation
📝 WalkthroughWalkthroughThis PR adds core Playbook domain types and utilities, a playbook parser with cross-reference validation and aggregate error reporting plus tests, and service-layer TypeScript interfaces for LLM, evidence store, artifact storage, and playbook orchestration. ChangesPlaybook Agent Framework
Sequence Diagram(s)sequenceDiagram
participant Loader as parsePlaybook
participant RawTypes as RawPlaybook
participant Converter as toCatalogReference
participant CrossRef as validatePlaybookCrossRefs
participant Result as Playbook
Loader->>RawTypes: validate raw fields
RawTypes->>Converter: convert catalog refs
Converter->>Loader: parse semantic versions
Loader->>CrossRef: check label IDs, refs, weights
CrossRef->>CrossRef: baseline subset of allowed
CrossRef->>CrossRef: validate hard rule outcomes
CrossRef->>CrossRef: confidence weight sum ~1.0
CrossRef->>Result: return parsed Playbook
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Introduces a typed, versioned “playbook” schema and a YAML-to-Playbook loader with validation to support deterministic risk investigation workflows.
Changes:
- Added TypeScript schema/types for playbooks, catalog references, confidence computation, and results.
- Implemented a playbook loader that converts snake_case YAML objects into typed Playbook objects and validates cross-references.
- Added unit tests for loader validation behavior and defaults; defined public adapter interfaces for LLM/evidence/artifact stores.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| server/services/playbookAgentService/playbookTypes.ts | Adds core playbook/type system, semantic version helpers, and catalog reference validation utilities. |
| server/services/playbookAgentService/playbookLoader.ts | Adds YAML object parsing + structural/cross-ref validation to produce validated Playbook objects. |
| server/services/playbookAgentService/playbookLoader.test.ts | Adds unit coverage for successful parsing, defaults, and key validation failures. |
| server/services/playbookAgentService/interfaces.ts | Adds framework adapter interfaces for LLM, evidence execution, artifact storage, and runner API. |
| function toCatalogReference(raw: RawCatalogRef): CatalogReference { | ||
| const ref: CatalogReference = { | ||
| catalogId: raw.catalog_id, | ||
| version: parseSemanticVersion(raw.version), | ||
| catalogType: raw.catalog_type as CatalogType, | ||
| }; | ||
| validateCatalogReference(ref); | ||
| return ref; | ||
| } |
| const evidenceRequestPolicy: EvidenceRequestPolicy = { | ||
| querySelectionMode: 'query_id', | ||
| maxRounds: r.evidence_request_policy.max_rounds, | ||
| targetConfidence: r.evidence_request_policy.target_confidence, | ||
| maxAdditionalQueries: r.evidence_request_policy.max_additional_queries, | ||
| stopIfNoNewEvidence: r.evidence_request_policy.stop_if_no_new_evidence, | ||
| triggerDeepDiveIf: r.evidence_request_policy.trigger_deep_dive_if, | ||
| stopInvestigationIf: r.evidence_request_policy.stop_investigation_if, | ||
| guidance: r.evidence_request_policy.guidance, | ||
| }; |
| export type EvidenceQuery = { | ||
| readonly catalogId: string; | ||
| readonly version: string; | ||
| readonly parameters: Record<string, unknown>; | ||
| }; |
| describe('parsePlaybook', () => { | ||
| it('parses a valid playbook', () => { | ||
| const playbook = parsePlaybook(MINIMAL_PLAYBOOK); | ||
|
|
||
| expect(playbook.useCaseId).toBe('test_investigation'); | ||
| expect(playbook.version).toEqual({ major: 1, minor: 0, patch: 0 }); | ||
| expect(playbook.baselineCatalogRefs).toHaveLength(1); | ||
| expect(playbook.allowedCatalogRefs).toHaveLength(2); | ||
| expect(playbook.labels).toHaveLength(2); | ||
| expect(playbook.confidenceComputation.llmUncertaintyAlpha).toBe(0.4); | ||
| }); |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
server/services/playbookAgentService/interfaces.ts (1)
42-45: 🏗️ Heavy liftUse domain-level catalog/version types in
EvidenceQueryto prevent contract drift.
catalogIdandversionare currently plainstring, which weakens guarantees already modeled in the domain layer. Prefer reusing the typed catalog/version entities fromplaybookTypes.tsin this interface contract.Proposed direction
-import type { PlaybookResult, QueryResult } from './playbookTypes.js'; +import type { + PlaybookResult, + QueryResult, + CatalogReference, + SemanticVersion, +} from './playbookTypes.js'; export type EvidenceQuery = { - readonly catalogId: string; - readonly version: string; + readonly catalogId: CatalogReference['catalogId']; + readonly version: SemanticVersion; readonly parameters: Record<string, unknown>; };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/interfaces.ts` around lines 42 - 45, The EvidenceQuery type uses plain strings for catalogId and version; replace those with the domain-level types exported from playbookTypes.ts (e.g., CatalogId and CatalogVersion or the corresponding names used there) by importing them into server/services/playbookAgentService/interfaces.ts and updating EvidenceQuery's catalogId and version properties to use those imported types while leaving parameters as Record<string, unknown>.server/services/playbookAgentService/playbookLoader.test.ts (1)
62-157: ⚡ Quick winAdd regression tests for catalog type and query-selection validation paths.
Given the parser invariants, add negative tests for: invalid
catalog_type, mismatchedcatalogTypebetween baseline and allowed refs, and unsupportedquery_selection_mode.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.test.ts` around lines 62 - 157, Add three negative unit tests in the parsePlaybook suite using MINIMAL_PLAYBOOK: one that sets a baseline_catalog_refs entry with an invalid catalog_type (e.g., 'invalid_type') and asserts parsePlaybook throws about invalid catalog_type; one that creates a mismatch between baseline_catalog_refs[0].catalog_type and an allowed_catalog_refs entry’s catalog_type (different values) and asserts parsePlaybook throws about mismatched catalog types; and one that sets query_selection_mode to an unsupported value (e.g., 'UNKNOWN_MODE') and asserts parsePlaybook throws about unsupported query_selection_mode. Reference parsePlaybook, MINIMAL_PLAYBOOK, baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode to locate where to add each test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/services/playbookAgentService/playbookLoader.ts`:
- Around line 303-305: The code creates evidenceRequestPolicy with a hard-coded
querySelectionMode 'query_id' which overwrites the authored value; update the
loader where evidenceRequestPolicy is constructed (evidenceRequestPolicy in
playbookLoader) to read the value from the authored config (e.g.,
r.evidence_request_policy.query_selection_mode) instead of hard-coding, and
validate it against allowed options (reject/throw a descriptive error if the
value is missing or not in the accepted set) so invalid configs are surfaced
rather than silently replaced.
- Around line 139-146: The function toCatalogReference currently type-casts
raw.catalog_type to CatalogType which skips runtime checks; instead perform a
runtime validation/parsing of raw.catalog_type (e.g., implement or call an
isValid/parse function) and set ref.catalogType only if the value is one of the
allowed CatalogType members, otherwise throw a descriptive error; keep
parseSemanticVersion(raw.version) and validateCatalogReference(ref) but remove
the "as CatalogType" cast and replace it with a runtime-validated assignment
(for example add a parseCatalogType(raw.catalog_type) or isCatalogType check
before constructing/returning the CatalogReference).
- Around line 177-199: The subset check builds allowedIds using
formatCatalogReference but only includes catalogId@version, allowing mismatched
catalogType to be accepted; update the matching key to incorporate catalogType
(e.g., catalogType:catalogId@version) and use that consistently when populating
allowedIds and when checking playbook.baselineCatalogRefs and each mod.queryRefs
in playbook.deepDiveModules; either extend formatCatalogReference to include
catalogType or add a new helper (referenced by allowedIds,
formatCatalogReference, playbook.allowedCatalogRefs,
playbook.baselineCatalogRefs, playbook.deepDiveModules, and mod.queryRefs) so
both creation and lookup use the same catalogType-aware key.
In `@server/services/playbookAgentService/playbookTypes.ts`:
- Around line 24-37: The parseSemanticVersion function currently uses
Number(...) which accepts malformed tokens (e.g., empty strings, "+1", " 1"), so
update parseSemanticVersion to first validate each token with a strict regex
(e.g., /^\d+$/) before converting to Number; if any token fails the regex, throw
the existing Invalid semantic version error (include input). Then convert tokens
to integers, check they are non-negative integers as before, and return the
SemanticVersion ({ major, minor, patch }). This change ensures
parseSemanticVersion rejects empty or non-digit tokens and only accepts strictly
numeric components.
---
Nitpick comments:
In `@server/services/playbookAgentService/interfaces.ts`:
- Around line 42-45: The EvidenceQuery type uses plain strings for catalogId and
version; replace those with the domain-level types exported from
playbookTypes.ts (e.g., CatalogId and CatalogVersion or the corresponding names
used there) by importing them into
server/services/playbookAgentService/interfaces.ts and updating EvidenceQuery's
catalogId and version properties to use those imported types while leaving
parameters as Record<string, unknown>.
In `@server/services/playbookAgentService/playbookLoader.test.ts`:
- Around line 62-157: Add three negative unit tests in the parsePlaybook suite
using MINIMAL_PLAYBOOK: one that sets a baseline_catalog_refs entry with an
invalid catalog_type (e.g., 'invalid_type') and asserts parsePlaybook throws
about invalid catalog_type; one that creates a mismatch between
baseline_catalog_refs[0].catalog_type and an allowed_catalog_refs entry’s
catalog_type (different values) and asserts parsePlaybook throws about
mismatched catalog types; and one that sets query_selection_mode to an
unsupported value (e.g., 'UNKNOWN_MODE') and asserts parsePlaybook throws about
unsupported query_selection_mode. Reference parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode to locate
where to add each test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 53afe152-6371-47da-9b81-a20ec4a16843
📒 Files selected for processing (4)
server/services/playbookAgentService/interfaces.tsserver/services/playbookAgentService/playbookLoader.test.tsserver/services/playbookAgentService/playbookLoader.tsserver/services/playbookAgentService/playbookTypes.ts
| const allowedIds = new Set( | ||
| playbook.allowedCatalogRefs.map(formatCatalogReference), | ||
| ); | ||
| for (const ref of playbook.baselineCatalogRefs) { | ||
| const refStr = formatCatalogReference(ref); | ||
| if (!allowedIds.has(refStr)) { | ||
| errors.push({ | ||
| field: 'baselineCatalogRefs', | ||
| message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`, | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| // deep-dive query refs must be in allowed set | ||
| if (playbook.deepDiveModules) { | ||
| for (const mod of playbook.deepDiveModules) { | ||
| for (const ref of mod.queryRefs) { | ||
| const refStr = formatCatalogReference(ref); | ||
| if (!allowedIds.has(refStr)) { | ||
| errors.push({ | ||
| field: `deepDiveModules.${mod.moduleId}`, | ||
| message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`, | ||
| }); |
There was a problem hiding this comment.
Cross-reference keys should include catalogType.
Current subset checks compare only catalogId@version, so mismatched catalogType values can be treated as valid matches.
Suggested fix
+const referenceKey = (ref: CatalogReference): string =>
+ `${ref.catalogType}:${formatCatalogReference(ref)}`;
+
function validatePlaybookCrossRefs(playbook: Playbook): PlaybookValidationError[] {
const errors: PlaybookValidationError[] = [];
// baseline_catalog_refs must be a subset of allowed_catalog_refs
- const allowedIds = new Set(
- playbook.allowedCatalogRefs.map(formatCatalogReference),
- );
+ const allowedIds = new Set(playbook.allowedCatalogRefs.map(referenceKey));
for (const ref of playbook.baselineCatalogRefs) {
- const refStr = formatCatalogReference(ref);
- if (!allowedIds.has(refStr)) {
+ const refStr = referenceKey(ref);
+ if (!allowedIds.has(refStr)) {
errors.push({
field: 'baselineCatalogRefs',
message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`,
});
}
}
@@
for (const ref of mod.queryRefs) {
- const refStr = formatCatalogReference(ref);
+ const refStr = referenceKey(ref);
if (!allowedIds.has(refStr)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const allowedIds = new Set( | |
| playbook.allowedCatalogRefs.map(formatCatalogReference), | |
| ); | |
| for (const ref of playbook.baselineCatalogRefs) { | |
| const refStr = formatCatalogReference(ref); | |
| if (!allowedIds.has(refStr)) { | |
| errors.push({ | |
| field: 'baselineCatalogRefs', | |
| message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`, | |
| }); | |
| } | |
| } | |
| // deep-dive query refs must be in allowed set | |
| if (playbook.deepDiveModules) { | |
| for (const mod of playbook.deepDiveModules) { | |
| for (const ref of mod.queryRefs) { | |
| const refStr = formatCatalogReference(ref); | |
| if (!allowedIds.has(refStr)) { | |
| errors.push({ | |
| field: `deepDiveModules.${mod.moduleId}`, | |
| message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`, | |
| }); | |
| const referenceKey = (ref: CatalogReference): string => | |
| `${ref.catalogType}:${formatCatalogReference(ref)}`; | |
| const allowedIds = new Set( | |
| playbook.allowedCatalogRefs.map(referenceKey), | |
| ); | |
| for (const ref of playbook.baselineCatalogRefs) { | |
| const refStr = referenceKey(ref); | |
| if (!allowedIds.has(refStr)) { | |
| errors.push({ | |
| field: 'baselineCatalogRefs', | |
| message: `Baseline ref '${refStr}' not found in allowed_catalog_refs`, | |
| }); | |
| } | |
| } | |
| // deep-dive query refs must be in allowed set | |
| if (playbook.deepDiveModules) { | |
| for (const mod of playbook.deepDiveModules) { | |
| for (const ref of mod.queryRefs) { | |
| const refStr = referenceKey(ref); | |
| if (!allowedIds.has(refStr)) { | |
| errors.push({ | |
| field: `deepDiveModules.${mod.moduleId}`, | |
| message: `Deep-dive ref '${refStr}' not found in allowed_catalog_refs`, | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/services/playbookAgentService/playbookLoader.ts` around lines 177 -
199, The subset check builds allowedIds using formatCatalogReference but only
includes catalogId@version, allowing mismatched catalogType to be accepted;
update the matching key to incorporate catalogType (e.g.,
catalogType:catalogId@version) and use that consistently when populating
allowedIds and when checking playbook.baselineCatalogRefs and each mod.queryRefs
in playbook.deepDiveModules; either extend formatCatalogReference to include
catalogType or add a new helper (referenced by allowedIds,
formatCatalogReference, playbook.allowedCatalogRefs,
playbook.baselineCatalogRefs, playbook.deepDiveModules, and mod.queryRefs) so
both creation and lookup use the same catalogType-aware key.
| const evidenceRequestPolicy: EvidenceRequestPolicy = { | ||
| querySelectionMode: 'query_id', | ||
| maxRounds: r.evidence_request_policy.max_rounds, |
There was a problem hiding this comment.
Do not silently overwrite query_selection_mode.
The loader currently ignores the authored value and always sets 'query_id', which hides invalid configs instead of rejecting them.
Suggested fix
+ if (r.evidence_request_policy.query_selection_mode !== 'query_id') {
+ throw new Error(
+ `Invalid query_selection_mode '${r.evidence_request_policy.query_selection_mode}', expected 'query_id'`,
+ );
+ }
+
const evidenceRequestPolicy: EvidenceRequestPolicy = {
- querySelectionMode: 'query_id',
+ querySelectionMode: r.evidence_request_policy.query_selection_mode,
maxRounds: r.evidence_request_policy.max_rounds,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const evidenceRequestPolicy: EvidenceRequestPolicy = { | |
| querySelectionMode: 'query_id', | |
| maxRounds: r.evidence_request_policy.max_rounds, | |
| if (r.evidence_request_policy.query_selection_mode !== 'query_id') { | |
| throw new Error( | |
| `Invalid query_selection_mode '${r.evidence_request_policy.query_selection_mode}', expected 'query_id'`, | |
| ); | |
| } | |
| const evidenceRequestPolicy: EvidenceRequestPolicy = { | |
| querySelectionMode: r.evidence_request_policy.query_selection_mode, | |
| maxRounds: r.evidence_request_policy.max_rounds, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/services/playbookAgentService/playbookLoader.ts` around lines 303 -
305, The code creates evidenceRequestPolicy with a hard-coded querySelectionMode
'query_id' which overwrites the authored value; update the loader where
evidenceRequestPolicy is constructed (evidenceRequestPolicy in playbookLoader)
to read the value from the authored config (e.g.,
r.evidence_request_policy.query_selection_mode) instead of hard-coding, and
validate it against allowed options (reject/throw a descriptive error if the
value is missing or not in the accepted set) so invalid configs are surfaced
rather than silently replaced.
Introduce the foundational type system for the playbook agent framework:
playbookTypes.ts — TypeScript types for declarative investigation playbooks:
- SemanticVersion with parsing and formatting
- CatalogReference for versioned query references (no floating versions)
- Playbook config: evidence ontology, labels, decision logic, confidence
computation weights, output contracts
- PlaybookVerdict and ConfidenceBreakdown result types
- QueryResult/QueryResults for evidence store responses
interfaces.ts — Abstract contracts that organizations implement:
- ILLMAdapter: pluggable LLM provider (OpenAI, Anthropic, local, etc.)
- IEvidenceStore: execute pre-approved queries by catalog ID
- IArtifactStore: insert-only storage for immutable audit trail
- IPlaybookRunner: core runner interface
These types enforce the key safety property: playbooks reference queries by
catalog_id@version and never contain SQL, table names, or field mappings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3a824d9 to
f40b387
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
server/services/playbookAgentService/playbookLoader.test.ts (1)
62-157: ⚡ Quick winAdd regressions for
query_selection_modeandcatalog_type-aware subset checks.The suite misses two high-impact cases currently handled incorrectly in the loader: invalid
query_selection_modeacceptance and cross-ref matching that ignorescatalog_type.Suggested tests
describe('parsePlaybook', () => { + it('rejects invalid query_selection_mode', () => { + const bad = { + ...MINIMAL_PLAYBOOK, + evidence_request_policy: { + ...MINIMAL_PLAYBOOK.evidence_request_policy, + query_selection_mode: 'round_robin', + }, + }; + + expect(() => parsePlaybook(bad)).toThrow('query_selection_mode'); + }); + + it('treats catalog_type as part of allowed-set identity', () => { + const bad = { + ...MINIMAL_PLAYBOOK, + baseline_catalog_refs: [ + { catalog_id: 'login_patterns', version: '1.0.0', catalog_type: 'query_bundle' }, + ], + allowed_catalog_refs: [ + { catalog_id: 'login_patterns', version: '1.0.0', catalog_type: 'query' }, + ], + }; + + expect(() => parsePlaybook(bad)).toThrow('not found in allowed_catalog_refs'); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/services/playbookAgentService/playbookLoader.test.ts` around lines 62 - 157, Add two tests to the parsePlaybook suite: one that sets query_selection_mode on MINIMAL_PLAYBOOK to an invalid value and asserts parsePlaybook throws (to cover validation of query_selection_mode), and another that crafts baseline_catalog_refs and allowed_catalog_refs with the same catalog_id but different catalog_type (e.g., one 'query' vs 'dataset') and asserts parsePlaybook rejects the baseline because catalog_type must match (to cover catalog_type-aware subset checks). Use parsePlaybook, MINIMAL_PLAYBOOK, baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode in the new cases so they fail if validation is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/services/playbookAgentService/playbookLoader.ts`:
- Around line 256-257: The function parsePlaybook currently casts raw to
Partial<RawPlaybook> (const r = raw as Partial<RawPlaybook>) and may access
properties when raw is null/undefined; add an explicit guard at the start of
parsePlaybook to verify raw is non-null and is an object (e.g., if (raw == null
|| typeof raw !== 'object') throw a validation/TypeError) before assigning to r
or reading properties; apply the same guard logic for the subsequent property
validations in the block that references r (lines around 259–274) so the
existing validation/error path is used instead of allowing a runtime TypeError
when parsePlaybook(null/undefined) is called.
- Around line 379-395: The optional confidence fields (base_coverage_weight,
additional_coverage_weight, critical_evidence_missing_penalty,
additional_query_bonus_max) are not being validated and can be outside [0,1];
add calls to validateRange for each optional field before building the
ConfidenceComputation object (use validateRange(cc.base_coverage_weight,
'base_coverage_weight') etc.), only invoking validateRange when the field is
defined, then construct ConfidenceComputation with the same defaults (??) as
now.
---
Nitpick comments:
In `@server/services/playbookAgentService/playbookLoader.test.ts`:
- Around line 62-157: Add two tests to the parsePlaybook suite: one that sets
query_selection_mode on MINIMAL_PLAYBOOK to an invalid value and asserts
parsePlaybook throws (to cover validation of query_selection_mode), and another
that crafts baseline_catalog_refs and allowed_catalog_refs with the same
catalog_id but different catalog_type (e.g., one 'query' vs 'dataset') and
asserts parsePlaybook rejects the baseline because catalog_type must match (to
cover catalog_type-aware subset checks). Use parsePlaybook, MINIMAL_PLAYBOOK,
baseline_catalog_refs, allowed_catalog_refs, and query_selection_mode in the new
cases so they fail if validation is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 30645a54-1a30-4411-8edb-bb84b3f29ed5
📒 Files selected for processing (4)
server/services/playbookAgentService/interfaces.tsserver/services/playbookAgentService/playbookLoader.test.tsserver/services/playbookAgentService/playbookLoader.tsserver/services/playbookAgentService/playbookTypes.ts
| export function parsePlaybook(raw: unknown): Playbook { | ||
| const r = raw as Partial<RawPlaybook>; |
There was a problem hiding this comment.
Guard raw before property access.
parsePlaybook(null)/parsePlaybook(undefined) can throw a runtime TypeError before your validation error path is reached.
Proposed fix
export function parsePlaybook(raw: unknown): Playbook {
+ if (typeof raw !== 'object' || raw == null) {
+ throw new Error('Playbook must be an object');
+ }
const r = raw as Partial<RawPlaybook>;Also applies to: 259-274
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/services/playbookAgentService/playbookLoader.ts` around lines 256 -
257, The function parsePlaybook currently casts raw to Partial<RawPlaybook>
(const r = raw as Partial<RawPlaybook>) and may access properties when raw is
null/undefined; add an explicit guard at the start of parsePlaybook to verify
raw is non-null and is an object (e.g., if (raw == null || typeof raw !==
'object') throw a validation/TypeError) before assigning to r or reading
properties; apply the same guard logic for the subsequent property validations
in the block that references r (lines around 259–274) so the existing
validation/error path is used instead of allowing a runtime TypeError when
parsePlaybook(null/undefined) is called.
| validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha'); | ||
| validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight'); | ||
| validateRange(cc.signal_strength_weight, 'signal_strength_weight'); | ||
| validateRange(cc.signal_agreement_weight, 'signal_agreement_weight'); | ||
| validateRange(cc.data_quality_weight, 'data_quality_weight'); | ||
|
|
||
| const confidenceComputation: ConfidenceComputation = { | ||
| evidenceCompletenessWeight: cc.evidence_completeness_weight, | ||
| signalStrengthWeight: cc.signal_strength_weight, | ||
| signalAgreementWeight: cc.signal_agreement_weight, | ||
| dataQualityWeight: cc.data_quality_weight, | ||
| llmUncertaintyAlpha: cc.llm_uncertainty_alpha, | ||
| baseCoverageWeight: cc.base_coverage_weight ?? 0.7, | ||
| additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3, | ||
| criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3, | ||
| additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2, | ||
| }; |
There was a problem hiding this comment.
Validate optional confidence fields when provided.
Only the required confidence fields are range-checked. Optional authored values can currently be out of [0, 1] and still be accepted.
Proposed fix
validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha');
validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight');
validateRange(cc.signal_strength_weight, 'signal_strength_weight');
validateRange(cc.signal_agreement_weight, 'signal_agreement_weight');
validateRange(cc.data_quality_weight, 'data_quality_weight');
+
+ const baseCoverageWeight = cc.base_coverage_weight ?? 0.7;
+ const additionalCoverageWeight = cc.additional_coverage_weight ?? 0.3;
+ const criticalEvidenceMissingPenalty =
+ cc.critical_evidence_missing_penalty ?? 0.3;
+ const additionalQueryBonusMax = cc.additional_query_bonus_max ?? 0.2;
+
+ validateRange(baseCoverageWeight, 'base_coverage_weight');
+ validateRange(additionalCoverageWeight, 'additional_coverage_weight');
+ validateRange(
+ criticalEvidenceMissingPenalty,
+ 'critical_evidence_missing_penalty',
+ );
+ validateRange(additionalQueryBonusMax, 'additional_query_bonus_max');
const confidenceComputation: ConfidenceComputation = {
evidenceCompletenessWeight: cc.evidence_completeness_weight,
signalStrengthWeight: cc.signal_strength_weight,
signalAgreementWeight: cc.signal_agreement_weight,
dataQualityWeight: cc.data_quality_weight,
llmUncertaintyAlpha: cc.llm_uncertainty_alpha,
- baseCoverageWeight: cc.base_coverage_weight ?? 0.7,
- additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3,
- criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3,
- additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2,
+ baseCoverageWeight,
+ additionalCoverageWeight,
+ criticalEvidenceMissingPenalty,
+ additionalQueryBonusMax,
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha'); | |
| validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight'); | |
| validateRange(cc.signal_strength_weight, 'signal_strength_weight'); | |
| validateRange(cc.signal_agreement_weight, 'signal_agreement_weight'); | |
| validateRange(cc.data_quality_weight, 'data_quality_weight'); | |
| const confidenceComputation: ConfidenceComputation = { | |
| evidenceCompletenessWeight: cc.evidence_completeness_weight, | |
| signalStrengthWeight: cc.signal_strength_weight, | |
| signalAgreementWeight: cc.signal_agreement_weight, | |
| dataQualityWeight: cc.data_quality_weight, | |
| llmUncertaintyAlpha: cc.llm_uncertainty_alpha, | |
| baseCoverageWeight: cc.base_coverage_weight ?? 0.7, | |
| additionalCoverageWeight: cc.additional_coverage_weight ?? 0.3, | |
| criticalEvidenceMissingPenalty: cc.critical_evidence_missing_penalty ?? 0.3, | |
| additionalQueryBonusMax: cc.additional_query_bonus_max ?? 0.2, | |
| }; | |
| validateRange(cc.llm_uncertainty_alpha, 'llm_uncertainty_alpha'); | |
| validateRange(cc.evidence_completeness_weight, 'evidence_completeness_weight'); | |
| validateRange(cc.signal_strength_weight, 'signal_strength_weight'); | |
| validateRange(cc.signal_agreement_weight, 'signal_agreement_weight'); | |
| validateRange(cc.data_quality_weight, 'data_quality_weight'); | |
| const baseCoverageWeight = cc.base_coverage_weight ?? 0.7; | |
| const additionalCoverageWeight = cc.additional_coverage_weight ?? 0.3; | |
| const criticalEvidenceMissingPenalty = | |
| cc.critical_evidence_missing_penalty ?? 0.3; | |
| const additionalQueryBonusMax = cc.additional_query_bonus_max ?? 0.2; | |
| validateRange(baseCoverageWeight, 'base_coverage_weight'); | |
| validateRange(additionalCoverageWeight, 'additional_coverage_weight'); | |
| validateRange( | |
| criticalEvidenceMissingPenalty, | |
| 'critical_evidence_missing_penalty', | |
| ); | |
| validateRange(additionalQueryBonusMax, 'additional_query_bonus_max'); | |
| const confidenceComputation: ConfidenceComputation = { | |
| evidenceCompletenessWeight: cc.evidence_completeness_weight, | |
| signalStrengthWeight: cc.signal_strength_weight, | |
| signalAgreementWeight: cc.signal_agreement_weight, | |
| dataQualityWeight: cc.data_quality_weight, | |
| llmUncertaintyAlpha: cc.llm_uncertainty_alpha, | |
| baseCoverageWeight, | |
| additionalCoverageWeight, | |
| criticalEvidenceMissingPenalty, | |
| additionalQueryBonusMax, | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/services/playbookAgentService/playbookLoader.ts` around lines 379 -
395, The optional confidence fields (base_coverage_weight,
additional_coverage_weight, critical_evidence_missing_penalty,
additional_query_bonus_max) are not being validated and can be outside [0,1];
add calls to validateRange for each optional field before building the
ConfidenceComputation object (use validateRange(cc.base_coverage_weight,
'base_coverage_weight') etc.), only invoking validateRange when the field is
defined, then construct ConfidenceComputation with the same defaults (??) as
now.
Summary
Playbook loader that converts raw YAML/JSON configs into validated TypeScript
Playbookobjects with comprehensive cross-reference checking.Part 2 of 8 — depends on #572 (types and interfaces).
playbookLoader.tsparsePlaybook(): snake_case YAML input to camelCase TypeScript outputbaseline_catalog_refsmust be a subset ofallowed_catalog_refslatest,head,main)playbookLoader.test.ts— 7 testsTest plan
npm test -- --testPathPattern=playbookLoaderpasses (7 tests)Summary by CodeRabbit
New Features
Tests