fix(analyzer): reduce instructional-prose false positives in static scans (#103) by rodboev · Pull Request #232 · NVIDIA/SkillSpector

rodboev · 2026-06-29T18:06:21Z

Summary

--no-llm static scans currently over-fire on benign documentation and layout-only content. This narrows anti-refusal emission to executable or ambiguous instructions, and filters MP2 layout-only spans that carry no semantic stuffing content.

Closes #103

Attribution: issue follow-up from @M8seven on 2026-06-25 sharpened the surviving scope with the whitespace and box-drawing MP2 repro plus the Never skip the corpus check warning prose case.

Root cause

static_patterns_anti_refusal.py emits AR findings after only a generic example penalty, so deny-lists, anti-examples, protective warnings, and defensive fixtures can look like active jailbreak instructions. static_patterns_memory_poisoning.py filters only one narrow repeated-capture case, so whitespace and box-drawing layout can still emit Context Window Stuffing.

Diff Notes

Add a private AR post-filter for benign documentation, deny-lists, anti-examples, tool declarations, and protective warning contexts.
Add a private MP2 post-filter for whitespace-only and box-drawing layout spans.
Convert existing anti-refusal false-positive xfails into passing regression tests and add focused MP2 layout coverage.
Preserve true positives for direct malicious instructions and semantic stuffing commands.

Scope

This stays in the analyzer layer. It does not change prompt-injection logic, CLI behavior, graph orchestration, report or SARIF schemas, provider code, or LLM-side mitigation.

Verification

./.venv/Scripts/python.exe -m pytest tests/nodes/analyzers/test_static_patterns_anti_refusal.py tests/nodes/analyzers/test_static_patterns.py
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/

…cans (NVIDIA#103)

)

…VIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

…NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

…IDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

…irectives (NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

…ives (NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

Signed-off-by: Rod Boev <rod.boev@gmail.com>

rodboev added 14 commits June 29, 2026 11:43

fix(analyzer): reduce instructional-prose false positives in static s…

bf02bb7

…cans (NVIDIA#103)

fix(analyzer): preserve direct warning-suppression detection (NVIDIA#103

ff97530

)

fix(analyzer): honor quoted and declared benign roles (NVIDIA#103)

16f6b0d

fix(analyzer): keep adjacent live anti-refusal directives detectable (N…

f888914

…VIDIA#103)

fix(analyzer): scope benign anti-refusal continuations precisely (NVI…

7d50b8c

…DIA#103)

fix(analyzer): distinguish declaration headers from live directives (N…

c7cade2

…VIDIA#103)

fix(analyzer): treat documentation labels as prose, not examples (NVI…

a626c7e

…DIA#103)

test(analyzer): cover declaration and fixture prose edges (NVIDIA#103)

1265a4c

fix(analyzer): keep live directives from slipping past prose guards (N…

a51d7f0

…VIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

fix(analyzer): keep ambiguous labels from suppressing live directives (…

e02d472

…NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

fix(analyzer): preserve live directives through the static runner (NV…

e9e9a82

…IDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

fix(analyzer): keep block labels and schema prose from masking live d…

e55b989

…irectives (NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

fix(analyzer): preserve the remaining AR2 response-suppression direct…

74a36a9

…ives (NVIDIA#103) Signed-off-by: Rod Boev <rod.boev@gmail.com>

fix(analyzer): preserve multiline documentation directives (NVIDIA#103)

45953d6

Signed-off-by: Rod Boev <rod.boev@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(analyzer): reduce instructional-prose false positives in static scans (#103)#232

fix(analyzer): reduce instructional-prose false positives in static scans (#103)#232
rodboev wants to merge 14 commits into
NVIDIA:mainfrom
rodboev:pr/static-prose-false-positive-103

rodboev commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rodboev commented Jun 29, 2026

Summary

Root cause

Diff Notes

Scope

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant