Skip to content

test(ctf): adversarial coverage harness + first detector unit tests#529

Open
genesisadversary wants to merge 1 commit into
GenAI-Security-Project:mainfrom
genesisadversary:test/adversarial-coverage
Open

test(ctf): adversarial coverage harness + first detector unit tests#529
genesisadversary wants to merge 1 commit into
GenAI-Security-Project:mainfrom
genesisadversary:test/adversarial-coverage

Conversation

@genesisadversary

Copy link
Copy Markdown

What

Adds an offline, deterministic adversarial coverage harness for the CTF detectors, plus the first
dedicated unit tests for a production detector.

  • tools/adversarial_fuzzer/ — enumerates adversarial scenarios over a lever vocabulary, labels each with
    a detector-independent OWASP-policy oracle, materializes it into the real data model + event stream, runs
    the production detectors, and reports a TP / FN(gap) / TN / FP coverage matrix. Run: uv run python -m
    tools.adversarial_fuzzer (exits non-zero on gaps, so it can gate CI).
  • tests/unit/ctf/test_invoice_threshold_bypass_detector.py — 10 unit tests for
    InvoiceThresholdBypassDetector (boundary, status-spoofing, custom threshold, graceful handling, config
    validation).
  • tests/unit/ctf/test_adversarial_fuzzer.py — 5 tests pinning the harness behaviour and the known
    coverage gaps.

Why

The production detectors ship without dedicated unit tests, and there was no systematic way to ask "which
attack variants does no detector catch?". This follows the existing event-driven test patterns and
extends them.

Results

  • 15 passed (offline, no Redis / network / LLM, $0 — in-memory SQLite).
  • Coverage sweep: 20 scenarios, 6 policy violations, 3 uncovered variants, 0 false positives. No "caught
    %" is reported on purpose — the oracle is broader than any single challenge-scoped detector, so a ratio
    would mislead. Details in Adversarial coverage sweep: uncovered attack variants in two detectors #528.

Notes

Checklist

  • uv run black . / uv run isort .
  • uv run pytest tests/unit/ctf/ green

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant