feat: agentops assert run + agentops redteam run as active CI gates by placerda · Pull Request #283 · Azure/agentops

placerda · 2026-06-09T17:06:58Z

What

Turns ASSERT (open-source assert-ai framework) and the Foundry/PyRIT AI Red Teaming agent into active, gated CI steps that AgentOps orchestrates end-to-end — instead of just consuming pre-generated artifacts via assert_path: / redteam_path:.

Also rewrites the README Overview to position AgentOps as an open-source orchestration framework + CLI rather than a feature bullet list.

New commands

agentops assert run — subprocess wrapper around the assert-ai CLI. Parses metrics.json + scores.jsonl, writes a normalized summary at .agentops/assert/latest.json, and exits 2 on any policy violation (unless --no-gate).
agentops redteam run — invokes azure.ai.evaluation.red_team.RedTeam against an Azure OpenAI deployment, Foundry agent, or HTTP endpoint. Aggregates per-category and per-strategy attack-success-rate into .agentops/redteam/latest.json and exits 2 when ASR exceeds the configured threshold (unless --no-gate).

Both runners produce artifacts that agentops doctor --evidence-pack ingests automatically.

Schema additions (`agentops.yaml`)

assert:
  cli: assert-ai
  config: assert/eval_config.yaml
  fail_on_violations: true

redteam:
  target:
    type: model_deployment
    deployment: gpt-4o-mini
  risk_categories: [violence, hate_unfairness, sexual, self_harm]
  attack_strategies: [base64, rot13]
  num_objectives: 5
  fail_on_attack_success_rate: 0.2

assert / redteam are YAML aliases for the Python fields assert_run / redteam_run (the keyword assert cannot be a Python identifier).

Docs

Tutorial: docs/tutorial-prompt-agent-quickstart.md step 10 rewritten as 10a (ASSERT) / 10b (Red Team) / 10c (evidence pack pickup), with install steps for assert-ai and azure-ai-evaluation[redteam].
README: new tagline, new Overview as a six-step orchestration narrative (Evaluate → Probe → Diagnose → Gate → Prove → Learn-from-production), updated Foundry boundary table with a new "Probe safety" row.

Tests

New file tests/unit/test_assert_and_redteam_runners.py — 24 tests covering schema aliases, run-output discovery, dimension summarization, totals aggregation, target-callback shapes, normalized JSON writing, gating, and CLI smoke (missing config / missing dependency / explain manuals).
Full suite: 921 passed, 1 skipped.

Exit-code contract preserved

0 — all gates passed
2 — threshold, ASSERT violation, or red-team ASR gate failed
1 — runtime / configuration error

Scope notes

Active runners are additive. Existing assert_path: / redteam_path: evidence-only paths still work.
AgentOps does not reimplement ASSERT or PyRIT — it shells out to / imports the upstream tools and normalizes their output.
No changes to agentops doctor, cockpit, or eval runner internals beyond the new normalized JSON inputs.

Follow-ups (out of scope)

Cockpit cards for ASSERT / Red Team latest runs.
Patch release to PyPI (currently v0.3.13; user tutorial workspace will need v0.3.14 to consume).

…ates Turn ASSERT (open-source assert-ai framework) and the Foundry/PyRIT AI Red Teaming agent from passive evidence-only references into active, gated CI steps that AgentOps orchestrates end-to-end. New commands - agentops assert run invokes the assert-ai CLI as a subprocess, locates the run output, parses metrics.json and scores.jsonl, and writes a normalized summary at .agentops/assert/latest.json. Exits 2 on any policy violation unless --no-gate or assert.fail_on_violations: false. - agentops redteam run invokes azure.ai.evaluation.red_team.RedTeam against an Azure OpenAI deployment, Foundry agent, or HTTP endpoint, then aggregates per-category and per-strategy attack-success-rate into .agentops/redteam/latest.json. Exits 2 when ASR exceeds redteam.fail_on_attack_success_rate unless --no-gate. Schema - Adds AssertRunConfig and RedTeamRunConfig Pydantic models. - Adds assert_run / redteam_run fields on AgentOpsConfig with aliases assert / redteam so YAML stays natural while Python avoids the reserved keyword. Enables populate_by_name on the root model. Services - src/agentops/services/assert_runner.py: subprocess wrapper, run-output locator with suite/run/most-recent fallback, dimension summarizer, normalized JSON writer. - src/agentops/services/redteam_runner.py: lazy import of the Foundry Red Team SDK, target callback builder for deployment/agent/endpoint shapes, per-category and per-strategy aggregation, normalized JSON writer. CLI - New assert_app and redteam_app Typer groups with run and explain subcommands. - Long-form manuals added to EXPLAIN_PAGES for both groups and surfaced via agentops explain. - Fixes a stale loaded.config access in the new command handlers. Tutorial - docs/tutorial-prompt-agent-quickstart.md replaces the passive assert_path evidence section with active 10a/10b/10c subsections that install assert-ai and azure-ai-evaluation[redteam], scaffold assert/eval_config.yaml and the redteam block, and pull both runners into the evidence pack. - Success criteria updated accordingly. README - Repositions the accelerator as an open-source framework + CLI that orchestrates continuous evaluation, safety testing, and release readiness (rather than reinventing them). - Tagline, six-step release loop, core-outputs table, and exit-code contract reworked. Foundry boundary table now lists ASSERT and the AI Red Teaming agent under "Probe safety" with active commands. Tests - tests/unit/test_assert_and_redteam_runners.py covers schema aliases, run-output discovery, dimension summarization, totals aggregation, target callback resolution, normalized JSON writing, gating, and CLI smoke (missing config block, missing dependency, explain manuals). - Full suite: 921 passed, 1 skipped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- assert_runner._aggregate_totals: narrow Optional dict from metrics.get totals before subscripting, by binding the result to a typed local. - redteam_runner.run_redteam: validate azure_ai_project is not None before passing it to the RedTeam SDK (raises RedTeamRunnerError with a clear hint when project metadata is missing). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

placerda and others added 2 commits June 9, 2026 14:06

placerda merged commit f3647d3 into develop Jun 9, 2026
12 checks passed

placerda deleted the feature/assert-runner-integration branch June 9, 2026 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: agentops assert run + agentops redteam run as active CI gates#283

feat: agentops assert run + agentops redteam run as active CI gates#283
placerda merged 2 commits into
developfrom
feature/assert-runner-integration

placerda commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

placerda commented Jun 9, 2026

What

New commands

Schema additions (agentops.yaml)

Docs

Tests

Exit-code contract preserved

Scope notes

Follow-ups (out of scope)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Schema additions (`agentops.yaml`)