feat(si): sponsored_context_accountability storyboard (first pass)#5551
feat(si): sponsored_context_accountability storyboard (first pass)#5551kapoost wants to merge 4 commits into
Conversation
Refs adcontextprotocol#5541, adcontextprotocol#5486, adcontextprotocol#5501. Adds compliance/source/protocols/sponsored-intelligence/sponsored-context-accountability.yaml — a brand-side conformance storyboard exercising the PR adcontextprotocol#5501 sponsored_context envelope and sponsored_context_receipt allOf invariants in four phases inside a single yaml (small review surface, whole contract visible together). Phases: - presentation_only_happy_path - required_disclosure_commitment - rejected_receipt - silent_downgrade_rejected (regression anchor: error.message contains "silent downgrade forbidden"; error code is one of INVALID_PARAMS / VALIDATION_ERROR / INVALID_REQUEST so the invariant decouples from any single transport) Uses existing storyboard matchers only (response_schema, field_present, field_value, field_value_or_absent, field_contains, error_code). LLM-generated response.message is asserted as present/non-empty only, so language and model provider remain implementation choices. The reference implementation kapoost/bragent (v0.2.0+) served as the empirical surface from which the assertions were derived; the storyboard itself does not depend on bragent being running. Open question for review: whether to also assert that rejected receipts surface in an audit endpoint. bragent has the shape (admin-gated /sessions/<id>/audit returns the dual-trail) but it is bragent-specific rather than a spec invariant, so this first pass stays on the wire-level SI surfaces.
There was a problem hiding this comment.
The automated review encountered an issue (possibly reached max turns, timed out, or failed to post the final gh pr review). A human reviewer should take this PR.
This is an automated message from the Argus AI review workflow.
There was a problem hiding this comment.
The automated review encountered an issue (possibly reached max turns, timed out, or failed to post the final gh pr review). A human reviewer should take this PR.
This is an automated message from the Argus AI review workflow.
Three corrections after first push surfaced CI failures: - Inline literal sponsored_context in every receipt sample_request instead of `$context.sponsored_context`. The schema lint substitutes string placeholders only — object-shaped fields need a literal so the allOf chain in si-sponsored-context-receipt.json can validate. - Allow-grow the deliberately schema-invalid mismatch sample (silent_downgrade_rejected/si_send_message_with_mismatched_receipt) into tests/storyboard-sample-request-schema-allowlist.json. The whole point of the step is to send an envelope the spec rejects. - Drop the field_contains assertion on `error.message` containing "silent downgrade forbidden". The storyboard primitives don't currently expose an error-message matcher and Bill's call on adcontextprotocol#5541 was to avoid adding a new assertion language in the first pass. The wording recommendation now lives in the step's `expected:` text as a manual review pointer; promoting it to a hard check is a follow-up if the WG wants an `error_message_contains` matcher. Also constrained error_code to the canonical AdCP enum (VALIDATION_ERROR, INVALID_REQUEST) per the error-code lint — INVALID_PARAMS is JSON-RPC, not AdCP. All lint + integration tests now pass locally (264 test files, 4162 tests).
e1edd46 to
dec0547
Compare
There was a problem hiding this comment.
The automated review encountered an issue (possibly reached max turns, timed out, or failed to post the final gh pr review). A human reviewer should take this PR.
This is an automated message from the Argus AI review workflow.
|
Force-pushed twice — heads-up so the review history reads cleanly:
Local Ready for review. |
|
This shape looks right to me. Thanks for keeping it to one YAML, four inline phases, and existing storyboard matchers only. +1 on leaving the audit-endpoint round trip out of this first pass. I’d keep #5551 at the wire-level SI conformance boundary: sponsored_context declaration, accepted receipt, rejected receipt, and mismatch rejection. Audit readback feels like a follow-up only if/when the WG decides there is a normative SI audit surface rather than a bragent-local admin API. One small cleanup before merge: after the follow-up commit, the YAML no longer hard-asserts |
|
PR description updated to match the final YAML:
One item I can't push to directly since it lives on @kapoost's branch: the changeset at
@kapoost, a small push to your branch with those two changeset edits will close the loop. Generated by Claude Code |
Two lines drifted from the YAML during the lint-alignment force-push.
Match the PR description bokelley updated:
- silent_downgrade_rejected bullet now reflects error_code ∈
{VALIDATION_ERROR, INVALID_REQUEST}; the "silent downgrade
forbidden" message wording is a manual-review pointer in
`expected:`, not a hard check.
- Matchers sentence trimmed to the four primitives the final yaml
actually uses: response_schema, field_present, field_value,
error_code. field_value_or_absent and field_contains were
removed during the lint-alignment pass.
|
@bokelley — done in |
There was a problem hiding this comment.
The automated review encountered an issue (possibly reached max turns, timed out, or failed to post the final gh pr review). A human reviewer should take this PR.
This is an automated message from the Argus AI review workflow.
|
I think this is the right shape overall, but I would like one cleanup before approval:
CI is green and I agree with leaving audit-endpoint readback out of this first pass. This just needs the fixture/prose alignment so the storyboard is testing the contract it describes. |
|
All three fixes are clear. The PR branch is in a fork outside my push scope, so here are the exact replacements for @kapoost to land in one commit. 1. Remove the runtime-substitution claim
with:
Also in the
2. Normative wording
-"Validates that a conformant brand-side SI agent emits sponsored_context on every response, accepts well-formed sponsored_context_receipt envelopes, and rejects receipts that silently downgrade declared context_use."
+"Validates that a brand-side SI agent claiming sponsored_context_accountability conformance emits sponsored_context on each response, accepts well-formed sponsored_context_receipt envelopes, and rejects receipts that silently downgrade declared context_use."
-AdCP 3.1.0-rc.13/rc.14 introduced the sponsored_context_accountability surface (PR #5501): every SI
-response carries a `sponsored_context` envelope
+AdCP 3.1.0-rc.13/rc.14 introduced the sponsored_context_accountability surface (PR #5501): in this
+conformance profile, a brand-side SI agent's every response carries a `sponsored_context` envelopeValidation in -description: "PR #5501 envelope: every SI response must carry sponsored_context"
+description: "sponsored_context_accountability conformance target: brand-side SI agent must emit sponsored_context on each response"3. Changeset drift — In -- `required_disclosure_commitment` — host's receipt carries `disclosure_commitment.status=accepted`; agent accepts the receipt without erroring regardless of whether its own declaration carried `required=true`.
+- `required_disclosure_commitment` — literal `sponsored_context` carries `disclosure_obligation.required=true`; host's receipt carries `disclosure_commitment.status=accepted`; agent accepts the well-formed receipt without error.Triaged by Claude Code. Session: https://claude.ai/code/session_01M3SfYLeRAe9pDHn1QtkdAr Generated by Claude Code |
…cope Three fixes per bokelley's adcontextprotocol#5551 review (2026-06-16): 1. Remove runtime-substitution claim. The storyboard runner does not perform object-shaped context substitution, so the YAML's claim that sample_request envelopes are replaced with the prior emission was inaccurate. The samples send static Acme outdoor receipt literals; the assertions test receipt acceptance/rejection semantics, not principal-matching against the agent's own declaration. Updates `prerequisites.description` and drops the misleading last sentence from `si_send_message_presentation_accepted` narrative. 2. Normative wording. Baseline SI frames `sponsored_context` emission as MAY/SHOULD; this storyboard can be stricter as an opt-in conformance profile, but the prose should say so explicitly. Updates `summary`, top-level `narrative`, and the `si_initiate_session_presentation` validation description to scope the MUST to agents claiming sponsored_context_accountability conformance. 3. Changeset drift. The `required_disclosure_commitment` bullet still read "regardless of whether its own declaration carried required=true" — but the YAML now deliberately uses a `required=true` literal. Rewrites the bullet to match the actual phase behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
The automated review encountered an issue (possibly reached max turns, timed out, or failed to post the final gh pr review). A human reviewer should take this PR.
This is an automated message from the Argus AI review workflow.
First-pass SI brand-side conformance storyboard for the surfaces shipped in #5501, scoped per the discussion under #5541.
What's in the box
One yaml at
static/compliance/source/protocols/sponsored-intelligence/sponsored-context-accountability.yamlwith four phases:presentation_only_happy_pathsponsored_contextenvelope (paying_principal nested, context_use, disclosure_obligation, declared_by.role=brand_agent), host returns an accepted receipt with matchingaccepted_context_use, second turn lands cleanly.required_disclosure_commitmentdisclosure_commitment.status=accepted; agent accepts the well-formed receipt regardless of whether its own declaration carriedrequired=true.rejected_receipthost_receipt.status=rejectedwith arejection_reason; agent accepts the rejection as a valid wire response.silent_downgrade_rejectedaccepted_context_use; agent MUST reject. Regression anchor onerror_code ∈ {VALIDATION_ERROR, INVALID_REQUEST}; recommended error message wording ("silent downgrade forbidden") is documented inexpected:as a manual-review pointer, not a hard assertion.Design choices, per the thread on #5541
response_schema,field_present,field_value,error_code. No new assertion language.error_code(acceptingVALIDATION_ERRORorINVALID_REQUEST); recommended error message wording is captured inexpected:as a manual-review pointer.Empirical surface
kapoost/bragent v0.2.0+ was the reference implementation from which the assertions were derived. The storyboard itself does not depend on bragent being running — the suite stays self-contained.
Open question for review
Whether to also assert that the rejected receipt surfaces in an audit endpoint. bragent has the shape (admin-gated
/sessions/<id>/auditreturns the dual-trail withrejection_reasonround-tripped) but it is bragent-specific rather than a spec invariant, so this first pass stays on the wire-level SI surfaces. Happy to add an audit phase as a follow-up commit if the spec wants to elevate audit to a normative SI surface.cc @sangilish (matrix authoring per the #5541 thread) and @bokelley (PR #5501 author).