ci(claude-review): submit a formal PR review (verdict + summary) via the Reviews API by umangyadav · Pull Request #2391 · ROCm/rocMLIR

umangyadav · 2026-05-29T15:44:05Z

Summary

Switch the bot's overall review from a stand-alone issue comment to a formal pull-request review (gh pr review --comment) carrying a new verdict field (APPROVE / REQUEST_CHANGES / COMMENT) plus the Markdown summary.

Every verdict is unconditionally submitted as --comment (advisory). This prevents the bot from ever ticking a branch-protection approving-review slot or leaving a sticky CHANGES_REQUESTED block (the gh-aw#27655 / PR #27662 wedge). The model's verdict is preserved in a computed body header. Full rationale in CLAUDE_AUTO_REVIEW.md §13.

Also adds defense-in-depth sanitizer re-checks (verdict enum, summary non-empty string, every string-typed field really is a string or null) to close a redaction-bypass class where a schema regression could partially leak values via jq's truncated error message.

Test plan

Sanitizer fixture suite: 148/148 PASS (121 → 148, +27 covering verdict / summary / field-type / redaction).
End-to-end smoke with stubbed gh: COMMENT initial, APPROVE re-review with all 3 thread-update types, REQUEST_CHANGES + suggestion fence, bad-verdict defensive arm, 422 soft-skip, 500 hard-failure resilience.
shellcheck clean on the modified scripts.
CI: validate on first claude-review label after merge.

…the Reviews API Post the model's verdict + Markdown summary as a formal pull-request review (`gh pr review --comment`) instead of a stand-alone issue comment. The verdict is unconditionally submitted as `--comment` (advisory) so the bot can never tick a branch-protection approving- review slot or leave a sticky CHANGES_REQUESTED block (the github/gh-aw#27655 + PR #27662 wedge for the same class of tool). The review body has a deterministic header computed from `verdict` + finding counts; the model owns only the `summary` section (`## Scope` / `## Findings` / `## Notes` / `## CI status`). The header label switches from `Findings:` to `New findings:` on re-review runs so a fully-resolved PR cannot be misread as "always clean". Defense-in-depth sanitizer re-checks of schema-expressible fields: the `verdict` enum; that `summary` is a non-empty string; that every other string-typed field (`inline_comments[].body` / `.suggestion`, `thread_updates[].body`) actually is a string (or null). Closes a redaction-bypass class where a future schema regression slipping a non-string into the later `utf8bytelength` / `test` / `contains` / `split` scans would partially leak the value via jq's truncated error message. Fail-closed with a fixed-text error; no model- controlled content reaches the public Actions log. Sanitizer fixture suite: 121 -> 148 (verdict enum, summary type, field-type checks, plus redaction pins on each rejection path). Architecture, threat model, and porting checklist in CLAUDE_AUTO_REVIEW.md sections 8, 12, 13. Co-authored-by: Cursor <cursoragent@cursor.com>

rocmlir-pr-reviewer · 2026-05-29T15:44:58Z

⚠️ This PR modifies the Claude-review CI security perimeter

The following files in this PR control whether and how the
claude-review workflow protects its LLM Gateway secrets at
runtime (see .github/workflows/CLAUDE_AUTO_REVIEW.md):

.claude/skills/review-rocmlir-pr/SKILL.md
.claude/skills/update-pr-review/SKILL.md
.github/scripts/post_claude_review.sh
.github/scripts/sanitize_claude_actions.sh
.github/scripts/tests/test_sanitize.sh
.github/workflows/CLAUDE_AUTO_REVIEW.md
.github/workflows/claude_auto_review.yml

Before applying the claude-review label on this PR, please:

Audit the diff in these paths line-by-line. A malicious or
accidental change could disable the --allowedTools
restriction, the overlay step, the sanitizer, or the
review/post job split -- any of which would expose the
secrets in env to the PR-modified workflow.
If the changes are legitimate and you want a Claude review,
do NOT apply the claude-review label. Instead, run via
Actions → Claude Auto Review → Run workflow and enter
this PR's number. The dispatch path runs from the trusted,
code-owner-approved version of the workflow on
develop, so a malicious PR-side
modification cannot affect the run.

(This banner is automated. The modifies-ci-paths label
will be removed automatically if a future push removes the
perimeter modifications. The Layer-3 in-workflow guard will
additionally fail the claude-review label-triggered run
on this PR if the label is applied while perimeter changes
are present.)

…n-0 comments Address PR review feedback (#2391): 1. The COMMENT-case comment on line 299 read "model verdict matches event", which could be misread as "the bot's submitted event sometimes varies". Reword to be explicit that the bot always submits as COMMENT, so a model verdict of COMMENT specifically needs no annotation. 2. The defensive *) arm's `return 0` is essential flow control (skipping the body-build / `gh pr review --comment` submission below), not just an exit-status statement. Without it a sanitizer-bypassed bad verdict would land verbatim in the rendered review header. Add a clarifying comment so a future reader doesn't mistake it for dead code. No behavioural change. Smoke + sanitizer fixture suite (148/148) unchanged. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

This PR updates the Claude auto-review pipeline to submit its result as a formal pull-request review (via the Reviews API / gh pr review --comment) instead of a top-level PR conversation comment, while adding a model-emitted verdict field (APPROVE / REQUEST_CHANGES / COMMENT) that is displayed in a deterministic header but always submitted as an advisory COMMENT review. It also adds defense-in-depth sanitizer checks (verdict enum, non-empty summary, and string-typed field enforcement) plus expanded sanitizer fixtures.

Changes:

Switch posting behavior from “top-level summary comment” to “formal PR review submission” (always --comment), including a computed header showing verdict and finding counts.
Extend the structured output contract to require verdict and a non-empty Markdown summary, and mirror that across workflow schema + skills docs.
Harden sanitize_claude_actions.sh against schema-regression redaction bypasses and expand sanitizer test coverage accordingly.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`.github/workflows/claude_auto_review.yml`	Updates the model JSON schema to require `verdict`, updates prompt guidance, and renames the post step to reflect review submission.
`.github/workflows/CLAUDE_AUTO_REVIEW.md`	Documents the new “formal review” behavior, COMMENT-only submission policy, and updated output contract/security rationale.
`.github/scripts/tests/test_sanitize.sh`	Adds extensive fixtures for verdict/summary validation, type enforcement, and stderr redaction guarantees.
`.github/scripts/sanitize_claude_actions.sh`	Adds verdict/summary validation and string-type defense-in-depth checks before later jq string operations.
`.github/scripts/post_claude_review.sh`	Replaces the prior top-level summary comment with `gh pr review --comment`, adding a deterministic header and preserving marker attribution.
`.claude/skills/update-pr-review/SKILL.md`	Updates the required output schema/rules to include `verdict` and Markdown `summary` sections.
`.claude/skills/review-rocmlir-pr/SKILL.md`	Updates the output schema/rules to include `verdict` and the new structured Markdown summary expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+**Review body layout** (built by `post_claude_review.sh`, NOT the model):
+
+```markdown
+**Verdict:** APPROVE -- submitted as COMMENT (automated reviews are advisory)  ·  **Findings:** N (C Critical, J Major, K Minor)


rocmlir-pr-reviewer · 2026-05-29T16:02:25Z

Scope

CI-only change: switches the bot's overall review from a stand-alone issue comment to a formal pull-request review (gh pr review --comment) carrying a new verdict field, and adds defense-in-depth sanitizer re-checks for the verdict enum, the summary non-empty-string contract, and the string-typedness of inline_comments[].body/.suggestion and thread_updates[].body. Touches 7 files under .github/ and .claude/skills/ only -- no MLIR / Rock / C++ surface.

Findings

No blocking issues found.

Notes

post_review correctly returns 0 (but sets HAD_FAILURE=1) on an unknown verdict, leaving inline comments and thread updates already posted and surfacing the failure at script exit -- the right behaviour given the post-step ordering.
The all(... ; . == null or type == "string") predicate is vacuously true on empty arrays, so an APPROVE with no inline comments and no thread updates still passes the new type re-check, as intended.
COMMENT-only submission is implemented as a single hardcoded rule in post_review with no env/workflow toggle, matching the §13 threat-model rationale (a misconfiguration cannot flip it).
One previously-noted Copilot comment (header doc example uses regular spaces / letter placeholders while the script outputs  ·  / numeric counts) is doc-only and already raised in a prior thread; not duplicating it here.

CI status

No non-self CI failures. Jenkins / Build-and-Test / MIGraphX still pending; sanitizer-fixture-suite passing (148/148 per PR body).

…ew, not thread_updates.length Address PR review feedback (#2391): the post-review header label switched from `Findings:` to `New findings:` when `thread_updates` was non-empty. That heuristic missed a real re-review state: on a re-run where every prior thread was suppressed by its dedup gate (Scenario D non-regression -- still present, no human reply, last bot reply was clarify/null; plus Scenarios A/B/C dedup-gated as already-resolved or already-clarified), the model emits inline_comments == [] AND thread_updates == [], the header reads `Findings: 0`, and a maintainer can misread the PR as "always clean" -- exactly what the New-findings: switch exists to prevent. Fix: derive the signal in the pre-fetch step (which already has prev_comments.json in hand) using the same BOT_LOGIN + marker + root-comment filter the prompt's Step 1 uses, and stash `is_re_review` on meta.json (already uploaded as an artifact). The post job reads it via `.is_re_review // false`, so a workflow rollback between runs is still safe. The two derivations (pre-fetch jq and prompt Step 1) MUST stay byte-for-byte identical; added a row in the doc §15 sync table. Verified: - 4 smoke cases (initial / dedup-gated re-review / missing field fallback / re-review with updates emitted) all render the correct header. - 7 unit tests on the pre-fetch jq derivation against mock prev_comments.json shapes (empty, only humans, bot reply sub-thread, bot root w/o marker, bot root w/ marker, mixed, wrong bot identity) all PASS. - Sanitizer fixture suite still 148/148. Co-authored-by: Cursor <cursoragent@cursor.com>

Address Copilot review feedback (#2391): the `markdown`-fenced example of the review body layout drifted from what `post_claude_review.sh` actually writes: - Dot separator: doc had ` · ` (2 ASCII spaces + middle-dot + 2 ASCII spaces); the script writes ` · ` (literal HTML entities around the middle-dot). The rendered output looks the same, but a maintainer grepping a real review body for ` · ` would never match -- the doc's `markdown` fence claims to show source, so showing source-that-isn't is misleading. - Per-severity counts: doc used opaque placeholders `N (C Critical, J Major, K Minor)`; the script always writes concrete integers (`%d`). The `C/J/K` letters aren't mnemonic (`J`, `K` have no relation to "Major"/"Minor") and the APPROVE verdict in the example is inconsistent with non-zero counts (the decision rule says zero findings -> APPROVE). Switch the example to a self-consistent REQUEST_CHANGES + 3 (1/1/1) that also exercises the `submitted as COMMENT` annotation, and add the helper sentence "Numbers shown are illustrative" so a future reader doesn't read the integers as a literal template. Verified byte-for-byte equality of the doc example against the script's `printf` output. No code change; sanitizer suite still 148/148. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

umangyadav requested a review from causten as a code owner May 29, 2026 15:44

rocmlir-pr-reviewer Bot added the modifies-ci-paths PR modifies the Claude review CI security perimeter; audit before applying claude-review label May 29, 2026

umangyadav requested a review from Copilot May 29, 2026 15:52

Copilot started reviewing on behalf of umangyadav May 29, 2026 15:52 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

Comment thread .github/workflows/CLAUDE_AUTO_REVIEW.md Outdated

**Review body layout** (built by `post_claude_review.sh`, NOT the model):

```markdown

**Verdict:** APPROVE -- submitted as COMMENT (automated reviews are advisory) · **Findings:** N (C Critical, J Major, K Minor)

umangyadav and others added 2 commits May 29, 2026 16:04

umangyadav requested a review from Copilot May 29, 2026 16:31

Copilot started reviewing on behalf of umangyadav May 29, 2026 16:31 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

causten merged commit 80b3fe0 into develop May 29, 2026
7 of 16 checks passed

causten deleted the users/umayadav/claude-review-formal-review branch May 29, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(claude-review): submit a formal PR review (verdict + summary) via the Reviews API#2391

ci(claude-review): submit a formal PR review (verdict + summary) via the Reviews API#2391
causten merged 4 commits into
developfrom
users/umayadav/claude-review-formal-review

umangyadav commented May 29, 2026 •

edited

Loading

Uh oh!

rocmlir-pr-reviewer Bot commented May 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

rocmlir-pr-reviewer Bot commented May 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

umangyadav commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

rocmlir-pr-reviewer Bot commented May 29, 2026

⚠️ This PR modifies the Claude-review CI security perimeter

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

rocmlir-pr-reviewer Bot commented May 29, 2026

Scope

Findings

Notes

CI status

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

umangyadav commented May 29, 2026 •

edited

Loading