[PM-36854] feat: Add generating-review-storybook skill (POC) by SaintPatrck · Pull Request #111 · bitwarden/ai-plugins

SaintPatrck · 2026-05-07T20:38:22Z

🎟️ Tracking

PM-36854

⚠️ POC — proof of concept exploring whether a static, self-contained storybook artifact can meaningfully reduce reviewer cognitive load on stacked AI-written PRs. Not intended for merge without further discussion on scope, hosting, and maintenance.

📔 Objective

Adds a new generating-review-storybook skill to the bitwarden-code-review plugin. The skill packages a stack of PRs (or commits) into a self-contained, double-clickable HTML walkthrough designed for humans reviewing AI-written code:

Verdict-first hierarchy — cover surfaces approve / approve-fix / block / pending before any diff
Narrative chapters — each PR is broken into 2–5 logical chapters with a narrative paragraph, not a flat alphabetical file list
Inline annotations as marginalia — Claude findings (from code-review-local) and optionally human reviewer threads (via gh api graphql reviewThreads) render at the diff line they reference
Copy-as-Markdown export — reviewer notes leave the storybook as a Markdown handoff

What ships

SKILL.md (1,087 words, ~106 lines) interviews the user, builds a config, and runs the scaffolder
scripts/scaffold.py — deterministic renderer; default output $CLAUDE_PLUGIN_DATA/storybooks/<slug>-<timestamp>/
scripts/capture_diffs.py — fetch PR/commit diffs as base64
scripts/parse_review_md.py — extract verdicts and findings from existing bitwarden-code-review summary files
scripts/fetch_pr_threads.py — pull existing GitHub review threads into the storybook comments[] shape (outdated skipped, resolved kept with _(resolved)_ suffix)
assets/template/ — bundled HTML/CSS/JS template
references/data-schema.md and references/customization.md

Why a POC

The storybook design makes several bets that should be tested with real reviewers before this graduates:

(1) verdict-first hierarchy actually reduces triage time vs. a flat PR list,
(2) narrative chapters are worth the synthesis cost on Claude's end,
(3) static-site-as-artifact is the right delivery mode rather than e.g. a Slack thread or GitHub Check.

Feedback on these is the explicit goal of the POC.

Example

Example storybook based on bitwarden/android/pull/6863

storybook-andriod-6863.zip

📸 Screenshots

Screen.Recording.2026-05-07.at.4.57.20.PM.mp4

Packages a stack of PRs or commits into a self-contained HTML walkthrough for humans reviewing AI-generated code. Verdict-first cover, severity-grouped findings (with file:line locations and suggestions), per-PR diffs as drill-down detail, inline notes, and copy-as-Markdown export. Optional pre-baking from existing bitwarden-code-review summary files via parse_review_md.py. Bumps plugin to 1.11.0; adds SKILL.md, three Python scripts (scaffold, capture_diffs, parse_review_md), data-schema and customization references, bundled HTML/CSS/JS template, and CHANGELOG entry.

…table

… annotations in review storybooks The objective is to enhance the `generating-review-storybook` skill to provide a structured, scene-based walkthrough of code changes. This replaces the flat, per-PR file list with a narrative-driven experience where findings and comments are rendered inline. ### Behavioral Changes * **Narrative-First Navigation:** Storybooks now partition PRs into logical "Chapters" and "Scenes," each with its own page, title, and descriptive narrative to guide the reviewer. * **Inline Annotations:** AI-generated findings and human comments are now rendered as "gloss" marginalia directly at the relevant diff lines, rather than being grouped in a separate section. * **Enhanced Triage:** The file index and table of contents now include "severity dots" (visual indicators) to show where critical issues or comments exist within specific files. * **Walkthrough Hierarchy:** The cover page now displays a structured hierarchy of chapters and scenes instead of a simple PR list. ### Specific Changes #### Core Logic and Scaffolding * **`scaffold.py`**: Introduced `build_page_order` to map the nested PR/Chapter/Scene structure into a linear sequence of HTML pages. Added validation for the new `chapters` and `comments` schema fields. * **`app.js.tmpl`**: Refactored the rendering engine to support multi-page walkthroughs. Implemented `buildAnnotations` to group findings and comments by file and line number for inline rendering. * **`data-schema.md`**: Updated the configuration schema to include `chapters` (logical groupings with titles and narratives) and `comments` (human reviewer feedback). #### UI and Styling * **`styles.css`**: Added comprehensive styles for the new walkthrough components, including `.chapter-intro`, `.scene-page`, and the `.gloss` annotation system. Replaced the old card-based finding styles with a marginalia-inspired design. * **File Row Enhancements**: Updated `buildFileRow` to render status dots that reflect the severity and quantity of annotations within a file. * **Inline Gloss**: Implemented `renderGloss` to display AI findings (severity-colored) and human comments (accent-colored) with author metadata and suggested fixes. #### Agent Guidance * **`SKILL.md`**: Updated the AI agent instructions to prioritize "synthesizing chapters." The agent is now directed to group files logically and provide narrative context that explains the *intent* of changes rather than just describing file modifications. * **`CHANGELOG.md`**: Noted the shift to a verdict-first hierarchy and inline marginalia for improved reviewer triage.

…en gh_repo Address two findings from local review of the generating-review-storybook skill: - fetch_pr_threads.py was on disk but never referenced from SKILL.md, while the docs separately claimed real-PR-thread fetching was a "future integration." Wire it in as optional step 3b, update SKILL.md preamble, and reflect it as a shipping producer in references/data-schema.md. - gh_repo was the only user-supplied config value flowing into the app.js.tmpl JS template literal without escaping or validation. Add an owner/name regex check in scaffold.load_config so a malformed value is rejected up front, matching the rigor already applied to PR numbers in capture_diffs.py.

SaintPatrck added 4 commits May 4, 2026 17:01

Drop stale role-aware references in SKILL.md step 6 and README skill …

46bbd04

…table

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PM-36854] feat: Add generating-review-storybook skill (POC)#111

[PM-36854] feat: Add generating-review-storybook skill (POC)#111
SaintPatrck wants to merge 4 commits into
mainfrom
feat/generating-review-storybook

SaintPatrck commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SaintPatrck commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎟️ Tracking

📔 Objective

What ships

Why a POC

Example

📸 Screenshots

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SaintPatrck commented May 7, 2026 •

edited

Loading