Refactor review_analysis + add 3-system and human-vs-AI overlap analyses by dangng2004 · Pull Request #91 · ChicagoHAI/OpenAIReview

dangng2004 · 2026-05-21T20:54:22Z

Summary

Refactors the review_analysis venn/cluster plumbing into a shared helper module, then adds two new comparison axes on top of it.

Refactor

utils.py — shared load / para_set / regions_{2,3} / draw_venn{2,3} / save_fig; plots now written to plots/ in both PNG and PDF
analysis.py, analysis_gpt_claude.py — refactored to use utils; old top-level venn PNGs deleted (regenerated under plots/)

New comparisons

analysis_three_systems.py — 3-way paragraph-index overlap of coarse / OpenAIReview / Reviewer 3 on their common ~70-paper cohort
analysis_with_humans.py — overlap between human OpenReview reviewers and the AI-system union; two-pass LLM concern-extraction + paragraph-mapping with on-disk .cache/
cluster_new.py — KMeans clustering for the two new comparisons

Other

.gitignore — cover plots/, .cache/, generated cluster_*.json / per_paper_*.json, and the local frontier_subset_progressive symlink
benchmarks/perturbation/_combine_gpt_claude.py — paper-table helper: combined (GPT-5.5 OR Claude-Opus-4.7) recall on the 24-paper frontier subset for tab:recall-overall in perturbation.tex

Test plan

python analysis.py produces plots/venn_cp.{png,pdf} and plots/venn_all.{png,pdf} without errors
python analysis_three_systems.py produces the 3-way overlap plot
python analysis_with_humans.py produces the human-vs-AI venn (uses cached LLM outputs on rerun)
python _combine_gpt_claude.py from benchmarks/perturbation/ prints combined recall numbers matching the paper table

🤖 Generated with Claude Code

* utils.py — extract shared load/para_set/regions/venn helpers; move plots under plots/ in both PNG and PDF * analysis.py, analysis_gpt_claude.py — refactor to use utils; add docstrings; drop the old top-level venn PNGs (regenerated to plots/) * analysis_three_systems.py — 3-way paragraph-index overlap of coarse / OpenAIReview / Reviewer 3 on their common 70-paper cohort * analysis_with_humans.py — overlap between human OpenReview reviewers and the AI-system union; two-pass LLM concern-extraction + paragraph mapping with on-disk .cache/ * cluster_new.py — KMeans clustering for the two new comparisons * .gitignore — cover plots/, .cache/, generated cluster/per-paper JSONs, and the local frontier_subset_progressive symlink * _combine_gpt_claude.py — compute combined (GPT-5.5 OR Claude-Opus-4.7) recall on the 24-paper frontier subset for tab:recall-overall

dangng2004 marked this pull request as draft May 21, 2026 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor review_analysis + add 3-system and human-vs-AI overlap analyses#91

Refactor review_analysis + add 3-system and human-vs-AI overlap analyses#91
dangng2004 wants to merge 1 commit into
mainfrom
feat/venn-analyses

dangng2004 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dangng2004 commented May 21, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant