Skip to content

fix(automerge): narrow infra/ regex; let cron-only changes auto-merge#70

Merged
topcoder1 merged 1 commit into
mainfrom
claude/narrow-infra-risk-pattern
May 20, 2026
Merged

fix(automerge): narrow infra/ regex; let cron-only changes auto-merge#70
topcoder1 merged 1 commit into
mainfrom
claude/narrow-infra-risk-pattern

Conversation

@topcoder1
Copy link
Copy Markdown
Owner

Summary

Replaces the over-broad ^infra/.* pattern in claude-author-automerge.yml with explicit risky-subcategory rules. Cron-only changes, runbook docs, and example configs under infra/ are now correctly auto-merge eligible; IAM policies, nginx configs, systemd units, Terraform, and deploy scripts stay manual-merge.

Adds selftest/test_automerge_risk_patterns.sh to lock the matching behavior in version control so future regex edits can't silently regress.

Why

The bot's ^infra/.* rule over-classified cron-only changes as manual-merge. Concrete cases triggering this PR:

Both PRs touched only infra/crontabs/wxa-scanner.crontab under infra/, and both were blocked on auto-merge by the bot's path classifier. Per the wxa_vpn#250 lesson codified in CLAUDE.md (2026-05-04) about over-classification, those PRs are auto-merge territory: the existing system runs slightly differently for one cycle, git revert and we're back. The bot saying "manual-merge required" was wrong; the policy text in CLAUDE.md says `infra/iam/**`, not all of `infra/`.

What changes

Risky (manual-merge): unchanged for IAM, deploy/terraform/pulumi/k8s/cloudformation/ansible/digitalocean/scanner-id subdirs, nginx, .service/.slice/.timer/.tf/.hcl/.sh anywhere under infra/.

Auto-merge eligible (the historical false positives):

  • `infra/crontabs/*.crontab` — cron schedule changes
  • `infra/*.md` — runbook docs
  • `infra/crontab.example` — example configs

Tests

`selftest/test_automerge_risk_patterns.sh` — 33 paths that MUST match (risky) + 11 paths that MUST NOT match (safe). Mirrors the regex block; edit both in lockstep.

The two `infra/crontabs/*.crontab` and `infra/aws-scanner-setup.md` cases that motivated this fix are explicitly in the SAFE bucket — a future broadening of the regex would fail the test instead of silently re-introducing the false positive.

Test plan

  • `bash selftest/test_automerge_risk_patterns.sh` passes locally (44 cases)
  • After this lands and fleet picks it up: re-test on a Claude-authored PR touching only `infra/crontabs/foo.crontab` — should auto-enable merge

Auto-merge rationale

This PR itself touches `.github/workflows/claude-author-automerge.yml` — caught by the `^\.github/workflows/.*` rule (correctly), so it'll click-merge. Expected and policy-aligned.

🤖 Generated with Claude Code

The broad `^infra/.*` pattern over-classified cron-only changes as manual-
merge. Concrete case: wxa_vpn#439 (rDNS-driven CDN upgrade, Tier B3) and
wxa_vpn#441 (per-operator CloudWatch telemetry, Tier B5) both added a single
weekly/daily cron entry to `infra/crontabs/wxa-scanner.crontab` and were
blocked on auto-merge by the bot's classifier. Per the wxa_vpn#250 lesson
(codified 2026-05-04) about over-classification, those PRs are auto-merge
territory: "the existing system runs slightly differently for one cycle,
git revert and we're back."

Replaces the single `^infra/.*` rule with explicit risky-subcategory rules:
  * ^infra/iam/.*   — IAM trust policies (security-critical)
  * ^infra/(deploy|terraform|pulumi|k8s|cloudformation|ansible|
    digitalocean|scanner-id)/.* — IaC + host config
  * ^infra/nginx.*  — nginx prod config (top-level + subdir form)
  * ^infra/.*\.(service|slice|timer|tf|hcl|sh)$ — systemd units +
    Terraform files + shell scripts anywhere under infra/

What's now correctly auto-merge eligible under infra/:
  * infra/crontabs/*.crontab — cron schedule changes
  * infra/*.md             — runbook docs
  * infra/crontab.example  — example config

What stays manual-merge (genuinely risky):
  * IAM policies, nginx configs, systemd units, Terraform .tf, deploy
    shell scripts, host identity, IaC under infra/

Adds selftest/test_automerge_risk_patterns.sh with 33 RISKY cases that MUST
match and 11 SAFE cases that MUST NOT match. The test mirrors the regex
block from claude-author-automerge.yml — if you edit one, edit the other.
The script keeps the wxa_vpn#439/#441-style false positives (cron files,
runbook docs) explicitly listed in the SAFE bucket so a future broadening
of the pattern fails the test instead of silently regressing.

Refs the 2026-05-04 over-classification lesson + the 2026-05-20 wxa_vpn
incidents (#439, #441) that surfaced the bug.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the risk:blocked Risk class: blocked label May 20, 2026
@github-actions
Copy link
Copy Markdown

Risk class: blocked — manual merge required.

This PR touches one of the blocked path categories from .github/risk-paths.yml (Dockerfiles, docker-compose, .github/workflows/**, **/.env*, **/secrets*, infra/, terraform/, k8s/, or the classifier config itself).

Auto-merge is refused by claude-author-automerge.yml. A maintainer should review the diff and click "Squash and merge" themselves.

(This is a policy notice, not a code-quality failure. The classify job itself does not fail — required CI checks remain authoritative for "is the code green.")

@github-actions
Copy link
Copy Markdown

Coverage Floor — mode: enforce

metric value
measured 100.0%
floor (current) 99.0%
target 100.0%
last bumped 2026-05-12

@claude
Copy link
Copy Markdown

claude Bot commented May 20, 2026

No issues found. Regex narrowing is deliberate and correct; the 44-case self-test script validates all risky and safe buckets. One minor uncovered gap: infra/docker-compose.yml (and similarly named yaml files directly under infra/) is no longer caught as risky after removing ^infra/.*, since ^docker-compose.*\.ya?ml$ is anchored to the repo root — worth adding to the RISKY test list if docker-compose ever appears under infra/ in the fleet.

@topcoder1 topcoder1 merged commit 21b354e into main May 20, 2026
13 checks passed
@topcoder1 topcoder1 deleted the claude/narrow-infra-risk-pattern branch May 20, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

risk:blocked Risk class: blocked

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant