From ab66da08a7ae2b66b18f5affd9e3a9b476909622 Mon Sep 17 00:00:00 2001 From: Pieter Viljoen Date: Mon, 29 Jun 2026 20:31:37 -0700 Subject: [PATCH 1/2] fix: address filed configuration and doc defects - configure.sh: drop 2>/dev/null in check_security so gh's error surfaces (matches check_secrets). - WORKFLOW.md D7.2/section 2: reword so it no longer claims reusable tasks *declare* permissions (they run under the caller's least-privilege grant). Closes #226 Co-Authored-By: Claude Opus 4.8 (1M context) --- WORKFLOW.md | 4 +- plans/00-GOTCHA-BRIEFING.md | 136 +++++ plans/INDEX.md | 97 ++++ plans/esphome-nonroot-migration-plan.md | 457 +++++++++++++++++ .../homeassistant-purpleair-migration-plan.md | 423 ++++++++++++++++ plans/nxwitness-migration-plan.md | 466 ++++++++++++++++++ repo-config/configure.sh | 2 +- 7 files changed, 1582 insertions(+), 3 deletions(-) create mode 100644 plans/00-GOTCHA-BRIEFING.md create mode 100644 plans/INDEX.md create mode 100644 plans/esphome-nonroot-migration-plan.md create mode 100644 plans/homeassistant-purpleair-migration-plan.md create mode 100644 plans/nxwitness-migration-plan.md diff --git a/WORKFLOW.md b/WORKFLOW.md index ffe3d58..cd93277 100644 --- a/WORKFLOW.md +++ b/WORKFLOW.md @@ -109,7 +109,7 @@ violate section 4. trigger blocks. `workflow_dispatch` delivers the string `"true"`/`"false"`, so any `if:` consuming it compares both forms: `${{ inputs.foo == true || inputs.foo == 'true' }}`. - **Reusable-workflow permissions.** Job-level `permissions:` are validated before `if:`, so even a - skipped job needs valid permissions declared. Grant least privilege. A callee's extra scope (e.g. + skipped job's declared permissions must be valid. Grant least privilege. A callee's extra scope (e.g. `actions: write` to delete artifacts) is granted by the caller at the `uses:` job. - **Allowlist `success` and `skipped` explicitly** when chaining across an optional dependency. `!= 'failure'` lets `cancelled` through. Use `(needs.X.result == 'success' || needs.X.result == @@ -468,7 +468,7 @@ applicable guarantee is not operational (section 1). ref-independent group with `cancel-in-progress: false`. All other entry workflows use the `...-${{ github.ref }}` group with `cancel-in-progress: true`, except the merge-bot (PR-number group, D8.1) and the daily codegen workflow (ref-independent `${{ github.workflow }}` group with `cancel-in-progress: true`, section 2). -- **D7.2 Skipped jobs still need valid permissions.** Output: every reusable job declares valid +- **D7.2 Skipped jobs still need valid permissions.** Output: every reusable job runs under valid least-privilege `permissions:`, and a callee's extra scope is granted by the caller. - **D7.3 Boolean inputs both forms.** Output: boolean inputs are declared in both trigger blocks and compared against `true` and `'true'`. diff --git a/plans/00-GOTCHA-BRIEFING.md b/plans/00-GOTCHA-BRIEFING.md new file mode 100644 index 0000000..f1b1328 --- /dev/null +++ b/plans/00-GOTCHA-BRIEFING.md @@ -0,0 +1,136 @@ +# Branch-scoped CI/CD migration — shared planning briefing + +You are writing a **detailed, review-ready migration + convergence plan** for one ptr727 repo. Four +sibling repos (LanguageTags, Utilities, PlexCleaner, VSCode-Server-DotNetCore) are already migrated to +this model and live. Your plan must **front-load every gotcha below** so nothing is rediscovered after +merge (the maintainer lost a lot of time to post-merge surprises in the last round and explicitly wants +that avoided this round). + +Do **NOT** make code changes. Produce a plan file only. Be concrete: name files, jobs, triggers, and the +exact guards/SHAs. Where the repo's release model genuinely conflicts with the canonical model, FLAG the +tension explicitly with a recommendation rather than papering over it. + +## Canonical references to read first (ground every claim in these, not assumptions) + +- Docker model + terse comments + 5D audit: `/home/pieter/PlexCleaner/` — read `WORKFLOW.md`, + `.github/workflows/*.yml`, `repo-config/configure.sh`, `repo-config/README.md`, `AGENTS.md`, + `CODESTYLE.md`, `.github/copilot-instructions.md`. +- NuGet/.NET model: `/home/pieter/LanguageTags/` (same files). (Only relevant if your repo publishes + NuGet — none of the three remaining do, but the .NET tooling sections still apply to NxWitness.) +- The just-completed Docker-only plan (closest template for a no-NuGet Docker repo): + `/home/pieter/.claude/plans/we-are-investigate-enhancement-peppy-bentley.md`. +- Memory (durable gotchas, read all): `/home/pieter/.claude/projects/-home-pieter-LanguageTags/memory/`. + Especially `branch-scoped-cicd-review-gotchas.md`, `nbgv-publicrelease-githubref-leak.md`, + `docker-publishing-pattern.md`, `branch-scoped-migration-playbook.md`, `github-auto-delete-branch-gotcha.md`, + `copilot-review-flow.md`, `terse-comments.md`. + +## The model (one sentence): one run = one branch. + +- **CI (`test-pull-request.yml`)**: triggers on **push to every branch** (not `pull_request`); runs + validate + the project's smoke/test + a single aggregator job named exactly + `Check pull request workflow status job` (the ruleset's required check, matched by string). Self-testing: + pushing the branch IS the PR check. +- **Publisher (`publish-release.yml`)**: `workflow_dispatch` + `schedule` only. **NO `push` trigger for + Docker repos.** Schedule builds **main only** via `github` context. Dispatch builds `github.ref_name`, + guarded `if: github.ref_name == 'main' || github.ref_name == 'develop'`, passing `ref`/`branch` = + `github.ref_name`. **No branch matrix, no branch switching, no two-phase setup job, no PUBLISH_ON_MERGE.** + Merges never publish. +- **Promotion**: `develop -> main` PR, **merge-commit**, Copilot-reviewed, **no admin bypass**. + +## Gotchas to bake into the plan (each caused a real post-merge fix last round) + +1. **NBGV GITHUB_REF classification.** NBGV picks prerelease vs stable from `GITHUB_REF`, which is + **read-only** (cannot be overridden in-job — a past attempt was a silent no-op). The one-branch model is + the fix: `github.ref` already matches the built branch, so no `IGNORE_GITHUB_REF` hack is needed (that + override only ever made sense for the old branch-matrix publisher). `version.json` floor + + `publicReleaseRefSpec` `^refs/heads/main$`: main ships clean `X.Y.`, develop ships + `X.Y.-g` prerelease. Bump the version floor to exercise the publish path. +2. **NBGV threading.** Run NBGV **once** per leg in `get-version`/`build-release-task` and **thread the + computed `semver2` down** into the docker/build task. Do **not** add a nested `get-version` inside the + build task — a second NBGV run can reclassify and produce a `:SemVer2` tag collision or a wrong + stable/prerelease tag. +3. **Docker creds in BOTH secret stores.** Docker Hub (and App) credentials must exist in **both** the + Actions **and** Dependabot secret stores: a Dependabot-triggered run is given the Dependabot store, and + that run's push-CI does the Docker smoke/login. The `configure.sh` required-secret lists encode this. +4. **Branch-deletion guard.** Every `push`-triggered workflow job must guard `if: !github.event.deleted` + (and the head jobs) so deleting a branch does not fire a phantom run. +5. **merge-bot `--delete-branch`.** The merge-bot must merge bot PRs with + `gh pr merge --auto --delete-branch`. The repo-wide **auto-delete-on-merge setting stays OFF** (so a + `develop -> main` promotion does not delete `develop`); per-merge deletion is explicit instead. Without + `--delete-branch`, bot branches accumulate. +6. **5D audit (`repo-config/configure.sh`) — use the final hardened canonical form:** + - `jq_lacks`: `jq -e ... >/dev/null || rc=$?; case "$rc" in 0) return 1 ;; 1|4) return 0 ;; *) return "$rc" ;; esac` + — exit **4** (no output) is a "lacks" case (NOT just 1); keep **stderr** (only redirect stdout) so a real + jq error (2/3/5) shows its diagnostic. + - `check_secrets`: do **not** swallow gh stderr; an API/auth error **FAILs** the audit (cannot verify = + must fail), distinct from a genuinely missing secret. + - `ruleset_id`: let gh print its own error (no `2>/dev/null`), add a context line, return non-zero; select + the first match **inside jq** (`first // empty`), not `| head -1` (SIGPIPE under pipefail). + - `check_app`: best-effort **note**, never fails the audit (precise check needs app-level auth). + - The audit must **fail when it cannot verify**, never pass by default. +7. **Required-check name lockstep.** The ruleset's required status-check string, the aggregator job name + (`Check pull request workflow status job`), and the ruleset JSON must move in lockstep. The **first + `apply`** against the live repo is what lets a PR on the new workflows go green. Run `apply` in the same + change that ships the workflow edit, then `check`. +8. **Aggregator success/skipped allowlist (D7.4).** The aggregator gate must treat **success OR skipped** + as passing for conditionally-skipped jobs, else a legitimately-skipped job blocks the PR. +9. **EOL discipline.** CRLF for `.md`/`.yml`/`.json`/`.code-workspace`; LF for `.sh`/`Dockerfile`/`.py`. + `file` does NOT report CRLF for JSON — verify with `tr -cd '\r' | wc -c` or `grep -c $'\r'`. Pin these in + `.gitattributes`/`.editorconfig`. +10. **Copilot review loop.** Comments lag the run (wait + buffer). Threads must be **resolved** to merge. + Re-request review via the `requestReviews` GraphQL mutation with bot id `BOT_kgDOCnlnWA` — Copilot does + **not** auto-review every push, so re-request after each new head. `gh pr edit --body` fails on the + projects-classic GraphQL error → use `gh api -X PATCH repos/OWNER/REPO/pulls/N -F body=@file`. snupkg / + OIDC / NBGV-prerelease are recurring **false positives** — decline with rationale. Pure-prose/format + nits on a promotion PR: decline (would diverge main from develop). Every PR finds something new — expect + 1–3 rounds and budget for them. +11. **Dependabot + codegen dual-target main AND develop.** Deliberate — it solves the non-linear rebase / + merge-block conflicts that arose with single-target. Keep both targets unless you can prove a simpler + scheme merges develop->main without bypass; the maintainer has already rejected single-target. +12. **Strip template cruft.** Remove `build-datebadge-task.yml`, `publish-docker-readme-task.yml` (fold the + Docker Hub overview into the docker task via a `peter-evans/dockerhub-description` step on a **main** + publish), `dorny/paths-filter`, the `setup`/`PUBLISH_ON_MERGE` machinery, and any `merge-codegen` / + `merge-upstream-version` jobs the repo does not actually use. +13. **Action SHAs.** Use the converged newer pins (e.g. `actions/setup-dotnet` v5.4.0, `actions/checkout` + v7.0.0). **Verify any SHA->version claim against the GitHub API** before asserting it — Copilot has + hallucinated SHA/version mappings; do not trust them. **EXCEPTION — `dotnet/nbgv` is consumed via + `@master`, NEVER SHA-pinned.** The upstream tag stream lags `master` substantially, so a SHA pin draws + spurious Dependabot downgrade PRs; this is a deliberate documented exception (see ESPHome AGENTS.md + "Action pinning"). Do not convert `nbgv@master` to a SHA. Repos are currently split (PlexCleaner / + VSCode-Server / HA-PurpleAir SHA-pin it; ESPHome / NxWitness float `@master`) — converge toward `@master` + and never the other way. Likewise never edit a human-authored rule/rationale comment during a + terse-comment pass (the `@master` rationale comment must survive). +14. **Prose rules.** No em-dashes anywhere (hard rule). US English. Terse comments: one line if it fits + ~120 cols, structured, ASCII only; each workflow gets a top-of-file summary comment. Never edit + human-authored comments — only agent-authored ones, to the terse style. + +## Convergence requirement + +The shared sections must converge **byte-for-byte where possible** across all repos: the Comments +subsection, Git/Commit rules, "Where rules live", PR Review Etiquette, Documentation Style Conventions +(incl. the "write docs in the current state" rule), the `repo-config/` ruleset JSON and `configure.sh` +helper bodies, the copilot Review Runbook. Per-repo differences are limited to: project description, +secret names, target-specific D-guarantees, and the publish mechanics. Your plan must call out which +canonical sections port verbatim and which adapt. + +## What the plan file must contain (sections) + +1. **Current-state assessment** — exact triggers, jobs, SHAs, version scheme, branch hygiene, and how the + repo publishes today. Note partial-migration state and any messy branch backlog. +2. **Target architecture** — each workflow file: triggers, jobs, guards, threaded values, the publish + trigger, and the release artifact. Reconcile the repo's release cadence with the one-branch model and + FLAG conflicts. +3. **Release model decision** — the crux for each repo (see per-project note). State the recommended + trigger explicitly and why. +4. **Files to create / edit / delete** — full list. +5. **Convergence + backports** — canonical sections to port verbatim vs adapt; any drift to backport to the + four live repos. +6. **Gotcha checklist** — map each numbered gotcha above to where it applies in THIS repo (or "n/a, why"). +7. **Verification** — static (actionlint/markdownlint/cspell/parse/bash -n/EOL/em-dash sweep), config audit + (`configure.sh check` expected drift then "matches"), and live dispatch verification. +8. **Go-live sequence** — PR -> develop -> Copilot dance -> `apply` lockstep -> squash to develop -> + promote develop->main (no bypass) -> dispatch publisher -> verify artifacts -> confirm develop survives. +9. **Open questions for the maintainer** — anything genuinely ambiguous, with your recommended default. + +Write the plan to the path given in your task prompt. Make it thorough enough to execute from without +re-deriving the model. diff --git a/plans/INDEX.md b/plans/INDEX.md new file mode 100644 index 0000000..00976fe --- /dev/null +++ b/plans/INDEX.md @@ -0,0 +1,97 @@ +# Next-round migration plans — review index + +Three execute-ready plans, one per remaining repo, plus a future PyPI project. Each follows the same +9-section structure and maps the shared gotcha checklist ([00-GOTCHA-BRIEFING.md](./00-GOTCHA-BRIEFING.md)) +into repo-specific findings, so the post-merge surprises that cost time last round are surfaced BEFORE any +code changes. + +- [esphome-nonroot-migration-plan.md](./esphome-nonroot-migration-plan.md) — Docker, upstream-pin check +- [homeassistant-purpleair-migration-plan.md](./homeassistant-purpleair-migration-plan.md) — Python/HACS zip +- [nxwitness-migration-plan.md](./nxwitness-migration-plan.md) — .NET codegen + multi-target Docker + +> This INDEX is the **authoritative aligned decision record** (updated 2026-06-29 after maintainer review). +> Where an individual plan's prose predates these decisions, this file wins; the plans' release-model +> sections have been patched to match. + +## Release-model playbooks (the framing) + +We are converging on **release-model playbooks**: a WORKFLOW set defined per *release model*, the same way +CODESTYLE defines rules per *language*. Same-model repos share workflow definitions; a fix to one backports +to its model-siblings. A model can have sub-models (e.g. Docker splits by whether an external update trigger +exists). Current map: + +| Release model | Publish trigger | Repos | +|---|---|---| +| NuGet push-publish (+ `Directory.Packages.props` as shipped input) | push-on-dep-change + dispatch | LanguageTags, Utilities | +| Native binary + multi-arch Docker | dispatch / on-demand | PlexCleaner | +| **Docker — vanilla** (no external trigger) | weekly schedule(main) + dispatch | VSCode-Server | +| **Docker — triggered** (daily external signal) | weekly schedule(main) + **push-on-pin/matrix-change(main)** + dispatch | ESPHome-NonRoot, NxWitness | +| HACS/Python — manual release + upstream-retest tripwire | dispatch only (schedule retests, never publishes) | HA-PurpleAir | +| PyPI (the "never done" one) | TBD | *future repo — which one?* | + +**Capstone deliverable (after all migrations):** per-workflow flow diagrams in text notation (Mermaid) + +rendered PNG, showing entry points, triggers, outputs, decisions, and branching — making common-vs-unique +obvious across models and sub-models. + +## Maintainer clarifications (incorporated) + +- **`upstream-version.json` / `Matrix.json` are maintained by the repo's own daily scheduled job** (the + upstream-version check for ESPHome; codegen for NxWitness), which checks upstream and records the + last-released versions. NOT dependabot-updated. This is the repo's self-owned upstream-state pin. +- **The daily detection signal is 100% certain** an update is required; vanilla Docker has no such signal so + it can only assume a weekly apt/base change. Hence: triggered repos publish on the signal AND weekly; + vanilla publishes weekly only. +- **Dependabot (and codegen) dual-target main AND develop on every repo** — purely to avoid merge drift. + develop's pin/matrix/dep updates are sync-only and never publish. +- **HA monitoring is a breakage tripwire**: the HA-version monitor bumps the test matrix so a breaking + upstream change FAILS the PR build; a human fixes it and releases manually. That is why HA publish is + dispatch-only. + +## Release-trigger decision per repo (signed off) + +- **ESPHome-NonRoot:** weekly `schedule` on main (baseline apt/base CVE refresh) **+ path-scoped `push` on + main when `upstream-version.json` changes** (the daily upstream-check commits a real update -> publish + now) **+ `workflow_dispatch`**. The daily upstream-check workflow stays as the detection mechanism. Cheap + single image, so publish-on-trigger is clearly worth it. Drops PUBLISH_ON_MERGE; ordinary code merges do + not touch the pin so they do not publish. +- **NxWitness:** weekly `schedule` on main (builds the full product matrix, baseline refresh + release) + **+ path-scoped `push` on main when `Matrix.json` changes** (codegen commits a new matrix -> publish now) + **+ `workflow_dispatch`**. **Supersedes the earlier weekly-only decision** — publish on matrix change is + accepted despite the full-matrix build cost. Schedule is **main-only** (the earlier "paired develop + dispatch to refresh :develop" is dropped; `:develop` builds on manual dispatch only). Codegen runs daily, + dual-target main+develop. +- **HomeAssistant-PurpleAir:** **dispatch-only** publish; `schedule` retests main only and **never + publishes**. The HA-version monitor updates the test matrix to trip on breaking changes (fail the PR -> + human fix -> manual release). Drops the current `push:[develop]` auto-prerelease that violates the "merges + never publish" invariant the docs already claim. + +## Cross-repo recurring findings (same root causes as last round — now pre-empted) + +1. **Nested NBGV in `build-docker-task.yml`** (ESPHome, NxWitness) — second NBGV run risks `:SemVer2` tag + collision / misclassification. Fix: thread one `semver2` down, delete the nested `get-version`. +2. **merge-bot missing `--delete-branch`** (all three) — bot branches accumulate; auto-delete-setting stays + OFF to protect develop on promotion. +3. **No `repo-config/` 5D audit** (ESPHome, NxWitness; HA also lacks it) — created from the canonical + hardened `configure.sh` + ruleset JSON. +4. **Required-check name mismatch** (NxWitness aggregator is `...status` missing trailing ` job`; HA + similar) — must move in lockstep with the first `apply` or PRs never go green. +5. **CI still on `pull_request` + `dorny/paths-filter`** (NxWitness) — move to `push: ['**']`, drop the + filter, add the `!github.event.deleted` guard. (Note: the publisher's path-scoped `push` is separate and + intentional — it lives in `publish-release.yml`, not CI.) +6. **Messy branch backlog** (all three; HA worst at 40+) — prune superseded branches after verifying they + are merged/abandoned. +7. **Version-floor regressions to set first** — HA main is 0.3.0 but NBGV base would ship 0.1.x (bump base + to >=0.3 first); ESPHome/NxWitness bump the floor to exercise the publish path. + +## Suggested execution order (simplest -> hardest) + +1. **ESPHome-NonRoot** — closest to the converged Docker reference; smallest delta; first to prove the + triggered-Docker sub-model (weekly + push-on-pin). +2. **HomeAssistant-PurpleAir** — develop already ~90% converged; mostly main-catch-up + Python validate + mapping + branch cleanup; the version regression needs deciding first. +3. **NxWitness** — most complex (codegen + multi-target matrix + push-on-matrix publish); do last with the + pattern fresh. +4. **PyPI project** (future) — identify the repo and plan once the above land. + +All SHA->version claims in the plans were verified against the GitHub API (no hallucinated pins). No code +was changed — these are plans only. diff --git a/plans/esphome-nonroot-migration-plan.md b/plans/esphome-nonroot-migration-plan.md new file mode 100644 index 0000000..31c799c --- /dev/null +++ b/plans/esphome-nonroot-migration-plan.md @@ -0,0 +1,457 @@ +# Migrate ESPHome-NonRoot to branch-scoped CI/CD + converge + +Target: `ptr727/ESPHome-NonRoot` (`/home/pieter/ESPHome-NonRoot`). Docker-only image: layers a non-root +ESPHome + `esphome-device-builder` dashboard onto `python:3.14-slim`, ships one multi-arch image to +`docker.io/ptr727/esphome-nonroot`. No buildable app, no NuGet, no tests. Unique trait vs the other Docker +repos: it **tracks an upstream PyPI release** (`esphome` + `device_builder`) via a daily tracker that writes +`upstream-version.json` and opens auto-merged bump PRs. NBGV (`version.json`, floor `1.7`) still drives the +GitHub-release tag and `LABEL_VERSION`. + +Canonical references read: PlexCleaner (`/home/pieter/PlexCleaner`) WORKFLOW.md + workflows + repo-config +(the converged Docker model, **one branch per run, no matrix**), the prior Docker-only plan +(`we-are-investigate-enhancement-peppy-bentley.md`, VSCode-Server), and memory (nbgv-githubref-leak, +docker-publishing-pattern, branch-scoped-cicd-review-gotchas, migration-playbook, auto-delete gotcha). +**Do not change code from this plan; this is a plan only.** All converged action SHAs were verified MATCH +against the GitHub API (setup-dotnet v5.4.0, checkout v7.0.0, nbgv v0.5.2, build-push v7.2.0). + +--- + +## 1. Current-state assessment + +ESPHome-NonRoot is on the **older ProjectTemplate two-phase model** - further behind than PlexCleaner was, +and it carries the upstream-tracker layer the pure-Dockerfile VSCode-Server repo did not. + +**Workflows present (`.github/workflows/`):** + +| file | trigger | shape | +|---|---|---| +| `publish-release.yml` | `push: [main, develop]` + `workflow_dispatch` + `schedule` (Mon 02:00) | two-phase: `setup` plan job reading `vars.PUBLISH_ON_MERGE`, then `[main,develop]` **matrix** `publish`, plus `date-badge`, `docker-readme`, `cleanup-artifacts` jobs | +| `test-pull-request.yml` | `pull_request: [main, develop]` + dispatch | `dorny/paths-filter` `changes` -> `smoke-build` -> aggregator `Check pull request workflow status` (**no ` job` suffix**) -> `cleanup-artifacts` | +| `build-release-task.yml` | `workflow_call` | `get-version` -> `build-docker` (gated `enable_docker`) -> `github-release`; **nested `get-version` re-run inside `build-docker-task`** | +| `build-docker-task.yml` | `workflow_call` | **has its own nested `get-version` job**; reads `upstream-version.json`; login gated on `push` (not on smoke); tags `latest`/`develop` + pinned esphome version | +| `get-version-task.yml` | `workflow_call` | NBGV; `setup-dotnet` **v5.3.0** (old SHA); **`nbgv@master`** (floated, not pinned) | +| `check-upstream-version.yml` | `schedule` (daily 05:00) + dispatch | entry-point; resolver curls PyPI for `esphome` + `esphome-device-builder` | +| `check-upstream-version-task.yml` | `workflow_call` | generic tracker: matrix over `[main, develop]`, App-signed `create-pull-request`, writes `upstream-version.json` (CRLF) | +| `merge-bot-pull-request.yml` | `pull_request_target` | `merge-dependabot` + **`merge-upstream-version`** + `disable-auto-merge`; **merges WITHOUT `--delete-branch`** | +| `build-datebadge-task.yml` | `workflow_call` | BYOB date badge (template cruft) | +| `publish-docker-readme-task.yml` | `workflow_call` | full generic docker-readme task (template cruft - fold into docker task) | + +**Drift / debt vs canonical:** + +- **Two-phase machinery present:** `setup` plan job, `vars.PUBLISH_ON_MERGE`, `push` publish trigger, the + `[main,develop]` **branch matrix** in `publish` (the cross-branch NBGV-ref leak class, per + `nbgv-publicrelease-githubref-leak`). Canonical is now **one branch per run, no matrix, no setup job**. +- **`pull_request` CI (not `push`):** PRs from forks satisfy the check, but a workflow-edit PR does not test + its own copy, and there is no branch-deletion guard (none needed under `pull_request`, but needed after + the switch to `push`). +- **`dorny/paths-filter`** gates the smoke build (strip per gotcha 12; always smoke, buildcache keeps fast). +- **Nested NBGV:** both `build-release-task` and `build-docker-task` run `get-version` - a double NBGV run, + the exact gotcha-2 collision risk. Must thread `SemVer2` instead. +- **Aggregator name `Check pull request workflow status`** lacks the canonical ` job` suffix - the + required-check string must move to `Check pull request workflow status job` in lockstep with the ruleset. +- **No `repo-config/`** at all (no ruleset JSON, no `configure.sh`, no `settings.json`). The live ruleset + (if any) is hand-managed. This is the biggest missing piece. +- **No `WORKFLOW.md`**, **no `cspell.json`** (words live in the `.code-workspace`, ~60+ entries). +- **Old/floated SHAs:** `setup-dotnet` v5.3.0 (-> v5.4.0 `26b0ec1...`), `nbgv@master` (-> v0.5.2 + `705dad1...`). All Docker action SHAs already match canonical (qemu v4.1.0, buildx v4.1.0, login v4.2.0, + build-push v7.2.0) - verified. +- **`docker-readme` is a separate task** + a `date-badge` task - both template cruft to strip. +- **CODESTYLE.md carries large `.NET` and `Python` sections** that are inert (no .cs, no buildable .py - the + Python lives only inside the Dockerfile's uv venv). Per the repo's own "carry whole, don't trim" rule it + kept them; the converged Docker repos drop them. **Flag for maintainer** (see open questions). + +**Versioning:** NBGV, `version.json` floor `1.7`, `publicReleaseRefSpec ^refs/heads/main$`. Already correct +shape; floor bump to `1.8` to exercise the publish path (HISTORY tops out at 1.7). + +**Branch hygiene / backlog (messy - clean before go-live):** `main` and `develop` have **diverged +substantially** - `git diff --stat origin/main origin/develop` = 12 files / ~1034 insertions (develop is +well ahead: `.editorconfig` +176, `CODESTYLE.md` +463 new, `Docker/Dockerfile` rewritten ~334 lines, +`merge-bot`, `publish-release`, `publish-docker-readme` all changed). Stale remote branches to triage/delete: +`chore/sync-template`, `feature/sync-versioned-rulesets`, `fix-devcontainer-venv-path`, +`fix/python-314-doc-refs`, `reconverge-upstream-tracker`, `resync-copilot-runbook-178`, +`resync-template-pr167`, `resync/projecttemplate-pr184`, `resync/projecttemplate-pr190`, `shields`, +`support-device-builder`, plus live tracker heads `upstream-version-main` / `upstream-version-develop` and +4 dependabot heads. **The migration must land on `develop` (the ahead branch), then promote develop->main** +- a promotion here also resolves the large develop/main content drift, so expect a substantive promotion PR. + +**How a new upstream version becomes a published image today (the crux, traced):** + +1. `check-upstream-version.yml` runs daily (05:00 UTC) + on dispatch; resolver curls PyPI for the latest + `esphome` and `esphome-device-builder`, prints `{esphome, device_builder}`. +2. `check-upstream-version-task.yml` (matrix `[main, develop]`) rewrites `upstream-version.json` and opens an + App-signed `upstream-version-` bump PR per branch. +3. `merge-bot-pull-request.yml`'s `merge-upstream-version` auto-merges each (squash to develop, merge to main). +4. On main, the merge is a `push` -> **today** `publish-release.yml` publishes **only if `PUBLISH_ON_MERGE` + is `true`** (the maintainer's "releases on a dependabot/bump PR"); otherwise it waits for the weekly + schedule. `build-docker-task` reads `upstream-version.json` for the image tag + build-args. + +So `upstream-version.json` is **a committed build input**, read at build time - directly analogous to +`Directory.Packages.props` for the NuGet repos. The bump PR keeps it current; *something* then has to build. + +--- + +## 2. Target architecture + +Port PlexCleaner's converged Docker model **verbatim in shape**, dropping the executable target, and keep +ESPHome's two repo-specific leaves: `build-docker-task` (reads `upstream-version.json`, 3 build-args) and the +upstream-version tracker. **One branch per run, no matrix, no setup job, no PUBLISH_ON_MERGE.** + +**`publish-release.yml`** (rewrite to the triggered-Docker one-branch model): +- Triggers: `workflow_dispatch` + `schedule` (`0 2 * * MON`, weekly baseline) **+ path-scoped `push` on + main when the upstream pin changes** (`push: { branches: [main], paths: [upstream-version.json] }`). The + daily upstream-check commits a real update to the pin -> this push publishes immediately; the weekly + schedule covers base/apt rot when nothing upstream changed. Ordinary code merges do not touch the pin, so + "merges never publish" still holds (a Dockerfile/README change ships on the next weekly run or a manual + dispatch - see open questions for whether to widen the path set). Global concurrency group + `${{ github.workflow }}`, `cancel-in-progress: false`. +- Single `publish` job, `if: github.ref_name == 'main' || github.ref_name == 'develop'`, calls + `build-release-task.yml` with `ref: github.ref_name`, `branch: github.ref_name`, `smoke: false`, + `github: true`, `dockerhub: true`. **Delete** `setup`, `date-badge`, `docker-readme`, and the run-level + `cleanup-artifacts` jobs (the Docker push uploads no artifact; nothing to clean - matches PlexCleaner which + has none). Schedule runs main only (`github.ref` = default branch); dispatch publishes its own ref. + +**`build-release-task.yml`** (rewrite to thread NBGV, drop nested get-version): +- `validate` (gated `!smoke`, calls `validate-task`) + `get-version` (single NBGV) -> `build-docker` + (gated `!cancelled() && get-version success && (validate success || skipped)`) -> `github-release`. +- `build-docker` is passed `ref: GitCommitId`, `branch`, `smoke`, `push: dockerhub && !smoke`, and the + threaded `semver2` (+ the assembly versions, even though only `semver2` is consumed - keep the converged + input set for byte-convergence). **Remove `enable_docker`** (always build the one target). +- `github-release`: unchanged canonical shape - main-only prerelease backstop, download + `release-asset--*` (matches nothing here, succeeds), no-op-if-tag-exists guard, dispatch refreshes, + `target_commitish: GitCommitId`, `prerelease: branch != 'main'`, files `LICENSE` + `README.md`. **Keep + `fail_on_unmatched_files` OFF** (or omit) - the Docker target ships no asset and the glob legitimately + matches zero; PlexCleaner sets it true *because* its executable target must upload the 7z. Here it would + red a clean Docker-only release. (This is a real per-repo divergence from PlexCleaner - flag in WORKFLOW.md.) + +**`build-docker-task.yml`** (trim + thread, keep the upstream-version read): +- Inputs: `push`, `ref`, `branch` (required), `smoke`, **`semver2` (required, threaded)**. **Delete the + nested `get-version` job** (gotcha 2) - consume `inputs.semver2`. +- Keep the `Get pinned versions step` reading `.esphome` / `.device_builder` from `upstream-version.json`. +- **Login on every build (incl. smoke)** like canonical (higher pull/cache rate limits; forks can't push) - + change from today's `if: push`. This is what makes the Dependabot-store creds gotcha (3) load-bearing. +- Tags: `docker.io/ptr727/esphome-nonroot:${branch=='main' ? 'latest':'develop'}` + (main only) the pinned + `:` tag. **Add a `:` tag** to match canonical (every image carries its release + version) - decide with maintainer whether to keep BOTH the esphome-version tag and the SemVer2 tag (see + open questions). Branch-scoped buildcache (read both, write this branch on push), `mode=max`, + `ignore-error=true`. Build-args `LABEL_VERSION=semver2`, `ESPHOME_VERSION`, `DEVICE_BUILDER_VERSION`. +- **Fold the Docker Hub overview in:** add a `peter-evans/dockerhub-description@v5.0.0` step gated + `if: inputs.push && inputs.branch == 'main'`, `repository: ptr727/esphome-nonroot`, + `readme-filepath: ./Docker/README.md` (already exists, 28 lines). Then **delete** + `publish-docker-readme-task.yml`. + +**`get-version-task.yml`**: bump `setup-dotnet` to v5.4.0 (`26b0ec14...`); **keep `nbgv@master`** (documented no-SHA-pin exception - do NOT pin it) +(`705dad19...`) - drop `@master`. Otherwise canonical. + +**`validate-task.yml`** (NEW - lint-only, no unit-test): markdownlint (`**/*.md`) + cspell (README.md + +HISTORY.md) + actionlint, on `setup`-free runners. **No `unit-test`, no CSharpier/`dotnet format`** (no C#); +**no Python lint** (the Python is inside the Dockerfile only, exercised by the smoke build). Reused by CI and +each publish leg so the gates are identical. (PlexCleaner's `validate-task` has unit-test + C# lint; this is +the documented per-repo adapt.) + +**`test-pull-request.yml`** (rewrite to push-CI canonical): +- `on: push: branches: ['**']` (not tags) + `workflow_dispatch`. **Drop `pull_request` and + `dorny/paths-filter`.** +- `validate` (`if: !github.event.deleted`) + `smoke-build` (`build-release-task` with `smoke:true`, + `github:false`, `dockerhub:false`, `branch: github.ref_name`, `if: !github.event.deleted`) + + aggregator **`Check pull request workflow status job`** (`if: always() && !github.event.deleted`, fails + unless every need is `success`). **Drop the terminal `cleanup-artifacts`** (smoke uploads nothing). +- Document the fork-PR exception (a fork PR produces no push -> no required check; a maintainer lands it on + an in-repo branch first), same wording as PlexCleaner. + +**`check-upstream-version.yml` + `check-upstream-version-task.yml`** (KEEP, light touch): +- These already match the canonical multi-key tracker shape (App-signed CRLF-writing PR, `[main,develop]` + matrix, resolver inputs). Keep both. Only verify: action SHAs (`create-github-app-token` v3.2.0, + `checkout` v7.0.0, `create-pull-request` v8.1.1 - already current), terse-comment conformance, and that + the head-ref prefix `upstream-version` still matches the merge-bot's `merge-upstream-version` job refs. +- **The tracker's daily cadence is the staleness floor** for `upstream-version.json` currency; the *publish* + cadence (section 3) is what turns a current pin into a pushed image. + +**`merge-bot-pull-request.yml`** (converge, add `--delete-branch`): +- Keep `merge-dependabot`, `merge-upstream-version`, `disable-auto-merge-on-maintainer-push`. +- **Add `--delete-branch`** to both auto-merge calls (gotcha 5) - currently missing; bot/upstream branches + accumulate without it. Repo-wide auto-delete stays OFF (settings.json) so develop survives promotion. +- The `disable-auto-merge` job already lists both bot logins (`dependabot[bot]`, `ptr727-codegen[bot]`) - + keep (PlexCleaner only has dependabot, since it has no codegen/tracker bot; this is a justified per-repo + superset, not drift to "fix"). +- Converge comments to terse canonical. + +**Release artifact:** a GitHub release **as version anchor** (tag on the built commit + `LICENSE` + +`README.md`, generated notes, no binary asset) plus the multi-arch Docker image on Docker Hub +(`latest`/`develop` + version tags) + the Docker Hub overview on a main publish. + +### RESOLVED: release cadence vs the one-branch model (signed off 2026-06-29) + +ESPHome is a **triggered Docker** repo: the daily upstream-check gives a 100%-certain "update required" +signal. Agreed model: **publish on the update trigger AND weekly** - a path-scoped `push` on main when +`upstream-version.json` changes publishes the new upstream immediately, and the weekly schedule refreshes the +base/apt layer when nothing upstream changed. This is NOT a merge-publish of arbitrary code (only the pin +file change publishes), so the "merges never publish" invariant and the one-branch NBGV correctness both +hold. See section 3 for the full rationale. + +--- + +## 3. Release model decision (signed off 2026-06-29) + +**Decision: triggered-Docker model - publish on `weekly schedule (main)` + `push-on-upstream-version.json-change (main)` + `workflow_dispatch`.** The daily upstream-check stays as the +detection mechanism (it owns the pin); a real upstream bump it commits to main publishes immediately via the +path-scoped push, and the weekly schedule refreshes the base/apt layer when nothing upstream changed. + +**Why this shape:** + +- **Best of both, no merge-publish.** ESPHome's daily check is a 100%-certain "update required" signal, so + unlike vanilla Docker (VSCode-Server, which can only assume weekly apt rot) we publish the instant the pin + changes - ~immediate upstream response - and still publish weekly for base CVEs. Only the pin file change + publishes; arbitrary code merges do not (path filter), so the "merges never publish" invariant holds. +- **It is the `Directory.Packages.props` pattern.** The NuGet repos already publish on a push that touches + their shipped dependency input; `upstream-version.json` is the exact Docker analog. This is a converged + pattern, not a new one - and it is shared with NxWitness (push-on-`Matrix.json`-change), defining the + **triggered-Docker sub-model**. +- **One-branch correctness preserved.** Every publish run is single-branch (`github.ref` == built branch), + so NBGV classifies natively with no matrix, no `IGNORE_GITHUB_REF`, no cross-branch leak + (`nbgv-publicrelease-githubref-leak`); `develop->main` still promotes via a normal Copilot-reviewed PR with + no admin bypass. The push trigger is branch-filtered to `main`, so develop's daily pin update is sync-only + (drift-avoidance) and never publishes. + +**Staleness window:** new-upstream exposure is ~the daily-check interval (publish fires on the pin commit), +not a week; the weekly schedule only bounds *base-image* rot to <=7 days. Manual `gh workflow run +publish-release.yml` remains a zero-wait escape hatch. + +**Open sub-decision (see section 9):** whether the publish `paths` filter should also include `Docker/**` +(so a Dockerfile change to main publishes too) or stay pin-only (Dockerfile/code changes wait for the weekly +run or a dispatch). Default recommended: **pin-only**, to keep "merges never publish" literal. + +--- + +## 4. Files to create / edit / delete + +**Create (8):** +- `WORKFLOW.md` - port PlexCleaner's, Docker-only + upstream-tracker variant (see section 5). +- `.github/workflows/validate-task.yml` - lint-only (markdownlint + cspell + actionlint). +- `repo-config/ruleset-develop.json` - byte-identical to canonical. +- `repo-config/ruleset-main.json` - byte-identical to canonical. +- `repo-config/settings.json` - byte-identical to canonical. +- `repo-config/configure.sh` - canonical helper bodies; secrets `DOCKER_HUB_USERNAME`, + `DOCKER_HUB_ACCESS_TOKEN`, `CODEGEN_APP_CLIENT_ID`, `CODEGEN_APP_PRIVATE_KEY` in **both** stores; Docker + Hub repo string `ptr727/esphome-nonroot`. +- `repo-config/README.md` - canonical, adapted repo name / Docker Hub repo. +- `cspell.json` - migrate the `.code-workspace` `cSpell.words` (~60+ entries) + any README/HISTORY words. + +**Edit (10):** +- `.github/workflows/publish-release.yml` - rewrite to one-branch publisher (drop setup/matrix/push/ + PUBLISH_ON_MERGE/date-badge/docker-readme/cleanup jobs). +- `.github/workflows/build-release-task.yml` - add `validate` + thread NBGV; drop `enable_docker`; keep + github-release (no `fail_on_unmatched_files`). +- `.github/workflows/build-docker-task.yml` - drop nested `get-version`; consume `semver2`; login always; + add `:SemVer2` tag; fold in dockerhub-description step (main publish). +- `.github/workflows/get-version-task.yml` - setup-dotnet v5.4.0; **keep `nbgv@master`** (do NOT SHA-pin; documented exception). +- `.github/workflows/test-pull-request.yml` - push-CI, drop pull_request + dorny + cleanup; deletion guards; + aggregator rename to `...status job`. +- `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch`; terse comments. +- `.github/workflows/check-upstream-version.yml` / `-task.yml` - terse-comment + SHA conformance only + (no behavior change). +- `.github/dependabot.yml` - keep dual-target github-actions + docker (already correct); de-template + comments only. +- `AGENTS.md` - rewrite Release Model (one-branch publisher + upstream-tracker), strip two-phase / + PUBLISH_ON_MERGE / date-badge / docker-readme / paths-filter framing, point to `WORKFLOW.md`; add + "Shared Configuration and Tooling" + "Write docs in the current state" + repo-config pointer; converge the + shared sections (Comments, Git, Where rules live, PR Review Etiquette, Doc Style) byte-for-byte. +- `CODESTYLE.md` - **see open question** (drop inert .NET + Python sections to converge, OR keep per the + repo's "carry whole" rule). `.github/copilot-instructions.md` - confirm byte-identical Runbook + repo + placeholders (likely already current; edit only if drift). +- `version.json` - floor `1.7` -> `1.8`. `HISTORY.md` + `README.md` - add the 1.8 "CI/CD rework" entry. +- `ESPHome-NonRoot.code-workspace` / `.editorconfig` / `.gitattributes` - remove `cSpell.words` (moved to + cspell.json), de-template comments, keep LF pins for `.sh`/`Dockerfile`/entrypoint scripts. + +**Delete (2):** +- `.github/workflows/build-datebadge-task.yml` +- `.github/workflows/publish-docker-readme-task.yml` + +(Net: ~8 create, ~14 edit, 2 delete. `Docker/Dockerfile`, `Docker/Compose.yml`, `Docker/entrypoint/*`, +`Docker/README.md`, `.devcontainer/*`, `LICENSE`, `.markdownlint-cli2.jsonc`, `.dockerignore`, `.gitignore`, +`.vscode/*` unchanged.) + +--- + +## 5. Convergence + backports + +**Port verbatim (byte-for-byte) from PlexCleaner:** +- `repo-config/ruleset-develop.json`, `ruleset-main.json`, `settings.json` (only the integration_id 15368 + and check string are shared constants - identical). +- `repo-config/configure.sh` helper bodies: `jq_lacks` (exit-4 + stderr handling), `check_secrets` + (API-error-FAILs), `ruleset_id` (`first // empty`, no `2>/dev/null`), `check_app` (note-only), + `check_ruleset` / `check_settings` / `check_security`. Only `REQUIRED_*_SECRETS`, the Docker Hub repo + string, and `REPO` differ. +- AGENTS shared sections: **Comments** subsection, **Git and Commit Rules**, **Where rules live / + Shared Configuration and Tooling**, **PR Review Etiquette** (Merge Gate, Expected Review Loop, Triaging, + Responding, Escalating), **Documentation Style Conventions** incl. the **"Write docs in the current + state"** rule, **Workflow YAML Conventions** pointer. +- `.github/copilot-instructions.md` GitHub Copilot Review Runbook (placeholders `//` only). +- WORKFLOW.md section skeleton (0 model + glossary, 1-2 style, 3 architecture, 4 D0-D10, 5 methodology incl. + 5D audit, 6 config). + +**Adapt (per-repo):** +- WORKFLOW.md D4 (release/publish): **single Docker target + GitHub release as version anchor** (no + executable/7z seam); add a **D-guarantee for the upstream-version tracker** (daily PyPI resolve -> + App-signed bump PR -> auto-merge -> consumed by the next scheduled/dispatch publish) - the one genuine + ESPHome-specific contract the siblings lack. Note the `fail_on_unmatched_files: false` divergence and why. +- `validate-task` is lint-only (no unit-test / C# / Python lint) - the documented adapt. +- WORKFLOW.md "Self-sufficiency": ESPHome **has** an upstream-version tracker (PlexCleaner's says "no + codegen and no upstream-version tracker") - invert that sentence. +- `merge-bot` carries `merge-upstream-version` + a second bot login (justified superset). + +**Backports to the four live repos (drift found):** +- None *new* surfaced beyond what the VSCode-Server plan already lists (the "Write docs in current state" + rule missing from Utilities + LanguageTags; terse-comment alignment of `validate-task` / `merge-bot` in + Utilities + LanguageTags). If those backports already landed with VSCode-Server, this migration introduces + no new canonical drift - it **consumes** the converged form. Confirm the canonical `configure.sh` / + ruleset JSON match the now-live PlexCleaner copies before porting (they are the source of truth). + +--- + +## 6. Gotcha checklist (mapped to this repo) + +1. **NBGV GITHUB_REF classification** - APPLIES. One-branch publisher fixes it natively: `github.ref` == + built branch, no `IGNORE_GITHUB_REF`. Floor `1.7`->`1.8` exercises the publish path. The dropped matrix + removes the leak class entirely. +2. **NBGV threading (run once)** - **DIRECTLY APPLIES, current bug.** `build-docker-task` has its OWN nested + `get-version`, *and* `build-release-task` has one - a double run. Delete the nested job; thread `semver2` + from `build-release-task`'s single `get-version` into `build-docker-task` as an input. The pinned-version + tag would otherwise be fine, but the `:SemVer2` tag could collide/misclassify. +3. **Docker creds in BOTH secret stores** - APPLIES (Dependabot auto-merges docker + actions bumps; their + push-CI smoke build now logs in to Docker Hub because login moves to *always*). `configure.sh` requires + `DOCKER_HUB_USERNAME` / `DOCKER_HUB_ACCESS_TOKEN` in both Actions and Dependabot stores; App creds too + (the tracker is App-signed). **Verify the Dependabot store has them before go-live** or bot auto-merge + stalls on a red smoke check. +4. **Branch-deletion guard** - APPLIES once CI moves to `push: ['**']`. Add `if: !github.event.deleted` to + every `test-pull-request` job and `always() && !github.event.deleted` to the aggregator. (Not needed + today under `pull_request`.) +5. **merge-bot `--delete-branch`** - **APPLIES, current gap.** Today's merge-bot merges WITHOUT + `--delete-branch`, so `upstream-version-*` and dependabot heads accumulate (visible in the backlog). Add + it to both merge calls; keep repo-wide auto-delete OFF in settings.json (so a develop->main promotion does + not delete develop - `github-auto-delete-branch-gotcha`). +6. **5D audit hardened form** - APPLIES (creating `configure.sh` fresh). Use the final canonical bodies + verbatim: `jq_lacks` exit-4-is-lacks + stderr kept; `check_secrets` API-error-FAILs; `ruleset_id` + `first // empty` no-stderr-suppress; `check_app` note-only; audit FAILs when it cannot verify. +7. **Required-check name lockstep** - APPLIES. Rename aggregator to `Check pull request workflow status job` + in `test-pull-request.yml`, set the same string in both ruleset JSONs and in `configure.sh`'s + `REQUIRED_CHECK`. First `apply` against the live repo (in the same change shipping the workflow) is what + lets the migration PR's required check resolve; then `check`. +8. **Aggregator success/skipped allowlist (D7.4)** - APPLIES. `validate` always runs; `smoke-build` always + runs (no paths-filter now) so it won't skip - but use the canonical `success`-required loop (and the + `build-release-task` build gate uses `(success || skipped)` with `!cancelled()` for the `validate`-skip + on smoke). Don't use `!= 'failure'` (lets cancelled through). +9. **EOL discipline** - APPLIES. CRLF for `.md`/`.yml`/`.json`/`.code-workspace`; LF for `.sh`/`Dockerfile`/ + `entrypoint/*` (extensionless - `.gitattributes` already pins `*.sh` + `Dockerfile`; **add an explicit + `Docker/entrypoint/** text eol=lf`** if the entrypoint scripts are extensionless, or confirm they end + `.sh`). `upstream-version.json` must stay CRLF - the tracker writes it CRLF via `sed 's/$/\r/'`; verify + with `tr -cd '\r' | wc -c`, not `file`. +10. **Copilot review loop** - APPLIES at go-live. Comments lag; resolve threads to merge; re-request via the + `requestReviews` mutation (bot id `BOT_kgDOCnlnWA`); `gh api -X PATCH .../pulls/N -F body=@file` for body + edits. OIDC/NBGV-prerelease/snupkg are false positives (no NuGet/OIDC here, so fewer). Budget 1-3 rounds. +11. **Dependabot dual-target main AND develop** - APPLIES. `dependabot.yml` already dual-targets both + `github-actions` and `docker` - keep. (The *upstream-version tracker* also dual-targets via its + `[main,develop]` matrix - same rationale: avoid non-linear rebase/merge-block conflicts.) +12. **Strip template cruft** - APPLIES. Delete `build-datebadge-task.yml` + `publish-docker-readme-task.yml` + (fold overview into the docker task via `peter-evans/dockerhub-description` on a main publish); remove + `dorny/paths-filter` (test-pull-request); remove `setup` / `PUBLISH_ON_MERGE` (publish-release). **Keep** + `merge-upstream-version` and the `check-upstream-version*` tracker - this repo *does* use them (unlike the + pure-Docker siblings); that is the documented exception to "drop unused merge-bot jobs." +13. **Action SHAs** - APPLIES. setup-dotnet -> v5.4.0 (`26b0ec14cb23fa6904739307f278c14f94c95bf1`); Docker + actions already match canonical. **`nbgv` STAYS `@master` - do NOT SHA-pin it** (documented exception, + ESPHome AGENTS.md "Action pinning": the tag stream lags master so a pin draws Dependabot downgrade PRs; + the inline `@master` rationale comment is human-authored and must be preserved). Verify SHA->version + claims against the GitHub API (do not trust Copilot's SHA claims). +14. **Prose rules** - APPLIES. No em-dashes; US English; terse comments (one line <=~120, top-of-file + summary per workflow); never edit human-authored comments. Sweep the new/edited files. + +--- + +## 7. Verification + +**Static (local, before push):** +- `actionlint` (Docker: `docker run --rm -v "$PWD":/repo --workdir /repo rhysd/actionlint:latest -color`) + over all `.github/workflows/*.yml`. +- `markdownlint-cli2` (`docker run --rm -v "$PWD":/workdir davidanson/markdownlint-cli2:latest "**/*.md"`). +- `cspell` over README.md + HISTORY.md (CI scope) using the new `cspell.json`. +- YAML + JSON parse (`python -c 'import yaml,sys,glob; [yaml.safe_load(open(f)) for f in glob.glob(...)]'`; + `jq . repo-config/*.json upstream-version.json version.json`). +- `bash -n repo-config/configure.sh` and `bash -n` over inline resolver/run scripts where extractable. +- **EOL audit:** CRLF for `.md`/`.yml`/`.json`/`.code-workspace` (`grep -c $'\r'` or `tr -cd '\r' | wc -c`, + NOT `file`); LF for `.sh`/`Dockerfile`/`Docker/entrypoint/*`. Re-check any `.md`/`.json` that Edit/Write + touched (they can flip CRLF->LF). +- **Token sweep:** em-dash (`grep -rn $'—'`), plus + `PUBLISH_ON_MERGE|two-phase|dorny|datebadge|build-datebadge|publish-docker-readme|setup job|ProjectTemplate` + to confirm the cruft is gone (ProjectTemplate may legitimately remain in the Template-Lineage section if + kept - decide with maintainer). + +**Config audit (`repo-config/configure.sh`):** +- Before `apply`: `check` shows expected drift - the required-check rename (old live check + `Check pull request workflow status` -> new `...status job`), Docker secrets possibly missing from the + Dependabot store, no ruleset present (this repo has no repo-config today). +- `REPO=ptr727/ESPHome-NonRoot ./repo-config/configure.sh apply` then `check` -> "Configuration matches." + +**Live (dispatch, never via merge):** +- `gh workflow run publish-release.yml --ref develop` -> develop prerelease `1.8.-g`, + `:develop` image (multi-arch: `docker buildx imagetools inspect` shows amd64+arm64), GitHub release + (tag + LICENSE + README, marked prerelease, no binary asset). +- `gh workflow run publish-release.yml --ref main` (after promotion) -> clean `1.8.`, `:latest` + + `:` (+ `:SemVer2`) image, non-prerelease release, Docker Hub overview = `Docker/README.md`. +- Re-dispatch main -> no duplicate release (no-op guard), image still re-pushed (base refresh). +- Confirm the image runs non-root (the repo's reason to exist): `docker run --user 1001:100 ... esphome + version` succeeds against `/cache`. + +--- + +## 8. Go-live sequence + +1. **Triage the branch backlog first.** Decide which of the ~15 stale remote branches are obsolete; delete + them (`gh pr close` / `git push origin --delete`). Confirm the live tracker heads `upstream-version-*` + and dependabot heads are either merged or closed so they don't fight the migration. +2. Branch `feature/branch-scoped-cicd` off **`develop`** (the ahead branch). Apply iterations: + (i) workflows + delete cruft; (ii) `WORKFLOW.md` + `repo-config/` + `cspell.json`; (iii) docs (AGENTS / + CODESTYLE / copilot / dependabot / editorconfig / gitattributes / workspace / README / HISTORY / + version 1.8). Run the full static verify after each. +3. Open PR -> `develop`. **Copilot dance** (`copilot-review-flow`): wait + buffer, re-request via mutation, + resolve threads, decline false positives with rationale. Budget 1-3 rounds. +4. **`configure.sh apply` in lockstep** with the workflow edit (same change set), against the live repo - + this creates the rulesets + renames the required check so the PR's `...status job` check can resolve and + the Copilot rule attaches. Then `check`. +5. **Squash-merge to `develop`** (does NOT publish - one-branch model). Verify a develop push runs CI green. +6. **Promote `develop` -> `main` via a merge-commit PR, Copilot-reviewed, NO admin bypass.** Because main is + well behind develop, this is a substantive promotion (12 files / ~1k lines of legit content drift, not + just the migration). Watch for `migration-promotion-conflict` if main has straggler dependabot bumps in + files develop rewrote; if so, a local signed merge commit (tree = develop) may be needed - but try the + clean PR first. Decline pure-prose nits on the promotion PR (would diverge main from develop). +7. **Dispatch the publisher** on main (`gh workflow run publish-release.yml --ref main`) -> verify the main + `1.8.x` stable artifacts (image + release + overview). Dispatch on develop to verify the prerelease leg. +8. **Confirm `develop` survives** the promotion (`github-auto-delete-branch-gotcha`: auto-delete is OFF; + develop must still exist). Confirm the daily upstream tracker still opens bump PRs and the merge-bot + auto-merges + deletes their heads. +9. Set/confirm the publish schedule cadence the maintainer signed off in section 3 (daily vs weekly). + +--- + +## 9. Open questions for the maintainer + +1. **Release cadence (the crux - REQUIRES SIGN-OFF).** Confirm option (a): drop `PUBLISH_ON_MERGE`/merge- + publish; publish on schedule(main)+dispatch only; `upstream-version.json` is a shipped input. **Choose the + schedule:** recommended **daily** (`0 2 * * *`, ~24h CVE/upstream window, matches the daily tracker), vs + weekly (canonical default, ~7-day window), vs twice-weekly. Recommended default if no preference: **daily**. + The manual `gh workflow run` escape hatch covers any "release this bump now" case regardless. +2. **CODESTYLE.md inert sections.** Drop the large `.NET` and `Python` sections to converge with the Docker + siblings (which keep a General-only CODESTYLE), OR keep them per this repo's own "carry every section of a + carried file even when inert here" rule? Recommended: **drop them** (the Python is Dockerfile-internal, + not a maintained source tree; convergence wins). Needs a decision because it contradicts a stated repo rule. +3. **Image tag set.** Keep BOTH the pinned `:` tag (current behavior, user-meaningful) AND + add the canonical `:` tag, or only one? Recommended: **keep both** - the esphome tag is what + users pin to; the SemVer2 tag matches the GitHub release. (Costs one extra tag push.) +4. **`fail_on_unmatched_files`.** Confirm leaving it OFF/omitted on the Docker-only `github-release` (the + Docker target uploads no `release-asset-*`, so PlexCleaner's `true` would red a clean release). Recommended: + **off**, documented as the per-repo divergence in WORKFLOW.md D4. +5. **Version floor bump 1.7 -> 1.8.** Confirm the deliberate infra bump to exercise the publish path (matches + the sibling migrations: PlexCleaner 3.18->3.19, VSCode 1.0->1.1). Reconcile with the "routine edits leave + version.json untouched" rule as a maintainer-directed overhaul bump. +6. **Template lineage framing.** AGENTS.md currently frames the repo as ProjectTemplate-derived "two-phase". + Keep a (rewritten) Template-Lineage section pointing at the converged model, or drop the lineage framing + entirely as the siblings did? Recommended: **keep a trimmed lineage note** (it is still a real downstream + of ProjectTemplate) but rewrite it to the one-branch reality. diff --git a/plans/homeassistant-purpleair-migration-plan.md b/plans/homeassistant-purpleair-migration-plan.md new file mode 100644 index 0000000..5962a48 --- /dev/null +++ b/plans/homeassistant-purpleair-migration-plan.md @@ -0,0 +1,423 @@ +# HomeAssistant-PurpleAir branch-scoped CI/CD migration + convergence plan + +Repo: github.com/ptr727/HomeAssistant-PurpleAir (local clone: /home/pieter/homeassistant-purpleair, lowercase). +Type: Python Home Assistant custom integration distributed via HACS (custom_components/purpleair). No Docker, no NuGet. +Release artifact: a GitHub Release whose asset is `purpleair.zip` (the integration's files at archive root, HACS `zip_release` layout). + +Reference ground truth read for this plan: +- /home/pieter/LanguageTags (NuGet canonical), /home/pieter/PlexCleaner (Docker canonical) workflows, repo-config, AGENTS.md, copilot-instructions.md. +- This repo's 8 workflows, version.json, hacs.json, manifest.json (main + develop), dependabot.yml, ha-test-versions.json, AGENTS.md, CODESTYLE.md, .gitattributes, git history. + +HEADLINE: this repo is NOT a greenfield migration. `develop` has already been migrated most of the way to a +branch-scoped model in a prior round, and is heavily converged (AGENTS.md PR Review Etiquette, copilot Review +Runbook, NBGV versioning, HA-matrix bot, HACS zip-layout assertion are all already present on develop). The work +is (a) closing the remaining gaps to canon, (b) the crux release-trigger decision, (c) creating the missing +`repo-config/` 5D audit, and (d) a large branch-backlog + main/develop reconciliation cleanup. Do NOT rebuild from +the template; converge develop forward. + +--- + +## 1. Current-state assessment + +### 1.1 Branch hygiene (messy backlog - the big one) +- `develop` is 92 commits AHEAD of `main`, 0 behind. `git merge-base main develop` = `01e4292` (main tip, + PR #60 "reseed main manifest to 0.3.0 after pipeline regression"). Local `develop` == `origin/develop` (`dc8b34b`), clean. +- `main` is stuck in the OLD release-please era: it still carries `.github/workflows/release-please.yml`, has NO + `version.json`, NO `repo-config/`, manifest.json `version: 0.3.0`. develop has none of that legacy and has the + full NBGV + HA-matrix + HACS-zip pipeline. +- 54 remote branches. Abandoned/superseded migration attempts and one-offs: + - NBGV era leftovers: `nbgv`, `nbgv-prerelease-fix`, `restore-prerelease-gate`, `chore/seed-develop-prerelease-manifest`, + `chore/restore-develop-manifests-beta`, `chore/reseed-main-manifest-0.3.0`. + - HACS-zip pain (ALREADY merged into develop via PRs #80, #82): `fix-hacs-zip-layout`, `zip-layout-assertion-prefix-fix`. STALE. + - Sync churn: `sync-main-into-develop`, `chore/sync-main-into-develop-2`, `ruleset-and-refs-followup`, `feature/sync-versioned-rulesets`. + - release-please era: `release-please--branches--develop--...`, `release-please--branches--main--...`, `chore/remove-changelog`. + - Many open dependabot/* branches targeting develop, plus feature branches (`subentry-reconfigure-readkey`, + `feat/org-name-title-and-first-refresh-fix`, etc). +- VERIFY-before-delete: of the named "abandoned migration" branches, all six checked + (`nbgv`, `nbgv-prerelease-fix`, `restore-prerelease-gate`, `fix-hacs-zip-layout`, `zip-layout-assertion-prefix-fix`, + `sync-main-into-develop`) are NOT ancestors of develop, but the HACS-zip *content* already landed via squash PRs, + so the branches are stale duplicates, not lost work. Treat NBGV branches the same way (the NBGV pipeline is live on develop). + +### 1.2 Triggers and jobs today (on develop) +- `test-pull-request.yml`: triggers `pull_request: [main, develop]` + `push: [main, develop]` + `workflow_dispatch`. + Jobs: `test-release` (calls test-release-task.yml) and aggregator `check-workflow-status` + named **"Check pull request workflow status"** (NO trailing " job"). Concurrency `${{ github.workflow }}-${{ github.ref }}`. + -> DIVERGES from canon: canon triggers `push: ['**']` (every branch), no `pull_request`; aggregator name is + **"Check pull request workflow status job"**; aggregator must guard `!github.event.deleted` and treat success-only per job. +- `test-release-task.yml` (reusable, the validate+test set): jobs `ruff` (check + format --check), `mypy` (--strict), + `pyright`, `read-versions` (parses ha-test-versions.json), `pytest` (matrix over minimum/latest-stable/latest-beta, + uploads to Codecov), `hassfest`, `hacs`, and `build-release` (no-publish build gated by `build` input). +- `publish-release.yml`: triggers `workflow_dispatch: {}` + **`push: [develop]`**. Jobs: `gate` (assert dispatch from + main), `test-release` (build:false), `create-release` (github:true), `date-badge` (dispatch-only), `cleanup-artifacts`. + Concurrency `${{ github.workflow }}` global, cancel-in-progress:false. + -> DIVERGES from canon: canon publisher has NO push trigger; merges never publish. THIS repo auto-publishes a + prerelease on every develop push. This is the crux - see section 3. +- `build-release-task.yml` (reusable): `get-version` (calls get-version-task), `build` (stamp manifest with NBGV + SemVer2, zip custom_components/purpleair at root, **zip-layout assertion already present**, upload artifact), + `release` (download + softprops GitHub Release, `target_commitish: github.sha`, prerelease flag from get-version). +- `get-version-task.yml`: single NBGV run (setup-dotnet v10), outputs SemVer2/Tag/Prerelease. Prerelease + detected by `-` in SemVer2. CORRECT single-run threading (gotcha 2 already satisfied). **nbgv currently + SHA-pinned to v0.5.2 (`705dad19`) - CONVERT to `@master`** per the documented no-SHA-pin exception (the tag + stream lags master; a pin draws Dependabot downgrade PRs). See gotcha 13 / briefing. +- `build-datebadge-task.yml`: BYOB "Last Build" badge, gated on Prerelease==false. Template cruft per gotcha 12. +- `check-ha-version.yml`: daily cron + dispatch. Monitors `pytest-homeassistant-custom-component` on PyPI (whose + `homeassistant==` pin IS the upstream HA core version being tracked), resolves latest-stable + latest-beta HA pairs, + opens ONE bundled PR on rolling branch `ha-version-bump/matrix` -> develop via the codegen App. Does NOT publish. +- `merge-bot-pull-request.yml`: `pull_request_target`. Jobs merge-dependabot (squash develop / merge main by base), + merge-ha-version-bump (squash develop), disable-auto-merge-on-maintainer-push. Uses App token. + -> DIVERGES: `gh pr merge --auto` with NO `--delete-branch` (gotcha 5). + +### 1.3 Version scheme (NBGV present) +- `version.json`: base `version: "0.1"`, `publicReleaseRefSpec: ["^refs/heads/main$"]`, nugetPackageVersion.semVer 2. + Standard branch-scoped floor. main ships `0.1.`, develop ships `0.1.-g`. +- manifest.json `version`: develop = `0.0.0` placeholder (stamped at build time from NBGV); main = `0.3.0` (legacy, + release-please era). hacs.json has NO `version` field (HACS reads the stamped manifest). hacs.json `homeassistant` + = `2026.4.0` (the user-facing MINIMUM, hand-maintained, must match ha-test-versions.json `minimum.ha` and the + requirements.txt bootstrap pin series). +- The NBGV/version.json model is already correctly wired on develop; gotcha 1 is structurally satisfied. See 6.1. + +### 1.4 repo-config / 5D audit +- DOES NOT EXIST. No `repo-config/` directory on develop or main. This is the single largest missing canonical piece. + Rulesets are presumably configured live in the UI but are not codified or auditable. configure.sh + ruleset JSON + must be created from the LanguageTags canonical (adapting only secret names + the publish manual-verify note). + +### 1.5 SHA pins (gotcha 13, verified against GitHub API) +- setup-dotnet `9a946fd` = v5.3.0 (correct; canon prefers newer v5.4.0 - optional bump). +- nbgv `705dad19` = v0.5.2 - **SHA-pinned today, but CONVERT to `@master`** (documented no-SHA-pin exception; + do NOT keep the pin). +- checkout `df4cb1c069...` = the v6.0.3 commit (annotated tag `v6.0.3` -> `df4cb1c`; verified by dereference). Correct + (canon mentions v7.0.0 - a dependabot bump branch `dependabot/.../actions/checkout-7.0.0` already exists; let it land). +- All other pins (create-github-app-token v3.2.0 `bcd2ba49`, setup-python v6.3.0 `ece7cb06`, softprops v3.0.1 + `718ea10b`, codecov v6.0.1, upload/download-artifact, hacs/action 22.5.0) appear consistent; spot-verify any the + reviewer challenges - do NOT trust Copilot SHA->version claims. + +--- + +## 2. Target architecture + +One run = one branch. Reconcile per-file. The HACS pull-model release cadence is the one place the repo legitimately +diverges from the Docker/NuGet canon; section 3 resolves it. + +### 2.1 test-pull-request.yml (CI - align to canon) +- Triggers: `push: branches: ['**']` + `workflow_dispatch`. REMOVE the `pull_request` trigger and the `push:[main,develop]` + restriction. Self-testing: pushing any branch IS the PR check. + - Tension: the existing `push:[main,develop]` exists to re-upload Codecov on the post-merge SHA (default-branch badge). + Under `push:['**']` that still happens (main/develop are members of `**`), so the Codecov goal is preserved. Keep the + explanatory comment, retargeted. +- Jobs: keep `test-release` (calls test-release-task.yml). RENAME aggregator job to exactly + **"Check pull request workflow status job"** (job key may stay `check-workflow-status`). Add `if: ${{ !github.event.deleted }}` + to the head job(s) and the aggregator. Aggregator gate: per-need loop treating `success` as the only pass for a + required need; for any conditionally-skipped need use the success-OR-skipped allowlist (gotcha 8). Today there is one + need (`test-release`) which is never skipped, so the simple `!= 'success' -> exit 1` is correct; keep it but ensure the + `!github.event.deleted` guard so deleting a branch does not red a phantom run (gotcha 4). +- Concurrency unchanged (`${{ github.workflow }}-${{ github.ref }}`, cancel-in-progress true). + +### 2.2 test-release-task.yml (validate + test set - this is the Python adaptation of canon's validate/smoke) +- KEEP AS-IS structurally. This is the correct Python mapping of the canonical "validate + smoke/test" pair: + - validate role: `ruff` (lint+format), `mypy --strict`, `pyright`, `hassfest`, `hacs` (replaces canon's + markdownlint/cspell-only validate). + - smoke/test role: the `pytest` matrix over `minimum` / `latest-stable` / `latest-beta` HA versions (this IS the + HA-core-version test matrix the per-project note calls for) + `build-release` (no-publish build that exercises the + zip/HACS-layout path, analogous to canon's `smoke-build`). +- Optionally add `markdownlint`/`cspell` validate jobs if the repo wants the canonical doc-lint parity (the repo has + `.markdownlint-cli2.jsonc`); RECOMMEND keeping these as separate validate jobs only if they already pass clean - + do not introduce new failing gates during migration. Flag as an open question (9.4). + +### 2.3 publish-release.yml (the crux - see section 3 for the decision) +RECOMMENDED target (decision: dispatch-only publish, schedule retests only): +- Triggers: `workflow_dispatch: {}` + `schedule` (daily/weekly cron). REMOVE `push: [develop]`. +- `gate` job: keep, generalize to allow dispatch from `main` OR `develop`, guard + `if: github.ref_name == 'main' || github.ref_name == 'develop'` (canon shape). main dispatch = stable; develop + dispatch = prerelease (NBGV `-g` makes the tag unique). This restores "merges never publish". +- `schedule` leg: RETEST ONLY, NEVER publish. The schedule runs on the default branch (main) by GitHub rule; it should + run `test-release` (validate+test) against main and STOP - no `create-release`. Rationale: HACS is a pull model and + the maintainer explicitly does not auto-push; the schedule's job is to catch upstream HA/pytest-hacc drift breaking + the shipped main, surfacing a red run for a maintainer to act on, not to cut a release. (`check-ha-version.yml` + already handles the develop-side retest-and-bump; the publisher schedule covers main, which the matrix bot does not touch.) + - Implement: `create-release` (and date-badge) gate `if: github.event_name == 'workflow_dispatch'`. The schedule path + runs gate(skipped)->test-release->[create-release skipped]. Concurrency global, cancel-in-progress:false (unchanged). +- `create-release` job: unchanged otherwise (build:false on test-release, github:true on create-release, softprops with + target_commitish github.sha, prerelease from NBGV). +- `date-badge`: FOLD or DELETE per gotcha 12 (template cruft). RECOMMEND delete build-datebadge-task.yml and the + date-badge job - it is a "Last Build" vanity badge with no release-correctness role. Flag as open question 9.3 since it + is currently wired and green. +- `cleanup-artifacts`: keep (always(), best-effort). + +### 2.4 build-release-task.yml (release artifact - keep, it is the HACS zip producer) +- KEEP. The zip-layout assertion (manifest.json + __init__.py at root, no `purpleair/` wrapper, `./` normalization) is + ALREADY baked in (lines 68-99) from PRs #80/#82 - gotcha-equivalent "bake the zip-layout assertion" is DONE. Do not + remove it; verify it survives any edit. Single NBGV run via get-version-task threaded down as SemVer2 (gotcha 2 satisfied). + +### 2.5 get-version-task.yml / merge-bot / check-ha-version / dependabot +- get-version-task: keep (correct single NBGV run). +- merge-bot-pull-request.yml: add `--delete-branch` to BOTH `gh pr merge --auto` calls (merge-dependabot, + merge-ha-version-bump). Keep repo-wide auto-delete-on-merge OFF in settings.json so develop->main promotion does not + delete develop (gotcha 5). +- check-ha-version.yml: keep as-is (it is the repo's upstream-monitor; already retest-not-publish). Confirm its rolling + PR + bundle design is documented in AGENTS.md (it is). +- dependabot.yml: TODAY single-targets develop only (every ecosystem `target-branch: develop`). **DECIDED + (2026-06-29): dual-target main AND develop** (gotcha 11) - the maintainer confirmed "dependabot should + still keep main and develop updated, i.e. avoid merge drift." Converge all ecosystems (pip + + github-actions) to dual-target like the four live repos; develop's bumps are sync-only and never publish. + Open question 9.1 is resolved. + +--- + +## 3. Release model decision (THE CRUX) + +Question: how does the HACS-zip release map onto one-branch, given releases are NOT automatic on merge and HACS is a +pull model? + +What is monitored upstream: `check-ha-version.yml` monitors `pytest-homeassistant-custom-component` on PyPI, whose +`homeassistant==` pin is the de-facto upstream **HA core version** (stable and beta). `aiopurpleair` (the API client, +pinned as `aiopurpleair-ptr727==` in manifest/requirements) is NOT auto-monitored; it moves via Dependabot pip +PRs (note develop manifest pins 2026.8.0 vs main 2026.4.0). So "upstream" = HA core (via pytest-hacc), monitored daily, +retested on develop via a bundled bot PR; it does not publish. + +**The monitor is a breakage tripwire (maintainer-confirmed intent):** the HA-version bump updates the test +matrix with the new HA release specifically so a breaking upstream change makes the bot PR's CI **fail**; a +human then intervenes, fixes, and releases manually via dispatch. The monitoring exists to surface breakage +early, not to ship - which is exactly why publish is dispatch-only and the schedule retests but never +publishes. + +How/when the HACS zip is cut today: TWO paths - (a) every develop push auto-cuts a PRERELEASE GitHub Release; (b) +`workflow_dispatch` from main cuts a STABLE release. Path (a) directly contradicts the maintainer's stated model ("does +NOT auto push") and the canonical rule "merges never publish". + +DECISION: **dispatch-gated publish; schedule retests only; NO push-to-develop publish.** +- Stable release: `gh workflow run publish-release.yml --ref main` (gate asserts main). Cuts `0.1.` clean. +- Prerelease (when a beta tester build is wanted): `gh workflow run publish-release.yml --ref develop`. NBGV emits + `0.1.-g`, softprops marks it prerelease. This replaces the automatic develop-push prerelease with an + on-demand one - same artifact, same uniqueness guarantee, but maintainer-initiated. +- Schedule (daily/weekly): runs `test-release` against main ONLY (retest the shipped integration against the latest HA + matrix). NEVER publishes. This is the publisher-side complement to check-ha-version's develop-side retest. + +Rationale: +1. Matches the maintainer's explicit "monitor + retest but do NOT auto push (HACS is pull)" requirement verbatim. +2. Restores the canonical invariant "merges never publish" (briefing model + gotcha-adjacent). The AGENTS.md "Merging is + not releasing" verbatim contract currently lies, because develop-push DOES release; this decision makes the docs true. +3. Removes the develop-push trigger entirely, eliminating the only `push`-on-merge publish path - aligns with the + PlexCleaner Docker canon (dispatch + schedule, no push) which is the closest precedent for a non-NuGet artifact. +4. Keeps NBGV GITHUB_REF classification correct (gotcha 1): a develop dispatch has `github.ref = refs/heads/develop` + (one run = one branch), so NBGV classifies it prerelease with no IGNORE_GITHUB_REF hack. A main dispatch is clean. + +TENSION FLAGGED: the existing code+docs treat develop-push prerelease as a feature ("beta testers always have the +latest"). Dropping it trades automatic prereleases for on-demand. If the maintainer wants beta testers to keep getting +every develop push automatically, the alternative is to KEEP `push:[develop]` as a documented, deliberate exception to +the one-branch publisher (the repo already gates it correctly on test success). Surface both; RECOMMEND dispatch-only to +honor the stated "does NOT auto push" requirement. See open question 9.2. + +--- + +## 4. Files to create / edit / delete + +### CREATE (8) +1. `repo-config/configure.sh` - port LanguageTags verbatim; change only: + - `REQUIRED_ACTIONS_SECRETS=(CODEGEN_APP_CLIENT_ID CODEGEN_APP_PRIVATE_KEY CODECOV_TOKEN)` (no NuGet/Docker; Codecov + token is the repo's one publish-ish secret - it is Actions-only, NOT Dependabot). + - `REQUIRED_DEPENDABOT_SECRETS=(CODEGEN_APP_CLIENT_ID CODEGEN_APP_PRIVATE_KEY)` (the codegen App secrets must be in BOTH + stores so a Dependabot-triggered push-CI can mint the App token; Codecov is not needed by Dependabot runs). + - `REQUIRED_CHECK="Check pull request workflow status job"` (identical string). + - cmd_check manual-verify note: "GitHub Releases / HACS zip publish is dispatch-gated; no external publish policy to verify." + - check_app note: "verify the codegen App is installed". + - Keep jq_lacks / check_secrets / ruleset_id / check_app helper BODIES byte-identical to canon (gotcha 6). +2. `repo-config/ruleset-develop.json` - port verbatim (squash-only, linear, signed, deletion+non_fast_forward, + required check "Check pull request workflow status job" with integration_id 15368, copilot_code_review, 0 approvals, + strict false). +3. `repo-config/ruleset-main.json` - port verbatim (merge-only, NO linear, signed, same required check, copilot review). +4. `repo-config/settings.json` - port verbatim (allow_squash+merge, rebase off, auto_merge on, delete_branch_on_merge:false). +5. `repo-config/README.md` - port verbatim, substitute repo name; document the dispatch-only HACS publish in place of the + NuGet/Docker publish line. +6. (optional) `markdownlint`/`cspell` validate jobs - only if 9.4 says yes; else skip. + +### EDIT (8-9) +1. `.github/workflows/test-pull-request.yml` - triggers -> `push:['**']` + dispatch (drop pull_request); aggregator name + -> "Check pull request workflow status job"; add `!github.event.deleted` guards. +2. `.github/workflows/publish-release.yml` - drop `push:[develop]`; add `schedule`; gate dispatch main||develop; gate + create-release/date-badge on `github.event_name == 'workflow_dispatch'`; (delete date-badge if 9.3 yes). +3. `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch` to both `gh pr merge --auto` calls. +4. `.github/dependabot.yml` - dual-target main AND develop (if 9.1 yes). +5. `AGENTS.md` - update "Release flow" + "Merging is not releasing" to match the dispatch-only decision (remove the + "Push to develop -> automatic prerelease" bullet; document dispatch-from-develop + schedule-retests-main). Add a + "Where rules live" pointer if missing; verify Comments subsection matches canon verbatim. Add `repo-config/` pointer. +6. `CODESTYLE.md` - STRIP the `.NET` section (lines 39-343) - template cruft for a Python-only repo (gotcha 12). Keep + General + Python. +7. `.github/copilot-instructions.md` - verify Review Runbook is byte-converged with canon; substitute owner/name + `ptr727/homeassistant-purpleair` in any hardcoded GraphQL snippets; confirm bot login `copilot-pull-request-reviewer`. +8. `custom_components/purpleair/manifest.json` (main side, during reconciliation) - main's `0.3.0` must become `0.0.0` + placeholder to match the stamp-at-build model when develop lands on main (handled by the promotion, see 8). +9. (if SHA convergence chosen) bump checkout v6.0.3->v7.0.0, setup-dotnet v5.3.0->v5.4.0 across workflows - or let the + existing dependabot/checkout-7 branch land. + +### DELETE (2-4) +1. `.github/workflows/build-datebadge-task.yml` - template cruft (gotcha 12), if 9.3 yes. +2. `.github/workflows/release-please.yml` - ONLY EXISTS ON MAIN (legacy). Removed automatically when develop overwrites + main in the promotion; no develop-side action. +3. 40+ stale remote branches (backlog cleanup, see 8.6) - not files but a delete pass. +4. (n/a) No `dorny/paths-filter`, no `setup`/`PUBLISH_ON_MERGE`, no `merge-codegen`/`merge-upstream-version`, + no `publish-docker-readme-task.yml` exist here - already absent. + +Net: ~8 create, ~8-9 edit, ~2-4 delete (workflow-file deletes 1-2; plus a branch-backlog delete pass). + +--- + +## 5. Convergence + backports + +PORT VERBATIM from LanguageTags (byte-for-byte target): +- `repo-config/configure.sh` helper bodies (jq_lacks, check_secrets, ruleset_id, check_app, assert, apply_ruleset, + check_ruleset, check_settings, check_security, cmd_apply/cmd_check/dispatch). Only the secret arrays + two note strings differ. +- `repo-config/ruleset-develop.json`, `ruleset-main.json`, `settings.json` (only the required-check string is shared and + already canonical). +- AGENTS.md: Comments subsection, Git/Commit rules, "Where rules live" lead paragraph, PR Review Etiquette (already + present and looks converged - diff against canon), Documentation Style Conventions incl. the line-endings + "preserve current state" rule. +- `.github/copilot-instructions.md` Review Runbook (bot id read-from-review pattern, requestReviews mutation, coverage + check, thread resolution). +- `.editorconfig` EOL rules (CRLF for .md/.yml/.json; LF for .sh/.py) - the repo already has a 10KB .editorconfig; diff + the relevant blocks against canon. + +ADAPT (Python-specific, not verbatim): +- `test-release-task.yml` validate/test set (ruff/mypy/pyright/hassfest/hacs/pytest-matrix) - no canonical sibling; this + is the repo's correct Python mapping. Keep. +- publish-release.yml mechanics (HACS zip via softprops, dispatch+schedule-retest) - per-repo publish seam. +- configure.sh secret arrays + manual-verify note. + +BACKPORT TO THE FOUR LIVE REPOS (drift found here worth propagating): +- The HACS-zip-layout assertion pattern is repo-unique; nothing to backport. +- check-ha-version.yml's PEP-440 walk + bundled rolling-PR design is more robust than a naive sort; if any sibling repo + monitors a PyPI/upstream version, consider porting the `packaging.version` ordering. LanguageTags/PlexCleaner do not, + so likely n/a. +- If the .editorconfig or Comments subsection here has drifted ahead of canon, reconcile toward the canonical text (do + not let this repo's copy become a fork). + +--- + +## 6. Gotcha checklist (mapped to THIS repo) + +1. NBGV GITHUB_REF classification - SATISFIED on develop. version.json floor `0.1` + `publicReleaseRefSpec ^refs/heads/main$`; + one-run-one-branch means github.ref already matches; no IGNORE_GITHUB_REF hack present or needed. ACTION: when adopting + dispatch-only, a develop dispatch keeps github.ref=refs/heads/develop so NBGV stays prerelease - correct. Bump version.json + `0.1`->`0.2` once to exercise the publish path during go-live verification. +2. NBGV threading - SATISFIED. Single nbgv run in get-version-task; SemVer2 threaded into build-release-task's stamp + + softprops. No nested get-version. Keep it that way. +3. Docker creds in both secret stores - N/A (no Docker). The ANALOG: codegen App secrets (CODEGEN_APP_CLIENT_ID/PRIVATE_KEY) + MUST be in both Actions AND Dependabot stores, because a Dependabot-merged push fires develop CI and the merge-bot mints + the App token. configure.sh REQUIRED_DEPENDABOT_SECRETS encodes this. +4. Branch-deletion guard - NOT PRESENT today (test-pull-request uses pull_request, not push:['**']). ADD `!github.event.deleted` + when switching to push:['**'] so deleting a branch does not fire a phantom CI run. +5. merge-bot --delete-branch - MISSING. ADD to both merge calls. Keep repo-wide auto-delete OFF. +6. 5D audit hardened form - configure.sh DOES NOT EXIST; create from the final hardened canonical (jq_lacks exit-4 case, + check_secrets fail-on-API-error, ruleset_id first//empty + visible gh error, check_app best-effort note, fail-when-cannot-verify). +7. Required-check name lockstep - the ruleset JSON required-check, the aggregator job name, and the live ruleset must all + read "Check pull request workflow status job". Today the workflow says "...status" (no " job") and there is no ruleset JSON. + Fix the workflow name AND create the JSON AND run `apply` in the same change that ships the workflow edit, then `check`. +8. Aggregator success/skipped allowlist - today one never-skipped need, so success-only is fine. If validate jobs are split + out (markdownlint/cspell) keep the simple loop but ensure any conditionally-skipped need uses success-OR-skipped. +9. EOL discipline - CRLF for .md/.yml/.json/.code-workspace, LF for .sh/.py/Dockerfile. Repo has .gitattributes (`* -text` + + LF pins for *.sh/scripts/*) and a large .editorconfig. VERIFY .json/.yml CRLF with `tr -cd '\r' | wc -c` (file(1) + lies for JSON). New repo-config/*.json must be CRLF; configure.sh must be LF. +10. Copilot review loop - copilot-instructions.md Review Runbook already present; bot login `copilot-pull-request-reviewer` + (GraphQL, no [bot]) / `...[bot]` (REST). Re-request via requestReviews mutation each head. Expect 1-3 rounds; snupkg/OIDC + false positives are N/A (no NuGet); HACS-zip / NBGV-prerelease are the likely recurring false positives here - decline + with rationale. Use `gh api -X PATCH .../pulls/N -F body=@file` for body edits. +11. Dependabot dual-target - today develop-only. DECISION (9.1): converge to dual-target main AND develop to match canon + and avoid the non-linear merge-block; unless maintainer confirms main-bumps are pointless under HACS pull. Default: dual. +12. Strip template cruft - DELETE build-datebadge-task.yml + date-badge job; STRIP CODESTYLE.md .NET section. No + paths-filter / PUBLISH_ON_MERGE / merge-codegen / docker-readme exist (already clean). release-please.yml is main-only + legacy, removed by the promotion. +13. Action SHAs - VERIFIED: checkout df4cb1c=v6.0.3, setup-dotnet 9a946fd=v5.3.0, nbgv 705dad19=v0.5.2. Canon + prefers checkout v7 / setup-dotnet v5.4 (optional; dependabot/checkout-7 branch exists - let it land). + **nbgv must become `@master`, NOT stay SHA-pinned** (documented no-SHA-pin exception; the pin draws + Dependabot downgrade PRs). Re-verify any SHA the reviewer disputes; never trust Copilot's SHA->version mapping. +14. Prose rules - no em-dashes (sweep of workflows/AGENTS/CODESTYLE = clean today; re-sweep after edits). US English. Terse + comments, one line <=120, top-of-file workflow summary. Never edit human-authored comments. NOTE: check-ha-version.yml's + apply step deliberately preserves a U+2014 em-dash in ha-test-versions.json's `$comment` via ensure_ascii=False - that + em-dash is in a DATA file's content, not prose authored by us; leave the workflow logic, but confirm the comment text we + author elsewhere stays em-dash-free. + +--- + +## 7. Verification + +Static (run all before pushing): +- `actionlint` on every .github/workflows/*.yml. +- `bash -n repo-config/configure.sh`; `shellcheck` it. +- `python3 -m json.tool` parse on each repo-config/*.json + ha-test-versions.json + manifest.json + hacs.json. +- EOL: `for f in repo-config/*.json .editorconfig; do printf '%s ' "$f"; tr -cd '\r' < "$f" | wc -c; done` (expect >0 for + CRLF JSON, 0 for configure.sh). `grep -rIl $'\r' repo-config/configure.sh` must be empty. +- Em-dash sweep: `grep -rn $'—' .github/ AGENTS.md CODESTYLE.md repo-config/` must be empty (the ha-test-versions.json + `$comment` em-dash is acceptable data, exclude it). +- markdownlint-cli2 (config present) / cspell if the workspace has a dict - report-only if not gating. +- zip-layout assertion smoke: locally `cd custom_components/purpleair && zip -r /tmp/p.zip . && unzip -Z1 /tmp/p.zip | + sed 's|^\./||'` and confirm manifest.json + __init__.py at root, no `purpleair/` prefix. + +Config audit: +- `REPO=ptr727/HomeAssistant-PurpleAir ./repo-config/configure.sh check` BEFORE apply -> expect drift (rulesets not codified + / required-check name mismatch). Run `apply`. Re-run `check` -> expect "matches". + +Live dispatch verification (after merge): +- `gh workflow run publish-release.yml --ref develop` -> confirm a PRERELEASE GitHub Release with `0.1.-g` tag and + a `purpleair.zip` asset whose layout passes the assertion. +- `gh workflow run publish-release.yml --ref main` -> confirm a STABLE `0.1.` release. +- Confirm the schedule leg (or a dispatched no-publish test) retests without creating a release. +- Bump version.json `0.1`->`0.2` to exercise a new minor and confirm NBGV height resets. + +--- + +## 8. Go-live sequence + +8.1 Branch from develop: `feature/branch-scoped-cicd-convergence`. Make ALL edits + creates from section 4 there. +8.2 Push -> the new `push:['**']` CI runs on the feature branch (self-testing). Green it (ruff/mypy/pyright/pytest-matrix/ + hassfest/hacs + no-publish build). +8.3 Open PR feature -> develop (squash). Copilot dance: re-request via requestReviews on each head; resolve every thread; + expect 1-3 rounds; HACS-zip/NBGV-prerelease comments are likely false positives - decline with rationale. Maintainer + approves explicitly (Merge Gate). +8.4 Required-check lockstep: in the SAME change, `REPO=ptr727/HomeAssistant-PurpleAir ./repo-config/configure.sh apply` + against the live repo so the ruleset's required check becomes "Check pull request workflow status job" and matches the + renamed aggregator - otherwise the PR's new check name is not the required one and the PR cannot satisfy the old name. + Then `check` to confirm "matches". (Order: apply right before/with the merge so the new green check is the required one.) +8.5 Squash-merge to develop. develop CI re-runs (no publish - push trigger removed). +8.6 Branch-backlog cleanup (do this around the promotion, carefully): + - Confirm each candidate is fully merged or truly abandoned: `git branch -r --merged origin/develop` for the safe set; + for the NBGV/HACS/sync branches verify their content is on develop (HACS-zip already is via #80/#82; NBGV pipeline is + live) then `git push origin --delete `. + - Delete release-please-- branches, nbgv*, *sync*, ruleset-*, fix-hacs-zip-layout, zip-layout-assertion-prefix-fix, + chore/seed-*, chore/reseed-*, chore/restore-* after confirming superseded. + - Leave OPEN dependabot/* branches (they auto-close on merge or get superseded); leave live feature branches the + maintainer still wants. Get maintainer sign-off on the delete list (open question 9.5) - do not bulk-delete unilaterally. +8.7 Reconcile main: develop is 92 ahead / 0 behind, so a develop->main merge-commit PR is clean. Open develop->main PR, + "Create a merge commit" (main ruleset is merge-only; develop becomes a real ancestor so the next promotion is clean). + NO admin bypass. This brings version.json, repo-config, the new workflows, the 0.0.0 placeholder manifest, and DELETES + release-please.yml from main in one node. Copilot-review the promotion; decline pure-prose nits (would diverge main). + - Watch the manifest version: main currently 0.3.0, develop 0.0.0. The merge takes develop's 0.0.0 (correct - stamped + at build). The first STABLE dispatch from main will then publish NBGV `0.1.` (NOT 0.3.x). FLAG to maintainer + (9.6): the public version series effectively resets from release-please 0.3.0 to NBGV 0.1.. If continuity matters, + bump version.json base to `0.3` (or higher) BEFORE the first main dispatch so NBGV emits >=0.3.x. +8.8 Apply main ruleset: `configure.sh apply` already wrote both rulesets in 8.4; re-`check` post-promotion to confirm main + ruleset live. +8.9 Dispatch the publisher from main: `gh workflow run publish-release.yml --ref main`; verify clean release + zip asset + + layout. Optionally dispatch from develop for a prerelease. +8.10 Confirm develop SURVIVES: repo-wide auto-delete is OFF; the merge-commit promotion + per-merge --delete-branch on bot + PRs only delete bot branches, never develop. Verify `origin/develop` still exists post-promotion. + +--- + +## 9. Open questions for the maintainer (with recommended defaults) + +9.1 Dependabot targeting: converge to dual-target main AND develop (canon, avoids non-linear merge-block) vs keep + develop-only (current, documented)? DEFAULT: dual-target (match the four live repos; the maintainer already rejected + single-target elsewhere). +9.2 Release trigger (THE CRUX): adopt dispatch-only publish + schedule-retests-main, dropping the automatic develop-push + prerelease? DEFAULT: YES - it honors the stated "monitor + retest but do NOT auto push (HACS pull)" requirement and + restores "merges never publish". Alternative if beta testers must keep auto-prereleases: keep push:[develop] as a + documented deliberate exception. +9.3 Delete build-datebadge-task.yml + the date-badge job (template cruft)? DEFAULT: YES (vanity "Last Build" badge, no + release-correctness role). Keep only if the README badge is load-bearing for the maintainer. +9.4 Add markdownlint/cspell validate jobs for canonical doc-lint parity? DEFAULT: only if they already pass clean; do not + add a new failing gate during migration. Otherwise defer to a follow-up. +9.5 Branch-backlog delete list: approve bulk deletion of the ~40 superseded/abandoned branches (nbgv*, release-please--*, + *sync*, ruleset-*, hacs-zip-*, chore/seed|reseed|restore-*)? DEFAULT: delete after per-branch superseded-confirmation; + leave live dependabot/* and wanted feature branches. +9.6 Version continuity: main is at release-please 0.3.0; NBGV base is 0.1, so the first stable dispatch ships 0.1., + a version REGRESSION below 0.3.0. Bump version.json base to >=0.3 before the first main dispatch to preserve monotonic + public versions? DEFAULT: YES, set base to `0.3` (or `0.4`) so HACS users do not see a downgrade. diff --git a/plans/nxwitness-migration-plan.md b/plans/nxwitness-migration-plan.md new file mode 100644 index 0000000..13aed52 --- /dev/null +++ b/plans/nxwitness-migration-plan.md @@ -0,0 +1,466 @@ +# NxWitness branch-scoped CI/CD migration + convergence plan + +Repo: `/home/pieter/NxWitness` (GitHub `ptr727/NxWitness`). Default branch `main`; integration branch `develop`. +Target model: the branch-scoped self-publishing CI/CD proven on LanguageTags, Utilities, PlexCleaner, VSCode-Server. +Canonical Docker reference: `/home/pieter/PlexCleaner/`. Canonical .NET-tooling reference: `/home/pieter/LanguageTags/`. + +This is the MOST complex remaining repo: it couples (1) a .NET **codegen tool** (`CreateMatrix` + `CreateMatrixTests` + `Make/`) that +fetches upstream Nx product versions and regenerates `Make/Version.json`, `Make/Matrix.json`, `Docker/*.Dockerfile`, and `Make/Test*.yml`; +with (2) a **multi-stage, multi-product, multi-base Docker** build (5 products x {plain, LSIO} = 10 product images, plus 2 shared base +images) published to 12 Docker Hub repos. There is **no NuGet publish** (`CreateMatrix.csproj` and the test csproj both set +`IsPackable=false`; repo-wide grep for `nuget push` / `dotnet pack` / `PackageId` / `NuGetApiKey` returns zero). The .NET is a build-time +generator only; the published artifacts are exclusively Docker Hub images. + +IMPORTANT: NxWitness is **partially migrated already**. `publish-release.yml` is dispatch+schedule (no push), and codegen/Dependabot +already dual-target main AND develop. The work here is (a) closing the gaps to the canonical model, (b) adding the missing governance +surface (`WORKFLOW.md`, `repo-config/`, the converged AGENTS/CODESTYLE/copilot sections), and (c) **deciding the one big architectural +tension**: the publisher today builds BOTH branches in one run (build-main + build-develop legs), which is exactly the cross-branch NBGV +classification hazard the canonical one-branch model exists to remove. + +--- + +## 1. Current-state assessment + +### 1.1 Workflows present (`.github/workflows/`) +| File | Trigger | Role | Canonical status | +|---|---|---|---| +| `test-pull-request.yml` | `pull_request` [main, develop] + `workflow_dispatch` | CI: paths-filter -> test-release + smoke-build + aggregator | **DIVERGES**: uses `pull_request` not `push: ['**']`; uses `dorny/paths-filter`; aggregator name is `Check pull request workflow status` (missing trailing ` job`) | +| `publish-release.yml` | `workflow_dispatch` + `schedule` (Mon 02:00 UTC) | Publisher: get-version(main) + build-base(main) + build-main + build-develop + github-release + docker-readme + date-badge + cleanup | **DIVERGES**: builds BOTH branches in one run (two legs), not one-branch-per-run | +| `build-base-images-task.yml` | `workflow_call` | Builds shared `nx-base` + `nx-base-lsio` (matrix of 2), branch-scoped buildcache | repo-owned, keep | +| `build-docker-task.yml` | `workflow_call` | Builds product images from `Make/Matrix.json` (matrix over `.Images`), threads SemVer2 in | repo-owned, keep; NBGV threading OK (see 1.4) | +| `get-version-task.yml` | `workflow_call` (input `ref`) | Single NBGV run, exposes SemVer2 + assembly versions + GitCommitId | canonical-shaped; no IGNORE_GITHUB_REF | +| `test-release-task.yml` | `workflow_call` + `workflow_dispatch` | husky lint + `dotnet test` | rename/fold into `validate-task.yml` (see 2) | +| `run-codegen-pull-request-task.yml` | `workflow_call` | Matrix codegen main+develop -> `codegen-main`/`codegen-develop` PRs | dual-target, keep (gotcha 11) | +| `run-periodic-codegen-pull-request.yml` | `workflow_dispatch` + `schedule` (daily 04:00) | Calls the codegen task | keep | +| `merge-bot-pull-request.yml` | `pull_request_target` [opened, reopened, synchronize] | merge-dependabot + merge-codegen + disable-on-maintainer-push | **DIVERGES**: merge step lacks `--delete-branch` (gotcha 5) | +| `build-datebadge-task.yml` | `workflow_call` | BYOB "Last Build" badge | **STRIP** (gotcha 12) | +| `publish-docker-readme-task.yml` | `workflow_call` | Docker Hub overview via `peter-evans/dockerhub-description`, manifest-derived repo list | **FOLD** into the docker task as a main-publish step (gotcha 12) | + +### 1.2 Governance surface (the big gap) +- **No `WORKFLOW.md`.** No `repo-config/` (no `configure.sh`, no ruleset JSON, no `settings.json`, no `repo-config/README.md`). + The 5D audit and the GitHub-side config are entirely absent. This is the largest single body of new work. +- `AGENTS.md` exists with: Solution Structure, Build/Validation, Image Architecture, CI Pipeline, Versioning, Git and Commit Rules, + PR Title/Commit Conventions, PR Review Etiquette (full canonical contract, with the "Mandatory in every derived repo" banner + Merge + Gate), Coding Conventions (Highlights), Notes for Changes, Template adaptations. It does NOT reference WORKFLOW.md/repo-config. + It LACKS a dedicated `### Comments` subsection and a `## Documentation Style Conventions` section (those live in PlexCleaner AGENTS.md). +- `.github/copilot-instructions.md` exists with the full canonical Review Runbook (requestReviews mutation, GraphQL-vs-REST login split, + head-SHA coverage, bounded retry, thread resolution). Good - ports nearly verbatim, only owner/name strings change. +- `CODESTYLE.md` exists (25 KB). Needs a heading-by-heading diff against PlexCleaner's to converge the shared structure. + +### 1.3 Version scheme +`version.json` floor `2.14`, `publicReleaseRefSpec: ["^refs/heads/main$"]`, `nugetPackageVersion.semVer: 2`. NBGV computes +`X.Y.` on main, `X.Y.-g` elsewhere. `get-version-task.yml` runs NBGV once (dotnet/nbgv@master, floated - tag +stream lags so Dependabot would propose a downgrade; a deliberate `@master` float, keep with the existing comment). + +### 1.4 NBGV threading today (gotcha 2 status) +- `get-version-task.yml` runs NBGV **once** and exposes outputs. Good. +- `build-docker-task.yml` ALSO contains a nested `get-version` call (`needs: [get-version, ...]`) and threads + `needs.get-version.outputs.SemVer2` into `LABEL_VERSION`. So in the PUBLISHER path, NBGV runs in TWO places: once at top-level + `publish-release.yml::get-version` (ref: main, for the release tag) AND once inside each `build-docker-task` invocation (ref: the leg's + ref). These are different NBGV runs on different refs. For the image LABEL_VERSION this is arguably intentional (the develop leg should + label its images with the develop prerelease version), but it is a SECOND NBGV run per leg - exactly what gotcha 2 says to avoid, and it + means the develop leg's NBGV classification depends on which `ref` it was handed, not on GITHUB_REF. This works today only because the + legs pass an explicit `ref` (main commit / develop) and NBGV keys off the checked-out branch when given a real branch ref. See 3 for the + recommended consolidation. + +### 1.5 Cross-branch / NBGV-leak exposure (the crux) +`publish-release.yml` builds main and develop in ONE run via separate jobs. `GITHUB_REF` for the whole run is the dispatch ref. The +build legs each pass an explicit `ref` (`build-main` pins `get-version.outputs.GitCommitId`; `build-develop` passes `ref: develop`). +The repo does NOT set `IGNORE_GITHUB_REF` anywhere. The safety net is the **`Verify public release version step`** (the D2.2 backstop) in +`github-release`, which refuses to publish a `main` GitHub release carrying a prerelease `-`. So the GitHub RELEASE classification is +protected. The IMAGE tag/label classification, however, is governed by the nested `get-version` in each docker leg keyed on the leg's +`ref`, not on GITHUB_REF - so as long as `build-develop` passes `ref: develop`, its nested NBGV checks out develop and classifies +prerelease. This is the matrix-publisher case the memory `nbgv-publicrelease-githubref-leak.md` says either needs `IGNORE_GITHUB_REF` OR +(preferred) should migrate to one-branch-per-run. + +### 1.6 Smoke / CI path +`test-pull-request.yml` runs on `pull_request`. `dorny/paths-filter` gates: builds smoke (NxMeta + NxMeta-LSIO, amd64, no push) only when +`Docker/**` / `Make/Matrix.json` / `Make/Version.json` changed; `build_base` only when a base Dockerfile changed. Aggregator +`check-workflow-status` (name `Check pull request workflow status`) treats success|skipped as pass, fails on failure|cancelled. Has a +branch-deletion concern: on `pull_request` there is no branch-deletion event, so gotcha 4's `!github.event.deleted` guard is **n/a while +the trigger stays `pull_request`** - but if we move to `push: ['**']` (canonical), the guard becomes mandatory. + +### 1.7 Branch hygiene / backlog (messy) +`git branch -a` shows a substantial backlog of remote branches that must be reconciled or pruned before/around go-live: +`codegen`, `codegen-main`, `codegen-develop` (codegen working branches - expected, transient), `dependabot/...` on BOTH main and develop +(3 live), plus stragglers: `backport-cicd-fixes`, `bump-version-2.13`, `chore/sync-template`, `feature/sync-versioned-rulesets`, +`fix-lsio-puid-pgid-ordering`, `fix-release-version-tag-race`, `fix/release-skip-log-message`, `fix/release-tag-pinning-and-skip-existing`, +`propagate-versioning-policy`, `realign-template-lint-config`, `release-notes-2.14`, `shields`. Several look like prior CI-fix attempts. +Action: audit each before go-live; the migration branch should supersede the relevant `fix-release-*` / `*cicd*` / `*versioning*` +branches, and those should be closed (not merged) to avoid re-introducing superseded mechanics. Local-only branches not on origin +(`fix/release-skip-log-message`, `release-notes-2.14`) can be deleted locally. + +### 1.8 EOL / prose +All 10 workflow files are CRLF (verified by `grep -c $'\r'`). `AGENTS.md` has 0 em-dashes; `README.md` has 1 em-dash (must be swept). +`.slnx` is **stale**: it references LanguageTags-shaped workflow files that do not exist here (`build-executable-task.yml`, +`build-library-task.yml`, `build-release-task.yml`, `publish-periodic-docker-release.yml`) and omits files that DO exist +(`build-base-images-task.yml`, `build-docker-task.yml`). Must be rebuilt to the real file set (gotcha: do not mirror LanguageTags' .slnx). + +--- + +## 2. Target architecture + +The target keeps NxWitness's legitimately repo-specific build layer (shared base + per-product matrix) while converging the +orchestration, governance, and classification model. Per-file target: + +### 2.1 `test-pull-request.yml` (CI) -> converge to push-on-every-branch self-test +- **Trigger:** `push: branches: ['**']` (NOT `pull_request`) + `workflow_dispatch`. Rationale: the canonical model self-tests by pushing + the branch; reusable `./...` logic resolves from the head; the aggregator's ruleset-bound context has a single producer (the push run). + This is also required for the Dependabot-in-repo-branch path to produce the required check. +- **Concurrency:** `group: ${{ github.workflow }}-${{ github.ref }}`, `cancel-in-progress: true`. +- **Branch-deletion guard (gotcha 4):** every job `if: ${{ !github.event.deleted }}`; aggregator `if: ${{ always() && !github.event.deleted }}`. +- **Drop `dorny/paths-filter` (gotcha 12).** Two options for the smoke gate, FLAGGED for the maintainer (open question 9.1): + - (A) Always run a minimal smoke (NxMeta + NxMeta-LSIO, amd64, no push) on every branch push. Simplest, matches PlexCleaner's + unconditional smoke; costs a docker build on every doc-only push. + - (B) Replace paths-filter with a cheap inline `git diff --name-only` step inside a single `changes` job (no third-party action) to + keep the "only build images when image files changed" optimization. Preserves today's behavior without the dropped action. + - **Recommendation: (B).** NxWitness pushes are frequent (codegen + 6 Dependabot ecosystems x 2 branches) and a full product smoke is + heavier than PlexCleaner's single-target smoke; keep the change-gate but implement it inline to honor "strip paths-filter." +- **Jobs:** `validate` (lint + `dotnet test`, was `test-release-task.yml`), `changes` (inline diff, if option B), `smoke-build` + (calls `build-docker-task.yml` smoke), `check-workflow-status` (aggregator), `cleanup-artifacts`. +- **Aggregator name MUST become exactly `Check pull request workflow status job`** (add trailing ` job`) to match the canonical + required-check string used by `repo-config` (gotcha 7). Keep the success|skipped allowlist (gotcha 8) it already has; for `changes` keep + the "must succeed" semantics (a failed `changes` must not let an image-changing PR through as a skip). +- **Smoke `branch` input:** today passes `github.base_ref` (a PR concept). Under `push`, there is no base_ref. Pass `github.ref_name` + so a push to develop validates develop's Matrix rows and a push to main validates main's (matches the build task's branch filter). + +### 2.2 `validate-task.yml` (NEW, rename of `test-release-task.yml`) +- `workflow_call` + `workflow_dispatch`. Jobs: dotnet restore/tool-restore, husky lint, `dotnet test`, plus the static doc validators + if the canonical validate carries them (markdownlint/cspell scoped to README+HISTORY per LanguageTags convention). Name the aggregated + job `Validate job` per canonical. This is the CI's quality gate; it does not build images. + +### 2.3 `publish-release.yml` (Publisher) - DECIDED (triggered-Docker, one-branch-per-run) +**Decided shape (signed off 2026-06-29): triggered-Docker.** Triggers: `workflow_dispatch` + `schedule` +(weekly Mon 02:00, main baseline) **+ path-scoped `push` on main when the codegen matrix changes** +(`push: { branches: [main], paths: [ ] }`). +Keep: global non-ref-scoped concurrency (`group: ${{ github.workflow }}`, `cancel-in-progress: false`); the +`Verify public release version` D2.2 backstop on the main release; the skip-existing-release guard; the +artifact cleanup job. + +**Run shape: one-branch-per-run, schedule is main-only.** +- Schedule -> builds `main` only (full product matrix; baseline base/CVE refresh + versioned release). +- Push-on-`Matrix.json`-change (main) -> builds `main` (codegen committed a new matrix => publish the new + product versions immediately; this is the accepted publish-on-matrix-change, superseding the earlier + weekly-only decision). Only the matrix file change publishes; ordinary code merges do not (path filter). +- Dispatch -> builds `github.ref_name`, guarded `if: github.ref_name == 'main' || github.ref_name == 'develop'`. **`:develop` is refreshed by manual dispatch only** (the earlier "paired develop re-dispatch to + refresh :develop weekly" is DROPPED per maintainer: weekly builds main only). +- Jobs: `get-version` (ref: `github.ref_name`), `build-base` (push: true, ref: `github.ref_name`), + `build-docker` (push: true, branch: `github.ref_name`, ref: pinned to `get-version.outputs.GitCommitId` + for main / `github.ref_name` for develop), `github-release` (`if: github.ref_name == 'main'` - a develop + dispatch publishes images + the `:develop` tag but cuts no versioned GitHub release), `docker-readme` + (main only), `cleanup-artifacts`. +- NBGV classifies natively: main schedule/push/dispatch => clean version; a develop dispatch => prerelease. + No `IGNORE_GITHUB_REF`, no cross-branch leg. The D2.2 backstop stays as defense-in-depth. The push trigger + is branch-filtered to main, so develop's daily codegen matrix update is sync-only and never publishes. +- **Base-image sharing under one-branch:** the shared `nx-base` tag (`:ubuntu-noble`) is branch-agnostic. A + develop dispatch rebuilding the base would overwrite the shared tag with develop's base. Recommendation: + the develop dispatch sets `build_base: false` and pulls the main-built shared base; the weekly main + schedule refreshes the base for CVEs. If base divergence between branches is a real risk, FLAG (open + question 9.2). + +**Cost note:** publish-on-matrix-change rebuilds the full product matrix on every codegen bump (the +maintainer accepted this over the cheaper weekly-only, for the tightest upstream-vuln window). The matrix +build keeps `max-parallel: 4` and branch-scoped buildcache to bound runner time. + +### 2.4 `build-docker-task.yml` (repo-owned build layer) - keep, with NBGV consolidation +- Keep the `get-matrix` job (smoke filter / branch filter / full), the product matrix over `.Images`, the multi-arch build, the + branch-scoped registry buildcache (read both `buildcache-main` + `buildcache-develop`, write only this branch on push). +- **NBGV (gotcha 2):** remove the nested `get-version` job; instead accept threaded inputs `semver2` (and assembly versions if the image + embeds them) as REQUIRED workflow_call inputs, passed by the orchestrator's single `get-version` run. The orchestrator computes the + version for the branch being built and threads it down. This matches PlexCleaner's `build-docker-task` (version threaded, never + re-run). Smoke callers can pass a placeholder/threaded smoke version. (If the maintainer prefers each leg to label with its own branch + version under shape 3.2, the orchestrator runs get-version per leg and threads each leg's value - still single-NBGV-per-leg, no nested + re-run.) +- Keep `max-parallel: 4` on the product matrix. + +### 2.5 `build-base-images-task.yml` - keep verbatim (repo-owned) +Shared `nx-base` / `nx-base-lsio` matrix, branch-scoped buildcache + inline cache. The `ref` input lets the publisher build from main. +Under shape 3.1, the develop run sets `build_base: false`. + +### 2.6 `get-version-task.yml` - keep; conditionally add IGNORE_GITHUB_REF +Single NBGV run. Under shape 3.1: NO `IGNORE_GITHUB_REF` (native classification). Under shape 3.2: ADD `env: IGNORE_GITHUB_REF: "true"`. + +### 2.7 Codegen (`run-codegen-pull-request-task.yml` + `run-periodic-codegen-pull-request.yml`) - keep, dual-target (gotcha 11) +Matrix runs codegen on main AND develop, opens `codegen-main->main` and `codegen-develop->develop` PRs via the App token (so +`pull_request`/push events fire), CSharpier formats, merge-bot auto-merges each independently. This is the canonical dual-branch codegen +case the briefing's gotcha 11 protects. KEEP both targets. The daily schedule (04:00) is staggered after the weekly publish (Mon 02:00). +Note: the codegen task today runs ONLY `matrix --updateversion` (regenerates `Make/Version.json` + `Make/Matrix.json`); it does NOT run +`make` (Dockerfiles/compose are regenerated by a human via `Make/Create.sh`). Document this seam in WORKFLOW.md (S-section): the codegen +PR keeps the matrix current with upstream Nx versions; Dockerfile changes are a separate human-driven path. FLAG (open question 9.3): +should codegen also run `make` so new upstream versions auto-regenerate Dockerfiles? Current answer: no - Dockerfile structure changes +are reviewed; only version data auto-updates. Recommend keeping as-is. + +### 2.8 `merge-bot-pull-request.yml` - add `--delete-branch` (gotcha 5) +Add `--delete-branch` to the `gh pr merge --auto "$method"` calls in BOTH `merge-dependabot` and `merge-codegen` jobs (NOT to the +disable-auto job). Keep repo-wide `delete_branch_on_merge: false` in `settings.json` (gotcha 5 + github-auto-delete-branch-gotcha: +prevents a develop->main promotion from deleting develop). Per-merge deletion is explicit. Keep the per-base method case +(develop=squash, main=merge), the major-NuGet skip, the strict codegen head/base pairing, and `pull_request_target` + App-token model. + +### 2.9 Docker Hub overview - fold into the docker task (gotcha 12) +Delete `publish-docker-readme-task.yml` as a standalone and add a `peter-evans/dockerhub-description` step that runs ONLY on a main +publish. Because NxWitness has 12 repos, the fold must iterate the repo list. Cleanest: a small `docker-readme` job inside +`publish-release.yml` gated `if: github.ref_name == 'main'`, deriving the repo list from `Make/Matrix.json` via the existing +`manifest-jq` (`[.Images[].Name | ascii_downcase | "ptr727/\(.)"] + ["ptr727/nx-base","ptr727/nx-base-lsio"] | sort | unique`) and +matrixing `peter-evans/dockerhub-description` over it. This keeps the behavior but removes the standalone reusable file the briefing +says to strip. (If the maintainer prefers to keep the reusable task file for clarity given 12 repos, that is a defensible per-repo +deviation - document it. Recommendation: fold, to match the canonical strip.) + +### 2.10 `build-datebadge-task.yml` - DELETE (gotcha 12) +Remove the file and the `date-badge` job from `publish-release.yml`, and strip the badge from README if it points at the BYOB gist. + +### 2.11 `WORKFLOW.md` (NEW) - port from PlexCleaner, adapt for codegen + multi-image +Structure mirrors PlexCleaner: model-at-a-glance, glossary, architecture, the D0..D10 behavioral contract, 5-test methodology (5A static, +5B trace scenarios S1..Sn, 5C live probe, 5D config audit), repository configuration. NxWitness-specific additions: +- D-guarantees for the **codegen seam**: codegen dual-targets main+develop; codegen PR regenerates `Version.json`+`Matrix.json` only; + forward-only version guard (`ReleaseVersionForward`) prevents generic-tag regression; merge-bot auto-merges each codegen PR. +- D-guarantees for the **multi-image matrix**: shared base built once and reused; product matrix from `Matrix.json`; per-product Docker + Hub repos + base repos; multi-arch amd64+arm64; branch-scoped buildcache; weekly base refresh for CVEs. +- The NBGV-classification guarantee adapted to the chosen shape (3.1 native vs 3.2 IGNORE_GITHUB_REF), with the D2.2 backstop guarantee. +- A "Template adaptations" appendix documenting the legitimate divergences (shared-base fan-out, Docker-only release with no + release-asset files, folded docker-readme, codegen replacing merge-upstream-version). + +### 2.12 `repo-config/` (NEW) - port from PlexCleaner verbatim, retarget strings +- `configure.sh`: copy PlexCleaner's verbatim (helper bodies `jq_lacks`, `check_secrets`, `ruleset_id`, `check_app`, `assert`, `pass`, + `fail`, `note`, `apply_ruleset`, `cmd_apply`, `cmd_check`). Retarget: `REQUIRED_CHECK="Check pull request workflow status job"`, + `REQUIRED_ACTIONS_SECRETS`/`REQUIRED_DEPENDABOT_SECRETS` = `(DOCKER_HUB_USERNAME DOCKER_HUB_ACCESS_TOKEN CODEGEN_APP_CLIENT_ID + CODEGEN_APP_PRIVATE_KEY)` (identical set, both stores - gotcha 3), and the manual-verify note to enumerate the 12 NxWitness Docker Hub + repos (or note "push to docker.io/ptr727/"). `cmd_check` order: ruleset develop (squash, linear), + ruleset main (merge, non-linear), settings, security, secrets, app. +- `ruleset-develop.json`: condition `refs/heads/develop`; rules `deletion`, `non_fast_forward`, `required_linear_history`, + `required_signatures`, `pull_request` (allowed_merge_methods `["squash"]`, `required_review_thread_resolution: true`, + `dismiss_stale_reviews_on_push: true`, approvals 0), the required status check context + `Check pull request workflow status job` (integration_id 15368), `copilot_code_review` (review_on_push true). +- `ruleset-main.json`: condition `refs/heads/main`; SAME minus `required_linear_history` (must allow the develop->main merge commit); + `allowed_merge_methods: ["merge"]`; same status check + copilot rules. +- `settings.json`: `{ allow_squash_merge true, allow_merge_commit true, allow_rebase_merge false, allow_auto_merge true, + delete_branch_on_merge false }`. +- `repo-config/README.md`: port PlexCleaner's; retarget the repo slug (`ptr727/NxWitness`), the secret set, the Docker Hub repo + enumeration, and the required-check lockstep note. + +### 2.13 `.slnx` - rebuild to the real file set +Replace the stale LanguageTags-shaped file list with the actual files: workflows +`build-base-images-task.yml`, `build-docker-task.yml`, `get-version-task.yml`, `merge-bot-pull-request.yml`, `publish-release.yml`, +`run-codegen-pull-request-task.yml`, `run-periodic-codegen-pull-request.yml`, `test-pull-request.yml`, `validate-task.yml`; plus +`CreateMatrix.csproj` + `CreateMatrixTests.csproj` projects and the solution items (`WORKFLOW.md`, `version.json`, etc.). Remove the +deleted `build-datebadge-task.yml` / `publish-docker-readme-task.yml` / non-existent template names. + +### 2.14 Release artifact +None as a file. The GitHub release carries auto source zip + README + LICENSE only (Docker-only repo; `fail_on_unmatched_files` omitted). +The published artifacts are the 12 Docker Hub repos' multi-arch images. `:latest`/`:stable` from main, `:develop` from develop, plus +`:` and `:develop-` tags from `Matrix.json`. + +--- + +## 3. Release model decision (signed off 2026-06-29) + +**Decision: triggered-Docker, one-branch-per-run.** Publish on `weekly schedule (main)` + +`push-on-Matrix.json-change (main)` + `workflow_dispatch`. Schedule is **main-only**; `:develop` refreshes by +manual dispatch only. + +**How it reconciles the codegen cadence:** +- Daily codegen keeps `Matrix.json`/`Version.json` current on **main AND develop** (dual-target, gotcha 11 - + drift-avoidance only; develop's update is sync-only and never publishes). +- When codegen commits a new matrix **to main**, the path-scoped push publishes the new product versions + immediately (the accepted **publish-on-matrix-change**, superseding the earlier weekly-only decision - the + maintainer accepted the full-matrix rebuild cost for the tightest upstream window). +- The weekly main schedule still runs even with no matrix change, to refresh the shared base image for CVEs + and re-cut from the current pin. +- A maintainer wanting an off-cycle or develop-channel build dispatches the publisher from the branch. + Ordinary code merges never publish (only the matrix file path triggers). + +**Why one-branch (not the old two-leg combined run):** each publish run is single-branch, so NBGV classifies +natively - no `IGNORE_GITHUB_REF`, no cross-branch leg, none of the `nbgv-publicrelease-githubref-leak` class +of bugs - and it converges NxWitness's publisher with the four live repos and the ESPHome triggered-Docker +sub-model (identical shape, only the trigger path file differs: `Matrix.json` here, `upstream-version.json` +there). The D2.2 `Verify public release version` backstop stays as defense-in-depth. + +**Confirm during execution:** the exact repo-relative path of the codegen matrix output for the push +`paths` filter (e.g. `CreateMatrix/Matrix.json` vs `Make/Matrix.json`), and that codegen writes it on the +main branch directly (or via an auto-merged `codegen-main` PR whose merge is the publishing push). + +--- + +## 4. Files to create / edit / delete + +### Create (7) +1. `WORKFLOW.md` (port from PlexCleaner + codegen/multi-image D-guarantees). +2. `repo-config/configure.sh` (verbatim helpers, retargeted secrets/check/repo). +3. `repo-config/ruleset-main.json`. +4. `repo-config/ruleset-develop.json`. +5. `repo-config/settings.json`. +6. `repo-config/README.md` (port + retarget). +7. `.github/workflows/validate-task.yml` (rename of `test-release-task.yml`, canonical `Validate job`). + +### Edit (10) +1. `.github/workflows/test-pull-request.yml` - trigger `push: ['**']`+dispatch; drop paths-filter (inline diff, option B); + `!github.event.deleted` guards; aggregator rename to `Check pull request workflow status job`; smoke `branch: github.ref_name`; + call `validate-task.yml`. +2. `.github/workflows/publish-release.yml` - one-branch-per-run (shape 3.1): single get-version/build-base/build-docker per run, main-only + release+readme, develop re-dispatch for the develop channel; remove the `date-badge` job; fold docker-readme as a main-only job. +3. `.github/workflows/build-docker-task.yml` - remove nested `get-version`, accept threaded `semver2` (+ assembly versions) inputs. +4. `.github/workflows/get-version-task.yml` - keep single NBGV; (shape 3.2 only) add `IGNORE_GITHUB_REF`. Under 3.1 unchanged. +5. `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch` to both merge jobs. +6. `.github/dependabot.yml` - verify the 6 ecosystem x 2 branch entries cover the actions used in new/edited workflows; keep dual-target. +7. `AGENTS.md` - add `### Comments` subsection + `## Documentation Style Conventions` (converge with PlexCleaner); add a reference to + `WORKFLOW.md` + `repo-config/`; refresh the "Template adaptations" section to match the chosen publisher shape and the folded + docker-readme / dropped date-badge. +8. `CODESTYLE.md` - converge shared headings with PlexCleaner (diff and align General/.NET structure; keep NxWitness specifics). +9. `.github/copilot-instructions.md` - retarget owner/name strings in the Review Runbook (`ptr727/NxWitness`); otherwise verbatim. +10. `NxWitness.slnx` - rebuild to the real file set. + (Plus: `README.md` em-dash sweep + strip date badge; `version.json` floor bump to mark the overhaul and exercise the publish path - + reconcile with the AGENTS "routine edits leave version.json untouched" rule by noting a deliberate maintainer-directed infra bump.) + +### Delete (2) +1. `.github/workflows/build-datebadge-task.yml`. +2. `.github/workflows/publish-docker-readme-task.yml` (folded into `publish-release.yml`). + +Net: 7 create, ~12 edit (incl. README + version.json), 2 delete. (Counts exclude the branch-backlog cleanup in 1.7.) + +--- + +## 5. Convergence + backports + +### 5.1 Port VERBATIM (byte-for-byte, owner/name strings only) +- `repo-config/configure.sh` helper bodies (`jq_lacks`, `check_secrets`, `ruleset_id`, `check_app`, `assert`, `pass`/`fail`/`note`, + `apply_ruleset`) - the hardened canonical forms (gotcha 6). +- `repo-config/ruleset-*.json` structure (only condition/merge-method/linear-history differ between main and develop, already canonical). +- `repo-config/settings.json` (identical to PlexCleaner). +- `.github/copilot-instructions.md` Review Runbook (only `ptr727/NxWitness` substitutions; bot id `BOT_kgDOCnlnWA`, the requestReviews + mutation, GraphQL-vs-REST login split, known-broken `POST /requested_reviewers` note all carry verbatim). +- AGENTS.md shared subsections: `### Comments`, `## Git and Commit Rules`, the "Where rules live" lead-in, `## PR Review Etiquette` + (already present and canonical here), `## Documentation Style Conventions` incl. "write docs in the current state". + +### 5.2 Adapt (repo-specific) +- `configure.sh` required-secret list (same 4 names but the manual Docker Hub note enumerates 12 NxWitness repos) and `REPO` slug. +- `WORKFLOW.md` Docker mechanics + the NEW codegen and multi-image D-guarantees and the "Template adaptations" appendix. +- AGENTS.md "Template adaptations" and the codegen/Image-Architecture sections. +- `dependabot.yml` (6 ecosystems incl. docker, dual-target - already adapted). + +### 5.3 Backports to the four live repos (drift found) +- Confirm all four live repos' `merge-bot` use `--delete-branch` (gotcha 5). NxWitness lacked it; the others were noted to have it - + spot-check and backport if any regressed. +- If NxWitness's `configure.sh` helpers (ported from PlexCleaner) reveal any newer hardening than what LanguageTags/Utilities carry, + backport the hardened helper to those NuGet repos (they share the helper bodies verbatim). +- The folded docker-readme pattern (main-only `dockerhub-description` step) should match PlexCleaner's approach; if PlexCleaner kept a + reusable file vs an inline step, align NxWitness to whichever the maintainer blessed as canonical (PlexCleaner stripped the standalone - + fold here too). + +--- + +## 6. Gotcha checklist mapped to NxWitness + +1. **NBGV GITHUB_REF classification.** APPLIES. Under shape 3.1 (recommended) github.ref matches the built branch -> native + classification, no IGNORE_GITHUB_REF. Under shape 3.2 (combined two-leg) IGNORE_GITHUB_REF is REQUIRED. `version.json` floor 2.14, + `publicReleaseRefSpec ^refs/heads/main$`. Bump the floor to exercise the publish path. +2. **NBGV threading.** APPLIES. Today `build-docker-task.yml` re-runs NBGV via a nested `get-version`. Fix: remove the nested job, thread + `semver2` (+ assembly versions) from the orchestrator's single get-version run. This also removes the `:SemVer2`-tag-collision risk + across the image matrix (one classification feeds all product legs). +3. **Docker creds in BOTH stores.** APPLIES. `DOCKER_HUB_USERNAME` + `DOCKER_HUB_ACCESS_TOKEN` must be in Actions AND Dependabot stores + (Dependabot push CI smoke-builds and logs in to Docker Hub). `configure.sh` enforces both. Same for `CODEGEN_APP_*` (merge-bot). +4. **Branch-deletion guard.** APPLIES ONCE we move CI to `push: ['**']`. Add `!github.event.deleted` to every CI job + aggregator. n/a + while the trigger stays `pull_request` (no such event), but the move to push makes it mandatory. +5. **merge-bot `--delete-branch`.** APPLIES. Missing today; add to both merge jobs. Repo-wide auto-delete stays OFF in settings.json. +6. **5D audit hardened helpers.** APPLIES. Port the final hardened `jq_lacks` (exit 4 = lacks; keep stderr), `check_secrets` + (API error FAILs, both stores, paginate), `ruleset_id` (`first // empty` in jq, no `head -1`, let gh print error), `check_app` + (best-effort note, never fails). Audit must fail when it cannot verify. +7. **Required-check name lockstep.** APPLIES + ACTIVE BUG. The aggregator is named `Check pull request workflow status` (missing + ` job`). The required-check string, the aggregator job `name:`, and the ruleset JSON must all read `Check pull request workflow status + job`. Run `configure.sh apply` in the same change that ships the workflow edit, then `check`. +8. **Aggregator success/skipped allowlist.** APPLIES; already correct (success|skipped pass, failure|cancelled fail). Keep the `changes` + "must succeed" carve-out so an image-changing PR cannot merge on a `changes` failure treated as skip. +9. **EOL discipline.** APPLIES. All workflows are CRLF today; keep CRLF for md/yml/json/code-workspace/slnx, LF for .sh/Dockerfile/.py. + Pin in `.gitattributes`/`.editorconfig` (present - verify they cover `.slnx`). Re-check after Write/Edit (they can flip CRLF to LF); + verify with `grep -c $'\r'` not `file`. +10. **Copilot review loop.** APPLIES. Runbook already in `.github/copilot-instructions.md`. snupkg/OIDC/NBGV-prerelease false positives; + decline with rationale. Expect 1-3 rounds. Use `gh api -X PATCH .../pulls/N -F body=@file` for body edits. +11. **Dependabot + codegen dual-target main AND develop.** APPLIES - this is the canonical case. Both `dependabot.yml` and + `run-codegen-pull-request-task.yml` already dual-target. KEEP both; do not collapse to single-target (maintainer rejected it; it + caused non-linear rebase/merge-block conflicts). +12. **Strip template cruft.** APPLIES. Delete `build-datebadge-task.yml`; fold `publish-docker-readme-task.yml` into a main-publish step; + drop `dorny/paths-filter`; there is no `setup`/`PUBLISH_ON_MERGE` machinery here (already absent); merge-bot already omits + `merge-upstream-version` (codegen replaces it) - keep that. +13. **Action SHAs.** APPLIES. Current pins look converged (setup-dotnet v5.4.0 `26b0ec14...`, checkout v7.0.0 `9c091bb2...`, + create-github-app-token v3.2.0, docker actions v4.x/v7.2.0, softprops v3.0.1). VERIFY every SHA->version against the GitHub API before + asserting in review; do not trust Copilot's SHA/version mapping. +14. **Prose rules.** APPLIES. No em-dashes (README has 1 - sweep it). US English. Terse comments, one line if <~120 cols, top-of-file + workflow summaries (present). Never edit human-authored comments. + +--- + +## 7. Verification + +### 7.1 Static (local, before push) +- `actionlint` on all workflows (Docker image or npx). +- `markdownlint-cli2` on `WORKFLOW.md`, `AGENTS.md`, `CODESTYLE.md`, `README.md`, `repo-config/README.md`. +- `cspell` (scope = README + HISTORY per convention; add product/codegen terms to the dictionary as needed). +- YAML + JSON parse: every workflow, `Make/Matrix.json`, `Make/Version.json`, `version.json`, the three `repo-config/*.json`, + `.slnx` well-formedness. +- `bash -n repo-config/configure.sh` and `shellcheck` (the helpers carry `# shellcheck disable` directives - preserve them). +- `dotnet build` + `dotnet test` (CreateMatrixTests) green; `dotnet csharpier --check` + `dotnet husky run` clean. +- EOL audit: `grep -c $'\r'` to confirm CRLF on md/yml/json/code-workspace/slnx and LF on .sh/Dockerfile (gotcha 9). +- Em-dash sweep: `grep -rn '—'` across the tree (expect 0 after the README fix). +- Token sweep for stale template references: `LanguageTags|ProjectTemplate|build-executable-task|build-library-task| + build-release-task|publish-periodic-docker-release|datebadge|paths-filter|PUBLISH_ON_MERGE` (expect only intentional historical mentions). +- Codegen smoke: `dotnet run --project ./CreateMatrix -- matrix --versionpath=./Make/Version.json --matrixpath=/tmp/m.json + --updateversion` against a copy, confirm it still produces a valid Matrix.json (and that the forward-only guard holds). + +### 7.2 Config audit +- `repo-config/configure.sh check` BEFORE `apply`: expect drift (no ruleset yet, required-check name mismatch, possibly missing secrets + in one store). Document the expected drift list. +- `repo-config/configure.sh apply` then `check`: expect "Configuration matches" (modulo the App best-effort note and the manual Docker + Hub push note). + +### 7.3 Live dispatch verification (post-merge) +- Dispatch `publish-release.yml` from `main`: confirm clean `X.Y.` images on the 12 repos, `:latest`/`:stable` tags, multi-arch + manifest (amd64+arm64), the versioned GitHub release, the Docker Hub overview pushed, NO prerelease `-` on the main release. +- Dispatch from `develop`: confirm `:develop` + `:develop-` tags, prerelease classification (`X.Y.-g` label), NO + versioned GitHub release. +- Confirm the shared base tags (`nx-base:ubuntu-noble`, `nx-base-lsio:ubuntu-noble`) are intact and not overwritten by a develop run. +- Trigger codegen via dispatch: confirm both `codegen-main->main` and `codegen-develop->develop` PRs open and merge-bot auto-merges with + branch deletion. + +--- + +## 8. Go-live sequence + +1. Branch `migrate/branch-scoped-cicd` off `develop`. Verify SSH signing is live before the first commit (committing is enabled here). +2. Apply all create/edit/delete (section 4). Run the full static + codegen verification (7.1) locally. +3. Push the branch (CI now runs via `push: ['**']`). Open PR -> `develop`. +4. Copilot dance (gotcha 10): poll for auto-review, re-request via `requestReviews` mutation after each push, resolve every thread, + decline false positives (snupkg/OIDC n/a here; NBGV-prerelease, the IGNORE_GITHUB_REF presence/absence, and the two-leg-vs-one-branch + choice are the likely debate points) with rationale. Budget 1-3 rounds. +5. `repo-config/configure.sh apply` against the live repo IN THE SAME change window (gotcha 7): this writes the rulesets, renames the + required check to `Check pull request workflow status job` in lockstep with the workflow edit, and adds the Copilot rule - unblocking + the PR. Then `configure.sh check` -> matches. +6. Squash-merge to `develop`. Confirm CI green on develop. +7. Promote `develop -> main` via a merge-commit PR, Copilot-reviewed, NO admin bypass (main ruleset allows the merge commit by omitting + `required_linear_history`). Watch for the migration-promotion-conflict pattern if main has straggler bumps in rewritten files; if it + bites, a local signed merge commit (tree=develop) is the documented escape, but try the normal PR first. +8. Dispatch `publish-release.yml` from `main` to verify the publish path end-to-end (7.3). Then dispatch from `develop` to verify the + develop channel. +9. Confirm `develop` survives the promotion (github-auto-delete-branch-gotcha: delete_branch_on_merge stays OFF). +10. Prune the branch backlog (1.7): close superseded `fix-release-*` / `*cicd*` / `*versioning*` / `chore/sync-template` / + `release-notes-2.14` / `shields` branches (do NOT merge them - the migration supersedes their mechanics); delete merged dependabot + branches; let codegen branches recreate themselves on the next daily run. + +--- + +## 9. Open questions for the maintainer (with recommended defaults) + +1. **Smoke gate (2.1).** Always-smoke (A) vs inline-diff change gate (B)? **Default: B** - NxWitness has many frequent pushes and a + full product smoke is heavier than PlexCleaner's single-target smoke; keep the change gate but implement inline (drop paths-filter). +2. **Base-image sharing under one-branch (2.3).** Build the base only on the main run and have develop reuse the published shared tag, or + build per-branch? **Default: build on main, develop reuses (`build_base: false`)** - the `:ubuntu-noble` base tag is branch-agnostic; a + develop rebuild would churn the shared tag. Confirm base Dockerfiles never diverge between branches. +3. **Codegen scope (2.7).** Should codegen also run `make` to auto-regenerate Dockerfiles/compose on a new upstream version, or stay + version-data-only? **Default: stay version-data-only** - Dockerfile structure changes warrant human review; the matrix data is the + safe-to-automate part. +4. **Publisher shape (3).** One-branch-per-run + develop re-dispatch (3.1) vs hardened combined two-leg run with IGNORE_GITHUB_REF (3.2)? + **Default: 3.1** - converges with the four live repos and deletes the cross-branch NBGV-leak class. 3.2 is acceptable only if the + maintainer specifically wants a single combined weekly run; then harden with IGNORE_GITHUB_REF + the D2.2 backstop and document the + divergence. +5. **version.json floor bump.** Bump from 2.14 to mark the overhaul and exercise the publish path? **Default: yes (deliberate + maintainer-directed infra bump)**, reconciled in AGENTS.md so it does not contradict the "routine edits leave version.json untouched" + rule. +6. **Docker-readme fold (2.9).** Fold into a main-only `publish-release.yml` job (canonical strip) vs keep a reusable task file given the + 12-repo list? **Default: fold** - matches PlexCleaner's strip; the manifest-jq derivation moves inline. diff --git a/repo-config/configure.sh b/repo-config/configure.sh index d8ca484..2c4c22e 100755 --- a/repo-config/configure.sh +++ b/repo-config/configure.sh @@ -139,7 +139,7 @@ check_security() { # 404 when disabled; automated-security-fixes returns { "enabled": true/false }. assert "Dependabot vulnerability alerts enabled" gh_ok "repos/$REPO/vulnerability-alerts" assert "Dependabot automated security updates enabled" \ - jq_has '.enabled == true' < <(gh api "repos/$REPO/automated-security-fixes" 2>/dev/null) + jq_has '.enabled == true' < <(gh api "repos/$REPO/automated-security-fixes") } check_secrets() { From a6a653e89bd438d501a7a895685cbe12b324ee1a Mon Sep 17 00:00:00 2001 From: Pieter Viljoen Date: Mon, 29 Jun 2026 20:54:26 -0700 Subject: [PATCH 2/2] Remove accidentally-staged plans/ scratch folder plans/ is local migration-planning scratch (created for review during the CI/CD work); it was swept in by git add -A and should not be tracked - it also contains non-ASCII punctuation that violates the repo doc rules. Co-Authored-By: Claude Opus 4.8 (1M context) --- plans/00-GOTCHA-BRIEFING.md | 136 ----- plans/INDEX.md | 97 ---- plans/esphome-nonroot-migration-plan.md | 457 ----------------- .../homeassistant-purpleair-migration-plan.md | 423 ---------------- plans/nxwitness-migration-plan.md | 466 ------------------ 5 files changed, 1579 deletions(-) delete mode 100644 plans/00-GOTCHA-BRIEFING.md delete mode 100644 plans/INDEX.md delete mode 100644 plans/esphome-nonroot-migration-plan.md delete mode 100644 plans/homeassistant-purpleair-migration-plan.md delete mode 100644 plans/nxwitness-migration-plan.md diff --git a/plans/00-GOTCHA-BRIEFING.md b/plans/00-GOTCHA-BRIEFING.md deleted file mode 100644 index f1b1328..0000000 --- a/plans/00-GOTCHA-BRIEFING.md +++ /dev/null @@ -1,136 +0,0 @@ -# Branch-scoped CI/CD migration — shared planning briefing - -You are writing a **detailed, review-ready migration + convergence plan** for one ptr727 repo. Four -sibling repos (LanguageTags, Utilities, PlexCleaner, VSCode-Server-DotNetCore) are already migrated to -this model and live. Your plan must **front-load every gotcha below** so nothing is rediscovered after -merge (the maintainer lost a lot of time to post-merge surprises in the last round and explicitly wants -that avoided this round). - -Do **NOT** make code changes. Produce a plan file only. Be concrete: name files, jobs, triggers, and the -exact guards/SHAs. Where the repo's release model genuinely conflicts with the canonical model, FLAG the -tension explicitly with a recommendation rather than papering over it. - -## Canonical references to read first (ground every claim in these, not assumptions) - -- Docker model + terse comments + 5D audit: `/home/pieter/PlexCleaner/` — read `WORKFLOW.md`, - `.github/workflows/*.yml`, `repo-config/configure.sh`, `repo-config/README.md`, `AGENTS.md`, - `CODESTYLE.md`, `.github/copilot-instructions.md`. -- NuGet/.NET model: `/home/pieter/LanguageTags/` (same files). (Only relevant if your repo publishes - NuGet — none of the three remaining do, but the .NET tooling sections still apply to NxWitness.) -- The just-completed Docker-only plan (closest template for a no-NuGet Docker repo): - `/home/pieter/.claude/plans/we-are-investigate-enhancement-peppy-bentley.md`. -- Memory (durable gotchas, read all): `/home/pieter/.claude/projects/-home-pieter-LanguageTags/memory/`. - Especially `branch-scoped-cicd-review-gotchas.md`, `nbgv-publicrelease-githubref-leak.md`, - `docker-publishing-pattern.md`, `branch-scoped-migration-playbook.md`, `github-auto-delete-branch-gotcha.md`, - `copilot-review-flow.md`, `terse-comments.md`. - -## The model (one sentence): one run = one branch. - -- **CI (`test-pull-request.yml`)**: triggers on **push to every branch** (not `pull_request`); runs - validate + the project's smoke/test + a single aggregator job named exactly - `Check pull request workflow status job` (the ruleset's required check, matched by string). Self-testing: - pushing the branch IS the PR check. -- **Publisher (`publish-release.yml`)**: `workflow_dispatch` + `schedule` only. **NO `push` trigger for - Docker repos.** Schedule builds **main only** via `github` context. Dispatch builds `github.ref_name`, - guarded `if: github.ref_name == 'main' || github.ref_name == 'develop'`, passing `ref`/`branch` = - `github.ref_name`. **No branch matrix, no branch switching, no two-phase setup job, no PUBLISH_ON_MERGE.** - Merges never publish. -- **Promotion**: `develop -> main` PR, **merge-commit**, Copilot-reviewed, **no admin bypass**. - -## Gotchas to bake into the plan (each caused a real post-merge fix last round) - -1. **NBGV GITHUB_REF classification.** NBGV picks prerelease vs stable from `GITHUB_REF`, which is - **read-only** (cannot be overridden in-job — a past attempt was a silent no-op). The one-branch model is - the fix: `github.ref` already matches the built branch, so no `IGNORE_GITHUB_REF` hack is needed (that - override only ever made sense for the old branch-matrix publisher). `version.json` floor + - `publicReleaseRefSpec` `^refs/heads/main$`: main ships clean `X.Y.`, develop ships - `X.Y.-g` prerelease. Bump the version floor to exercise the publish path. -2. **NBGV threading.** Run NBGV **once** per leg in `get-version`/`build-release-task` and **thread the - computed `semver2` down** into the docker/build task. Do **not** add a nested `get-version` inside the - build task — a second NBGV run can reclassify and produce a `:SemVer2` tag collision or a wrong - stable/prerelease tag. -3. **Docker creds in BOTH secret stores.** Docker Hub (and App) credentials must exist in **both** the - Actions **and** Dependabot secret stores: a Dependabot-triggered run is given the Dependabot store, and - that run's push-CI does the Docker smoke/login. The `configure.sh` required-secret lists encode this. -4. **Branch-deletion guard.** Every `push`-triggered workflow job must guard `if: !github.event.deleted` - (and the head jobs) so deleting a branch does not fire a phantom run. -5. **merge-bot `--delete-branch`.** The merge-bot must merge bot PRs with - `gh pr merge --auto --delete-branch`. The repo-wide **auto-delete-on-merge setting stays OFF** (so a - `develop -> main` promotion does not delete `develop`); per-merge deletion is explicit instead. Without - `--delete-branch`, bot branches accumulate. -6. **5D audit (`repo-config/configure.sh`) — use the final hardened canonical form:** - - `jq_lacks`: `jq -e ... >/dev/null || rc=$?; case "$rc" in 0) return 1 ;; 1|4) return 0 ;; *) return "$rc" ;; esac` - — exit **4** (no output) is a "lacks" case (NOT just 1); keep **stderr** (only redirect stdout) so a real - jq error (2/3/5) shows its diagnostic. - - `check_secrets`: do **not** swallow gh stderr; an API/auth error **FAILs** the audit (cannot verify = - must fail), distinct from a genuinely missing secret. - - `ruleset_id`: let gh print its own error (no `2>/dev/null`), add a context line, return non-zero; select - the first match **inside jq** (`first // empty`), not `| head -1` (SIGPIPE under pipefail). - - `check_app`: best-effort **note**, never fails the audit (precise check needs app-level auth). - - The audit must **fail when it cannot verify**, never pass by default. -7. **Required-check name lockstep.** The ruleset's required status-check string, the aggregator job name - (`Check pull request workflow status job`), and the ruleset JSON must move in lockstep. The **first - `apply`** against the live repo is what lets a PR on the new workflows go green. Run `apply` in the same - change that ships the workflow edit, then `check`. -8. **Aggregator success/skipped allowlist (D7.4).** The aggregator gate must treat **success OR skipped** - as passing for conditionally-skipped jobs, else a legitimately-skipped job blocks the PR. -9. **EOL discipline.** CRLF for `.md`/`.yml`/`.json`/`.code-workspace`; LF for `.sh`/`Dockerfile`/`.py`. - `file` does NOT report CRLF for JSON — verify with `tr -cd '\r' | wc -c` or `grep -c $'\r'`. Pin these in - `.gitattributes`/`.editorconfig`. -10. **Copilot review loop.** Comments lag the run (wait + buffer). Threads must be **resolved** to merge. - Re-request review via the `requestReviews` GraphQL mutation with bot id `BOT_kgDOCnlnWA` — Copilot does - **not** auto-review every push, so re-request after each new head. `gh pr edit --body` fails on the - projects-classic GraphQL error → use `gh api -X PATCH repos/OWNER/REPO/pulls/N -F body=@file`. snupkg / - OIDC / NBGV-prerelease are recurring **false positives** — decline with rationale. Pure-prose/format - nits on a promotion PR: decline (would diverge main from develop). Every PR finds something new — expect - 1–3 rounds and budget for them. -11. **Dependabot + codegen dual-target main AND develop.** Deliberate — it solves the non-linear rebase / - merge-block conflicts that arose with single-target. Keep both targets unless you can prove a simpler - scheme merges develop->main without bypass; the maintainer has already rejected single-target. -12. **Strip template cruft.** Remove `build-datebadge-task.yml`, `publish-docker-readme-task.yml` (fold the - Docker Hub overview into the docker task via a `peter-evans/dockerhub-description` step on a **main** - publish), `dorny/paths-filter`, the `setup`/`PUBLISH_ON_MERGE` machinery, and any `merge-codegen` / - `merge-upstream-version` jobs the repo does not actually use. -13. **Action SHAs.** Use the converged newer pins (e.g. `actions/setup-dotnet` v5.4.0, `actions/checkout` - v7.0.0). **Verify any SHA->version claim against the GitHub API** before asserting it — Copilot has - hallucinated SHA/version mappings; do not trust them. **EXCEPTION — `dotnet/nbgv` is consumed via - `@master`, NEVER SHA-pinned.** The upstream tag stream lags `master` substantially, so a SHA pin draws - spurious Dependabot downgrade PRs; this is a deliberate documented exception (see ESPHome AGENTS.md - "Action pinning"). Do not convert `nbgv@master` to a SHA. Repos are currently split (PlexCleaner / - VSCode-Server / HA-PurpleAir SHA-pin it; ESPHome / NxWitness float `@master`) — converge toward `@master` - and never the other way. Likewise never edit a human-authored rule/rationale comment during a - terse-comment pass (the `@master` rationale comment must survive). -14. **Prose rules.** No em-dashes anywhere (hard rule). US English. Terse comments: one line if it fits - ~120 cols, structured, ASCII only; each workflow gets a top-of-file summary comment. Never edit - human-authored comments — only agent-authored ones, to the terse style. - -## Convergence requirement - -The shared sections must converge **byte-for-byte where possible** across all repos: the Comments -subsection, Git/Commit rules, "Where rules live", PR Review Etiquette, Documentation Style Conventions -(incl. the "write docs in the current state" rule), the `repo-config/` ruleset JSON and `configure.sh` -helper bodies, the copilot Review Runbook. Per-repo differences are limited to: project description, -secret names, target-specific D-guarantees, and the publish mechanics. Your plan must call out which -canonical sections port verbatim and which adapt. - -## What the plan file must contain (sections) - -1. **Current-state assessment** — exact triggers, jobs, SHAs, version scheme, branch hygiene, and how the - repo publishes today. Note partial-migration state and any messy branch backlog. -2. **Target architecture** — each workflow file: triggers, jobs, guards, threaded values, the publish - trigger, and the release artifact. Reconcile the repo's release cadence with the one-branch model and - FLAG conflicts. -3. **Release model decision** — the crux for each repo (see per-project note). State the recommended - trigger explicitly and why. -4. **Files to create / edit / delete** — full list. -5. **Convergence + backports** — canonical sections to port verbatim vs adapt; any drift to backport to the - four live repos. -6. **Gotcha checklist** — map each numbered gotcha above to where it applies in THIS repo (or "n/a, why"). -7. **Verification** — static (actionlint/markdownlint/cspell/parse/bash -n/EOL/em-dash sweep), config audit - (`configure.sh check` expected drift then "matches"), and live dispatch verification. -8. **Go-live sequence** — PR -> develop -> Copilot dance -> `apply` lockstep -> squash to develop -> - promote develop->main (no bypass) -> dispatch publisher -> verify artifacts -> confirm develop survives. -9. **Open questions for the maintainer** — anything genuinely ambiguous, with your recommended default. - -Write the plan to the path given in your task prompt. Make it thorough enough to execute from without -re-deriving the model. diff --git a/plans/INDEX.md b/plans/INDEX.md deleted file mode 100644 index 00976fe..0000000 --- a/plans/INDEX.md +++ /dev/null @@ -1,97 +0,0 @@ -# Next-round migration plans — review index - -Three execute-ready plans, one per remaining repo, plus a future PyPI project. Each follows the same -9-section structure and maps the shared gotcha checklist ([00-GOTCHA-BRIEFING.md](./00-GOTCHA-BRIEFING.md)) -into repo-specific findings, so the post-merge surprises that cost time last round are surfaced BEFORE any -code changes. - -- [esphome-nonroot-migration-plan.md](./esphome-nonroot-migration-plan.md) — Docker, upstream-pin check -- [homeassistant-purpleair-migration-plan.md](./homeassistant-purpleair-migration-plan.md) — Python/HACS zip -- [nxwitness-migration-plan.md](./nxwitness-migration-plan.md) — .NET codegen + multi-target Docker - -> This INDEX is the **authoritative aligned decision record** (updated 2026-06-29 after maintainer review). -> Where an individual plan's prose predates these decisions, this file wins; the plans' release-model -> sections have been patched to match. - -## Release-model playbooks (the framing) - -We are converging on **release-model playbooks**: a WORKFLOW set defined per *release model*, the same way -CODESTYLE defines rules per *language*. Same-model repos share workflow definitions; a fix to one backports -to its model-siblings. A model can have sub-models (e.g. Docker splits by whether an external update trigger -exists). Current map: - -| Release model | Publish trigger | Repos | -|---|---|---| -| NuGet push-publish (+ `Directory.Packages.props` as shipped input) | push-on-dep-change + dispatch | LanguageTags, Utilities | -| Native binary + multi-arch Docker | dispatch / on-demand | PlexCleaner | -| **Docker — vanilla** (no external trigger) | weekly schedule(main) + dispatch | VSCode-Server | -| **Docker — triggered** (daily external signal) | weekly schedule(main) + **push-on-pin/matrix-change(main)** + dispatch | ESPHome-NonRoot, NxWitness | -| HACS/Python — manual release + upstream-retest tripwire | dispatch only (schedule retests, never publishes) | HA-PurpleAir | -| PyPI (the "never done" one) | TBD | *future repo — which one?* | - -**Capstone deliverable (after all migrations):** per-workflow flow diagrams in text notation (Mermaid) + -rendered PNG, showing entry points, triggers, outputs, decisions, and branching — making common-vs-unique -obvious across models and sub-models. - -## Maintainer clarifications (incorporated) - -- **`upstream-version.json` / `Matrix.json` are maintained by the repo's own daily scheduled job** (the - upstream-version check for ESPHome; codegen for NxWitness), which checks upstream and records the - last-released versions. NOT dependabot-updated. This is the repo's self-owned upstream-state pin. -- **The daily detection signal is 100% certain** an update is required; vanilla Docker has no such signal so - it can only assume a weekly apt/base change. Hence: triggered repos publish on the signal AND weekly; - vanilla publishes weekly only. -- **Dependabot (and codegen) dual-target main AND develop on every repo** — purely to avoid merge drift. - develop's pin/matrix/dep updates are sync-only and never publish. -- **HA monitoring is a breakage tripwire**: the HA-version monitor bumps the test matrix so a breaking - upstream change FAILS the PR build; a human fixes it and releases manually. That is why HA publish is - dispatch-only. - -## Release-trigger decision per repo (signed off) - -- **ESPHome-NonRoot:** weekly `schedule` on main (baseline apt/base CVE refresh) **+ path-scoped `push` on - main when `upstream-version.json` changes** (the daily upstream-check commits a real update -> publish - now) **+ `workflow_dispatch`**. The daily upstream-check workflow stays as the detection mechanism. Cheap - single image, so publish-on-trigger is clearly worth it. Drops PUBLISH_ON_MERGE; ordinary code merges do - not touch the pin so they do not publish. -- **NxWitness:** weekly `schedule` on main (builds the full product matrix, baseline refresh + release) - **+ path-scoped `push` on main when `Matrix.json` changes** (codegen commits a new matrix -> publish now) - **+ `workflow_dispatch`**. **Supersedes the earlier weekly-only decision** — publish on matrix change is - accepted despite the full-matrix build cost. Schedule is **main-only** (the earlier "paired develop - dispatch to refresh :develop" is dropped; `:develop` builds on manual dispatch only). Codegen runs daily, - dual-target main+develop. -- **HomeAssistant-PurpleAir:** **dispatch-only** publish; `schedule` retests main only and **never - publishes**. The HA-version monitor updates the test matrix to trip on breaking changes (fail the PR -> - human fix -> manual release). Drops the current `push:[develop]` auto-prerelease that violates the "merges - never publish" invariant the docs already claim. - -## Cross-repo recurring findings (same root causes as last round — now pre-empted) - -1. **Nested NBGV in `build-docker-task.yml`** (ESPHome, NxWitness) — second NBGV run risks `:SemVer2` tag - collision / misclassification. Fix: thread one `semver2` down, delete the nested `get-version`. -2. **merge-bot missing `--delete-branch`** (all three) — bot branches accumulate; auto-delete-setting stays - OFF to protect develop on promotion. -3. **No `repo-config/` 5D audit** (ESPHome, NxWitness; HA also lacks it) — created from the canonical - hardened `configure.sh` + ruleset JSON. -4. **Required-check name mismatch** (NxWitness aggregator is `...status` missing trailing ` job`; HA - similar) — must move in lockstep with the first `apply` or PRs never go green. -5. **CI still on `pull_request` + `dorny/paths-filter`** (NxWitness) — move to `push: ['**']`, drop the - filter, add the `!github.event.deleted` guard. (Note: the publisher's path-scoped `push` is separate and - intentional — it lives in `publish-release.yml`, not CI.) -6. **Messy branch backlog** (all three; HA worst at 40+) — prune superseded branches after verifying they - are merged/abandoned. -7. **Version-floor regressions to set first** — HA main is 0.3.0 but NBGV base would ship 0.1.x (bump base - to >=0.3 first); ESPHome/NxWitness bump the floor to exercise the publish path. - -## Suggested execution order (simplest -> hardest) - -1. **ESPHome-NonRoot** — closest to the converged Docker reference; smallest delta; first to prove the - triggered-Docker sub-model (weekly + push-on-pin). -2. **HomeAssistant-PurpleAir** — develop already ~90% converged; mostly main-catch-up + Python validate - mapping + branch cleanup; the version regression needs deciding first. -3. **NxWitness** — most complex (codegen + multi-target matrix + push-on-matrix publish); do last with the - pattern fresh. -4. **PyPI project** (future) — identify the repo and plan once the above land. - -All SHA->version claims in the plans were verified against the GitHub API (no hallucinated pins). No code -was changed — these are plans only. diff --git a/plans/esphome-nonroot-migration-plan.md b/plans/esphome-nonroot-migration-plan.md deleted file mode 100644 index 31c799c..0000000 --- a/plans/esphome-nonroot-migration-plan.md +++ /dev/null @@ -1,457 +0,0 @@ -# Migrate ESPHome-NonRoot to branch-scoped CI/CD + converge - -Target: `ptr727/ESPHome-NonRoot` (`/home/pieter/ESPHome-NonRoot`). Docker-only image: layers a non-root -ESPHome + `esphome-device-builder` dashboard onto `python:3.14-slim`, ships one multi-arch image to -`docker.io/ptr727/esphome-nonroot`. No buildable app, no NuGet, no tests. Unique trait vs the other Docker -repos: it **tracks an upstream PyPI release** (`esphome` + `device_builder`) via a daily tracker that writes -`upstream-version.json` and opens auto-merged bump PRs. NBGV (`version.json`, floor `1.7`) still drives the -GitHub-release tag and `LABEL_VERSION`. - -Canonical references read: PlexCleaner (`/home/pieter/PlexCleaner`) WORKFLOW.md + workflows + repo-config -(the converged Docker model, **one branch per run, no matrix**), the prior Docker-only plan -(`we-are-investigate-enhancement-peppy-bentley.md`, VSCode-Server), and memory (nbgv-githubref-leak, -docker-publishing-pattern, branch-scoped-cicd-review-gotchas, migration-playbook, auto-delete gotcha). -**Do not change code from this plan; this is a plan only.** All converged action SHAs were verified MATCH -against the GitHub API (setup-dotnet v5.4.0, checkout v7.0.0, nbgv v0.5.2, build-push v7.2.0). - ---- - -## 1. Current-state assessment - -ESPHome-NonRoot is on the **older ProjectTemplate two-phase model** - further behind than PlexCleaner was, -and it carries the upstream-tracker layer the pure-Dockerfile VSCode-Server repo did not. - -**Workflows present (`.github/workflows/`):** - -| file | trigger | shape | -|---|---|---| -| `publish-release.yml` | `push: [main, develop]` + `workflow_dispatch` + `schedule` (Mon 02:00) | two-phase: `setup` plan job reading `vars.PUBLISH_ON_MERGE`, then `[main,develop]` **matrix** `publish`, plus `date-badge`, `docker-readme`, `cleanup-artifacts` jobs | -| `test-pull-request.yml` | `pull_request: [main, develop]` + dispatch | `dorny/paths-filter` `changes` -> `smoke-build` -> aggregator `Check pull request workflow status` (**no ` job` suffix**) -> `cleanup-artifacts` | -| `build-release-task.yml` | `workflow_call` | `get-version` -> `build-docker` (gated `enable_docker`) -> `github-release`; **nested `get-version` re-run inside `build-docker-task`** | -| `build-docker-task.yml` | `workflow_call` | **has its own nested `get-version` job**; reads `upstream-version.json`; login gated on `push` (not on smoke); tags `latest`/`develop` + pinned esphome version | -| `get-version-task.yml` | `workflow_call` | NBGV; `setup-dotnet` **v5.3.0** (old SHA); **`nbgv@master`** (floated, not pinned) | -| `check-upstream-version.yml` | `schedule` (daily 05:00) + dispatch | entry-point; resolver curls PyPI for `esphome` + `esphome-device-builder` | -| `check-upstream-version-task.yml` | `workflow_call` | generic tracker: matrix over `[main, develop]`, App-signed `create-pull-request`, writes `upstream-version.json` (CRLF) | -| `merge-bot-pull-request.yml` | `pull_request_target` | `merge-dependabot` + **`merge-upstream-version`** + `disable-auto-merge`; **merges WITHOUT `--delete-branch`** | -| `build-datebadge-task.yml` | `workflow_call` | BYOB date badge (template cruft) | -| `publish-docker-readme-task.yml` | `workflow_call` | full generic docker-readme task (template cruft - fold into docker task) | - -**Drift / debt vs canonical:** - -- **Two-phase machinery present:** `setup` plan job, `vars.PUBLISH_ON_MERGE`, `push` publish trigger, the - `[main,develop]` **branch matrix** in `publish` (the cross-branch NBGV-ref leak class, per - `nbgv-publicrelease-githubref-leak`). Canonical is now **one branch per run, no matrix, no setup job**. -- **`pull_request` CI (not `push`):** PRs from forks satisfy the check, but a workflow-edit PR does not test - its own copy, and there is no branch-deletion guard (none needed under `pull_request`, but needed after - the switch to `push`). -- **`dorny/paths-filter`** gates the smoke build (strip per gotcha 12; always smoke, buildcache keeps fast). -- **Nested NBGV:** both `build-release-task` and `build-docker-task` run `get-version` - a double NBGV run, - the exact gotcha-2 collision risk. Must thread `SemVer2` instead. -- **Aggregator name `Check pull request workflow status`** lacks the canonical ` job` suffix - the - required-check string must move to `Check pull request workflow status job` in lockstep with the ruleset. -- **No `repo-config/`** at all (no ruleset JSON, no `configure.sh`, no `settings.json`). The live ruleset - (if any) is hand-managed. This is the biggest missing piece. -- **No `WORKFLOW.md`**, **no `cspell.json`** (words live in the `.code-workspace`, ~60+ entries). -- **Old/floated SHAs:** `setup-dotnet` v5.3.0 (-> v5.4.0 `26b0ec1...`), `nbgv@master` (-> v0.5.2 - `705dad1...`). All Docker action SHAs already match canonical (qemu v4.1.0, buildx v4.1.0, login v4.2.0, - build-push v7.2.0) - verified. -- **`docker-readme` is a separate task** + a `date-badge` task - both template cruft to strip. -- **CODESTYLE.md carries large `.NET` and `Python` sections** that are inert (no .cs, no buildable .py - the - Python lives only inside the Dockerfile's uv venv). Per the repo's own "carry whole, don't trim" rule it - kept them; the converged Docker repos drop them. **Flag for maintainer** (see open questions). - -**Versioning:** NBGV, `version.json` floor `1.7`, `publicReleaseRefSpec ^refs/heads/main$`. Already correct -shape; floor bump to `1.8` to exercise the publish path (HISTORY tops out at 1.7). - -**Branch hygiene / backlog (messy - clean before go-live):** `main` and `develop` have **diverged -substantially** - `git diff --stat origin/main origin/develop` = 12 files / ~1034 insertions (develop is -well ahead: `.editorconfig` +176, `CODESTYLE.md` +463 new, `Docker/Dockerfile` rewritten ~334 lines, -`merge-bot`, `publish-release`, `publish-docker-readme` all changed). Stale remote branches to triage/delete: -`chore/sync-template`, `feature/sync-versioned-rulesets`, `fix-devcontainer-venv-path`, -`fix/python-314-doc-refs`, `reconverge-upstream-tracker`, `resync-copilot-runbook-178`, -`resync-template-pr167`, `resync/projecttemplate-pr184`, `resync/projecttemplate-pr190`, `shields`, -`support-device-builder`, plus live tracker heads `upstream-version-main` / `upstream-version-develop` and -4 dependabot heads. **The migration must land on `develop` (the ahead branch), then promote develop->main** -- a promotion here also resolves the large develop/main content drift, so expect a substantive promotion PR. - -**How a new upstream version becomes a published image today (the crux, traced):** - -1. `check-upstream-version.yml` runs daily (05:00 UTC) + on dispatch; resolver curls PyPI for the latest - `esphome` and `esphome-device-builder`, prints `{esphome, device_builder}`. -2. `check-upstream-version-task.yml` (matrix `[main, develop]`) rewrites `upstream-version.json` and opens an - App-signed `upstream-version-` bump PR per branch. -3. `merge-bot-pull-request.yml`'s `merge-upstream-version` auto-merges each (squash to develop, merge to main). -4. On main, the merge is a `push` -> **today** `publish-release.yml` publishes **only if `PUBLISH_ON_MERGE` - is `true`** (the maintainer's "releases on a dependabot/bump PR"); otherwise it waits for the weekly - schedule. `build-docker-task` reads `upstream-version.json` for the image tag + build-args. - -So `upstream-version.json` is **a committed build input**, read at build time - directly analogous to -`Directory.Packages.props` for the NuGet repos. The bump PR keeps it current; *something* then has to build. - ---- - -## 2. Target architecture - -Port PlexCleaner's converged Docker model **verbatim in shape**, dropping the executable target, and keep -ESPHome's two repo-specific leaves: `build-docker-task` (reads `upstream-version.json`, 3 build-args) and the -upstream-version tracker. **One branch per run, no matrix, no setup job, no PUBLISH_ON_MERGE.** - -**`publish-release.yml`** (rewrite to the triggered-Docker one-branch model): -- Triggers: `workflow_dispatch` + `schedule` (`0 2 * * MON`, weekly baseline) **+ path-scoped `push` on - main when the upstream pin changes** (`push: { branches: [main], paths: [upstream-version.json] }`). The - daily upstream-check commits a real update to the pin -> this push publishes immediately; the weekly - schedule covers base/apt rot when nothing upstream changed. Ordinary code merges do not touch the pin, so - "merges never publish" still holds (a Dockerfile/README change ships on the next weekly run or a manual - dispatch - see open questions for whether to widen the path set). Global concurrency group - `${{ github.workflow }}`, `cancel-in-progress: false`. -- Single `publish` job, `if: github.ref_name == 'main' || github.ref_name == 'develop'`, calls - `build-release-task.yml` with `ref: github.ref_name`, `branch: github.ref_name`, `smoke: false`, - `github: true`, `dockerhub: true`. **Delete** `setup`, `date-badge`, `docker-readme`, and the run-level - `cleanup-artifacts` jobs (the Docker push uploads no artifact; nothing to clean - matches PlexCleaner which - has none). Schedule runs main only (`github.ref` = default branch); dispatch publishes its own ref. - -**`build-release-task.yml`** (rewrite to thread NBGV, drop nested get-version): -- `validate` (gated `!smoke`, calls `validate-task`) + `get-version` (single NBGV) -> `build-docker` - (gated `!cancelled() && get-version success && (validate success || skipped)`) -> `github-release`. -- `build-docker` is passed `ref: GitCommitId`, `branch`, `smoke`, `push: dockerhub && !smoke`, and the - threaded `semver2` (+ the assembly versions, even though only `semver2` is consumed - keep the converged - input set for byte-convergence). **Remove `enable_docker`** (always build the one target). -- `github-release`: unchanged canonical shape - main-only prerelease backstop, download - `release-asset--*` (matches nothing here, succeeds), no-op-if-tag-exists guard, dispatch refreshes, - `target_commitish: GitCommitId`, `prerelease: branch != 'main'`, files `LICENSE` + `README.md`. **Keep - `fail_on_unmatched_files` OFF** (or omit) - the Docker target ships no asset and the glob legitimately - matches zero; PlexCleaner sets it true *because* its executable target must upload the 7z. Here it would - red a clean Docker-only release. (This is a real per-repo divergence from PlexCleaner - flag in WORKFLOW.md.) - -**`build-docker-task.yml`** (trim + thread, keep the upstream-version read): -- Inputs: `push`, `ref`, `branch` (required), `smoke`, **`semver2` (required, threaded)**. **Delete the - nested `get-version` job** (gotcha 2) - consume `inputs.semver2`. -- Keep the `Get pinned versions step` reading `.esphome` / `.device_builder` from `upstream-version.json`. -- **Login on every build (incl. smoke)** like canonical (higher pull/cache rate limits; forks can't push) - - change from today's `if: push`. This is what makes the Dependabot-store creds gotcha (3) load-bearing. -- Tags: `docker.io/ptr727/esphome-nonroot:${branch=='main' ? 'latest':'develop'}` + (main only) the pinned - `:` tag. **Add a `:` tag** to match canonical (every image carries its release - version) - decide with maintainer whether to keep BOTH the esphome-version tag and the SemVer2 tag (see - open questions). Branch-scoped buildcache (read both, write this branch on push), `mode=max`, - `ignore-error=true`. Build-args `LABEL_VERSION=semver2`, `ESPHOME_VERSION`, `DEVICE_BUILDER_VERSION`. -- **Fold the Docker Hub overview in:** add a `peter-evans/dockerhub-description@v5.0.0` step gated - `if: inputs.push && inputs.branch == 'main'`, `repository: ptr727/esphome-nonroot`, - `readme-filepath: ./Docker/README.md` (already exists, 28 lines). Then **delete** - `publish-docker-readme-task.yml`. - -**`get-version-task.yml`**: bump `setup-dotnet` to v5.4.0 (`26b0ec14...`); **keep `nbgv@master`** (documented no-SHA-pin exception - do NOT pin it) -(`705dad19...`) - drop `@master`. Otherwise canonical. - -**`validate-task.yml`** (NEW - lint-only, no unit-test): markdownlint (`**/*.md`) + cspell (README.md + -HISTORY.md) + actionlint, on `setup`-free runners. **No `unit-test`, no CSharpier/`dotnet format`** (no C#); -**no Python lint** (the Python is inside the Dockerfile only, exercised by the smoke build). Reused by CI and -each publish leg so the gates are identical. (PlexCleaner's `validate-task` has unit-test + C# lint; this is -the documented per-repo adapt.) - -**`test-pull-request.yml`** (rewrite to push-CI canonical): -- `on: push: branches: ['**']` (not tags) + `workflow_dispatch`. **Drop `pull_request` and - `dorny/paths-filter`.** -- `validate` (`if: !github.event.deleted`) + `smoke-build` (`build-release-task` with `smoke:true`, - `github:false`, `dockerhub:false`, `branch: github.ref_name`, `if: !github.event.deleted`) + - aggregator **`Check pull request workflow status job`** (`if: always() && !github.event.deleted`, fails - unless every need is `success`). **Drop the terminal `cleanup-artifacts`** (smoke uploads nothing). -- Document the fork-PR exception (a fork PR produces no push -> no required check; a maintainer lands it on - an in-repo branch first), same wording as PlexCleaner. - -**`check-upstream-version.yml` + `check-upstream-version-task.yml`** (KEEP, light touch): -- These already match the canonical multi-key tracker shape (App-signed CRLF-writing PR, `[main,develop]` - matrix, resolver inputs). Keep both. Only verify: action SHAs (`create-github-app-token` v3.2.0, - `checkout` v7.0.0, `create-pull-request` v8.1.1 - already current), terse-comment conformance, and that - the head-ref prefix `upstream-version` still matches the merge-bot's `merge-upstream-version` job refs. -- **The tracker's daily cadence is the staleness floor** for `upstream-version.json` currency; the *publish* - cadence (section 3) is what turns a current pin into a pushed image. - -**`merge-bot-pull-request.yml`** (converge, add `--delete-branch`): -- Keep `merge-dependabot`, `merge-upstream-version`, `disable-auto-merge-on-maintainer-push`. -- **Add `--delete-branch`** to both auto-merge calls (gotcha 5) - currently missing; bot/upstream branches - accumulate without it. Repo-wide auto-delete stays OFF (settings.json) so develop survives promotion. -- The `disable-auto-merge` job already lists both bot logins (`dependabot[bot]`, `ptr727-codegen[bot]`) - - keep (PlexCleaner only has dependabot, since it has no codegen/tracker bot; this is a justified per-repo - superset, not drift to "fix"). -- Converge comments to terse canonical. - -**Release artifact:** a GitHub release **as version anchor** (tag on the built commit + `LICENSE` + -`README.md`, generated notes, no binary asset) plus the multi-arch Docker image on Docker Hub -(`latest`/`develop` + version tags) + the Docker Hub overview on a main publish. - -### RESOLVED: release cadence vs the one-branch model (signed off 2026-06-29) - -ESPHome is a **triggered Docker** repo: the daily upstream-check gives a 100%-certain "update required" -signal. Agreed model: **publish on the update trigger AND weekly** - a path-scoped `push` on main when -`upstream-version.json` changes publishes the new upstream immediately, and the weekly schedule refreshes the -base/apt layer when nothing upstream changed. This is NOT a merge-publish of arbitrary code (only the pin -file change publishes), so the "merges never publish" invariant and the one-branch NBGV correctness both -hold. See section 3 for the full rationale. - ---- - -## 3. Release model decision (signed off 2026-06-29) - -**Decision: triggered-Docker model - publish on `weekly schedule (main)` + `push-on-upstream-version.json-change (main)` + `workflow_dispatch`.** The daily upstream-check stays as the -detection mechanism (it owns the pin); a real upstream bump it commits to main publishes immediately via the -path-scoped push, and the weekly schedule refreshes the base/apt layer when nothing upstream changed. - -**Why this shape:** - -- **Best of both, no merge-publish.** ESPHome's daily check is a 100%-certain "update required" signal, so - unlike vanilla Docker (VSCode-Server, which can only assume weekly apt rot) we publish the instant the pin - changes - ~immediate upstream response - and still publish weekly for base CVEs. Only the pin file change - publishes; arbitrary code merges do not (path filter), so the "merges never publish" invariant holds. -- **It is the `Directory.Packages.props` pattern.** The NuGet repos already publish on a push that touches - their shipped dependency input; `upstream-version.json` is the exact Docker analog. This is a converged - pattern, not a new one - and it is shared with NxWitness (push-on-`Matrix.json`-change), defining the - **triggered-Docker sub-model**. -- **One-branch correctness preserved.** Every publish run is single-branch (`github.ref` == built branch), - so NBGV classifies natively with no matrix, no `IGNORE_GITHUB_REF`, no cross-branch leak - (`nbgv-publicrelease-githubref-leak`); `develop->main` still promotes via a normal Copilot-reviewed PR with - no admin bypass. The push trigger is branch-filtered to `main`, so develop's daily pin update is sync-only - (drift-avoidance) and never publishes. - -**Staleness window:** new-upstream exposure is ~the daily-check interval (publish fires on the pin commit), -not a week; the weekly schedule only bounds *base-image* rot to <=7 days. Manual `gh workflow run -publish-release.yml` remains a zero-wait escape hatch. - -**Open sub-decision (see section 9):** whether the publish `paths` filter should also include `Docker/**` -(so a Dockerfile change to main publishes too) or stay pin-only (Dockerfile/code changes wait for the weekly -run or a dispatch). Default recommended: **pin-only**, to keep "merges never publish" literal. - ---- - -## 4. Files to create / edit / delete - -**Create (8):** -- `WORKFLOW.md` - port PlexCleaner's, Docker-only + upstream-tracker variant (see section 5). -- `.github/workflows/validate-task.yml` - lint-only (markdownlint + cspell + actionlint). -- `repo-config/ruleset-develop.json` - byte-identical to canonical. -- `repo-config/ruleset-main.json` - byte-identical to canonical. -- `repo-config/settings.json` - byte-identical to canonical. -- `repo-config/configure.sh` - canonical helper bodies; secrets `DOCKER_HUB_USERNAME`, - `DOCKER_HUB_ACCESS_TOKEN`, `CODEGEN_APP_CLIENT_ID`, `CODEGEN_APP_PRIVATE_KEY` in **both** stores; Docker - Hub repo string `ptr727/esphome-nonroot`. -- `repo-config/README.md` - canonical, adapted repo name / Docker Hub repo. -- `cspell.json` - migrate the `.code-workspace` `cSpell.words` (~60+ entries) + any README/HISTORY words. - -**Edit (10):** -- `.github/workflows/publish-release.yml` - rewrite to one-branch publisher (drop setup/matrix/push/ - PUBLISH_ON_MERGE/date-badge/docker-readme/cleanup jobs). -- `.github/workflows/build-release-task.yml` - add `validate` + thread NBGV; drop `enable_docker`; keep - github-release (no `fail_on_unmatched_files`). -- `.github/workflows/build-docker-task.yml` - drop nested `get-version`; consume `semver2`; login always; - add `:SemVer2` tag; fold in dockerhub-description step (main publish). -- `.github/workflows/get-version-task.yml` - setup-dotnet v5.4.0; **keep `nbgv@master`** (do NOT SHA-pin; documented exception). -- `.github/workflows/test-pull-request.yml` - push-CI, drop pull_request + dorny + cleanup; deletion guards; - aggregator rename to `...status job`. -- `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch`; terse comments. -- `.github/workflows/check-upstream-version.yml` / `-task.yml` - terse-comment + SHA conformance only - (no behavior change). -- `.github/dependabot.yml` - keep dual-target github-actions + docker (already correct); de-template - comments only. -- `AGENTS.md` - rewrite Release Model (one-branch publisher + upstream-tracker), strip two-phase / - PUBLISH_ON_MERGE / date-badge / docker-readme / paths-filter framing, point to `WORKFLOW.md`; add - "Shared Configuration and Tooling" + "Write docs in the current state" + repo-config pointer; converge the - shared sections (Comments, Git, Where rules live, PR Review Etiquette, Doc Style) byte-for-byte. -- `CODESTYLE.md` - **see open question** (drop inert .NET + Python sections to converge, OR keep per the - repo's "carry whole" rule). `.github/copilot-instructions.md` - confirm byte-identical Runbook + repo - placeholders (likely already current; edit only if drift). -- `version.json` - floor `1.7` -> `1.8`. `HISTORY.md` + `README.md` - add the 1.8 "CI/CD rework" entry. -- `ESPHome-NonRoot.code-workspace` / `.editorconfig` / `.gitattributes` - remove `cSpell.words` (moved to - cspell.json), de-template comments, keep LF pins for `.sh`/`Dockerfile`/entrypoint scripts. - -**Delete (2):** -- `.github/workflows/build-datebadge-task.yml` -- `.github/workflows/publish-docker-readme-task.yml` - -(Net: ~8 create, ~14 edit, 2 delete. `Docker/Dockerfile`, `Docker/Compose.yml`, `Docker/entrypoint/*`, -`Docker/README.md`, `.devcontainer/*`, `LICENSE`, `.markdownlint-cli2.jsonc`, `.dockerignore`, `.gitignore`, -`.vscode/*` unchanged.) - ---- - -## 5. Convergence + backports - -**Port verbatim (byte-for-byte) from PlexCleaner:** -- `repo-config/ruleset-develop.json`, `ruleset-main.json`, `settings.json` (only the integration_id 15368 - and check string are shared constants - identical). -- `repo-config/configure.sh` helper bodies: `jq_lacks` (exit-4 + stderr handling), `check_secrets` - (API-error-FAILs), `ruleset_id` (`first // empty`, no `2>/dev/null`), `check_app` (note-only), - `check_ruleset` / `check_settings` / `check_security`. Only `REQUIRED_*_SECRETS`, the Docker Hub repo - string, and `REPO` differ. -- AGENTS shared sections: **Comments** subsection, **Git and Commit Rules**, **Where rules live / - Shared Configuration and Tooling**, **PR Review Etiquette** (Merge Gate, Expected Review Loop, Triaging, - Responding, Escalating), **Documentation Style Conventions** incl. the **"Write docs in the current - state"** rule, **Workflow YAML Conventions** pointer. -- `.github/copilot-instructions.md` GitHub Copilot Review Runbook (placeholders `//` only). -- WORKFLOW.md section skeleton (0 model + glossary, 1-2 style, 3 architecture, 4 D0-D10, 5 methodology incl. - 5D audit, 6 config). - -**Adapt (per-repo):** -- WORKFLOW.md D4 (release/publish): **single Docker target + GitHub release as version anchor** (no - executable/7z seam); add a **D-guarantee for the upstream-version tracker** (daily PyPI resolve -> - App-signed bump PR -> auto-merge -> consumed by the next scheduled/dispatch publish) - the one genuine - ESPHome-specific contract the siblings lack. Note the `fail_on_unmatched_files: false` divergence and why. -- `validate-task` is lint-only (no unit-test / C# / Python lint) - the documented adapt. -- WORKFLOW.md "Self-sufficiency": ESPHome **has** an upstream-version tracker (PlexCleaner's says "no - codegen and no upstream-version tracker") - invert that sentence. -- `merge-bot` carries `merge-upstream-version` + a second bot login (justified superset). - -**Backports to the four live repos (drift found):** -- None *new* surfaced beyond what the VSCode-Server plan already lists (the "Write docs in current state" - rule missing from Utilities + LanguageTags; terse-comment alignment of `validate-task` / `merge-bot` in - Utilities + LanguageTags). If those backports already landed with VSCode-Server, this migration introduces - no new canonical drift - it **consumes** the converged form. Confirm the canonical `configure.sh` / - ruleset JSON match the now-live PlexCleaner copies before porting (they are the source of truth). - ---- - -## 6. Gotcha checklist (mapped to this repo) - -1. **NBGV GITHUB_REF classification** - APPLIES. One-branch publisher fixes it natively: `github.ref` == - built branch, no `IGNORE_GITHUB_REF`. Floor `1.7`->`1.8` exercises the publish path. The dropped matrix - removes the leak class entirely. -2. **NBGV threading (run once)** - **DIRECTLY APPLIES, current bug.** `build-docker-task` has its OWN nested - `get-version`, *and* `build-release-task` has one - a double run. Delete the nested job; thread `semver2` - from `build-release-task`'s single `get-version` into `build-docker-task` as an input. The pinned-version - tag would otherwise be fine, but the `:SemVer2` tag could collide/misclassify. -3. **Docker creds in BOTH secret stores** - APPLIES (Dependabot auto-merges docker + actions bumps; their - push-CI smoke build now logs in to Docker Hub because login moves to *always*). `configure.sh` requires - `DOCKER_HUB_USERNAME` / `DOCKER_HUB_ACCESS_TOKEN` in both Actions and Dependabot stores; App creds too - (the tracker is App-signed). **Verify the Dependabot store has them before go-live** or bot auto-merge - stalls on a red smoke check. -4. **Branch-deletion guard** - APPLIES once CI moves to `push: ['**']`. Add `if: !github.event.deleted` to - every `test-pull-request` job and `always() && !github.event.deleted` to the aggregator. (Not needed - today under `pull_request`.) -5. **merge-bot `--delete-branch`** - **APPLIES, current gap.** Today's merge-bot merges WITHOUT - `--delete-branch`, so `upstream-version-*` and dependabot heads accumulate (visible in the backlog). Add - it to both merge calls; keep repo-wide auto-delete OFF in settings.json (so a develop->main promotion does - not delete develop - `github-auto-delete-branch-gotcha`). -6. **5D audit hardened form** - APPLIES (creating `configure.sh` fresh). Use the final canonical bodies - verbatim: `jq_lacks` exit-4-is-lacks + stderr kept; `check_secrets` API-error-FAILs; `ruleset_id` - `first // empty` no-stderr-suppress; `check_app` note-only; audit FAILs when it cannot verify. -7. **Required-check name lockstep** - APPLIES. Rename aggregator to `Check pull request workflow status job` - in `test-pull-request.yml`, set the same string in both ruleset JSONs and in `configure.sh`'s - `REQUIRED_CHECK`. First `apply` against the live repo (in the same change shipping the workflow) is what - lets the migration PR's required check resolve; then `check`. -8. **Aggregator success/skipped allowlist (D7.4)** - APPLIES. `validate` always runs; `smoke-build` always - runs (no paths-filter now) so it won't skip - but use the canonical `success`-required loop (and the - `build-release-task` build gate uses `(success || skipped)` with `!cancelled()` for the `validate`-skip - on smoke). Don't use `!= 'failure'` (lets cancelled through). -9. **EOL discipline** - APPLIES. CRLF for `.md`/`.yml`/`.json`/`.code-workspace`; LF for `.sh`/`Dockerfile`/ - `entrypoint/*` (extensionless - `.gitattributes` already pins `*.sh` + `Dockerfile`; **add an explicit - `Docker/entrypoint/** text eol=lf`** if the entrypoint scripts are extensionless, or confirm they end - `.sh`). `upstream-version.json` must stay CRLF - the tracker writes it CRLF via `sed 's/$/\r/'`; verify - with `tr -cd '\r' | wc -c`, not `file`. -10. **Copilot review loop** - APPLIES at go-live. Comments lag; resolve threads to merge; re-request via the - `requestReviews` mutation (bot id `BOT_kgDOCnlnWA`); `gh api -X PATCH .../pulls/N -F body=@file` for body - edits. OIDC/NBGV-prerelease/snupkg are false positives (no NuGet/OIDC here, so fewer). Budget 1-3 rounds. -11. **Dependabot dual-target main AND develop** - APPLIES. `dependabot.yml` already dual-targets both - `github-actions` and `docker` - keep. (The *upstream-version tracker* also dual-targets via its - `[main,develop]` matrix - same rationale: avoid non-linear rebase/merge-block conflicts.) -12. **Strip template cruft** - APPLIES. Delete `build-datebadge-task.yml` + `publish-docker-readme-task.yml` - (fold overview into the docker task via `peter-evans/dockerhub-description` on a main publish); remove - `dorny/paths-filter` (test-pull-request); remove `setup` / `PUBLISH_ON_MERGE` (publish-release). **Keep** - `merge-upstream-version` and the `check-upstream-version*` tracker - this repo *does* use them (unlike the - pure-Docker siblings); that is the documented exception to "drop unused merge-bot jobs." -13. **Action SHAs** - APPLIES. setup-dotnet -> v5.4.0 (`26b0ec14cb23fa6904739307f278c14f94c95bf1`); Docker - actions already match canonical. **`nbgv` STAYS `@master` - do NOT SHA-pin it** (documented exception, - ESPHome AGENTS.md "Action pinning": the tag stream lags master so a pin draws Dependabot downgrade PRs; - the inline `@master` rationale comment is human-authored and must be preserved). Verify SHA->version - claims against the GitHub API (do not trust Copilot's SHA claims). -14. **Prose rules** - APPLIES. No em-dashes; US English; terse comments (one line <=~120, top-of-file - summary per workflow); never edit human-authored comments. Sweep the new/edited files. - ---- - -## 7. Verification - -**Static (local, before push):** -- `actionlint` (Docker: `docker run --rm -v "$PWD":/repo --workdir /repo rhysd/actionlint:latest -color`) - over all `.github/workflows/*.yml`. -- `markdownlint-cli2` (`docker run --rm -v "$PWD":/workdir davidanson/markdownlint-cli2:latest "**/*.md"`). -- `cspell` over README.md + HISTORY.md (CI scope) using the new `cspell.json`. -- YAML + JSON parse (`python -c 'import yaml,sys,glob; [yaml.safe_load(open(f)) for f in glob.glob(...)]'`; - `jq . repo-config/*.json upstream-version.json version.json`). -- `bash -n repo-config/configure.sh` and `bash -n` over inline resolver/run scripts where extractable. -- **EOL audit:** CRLF for `.md`/`.yml`/`.json`/`.code-workspace` (`grep -c $'\r'` or `tr -cd '\r' | wc -c`, - NOT `file`); LF for `.sh`/`Dockerfile`/`Docker/entrypoint/*`. Re-check any `.md`/`.json` that Edit/Write - touched (they can flip CRLF->LF). -- **Token sweep:** em-dash (`grep -rn $'—'`), plus - `PUBLISH_ON_MERGE|two-phase|dorny|datebadge|build-datebadge|publish-docker-readme|setup job|ProjectTemplate` - to confirm the cruft is gone (ProjectTemplate may legitimately remain in the Template-Lineage section if - kept - decide with maintainer). - -**Config audit (`repo-config/configure.sh`):** -- Before `apply`: `check` shows expected drift - the required-check rename (old live check - `Check pull request workflow status` -> new `...status job`), Docker secrets possibly missing from the - Dependabot store, no ruleset present (this repo has no repo-config today). -- `REPO=ptr727/ESPHome-NonRoot ./repo-config/configure.sh apply` then `check` -> "Configuration matches." - -**Live (dispatch, never via merge):** -- `gh workflow run publish-release.yml --ref develop` -> develop prerelease `1.8.-g`, - `:develop` image (multi-arch: `docker buildx imagetools inspect` shows amd64+arm64), GitHub release - (tag + LICENSE + README, marked prerelease, no binary asset). -- `gh workflow run publish-release.yml --ref main` (after promotion) -> clean `1.8.`, `:latest` + - `:` (+ `:SemVer2`) image, non-prerelease release, Docker Hub overview = `Docker/README.md`. -- Re-dispatch main -> no duplicate release (no-op guard), image still re-pushed (base refresh). -- Confirm the image runs non-root (the repo's reason to exist): `docker run --user 1001:100 ... esphome - version` succeeds against `/cache`. - ---- - -## 8. Go-live sequence - -1. **Triage the branch backlog first.** Decide which of the ~15 stale remote branches are obsolete; delete - them (`gh pr close` / `git push origin --delete`). Confirm the live tracker heads `upstream-version-*` - and dependabot heads are either merged or closed so they don't fight the migration. -2. Branch `feature/branch-scoped-cicd` off **`develop`** (the ahead branch). Apply iterations: - (i) workflows + delete cruft; (ii) `WORKFLOW.md` + `repo-config/` + `cspell.json`; (iii) docs (AGENTS / - CODESTYLE / copilot / dependabot / editorconfig / gitattributes / workspace / README / HISTORY / - version 1.8). Run the full static verify after each. -3. Open PR -> `develop`. **Copilot dance** (`copilot-review-flow`): wait + buffer, re-request via mutation, - resolve threads, decline false positives with rationale. Budget 1-3 rounds. -4. **`configure.sh apply` in lockstep** with the workflow edit (same change set), against the live repo - - this creates the rulesets + renames the required check so the PR's `...status job` check can resolve and - the Copilot rule attaches. Then `check`. -5. **Squash-merge to `develop`** (does NOT publish - one-branch model). Verify a develop push runs CI green. -6. **Promote `develop` -> `main` via a merge-commit PR, Copilot-reviewed, NO admin bypass.** Because main is - well behind develop, this is a substantive promotion (12 files / ~1k lines of legit content drift, not - just the migration). Watch for `migration-promotion-conflict` if main has straggler dependabot bumps in - files develop rewrote; if so, a local signed merge commit (tree = develop) may be needed - but try the - clean PR first. Decline pure-prose nits on the promotion PR (would diverge main from develop). -7. **Dispatch the publisher** on main (`gh workflow run publish-release.yml --ref main`) -> verify the main - `1.8.x` stable artifacts (image + release + overview). Dispatch on develop to verify the prerelease leg. -8. **Confirm `develop` survives** the promotion (`github-auto-delete-branch-gotcha`: auto-delete is OFF; - develop must still exist). Confirm the daily upstream tracker still opens bump PRs and the merge-bot - auto-merges + deletes their heads. -9. Set/confirm the publish schedule cadence the maintainer signed off in section 3 (daily vs weekly). - ---- - -## 9. Open questions for the maintainer - -1. **Release cadence (the crux - REQUIRES SIGN-OFF).** Confirm option (a): drop `PUBLISH_ON_MERGE`/merge- - publish; publish on schedule(main)+dispatch only; `upstream-version.json` is a shipped input. **Choose the - schedule:** recommended **daily** (`0 2 * * *`, ~24h CVE/upstream window, matches the daily tracker), vs - weekly (canonical default, ~7-day window), vs twice-weekly. Recommended default if no preference: **daily**. - The manual `gh workflow run` escape hatch covers any "release this bump now" case regardless. -2. **CODESTYLE.md inert sections.** Drop the large `.NET` and `Python` sections to converge with the Docker - siblings (which keep a General-only CODESTYLE), OR keep them per this repo's own "carry every section of a - carried file even when inert here" rule? Recommended: **drop them** (the Python is Dockerfile-internal, - not a maintained source tree; convergence wins). Needs a decision because it contradicts a stated repo rule. -3. **Image tag set.** Keep BOTH the pinned `:` tag (current behavior, user-meaningful) AND - add the canonical `:` tag, or only one? Recommended: **keep both** - the esphome tag is what - users pin to; the SemVer2 tag matches the GitHub release. (Costs one extra tag push.) -4. **`fail_on_unmatched_files`.** Confirm leaving it OFF/omitted on the Docker-only `github-release` (the - Docker target uploads no `release-asset-*`, so PlexCleaner's `true` would red a clean release). Recommended: - **off**, documented as the per-repo divergence in WORKFLOW.md D4. -5. **Version floor bump 1.7 -> 1.8.** Confirm the deliberate infra bump to exercise the publish path (matches - the sibling migrations: PlexCleaner 3.18->3.19, VSCode 1.0->1.1). Reconcile with the "routine edits leave - version.json untouched" rule as a maintainer-directed overhaul bump. -6. **Template lineage framing.** AGENTS.md currently frames the repo as ProjectTemplate-derived "two-phase". - Keep a (rewritten) Template-Lineage section pointing at the converged model, or drop the lineage framing - entirely as the siblings did? Recommended: **keep a trimmed lineage note** (it is still a real downstream - of ProjectTemplate) but rewrite it to the one-branch reality. diff --git a/plans/homeassistant-purpleair-migration-plan.md b/plans/homeassistant-purpleair-migration-plan.md deleted file mode 100644 index 5962a48..0000000 --- a/plans/homeassistant-purpleair-migration-plan.md +++ /dev/null @@ -1,423 +0,0 @@ -# HomeAssistant-PurpleAir branch-scoped CI/CD migration + convergence plan - -Repo: github.com/ptr727/HomeAssistant-PurpleAir (local clone: /home/pieter/homeassistant-purpleair, lowercase). -Type: Python Home Assistant custom integration distributed via HACS (custom_components/purpleair). No Docker, no NuGet. -Release artifact: a GitHub Release whose asset is `purpleair.zip` (the integration's files at archive root, HACS `zip_release` layout). - -Reference ground truth read for this plan: -- /home/pieter/LanguageTags (NuGet canonical), /home/pieter/PlexCleaner (Docker canonical) workflows, repo-config, AGENTS.md, copilot-instructions.md. -- This repo's 8 workflows, version.json, hacs.json, manifest.json (main + develop), dependabot.yml, ha-test-versions.json, AGENTS.md, CODESTYLE.md, .gitattributes, git history. - -HEADLINE: this repo is NOT a greenfield migration. `develop` has already been migrated most of the way to a -branch-scoped model in a prior round, and is heavily converged (AGENTS.md PR Review Etiquette, copilot Review -Runbook, NBGV versioning, HA-matrix bot, HACS zip-layout assertion are all already present on develop). The work -is (a) closing the remaining gaps to canon, (b) the crux release-trigger decision, (c) creating the missing -`repo-config/` 5D audit, and (d) a large branch-backlog + main/develop reconciliation cleanup. Do NOT rebuild from -the template; converge develop forward. - ---- - -## 1. Current-state assessment - -### 1.1 Branch hygiene (messy backlog - the big one) -- `develop` is 92 commits AHEAD of `main`, 0 behind. `git merge-base main develop` = `01e4292` (main tip, - PR #60 "reseed main manifest to 0.3.0 after pipeline regression"). Local `develop` == `origin/develop` (`dc8b34b`), clean. -- `main` is stuck in the OLD release-please era: it still carries `.github/workflows/release-please.yml`, has NO - `version.json`, NO `repo-config/`, manifest.json `version: 0.3.0`. develop has none of that legacy and has the - full NBGV + HA-matrix + HACS-zip pipeline. -- 54 remote branches. Abandoned/superseded migration attempts and one-offs: - - NBGV era leftovers: `nbgv`, `nbgv-prerelease-fix`, `restore-prerelease-gate`, `chore/seed-develop-prerelease-manifest`, - `chore/restore-develop-manifests-beta`, `chore/reseed-main-manifest-0.3.0`. - - HACS-zip pain (ALREADY merged into develop via PRs #80, #82): `fix-hacs-zip-layout`, `zip-layout-assertion-prefix-fix`. STALE. - - Sync churn: `sync-main-into-develop`, `chore/sync-main-into-develop-2`, `ruleset-and-refs-followup`, `feature/sync-versioned-rulesets`. - - release-please era: `release-please--branches--develop--...`, `release-please--branches--main--...`, `chore/remove-changelog`. - - Many open dependabot/* branches targeting develop, plus feature branches (`subentry-reconfigure-readkey`, - `feat/org-name-title-and-first-refresh-fix`, etc). -- VERIFY-before-delete: of the named "abandoned migration" branches, all six checked - (`nbgv`, `nbgv-prerelease-fix`, `restore-prerelease-gate`, `fix-hacs-zip-layout`, `zip-layout-assertion-prefix-fix`, - `sync-main-into-develop`) are NOT ancestors of develop, but the HACS-zip *content* already landed via squash PRs, - so the branches are stale duplicates, not lost work. Treat NBGV branches the same way (the NBGV pipeline is live on develop). - -### 1.2 Triggers and jobs today (on develop) -- `test-pull-request.yml`: triggers `pull_request: [main, develop]` + `push: [main, develop]` + `workflow_dispatch`. - Jobs: `test-release` (calls test-release-task.yml) and aggregator `check-workflow-status` - named **"Check pull request workflow status"** (NO trailing " job"). Concurrency `${{ github.workflow }}-${{ github.ref }}`. - -> DIVERGES from canon: canon triggers `push: ['**']` (every branch), no `pull_request`; aggregator name is - **"Check pull request workflow status job"**; aggregator must guard `!github.event.deleted` and treat success-only per job. -- `test-release-task.yml` (reusable, the validate+test set): jobs `ruff` (check + format --check), `mypy` (--strict), - `pyright`, `read-versions` (parses ha-test-versions.json), `pytest` (matrix over minimum/latest-stable/latest-beta, - uploads to Codecov), `hassfest`, `hacs`, and `build-release` (no-publish build gated by `build` input). -- `publish-release.yml`: triggers `workflow_dispatch: {}` + **`push: [develop]`**. Jobs: `gate` (assert dispatch from - main), `test-release` (build:false), `create-release` (github:true), `date-badge` (dispatch-only), `cleanup-artifacts`. - Concurrency `${{ github.workflow }}` global, cancel-in-progress:false. - -> DIVERGES from canon: canon publisher has NO push trigger; merges never publish. THIS repo auto-publishes a - prerelease on every develop push. This is the crux - see section 3. -- `build-release-task.yml` (reusable): `get-version` (calls get-version-task), `build` (stamp manifest with NBGV - SemVer2, zip custom_components/purpleair at root, **zip-layout assertion already present**, upload artifact), - `release` (download + softprops GitHub Release, `target_commitish: github.sha`, prerelease flag from get-version). -- `get-version-task.yml`: single NBGV run (setup-dotnet v10), outputs SemVer2/Tag/Prerelease. Prerelease - detected by `-` in SemVer2. CORRECT single-run threading (gotcha 2 already satisfied). **nbgv currently - SHA-pinned to v0.5.2 (`705dad19`) - CONVERT to `@master`** per the documented no-SHA-pin exception (the tag - stream lags master; a pin draws Dependabot downgrade PRs). See gotcha 13 / briefing. -- `build-datebadge-task.yml`: BYOB "Last Build" badge, gated on Prerelease==false. Template cruft per gotcha 12. -- `check-ha-version.yml`: daily cron + dispatch. Monitors `pytest-homeassistant-custom-component` on PyPI (whose - `homeassistant==` pin IS the upstream HA core version being tracked), resolves latest-stable + latest-beta HA pairs, - opens ONE bundled PR on rolling branch `ha-version-bump/matrix` -> develop via the codegen App. Does NOT publish. -- `merge-bot-pull-request.yml`: `pull_request_target`. Jobs merge-dependabot (squash develop / merge main by base), - merge-ha-version-bump (squash develop), disable-auto-merge-on-maintainer-push. Uses App token. - -> DIVERGES: `gh pr merge --auto` with NO `--delete-branch` (gotcha 5). - -### 1.3 Version scheme (NBGV present) -- `version.json`: base `version: "0.1"`, `publicReleaseRefSpec: ["^refs/heads/main$"]`, nugetPackageVersion.semVer 2. - Standard branch-scoped floor. main ships `0.1.`, develop ships `0.1.-g`. -- manifest.json `version`: develop = `0.0.0` placeholder (stamped at build time from NBGV); main = `0.3.0` (legacy, - release-please era). hacs.json has NO `version` field (HACS reads the stamped manifest). hacs.json `homeassistant` - = `2026.4.0` (the user-facing MINIMUM, hand-maintained, must match ha-test-versions.json `minimum.ha` and the - requirements.txt bootstrap pin series). -- The NBGV/version.json model is already correctly wired on develop; gotcha 1 is structurally satisfied. See 6.1. - -### 1.4 repo-config / 5D audit -- DOES NOT EXIST. No `repo-config/` directory on develop or main. This is the single largest missing canonical piece. - Rulesets are presumably configured live in the UI but are not codified or auditable. configure.sh + ruleset JSON - must be created from the LanguageTags canonical (adapting only secret names + the publish manual-verify note). - -### 1.5 SHA pins (gotcha 13, verified against GitHub API) -- setup-dotnet `9a946fd` = v5.3.0 (correct; canon prefers newer v5.4.0 - optional bump). -- nbgv `705dad19` = v0.5.2 - **SHA-pinned today, but CONVERT to `@master`** (documented no-SHA-pin exception; - do NOT keep the pin). -- checkout `df4cb1c069...` = the v6.0.3 commit (annotated tag `v6.0.3` -> `df4cb1c`; verified by dereference). Correct - (canon mentions v7.0.0 - a dependabot bump branch `dependabot/.../actions/checkout-7.0.0` already exists; let it land). -- All other pins (create-github-app-token v3.2.0 `bcd2ba49`, setup-python v6.3.0 `ece7cb06`, softprops v3.0.1 - `718ea10b`, codecov v6.0.1, upload/download-artifact, hacs/action 22.5.0) appear consistent; spot-verify any the - reviewer challenges - do NOT trust Copilot SHA->version claims. - ---- - -## 2. Target architecture - -One run = one branch. Reconcile per-file. The HACS pull-model release cadence is the one place the repo legitimately -diverges from the Docker/NuGet canon; section 3 resolves it. - -### 2.1 test-pull-request.yml (CI - align to canon) -- Triggers: `push: branches: ['**']` + `workflow_dispatch`. REMOVE the `pull_request` trigger and the `push:[main,develop]` - restriction. Self-testing: pushing any branch IS the PR check. - - Tension: the existing `push:[main,develop]` exists to re-upload Codecov on the post-merge SHA (default-branch badge). - Under `push:['**']` that still happens (main/develop are members of `**`), so the Codecov goal is preserved. Keep the - explanatory comment, retargeted. -- Jobs: keep `test-release` (calls test-release-task.yml). RENAME aggregator job to exactly - **"Check pull request workflow status job"** (job key may stay `check-workflow-status`). Add `if: ${{ !github.event.deleted }}` - to the head job(s) and the aggregator. Aggregator gate: per-need loop treating `success` as the only pass for a - required need; for any conditionally-skipped need use the success-OR-skipped allowlist (gotcha 8). Today there is one - need (`test-release`) which is never skipped, so the simple `!= 'success' -> exit 1` is correct; keep it but ensure the - `!github.event.deleted` guard so deleting a branch does not red a phantom run (gotcha 4). -- Concurrency unchanged (`${{ github.workflow }}-${{ github.ref }}`, cancel-in-progress true). - -### 2.2 test-release-task.yml (validate + test set - this is the Python adaptation of canon's validate/smoke) -- KEEP AS-IS structurally. This is the correct Python mapping of the canonical "validate + smoke/test" pair: - - validate role: `ruff` (lint+format), `mypy --strict`, `pyright`, `hassfest`, `hacs` (replaces canon's - markdownlint/cspell-only validate). - - smoke/test role: the `pytest` matrix over `minimum` / `latest-stable` / `latest-beta` HA versions (this IS the - HA-core-version test matrix the per-project note calls for) + `build-release` (no-publish build that exercises the - zip/HACS-layout path, analogous to canon's `smoke-build`). -- Optionally add `markdownlint`/`cspell` validate jobs if the repo wants the canonical doc-lint parity (the repo has - `.markdownlint-cli2.jsonc`); RECOMMEND keeping these as separate validate jobs only if they already pass clean - - do not introduce new failing gates during migration. Flag as an open question (9.4). - -### 2.3 publish-release.yml (the crux - see section 3 for the decision) -RECOMMENDED target (decision: dispatch-only publish, schedule retests only): -- Triggers: `workflow_dispatch: {}` + `schedule` (daily/weekly cron). REMOVE `push: [develop]`. -- `gate` job: keep, generalize to allow dispatch from `main` OR `develop`, guard - `if: github.ref_name == 'main' || github.ref_name == 'develop'` (canon shape). main dispatch = stable; develop - dispatch = prerelease (NBGV `-g` makes the tag unique). This restores "merges never publish". -- `schedule` leg: RETEST ONLY, NEVER publish. The schedule runs on the default branch (main) by GitHub rule; it should - run `test-release` (validate+test) against main and STOP - no `create-release`. Rationale: HACS is a pull model and - the maintainer explicitly does not auto-push; the schedule's job is to catch upstream HA/pytest-hacc drift breaking - the shipped main, surfacing a red run for a maintainer to act on, not to cut a release. (`check-ha-version.yml` - already handles the develop-side retest-and-bump; the publisher schedule covers main, which the matrix bot does not touch.) - - Implement: `create-release` (and date-badge) gate `if: github.event_name == 'workflow_dispatch'`. The schedule path - runs gate(skipped)->test-release->[create-release skipped]. Concurrency global, cancel-in-progress:false (unchanged). -- `create-release` job: unchanged otherwise (build:false on test-release, github:true on create-release, softprops with - target_commitish github.sha, prerelease from NBGV). -- `date-badge`: FOLD or DELETE per gotcha 12 (template cruft). RECOMMEND delete build-datebadge-task.yml and the - date-badge job - it is a "Last Build" vanity badge with no release-correctness role. Flag as open question 9.3 since it - is currently wired and green. -- `cleanup-artifacts`: keep (always(), best-effort). - -### 2.4 build-release-task.yml (release artifact - keep, it is the HACS zip producer) -- KEEP. The zip-layout assertion (manifest.json + __init__.py at root, no `purpleair/` wrapper, `./` normalization) is - ALREADY baked in (lines 68-99) from PRs #80/#82 - gotcha-equivalent "bake the zip-layout assertion" is DONE. Do not - remove it; verify it survives any edit. Single NBGV run via get-version-task threaded down as SemVer2 (gotcha 2 satisfied). - -### 2.5 get-version-task.yml / merge-bot / check-ha-version / dependabot -- get-version-task: keep (correct single NBGV run). -- merge-bot-pull-request.yml: add `--delete-branch` to BOTH `gh pr merge --auto` calls (merge-dependabot, - merge-ha-version-bump). Keep repo-wide auto-delete-on-merge OFF in settings.json so develop->main promotion does not - delete develop (gotcha 5). -- check-ha-version.yml: keep as-is (it is the repo's upstream-monitor; already retest-not-publish). Confirm its rolling - PR + bundle design is documented in AGENTS.md (it is). -- dependabot.yml: TODAY single-targets develop only (every ecosystem `target-branch: develop`). **DECIDED - (2026-06-29): dual-target main AND develop** (gotcha 11) - the maintainer confirmed "dependabot should - still keep main and develop updated, i.e. avoid merge drift." Converge all ecosystems (pip + - github-actions) to dual-target like the four live repos; develop's bumps are sync-only and never publish. - Open question 9.1 is resolved. - ---- - -## 3. Release model decision (THE CRUX) - -Question: how does the HACS-zip release map onto one-branch, given releases are NOT automatic on merge and HACS is a -pull model? - -What is monitored upstream: `check-ha-version.yml` monitors `pytest-homeassistant-custom-component` on PyPI, whose -`homeassistant==` pin is the de-facto upstream **HA core version** (stable and beta). `aiopurpleair` (the API client, -pinned as `aiopurpleair-ptr727==` in manifest/requirements) is NOT auto-monitored; it moves via Dependabot pip -PRs (note develop manifest pins 2026.8.0 vs main 2026.4.0). So "upstream" = HA core (via pytest-hacc), monitored daily, -retested on develop via a bundled bot PR; it does not publish. - -**The monitor is a breakage tripwire (maintainer-confirmed intent):** the HA-version bump updates the test -matrix with the new HA release specifically so a breaking upstream change makes the bot PR's CI **fail**; a -human then intervenes, fixes, and releases manually via dispatch. The monitoring exists to surface breakage -early, not to ship - which is exactly why publish is dispatch-only and the schedule retests but never -publishes. - -How/when the HACS zip is cut today: TWO paths - (a) every develop push auto-cuts a PRERELEASE GitHub Release; (b) -`workflow_dispatch` from main cuts a STABLE release. Path (a) directly contradicts the maintainer's stated model ("does -NOT auto push") and the canonical rule "merges never publish". - -DECISION: **dispatch-gated publish; schedule retests only; NO push-to-develop publish.** -- Stable release: `gh workflow run publish-release.yml --ref main` (gate asserts main). Cuts `0.1.` clean. -- Prerelease (when a beta tester build is wanted): `gh workflow run publish-release.yml --ref develop`. NBGV emits - `0.1.-g`, softprops marks it prerelease. This replaces the automatic develop-push prerelease with an - on-demand one - same artifact, same uniqueness guarantee, but maintainer-initiated. -- Schedule (daily/weekly): runs `test-release` against main ONLY (retest the shipped integration against the latest HA - matrix). NEVER publishes. This is the publisher-side complement to check-ha-version's develop-side retest. - -Rationale: -1. Matches the maintainer's explicit "monitor + retest but do NOT auto push (HACS is pull)" requirement verbatim. -2. Restores the canonical invariant "merges never publish" (briefing model + gotcha-adjacent). The AGENTS.md "Merging is - not releasing" verbatim contract currently lies, because develop-push DOES release; this decision makes the docs true. -3. Removes the develop-push trigger entirely, eliminating the only `push`-on-merge publish path - aligns with the - PlexCleaner Docker canon (dispatch + schedule, no push) which is the closest precedent for a non-NuGet artifact. -4. Keeps NBGV GITHUB_REF classification correct (gotcha 1): a develop dispatch has `github.ref = refs/heads/develop` - (one run = one branch), so NBGV classifies it prerelease with no IGNORE_GITHUB_REF hack. A main dispatch is clean. - -TENSION FLAGGED: the existing code+docs treat develop-push prerelease as a feature ("beta testers always have the -latest"). Dropping it trades automatic prereleases for on-demand. If the maintainer wants beta testers to keep getting -every develop push automatically, the alternative is to KEEP `push:[develop]` as a documented, deliberate exception to -the one-branch publisher (the repo already gates it correctly on test success). Surface both; RECOMMEND dispatch-only to -honor the stated "does NOT auto push" requirement. See open question 9.2. - ---- - -## 4. Files to create / edit / delete - -### CREATE (8) -1. `repo-config/configure.sh` - port LanguageTags verbatim; change only: - - `REQUIRED_ACTIONS_SECRETS=(CODEGEN_APP_CLIENT_ID CODEGEN_APP_PRIVATE_KEY CODECOV_TOKEN)` (no NuGet/Docker; Codecov - token is the repo's one publish-ish secret - it is Actions-only, NOT Dependabot). - - `REQUIRED_DEPENDABOT_SECRETS=(CODEGEN_APP_CLIENT_ID CODEGEN_APP_PRIVATE_KEY)` (the codegen App secrets must be in BOTH - stores so a Dependabot-triggered push-CI can mint the App token; Codecov is not needed by Dependabot runs). - - `REQUIRED_CHECK="Check pull request workflow status job"` (identical string). - - cmd_check manual-verify note: "GitHub Releases / HACS zip publish is dispatch-gated; no external publish policy to verify." - - check_app note: "verify the codegen App is installed". - - Keep jq_lacks / check_secrets / ruleset_id / check_app helper BODIES byte-identical to canon (gotcha 6). -2. `repo-config/ruleset-develop.json` - port verbatim (squash-only, linear, signed, deletion+non_fast_forward, - required check "Check pull request workflow status job" with integration_id 15368, copilot_code_review, 0 approvals, - strict false). -3. `repo-config/ruleset-main.json` - port verbatim (merge-only, NO linear, signed, same required check, copilot review). -4. `repo-config/settings.json` - port verbatim (allow_squash+merge, rebase off, auto_merge on, delete_branch_on_merge:false). -5. `repo-config/README.md` - port verbatim, substitute repo name; document the dispatch-only HACS publish in place of the - NuGet/Docker publish line. -6. (optional) `markdownlint`/`cspell` validate jobs - only if 9.4 says yes; else skip. - -### EDIT (8-9) -1. `.github/workflows/test-pull-request.yml` - triggers -> `push:['**']` + dispatch (drop pull_request); aggregator name - -> "Check pull request workflow status job"; add `!github.event.deleted` guards. -2. `.github/workflows/publish-release.yml` - drop `push:[develop]`; add `schedule`; gate dispatch main||develop; gate - create-release/date-badge on `github.event_name == 'workflow_dispatch'`; (delete date-badge if 9.3 yes). -3. `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch` to both `gh pr merge --auto` calls. -4. `.github/dependabot.yml` - dual-target main AND develop (if 9.1 yes). -5. `AGENTS.md` - update "Release flow" + "Merging is not releasing" to match the dispatch-only decision (remove the - "Push to develop -> automatic prerelease" bullet; document dispatch-from-develop + schedule-retests-main). Add a - "Where rules live" pointer if missing; verify Comments subsection matches canon verbatim. Add `repo-config/` pointer. -6. `CODESTYLE.md` - STRIP the `.NET` section (lines 39-343) - template cruft for a Python-only repo (gotcha 12). Keep - General + Python. -7. `.github/copilot-instructions.md` - verify Review Runbook is byte-converged with canon; substitute owner/name - `ptr727/homeassistant-purpleair` in any hardcoded GraphQL snippets; confirm bot login `copilot-pull-request-reviewer`. -8. `custom_components/purpleair/manifest.json` (main side, during reconciliation) - main's `0.3.0` must become `0.0.0` - placeholder to match the stamp-at-build model when develop lands on main (handled by the promotion, see 8). -9. (if SHA convergence chosen) bump checkout v6.0.3->v7.0.0, setup-dotnet v5.3.0->v5.4.0 across workflows - or let the - existing dependabot/checkout-7 branch land. - -### DELETE (2-4) -1. `.github/workflows/build-datebadge-task.yml` - template cruft (gotcha 12), if 9.3 yes. -2. `.github/workflows/release-please.yml` - ONLY EXISTS ON MAIN (legacy). Removed automatically when develop overwrites - main in the promotion; no develop-side action. -3. 40+ stale remote branches (backlog cleanup, see 8.6) - not files but a delete pass. -4. (n/a) No `dorny/paths-filter`, no `setup`/`PUBLISH_ON_MERGE`, no `merge-codegen`/`merge-upstream-version`, - no `publish-docker-readme-task.yml` exist here - already absent. - -Net: ~8 create, ~8-9 edit, ~2-4 delete (workflow-file deletes 1-2; plus a branch-backlog delete pass). - ---- - -## 5. Convergence + backports - -PORT VERBATIM from LanguageTags (byte-for-byte target): -- `repo-config/configure.sh` helper bodies (jq_lacks, check_secrets, ruleset_id, check_app, assert, apply_ruleset, - check_ruleset, check_settings, check_security, cmd_apply/cmd_check/dispatch). Only the secret arrays + two note strings differ. -- `repo-config/ruleset-develop.json`, `ruleset-main.json`, `settings.json` (only the required-check string is shared and - already canonical). -- AGENTS.md: Comments subsection, Git/Commit rules, "Where rules live" lead paragraph, PR Review Etiquette (already - present and looks converged - diff against canon), Documentation Style Conventions incl. the line-endings - "preserve current state" rule. -- `.github/copilot-instructions.md` Review Runbook (bot id read-from-review pattern, requestReviews mutation, coverage - check, thread resolution). -- `.editorconfig` EOL rules (CRLF for .md/.yml/.json; LF for .sh/.py) - the repo already has a 10KB .editorconfig; diff - the relevant blocks against canon. - -ADAPT (Python-specific, not verbatim): -- `test-release-task.yml` validate/test set (ruff/mypy/pyright/hassfest/hacs/pytest-matrix) - no canonical sibling; this - is the repo's correct Python mapping. Keep. -- publish-release.yml mechanics (HACS zip via softprops, dispatch+schedule-retest) - per-repo publish seam. -- configure.sh secret arrays + manual-verify note. - -BACKPORT TO THE FOUR LIVE REPOS (drift found here worth propagating): -- The HACS-zip-layout assertion pattern is repo-unique; nothing to backport. -- check-ha-version.yml's PEP-440 walk + bundled rolling-PR design is more robust than a naive sort; if any sibling repo - monitors a PyPI/upstream version, consider porting the `packaging.version` ordering. LanguageTags/PlexCleaner do not, - so likely n/a. -- If the .editorconfig or Comments subsection here has drifted ahead of canon, reconcile toward the canonical text (do - not let this repo's copy become a fork). - ---- - -## 6. Gotcha checklist (mapped to THIS repo) - -1. NBGV GITHUB_REF classification - SATISFIED on develop. version.json floor `0.1` + `publicReleaseRefSpec ^refs/heads/main$`; - one-run-one-branch means github.ref already matches; no IGNORE_GITHUB_REF hack present or needed. ACTION: when adopting - dispatch-only, a develop dispatch keeps github.ref=refs/heads/develop so NBGV stays prerelease - correct. Bump version.json - `0.1`->`0.2` once to exercise the publish path during go-live verification. -2. NBGV threading - SATISFIED. Single nbgv run in get-version-task; SemVer2 threaded into build-release-task's stamp + - softprops. No nested get-version. Keep it that way. -3. Docker creds in both secret stores - N/A (no Docker). The ANALOG: codegen App secrets (CODEGEN_APP_CLIENT_ID/PRIVATE_KEY) - MUST be in both Actions AND Dependabot stores, because a Dependabot-merged push fires develop CI and the merge-bot mints - the App token. configure.sh REQUIRED_DEPENDABOT_SECRETS encodes this. -4. Branch-deletion guard - NOT PRESENT today (test-pull-request uses pull_request, not push:['**']). ADD `!github.event.deleted` - when switching to push:['**'] so deleting a branch does not fire a phantom CI run. -5. merge-bot --delete-branch - MISSING. ADD to both merge calls. Keep repo-wide auto-delete OFF. -6. 5D audit hardened form - configure.sh DOES NOT EXIST; create from the final hardened canonical (jq_lacks exit-4 case, - check_secrets fail-on-API-error, ruleset_id first//empty + visible gh error, check_app best-effort note, fail-when-cannot-verify). -7. Required-check name lockstep - the ruleset JSON required-check, the aggregator job name, and the live ruleset must all - read "Check pull request workflow status job". Today the workflow says "...status" (no " job") and there is no ruleset JSON. - Fix the workflow name AND create the JSON AND run `apply` in the same change that ships the workflow edit, then `check`. -8. Aggregator success/skipped allowlist - today one never-skipped need, so success-only is fine. If validate jobs are split - out (markdownlint/cspell) keep the simple loop but ensure any conditionally-skipped need uses success-OR-skipped. -9. EOL discipline - CRLF for .md/.yml/.json/.code-workspace, LF for .sh/.py/Dockerfile. Repo has .gitattributes (`* -text` - + LF pins for *.sh/scripts/*) and a large .editorconfig. VERIFY .json/.yml CRLF with `tr -cd '\r' | wc -c` (file(1) - lies for JSON). New repo-config/*.json must be CRLF; configure.sh must be LF. -10. Copilot review loop - copilot-instructions.md Review Runbook already present; bot login `copilot-pull-request-reviewer` - (GraphQL, no [bot]) / `...[bot]` (REST). Re-request via requestReviews mutation each head. Expect 1-3 rounds; snupkg/OIDC - false positives are N/A (no NuGet); HACS-zip / NBGV-prerelease are the likely recurring false positives here - decline - with rationale. Use `gh api -X PATCH .../pulls/N -F body=@file` for body edits. -11. Dependabot dual-target - today develop-only. DECISION (9.1): converge to dual-target main AND develop to match canon - and avoid the non-linear merge-block; unless maintainer confirms main-bumps are pointless under HACS pull. Default: dual. -12. Strip template cruft - DELETE build-datebadge-task.yml + date-badge job; STRIP CODESTYLE.md .NET section. No - paths-filter / PUBLISH_ON_MERGE / merge-codegen / docker-readme exist (already clean). release-please.yml is main-only - legacy, removed by the promotion. -13. Action SHAs - VERIFIED: checkout df4cb1c=v6.0.3, setup-dotnet 9a946fd=v5.3.0, nbgv 705dad19=v0.5.2. Canon - prefers checkout v7 / setup-dotnet v5.4 (optional; dependabot/checkout-7 branch exists - let it land). - **nbgv must become `@master`, NOT stay SHA-pinned** (documented no-SHA-pin exception; the pin draws - Dependabot downgrade PRs). Re-verify any SHA the reviewer disputes; never trust Copilot's SHA->version mapping. -14. Prose rules - no em-dashes (sweep of workflows/AGENTS/CODESTYLE = clean today; re-sweep after edits). US English. Terse - comments, one line <=120, top-of-file workflow summary. Never edit human-authored comments. NOTE: check-ha-version.yml's - apply step deliberately preserves a U+2014 em-dash in ha-test-versions.json's `$comment` via ensure_ascii=False - that - em-dash is in a DATA file's content, not prose authored by us; leave the workflow logic, but confirm the comment text we - author elsewhere stays em-dash-free. - ---- - -## 7. Verification - -Static (run all before pushing): -- `actionlint` on every .github/workflows/*.yml. -- `bash -n repo-config/configure.sh`; `shellcheck` it. -- `python3 -m json.tool` parse on each repo-config/*.json + ha-test-versions.json + manifest.json + hacs.json. -- EOL: `for f in repo-config/*.json .editorconfig; do printf '%s ' "$f"; tr -cd '\r' < "$f" | wc -c; done` (expect >0 for - CRLF JSON, 0 for configure.sh). `grep -rIl $'\r' repo-config/configure.sh` must be empty. -- Em-dash sweep: `grep -rn $'—' .github/ AGENTS.md CODESTYLE.md repo-config/` must be empty (the ha-test-versions.json - `$comment` em-dash is acceptable data, exclude it). -- markdownlint-cli2 (config present) / cspell if the workspace has a dict - report-only if not gating. -- zip-layout assertion smoke: locally `cd custom_components/purpleair && zip -r /tmp/p.zip . && unzip -Z1 /tmp/p.zip | - sed 's|^\./||'` and confirm manifest.json + __init__.py at root, no `purpleair/` prefix. - -Config audit: -- `REPO=ptr727/HomeAssistant-PurpleAir ./repo-config/configure.sh check` BEFORE apply -> expect drift (rulesets not codified - / required-check name mismatch). Run `apply`. Re-run `check` -> expect "matches". - -Live dispatch verification (after merge): -- `gh workflow run publish-release.yml --ref develop` -> confirm a PRERELEASE GitHub Release with `0.1.-g` tag and - a `purpleair.zip` asset whose layout passes the assertion. -- `gh workflow run publish-release.yml --ref main` -> confirm a STABLE `0.1.` release. -- Confirm the schedule leg (or a dispatched no-publish test) retests without creating a release. -- Bump version.json `0.1`->`0.2` to exercise a new minor and confirm NBGV height resets. - ---- - -## 8. Go-live sequence - -8.1 Branch from develop: `feature/branch-scoped-cicd-convergence`. Make ALL edits + creates from section 4 there. -8.2 Push -> the new `push:['**']` CI runs on the feature branch (self-testing). Green it (ruff/mypy/pyright/pytest-matrix/ - hassfest/hacs + no-publish build). -8.3 Open PR feature -> develop (squash). Copilot dance: re-request via requestReviews on each head; resolve every thread; - expect 1-3 rounds; HACS-zip/NBGV-prerelease comments are likely false positives - decline with rationale. Maintainer - approves explicitly (Merge Gate). -8.4 Required-check lockstep: in the SAME change, `REPO=ptr727/HomeAssistant-PurpleAir ./repo-config/configure.sh apply` - against the live repo so the ruleset's required check becomes "Check pull request workflow status job" and matches the - renamed aggregator - otherwise the PR's new check name is not the required one and the PR cannot satisfy the old name. - Then `check` to confirm "matches". (Order: apply right before/with the merge so the new green check is the required one.) -8.5 Squash-merge to develop. develop CI re-runs (no publish - push trigger removed). -8.6 Branch-backlog cleanup (do this around the promotion, carefully): - - Confirm each candidate is fully merged or truly abandoned: `git branch -r --merged origin/develop` for the safe set; - for the NBGV/HACS/sync branches verify their content is on develop (HACS-zip already is via #80/#82; NBGV pipeline is - live) then `git push origin --delete `. - - Delete release-please-- branches, nbgv*, *sync*, ruleset-*, fix-hacs-zip-layout, zip-layout-assertion-prefix-fix, - chore/seed-*, chore/reseed-*, chore/restore-* after confirming superseded. - - Leave OPEN dependabot/* branches (they auto-close on merge or get superseded); leave live feature branches the - maintainer still wants. Get maintainer sign-off on the delete list (open question 9.5) - do not bulk-delete unilaterally. -8.7 Reconcile main: develop is 92 ahead / 0 behind, so a develop->main merge-commit PR is clean. Open develop->main PR, - "Create a merge commit" (main ruleset is merge-only; develop becomes a real ancestor so the next promotion is clean). - NO admin bypass. This brings version.json, repo-config, the new workflows, the 0.0.0 placeholder manifest, and DELETES - release-please.yml from main in one node. Copilot-review the promotion; decline pure-prose nits (would diverge main). - - Watch the manifest version: main currently 0.3.0, develop 0.0.0. The merge takes develop's 0.0.0 (correct - stamped - at build). The first STABLE dispatch from main will then publish NBGV `0.1.` (NOT 0.3.x). FLAG to maintainer - (9.6): the public version series effectively resets from release-please 0.3.0 to NBGV 0.1.. If continuity matters, - bump version.json base to `0.3` (or higher) BEFORE the first main dispatch so NBGV emits >=0.3.x. -8.8 Apply main ruleset: `configure.sh apply` already wrote both rulesets in 8.4; re-`check` post-promotion to confirm main - ruleset live. -8.9 Dispatch the publisher from main: `gh workflow run publish-release.yml --ref main`; verify clean release + zip asset + - layout. Optionally dispatch from develop for a prerelease. -8.10 Confirm develop SURVIVES: repo-wide auto-delete is OFF; the merge-commit promotion + per-merge --delete-branch on bot - PRs only delete bot branches, never develop. Verify `origin/develop` still exists post-promotion. - ---- - -## 9. Open questions for the maintainer (with recommended defaults) - -9.1 Dependabot targeting: converge to dual-target main AND develop (canon, avoids non-linear merge-block) vs keep - develop-only (current, documented)? DEFAULT: dual-target (match the four live repos; the maintainer already rejected - single-target elsewhere). -9.2 Release trigger (THE CRUX): adopt dispatch-only publish + schedule-retests-main, dropping the automatic develop-push - prerelease? DEFAULT: YES - it honors the stated "monitor + retest but do NOT auto push (HACS pull)" requirement and - restores "merges never publish". Alternative if beta testers must keep auto-prereleases: keep push:[develop] as a - documented deliberate exception. -9.3 Delete build-datebadge-task.yml + the date-badge job (template cruft)? DEFAULT: YES (vanity "Last Build" badge, no - release-correctness role). Keep only if the README badge is load-bearing for the maintainer. -9.4 Add markdownlint/cspell validate jobs for canonical doc-lint parity? DEFAULT: only if they already pass clean; do not - add a new failing gate during migration. Otherwise defer to a follow-up. -9.5 Branch-backlog delete list: approve bulk deletion of the ~40 superseded/abandoned branches (nbgv*, release-please--*, - *sync*, ruleset-*, hacs-zip-*, chore/seed|reseed|restore-*)? DEFAULT: delete after per-branch superseded-confirmation; - leave live dependabot/* and wanted feature branches. -9.6 Version continuity: main is at release-please 0.3.0; NBGV base is 0.1, so the first stable dispatch ships 0.1., - a version REGRESSION below 0.3.0. Bump version.json base to >=0.3 before the first main dispatch to preserve monotonic - public versions? DEFAULT: YES, set base to `0.3` (or `0.4`) so HACS users do not see a downgrade. diff --git a/plans/nxwitness-migration-plan.md b/plans/nxwitness-migration-plan.md deleted file mode 100644 index 13aed52..0000000 --- a/plans/nxwitness-migration-plan.md +++ /dev/null @@ -1,466 +0,0 @@ -# NxWitness branch-scoped CI/CD migration + convergence plan - -Repo: `/home/pieter/NxWitness` (GitHub `ptr727/NxWitness`). Default branch `main`; integration branch `develop`. -Target model: the branch-scoped self-publishing CI/CD proven on LanguageTags, Utilities, PlexCleaner, VSCode-Server. -Canonical Docker reference: `/home/pieter/PlexCleaner/`. Canonical .NET-tooling reference: `/home/pieter/LanguageTags/`. - -This is the MOST complex remaining repo: it couples (1) a .NET **codegen tool** (`CreateMatrix` + `CreateMatrixTests` + `Make/`) that -fetches upstream Nx product versions and regenerates `Make/Version.json`, `Make/Matrix.json`, `Docker/*.Dockerfile`, and `Make/Test*.yml`; -with (2) a **multi-stage, multi-product, multi-base Docker** build (5 products x {plain, LSIO} = 10 product images, plus 2 shared base -images) published to 12 Docker Hub repos. There is **no NuGet publish** (`CreateMatrix.csproj` and the test csproj both set -`IsPackable=false`; repo-wide grep for `nuget push` / `dotnet pack` / `PackageId` / `NuGetApiKey` returns zero). The .NET is a build-time -generator only; the published artifacts are exclusively Docker Hub images. - -IMPORTANT: NxWitness is **partially migrated already**. `publish-release.yml` is dispatch+schedule (no push), and codegen/Dependabot -already dual-target main AND develop. The work here is (a) closing the gaps to the canonical model, (b) adding the missing governance -surface (`WORKFLOW.md`, `repo-config/`, the converged AGENTS/CODESTYLE/copilot sections), and (c) **deciding the one big architectural -tension**: the publisher today builds BOTH branches in one run (build-main + build-develop legs), which is exactly the cross-branch NBGV -classification hazard the canonical one-branch model exists to remove. - ---- - -## 1. Current-state assessment - -### 1.1 Workflows present (`.github/workflows/`) -| File | Trigger | Role | Canonical status | -|---|---|---|---| -| `test-pull-request.yml` | `pull_request` [main, develop] + `workflow_dispatch` | CI: paths-filter -> test-release + smoke-build + aggregator | **DIVERGES**: uses `pull_request` not `push: ['**']`; uses `dorny/paths-filter`; aggregator name is `Check pull request workflow status` (missing trailing ` job`) | -| `publish-release.yml` | `workflow_dispatch` + `schedule` (Mon 02:00 UTC) | Publisher: get-version(main) + build-base(main) + build-main + build-develop + github-release + docker-readme + date-badge + cleanup | **DIVERGES**: builds BOTH branches in one run (two legs), not one-branch-per-run | -| `build-base-images-task.yml` | `workflow_call` | Builds shared `nx-base` + `nx-base-lsio` (matrix of 2), branch-scoped buildcache | repo-owned, keep | -| `build-docker-task.yml` | `workflow_call` | Builds product images from `Make/Matrix.json` (matrix over `.Images`), threads SemVer2 in | repo-owned, keep; NBGV threading OK (see 1.4) | -| `get-version-task.yml` | `workflow_call` (input `ref`) | Single NBGV run, exposes SemVer2 + assembly versions + GitCommitId | canonical-shaped; no IGNORE_GITHUB_REF | -| `test-release-task.yml` | `workflow_call` + `workflow_dispatch` | husky lint + `dotnet test` | rename/fold into `validate-task.yml` (see 2) | -| `run-codegen-pull-request-task.yml` | `workflow_call` | Matrix codegen main+develop -> `codegen-main`/`codegen-develop` PRs | dual-target, keep (gotcha 11) | -| `run-periodic-codegen-pull-request.yml` | `workflow_dispatch` + `schedule` (daily 04:00) | Calls the codegen task | keep | -| `merge-bot-pull-request.yml` | `pull_request_target` [opened, reopened, synchronize] | merge-dependabot + merge-codegen + disable-on-maintainer-push | **DIVERGES**: merge step lacks `--delete-branch` (gotcha 5) | -| `build-datebadge-task.yml` | `workflow_call` | BYOB "Last Build" badge | **STRIP** (gotcha 12) | -| `publish-docker-readme-task.yml` | `workflow_call` | Docker Hub overview via `peter-evans/dockerhub-description`, manifest-derived repo list | **FOLD** into the docker task as a main-publish step (gotcha 12) | - -### 1.2 Governance surface (the big gap) -- **No `WORKFLOW.md`.** No `repo-config/` (no `configure.sh`, no ruleset JSON, no `settings.json`, no `repo-config/README.md`). - The 5D audit and the GitHub-side config are entirely absent. This is the largest single body of new work. -- `AGENTS.md` exists with: Solution Structure, Build/Validation, Image Architecture, CI Pipeline, Versioning, Git and Commit Rules, - PR Title/Commit Conventions, PR Review Etiquette (full canonical contract, with the "Mandatory in every derived repo" banner + Merge - Gate), Coding Conventions (Highlights), Notes for Changes, Template adaptations. It does NOT reference WORKFLOW.md/repo-config. - It LACKS a dedicated `### Comments` subsection and a `## Documentation Style Conventions` section (those live in PlexCleaner AGENTS.md). -- `.github/copilot-instructions.md` exists with the full canonical Review Runbook (requestReviews mutation, GraphQL-vs-REST login split, - head-SHA coverage, bounded retry, thread resolution). Good - ports nearly verbatim, only owner/name strings change. -- `CODESTYLE.md` exists (25 KB). Needs a heading-by-heading diff against PlexCleaner's to converge the shared structure. - -### 1.3 Version scheme -`version.json` floor `2.14`, `publicReleaseRefSpec: ["^refs/heads/main$"]`, `nugetPackageVersion.semVer: 2`. NBGV computes -`X.Y.` on main, `X.Y.-g` elsewhere. `get-version-task.yml` runs NBGV once (dotnet/nbgv@master, floated - tag -stream lags so Dependabot would propose a downgrade; a deliberate `@master` float, keep with the existing comment). - -### 1.4 NBGV threading today (gotcha 2 status) -- `get-version-task.yml` runs NBGV **once** and exposes outputs. Good. -- `build-docker-task.yml` ALSO contains a nested `get-version` call (`needs: [get-version, ...]`) and threads - `needs.get-version.outputs.SemVer2` into `LABEL_VERSION`. So in the PUBLISHER path, NBGV runs in TWO places: once at top-level - `publish-release.yml::get-version` (ref: main, for the release tag) AND once inside each `build-docker-task` invocation (ref: the leg's - ref). These are different NBGV runs on different refs. For the image LABEL_VERSION this is arguably intentional (the develop leg should - label its images with the develop prerelease version), but it is a SECOND NBGV run per leg - exactly what gotcha 2 says to avoid, and it - means the develop leg's NBGV classification depends on which `ref` it was handed, not on GITHUB_REF. This works today only because the - legs pass an explicit `ref` (main commit / develop) and NBGV keys off the checked-out branch when given a real branch ref. See 3 for the - recommended consolidation. - -### 1.5 Cross-branch / NBGV-leak exposure (the crux) -`publish-release.yml` builds main and develop in ONE run via separate jobs. `GITHUB_REF` for the whole run is the dispatch ref. The -build legs each pass an explicit `ref` (`build-main` pins `get-version.outputs.GitCommitId`; `build-develop` passes `ref: develop`). -The repo does NOT set `IGNORE_GITHUB_REF` anywhere. The safety net is the **`Verify public release version step`** (the D2.2 backstop) in -`github-release`, which refuses to publish a `main` GitHub release carrying a prerelease `-`. So the GitHub RELEASE classification is -protected. The IMAGE tag/label classification, however, is governed by the nested `get-version` in each docker leg keyed on the leg's -`ref`, not on GITHUB_REF - so as long as `build-develop` passes `ref: develop`, its nested NBGV checks out develop and classifies -prerelease. This is the matrix-publisher case the memory `nbgv-publicrelease-githubref-leak.md` says either needs `IGNORE_GITHUB_REF` OR -(preferred) should migrate to one-branch-per-run. - -### 1.6 Smoke / CI path -`test-pull-request.yml` runs on `pull_request`. `dorny/paths-filter` gates: builds smoke (NxMeta + NxMeta-LSIO, amd64, no push) only when -`Docker/**` / `Make/Matrix.json` / `Make/Version.json` changed; `build_base` only when a base Dockerfile changed. Aggregator -`check-workflow-status` (name `Check pull request workflow status`) treats success|skipped as pass, fails on failure|cancelled. Has a -branch-deletion concern: on `pull_request` there is no branch-deletion event, so gotcha 4's `!github.event.deleted` guard is **n/a while -the trigger stays `pull_request`** - but if we move to `push: ['**']` (canonical), the guard becomes mandatory. - -### 1.7 Branch hygiene / backlog (messy) -`git branch -a` shows a substantial backlog of remote branches that must be reconciled or pruned before/around go-live: -`codegen`, `codegen-main`, `codegen-develop` (codegen working branches - expected, transient), `dependabot/...` on BOTH main and develop -(3 live), plus stragglers: `backport-cicd-fixes`, `bump-version-2.13`, `chore/sync-template`, `feature/sync-versioned-rulesets`, -`fix-lsio-puid-pgid-ordering`, `fix-release-version-tag-race`, `fix/release-skip-log-message`, `fix/release-tag-pinning-and-skip-existing`, -`propagate-versioning-policy`, `realign-template-lint-config`, `release-notes-2.14`, `shields`. Several look like prior CI-fix attempts. -Action: audit each before go-live; the migration branch should supersede the relevant `fix-release-*` / `*cicd*` / `*versioning*` -branches, and those should be closed (not merged) to avoid re-introducing superseded mechanics. Local-only branches not on origin -(`fix/release-skip-log-message`, `release-notes-2.14`) can be deleted locally. - -### 1.8 EOL / prose -All 10 workflow files are CRLF (verified by `grep -c $'\r'`). `AGENTS.md` has 0 em-dashes; `README.md` has 1 em-dash (must be swept). -`.slnx` is **stale**: it references LanguageTags-shaped workflow files that do not exist here (`build-executable-task.yml`, -`build-library-task.yml`, `build-release-task.yml`, `publish-periodic-docker-release.yml`) and omits files that DO exist -(`build-base-images-task.yml`, `build-docker-task.yml`). Must be rebuilt to the real file set (gotcha: do not mirror LanguageTags' .slnx). - ---- - -## 2. Target architecture - -The target keeps NxWitness's legitimately repo-specific build layer (shared base + per-product matrix) while converging the -orchestration, governance, and classification model. Per-file target: - -### 2.1 `test-pull-request.yml` (CI) -> converge to push-on-every-branch self-test -- **Trigger:** `push: branches: ['**']` (NOT `pull_request`) + `workflow_dispatch`. Rationale: the canonical model self-tests by pushing - the branch; reusable `./...` logic resolves from the head; the aggregator's ruleset-bound context has a single producer (the push run). - This is also required for the Dependabot-in-repo-branch path to produce the required check. -- **Concurrency:** `group: ${{ github.workflow }}-${{ github.ref }}`, `cancel-in-progress: true`. -- **Branch-deletion guard (gotcha 4):** every job `if: ${{ !github.event.deleted }}`; aggregator `if: ${{ always() && !github.event.deleted }}`. -- **Drop `dorny/paths-filter` (gotcha 12).** Two options for the smoke gate, FLAGGED for the maintainer (open question 9.1): - - (A) Always run a minimal smoke (NxMeta + NxMeta-LSIO, amd64, no push) on every branch push. Simplest, matches PlexCleaner's - unconditional smoke; costs a docker build on every doc-only push. - - (B) Replace paths-filter with a cheap inline `git diff --name-only` step inside a single `changes` job (no third-party action) to - keep the "only build images when image files changed" optimization. Preserves today's behavior without the dropped action. - - **Recommendation: (B).** NxWitness pushes are frequent (codegen + 6 Dependabot ecosystems x 2 branches) and a full product smoke is - heavier than PlexCleaner's single-target smoke; keep the change-gate but implement it inline to honor "strip paths-filter." -- **Jobs:** `validate` (lint + `dotnet test`, was `test-release-task.yml`), `changes` (inline diff, if option B), `smoke-build` - (calls `build-docker-task.yml` smoke), `check-workflow-status` (aggregator), `cleanup-artifacts`. -- **Aggregator name MUST become exactly `Check pull request workflow status job`** (add trailing ` job`) to match the canonical - required-check string used by `repo-config` (gotcha 7). Keep the success|skipped allowlist (gotcha 8) it already has; for `changes` keep - the "must succeed" semantics (a failed `changes` must not let an image-changing PR through as a skip). -- **Smoke `branch` input:** today passes `github.base_ref` (a PR concept). Under `push`, there is no base_ref. Pass `github.ref_name` - so a push to develop validates develop's Matrix rows and a push to main validates main's (matches the build task's branch filter). - -### 2.2 `validate-task.yml` (NEW, rename of `test-release-task.yml`) -- `workflow_call` + `workflow_dispatch`. Jobs: dotnet restore/tool-restore, husky lint, `dotnet test`, plus the static doc validators - if the canonical validate carries them (markdownlint/cspell scoped to README+HISTORY per LanguageTags convention). Name the aggregated - job `Validate job` per canonical. This is the CI's quality gate; it does not build images. - -### 2.3 `publish-release.yml` (Publisher) - DECIDED (triggered-Docker, one-branch-per-run) -**Decided shape (signed off 2026-06-29): triggered-Docker.** Triggers: `workflow_dispatch` + `schedule` -(weekly Mon 02:00, main baseline) **+ path-scoped `push` on main when the codegen matrix changes** -(`push: { branches: [main], paths: [ ] }`). -Keep: global non-ref-scoped concurrency (`group: ${{ github.workflow }}`, `cancel-in-progress: false`); the -`Verify public release version` D2.2 backstop on the main release; the skip-existing-release guard; the -artifact cleanup job. - -**Run shape: one-branch-per-run, schedule is main-only.** -- Schedule -> builds `main` only (full product matrix; baseline base/CVE refresh + versioned release). -- Push-on-`Matrix.json`-change (main) -> builds `main` (codegen committed a new matrix => publish the new - product versions immediately; this is the accepted publish-on-matrix-change, superseding the earlier - weekly-only decision). Only the matrix file change publishes; ordinary code merges do not (path filter). -- Dispatch -> builds `github.ref_name`, guarded `if: github.ref_name == 'main' || github.ref_name == 'develop'`. **`:develop` is refreshed by manual dispatch only** (the earlier "paired develop re-dispatch to - refresh :develop weekly" is DROPPED per maintainer: weekly builds main only). -- Jobs: `get-version` (ref: `github.ref_name`), `build-base` (push: true, ref: `github.ref_name`), - `build-docker` (push: true, branch: `github.ref_name`, ref: pinned to `get-version.outputs.GitCommitId` - for main / `github.ref_name` for develop), `github-release` (`if: github.ref_name == 'main'` - a develop - dispatch publishes images + the `:develop` tag but cuts no versioned GitHub release), `docker-readme` - (main only), `cleanup-artifacts`. -- NBGV classifies natively: main schedule/push/dispatch => clean version; a develop dispatch => prerelease. - No `IGNORE_GITHUB_REF`, no cross-branch leg. The D2.2 backstop stays as defense-in-depth. The push trigger - is branch-filtered to main, so develop's daily codegen matrix update is sync-only and never publishes. -- **Base-image sharing under one-branch:** the shared `nx-base` tag (`:ubuntu-noble`) is branch-agnostic. A - develop dispatch rebuilding the base would overwrite the shared tag with develop's base. Recommendation: - the develop dispatch sets `build_base: false` and pulls the main-built shared base; the weekly main - schedule refreshes the base for CVEs. If base divergence between branches is a real risk, FLAG (open - question 9.2). - -**Cost note:** publish-on-matrix-change rebuilds the full product matrix on every codegen bump (the -maintainer accepted this over the cheaper weekly-only, for the tightest upstream-vuln window). The matrix -build keeps `max-parallel: 4` and branch-scoped buildcache to bound runner time. - -### 2.4 `build-docker-task.yml` (repo-owned build layer) - keep, with NBGV consolidation -- Keep the `get-matrix` job (smoke filter / branch filter / full), the product matrix over `.Images`, the multi-arch build, the - branch-scoped registry buildcache (read both `buildcache-main` + `buildcache-develop`, write only this branch on push). -- **NBGV (gotcha 2):** remove the nested `get-version` job; instead accept threaded inputs `semver2` (and assembly versions if the image - embeds them) as REQUIRED workflow_call inputs, passed by the orchestrator's single `get-version` run. The orchestrator computes the - version for the branch being built and threads it down. This matches PlexCleaner's `build-docker-task` (version threaded, never - re-run). Smoke callers can pass a placeholder/threaded smoke version. (If the maintainer prefers each leg to label with its own branch - version under shape 3.2, the orchestrator runs get-version per leg and threads each leg's value - still single-NBGV-per-leg, no nested - re-run.) -- Keep `max-parallel: 4` on the product matrix. - -### 2.5 `build-base-images-task.yml` - keep verbatim (repo-owned) -Shared `nx-base` / `nx-base-lsio` matrix, branch-scoped buildcache + inline cache. The `ref` input lets the publisher build from main. -Under shape 3.1, the develop run sets `build_base: false`. - -### 2.6 `get-version-task.yml` - keep; conditionally add IGNORE_GITHUB_REF -Single NBGV run. Under shape 3.1: NO `IGNORE_GITHUB_REF` (native classification). Under shape 3.2: ADD `env: IGNORE_GITHUB_REF: "true"`. - -### 2.7 Codegen (`run-codegen-pull-request-task.yml` + `run-periodic-codegen-pull-request.yml`) - keep, dual-target (gotcha 11) -Matrix runs codegen on main AND develop, opens `codegen-main->main` and `codegen-develop->develop` PRs via the App token (so -`pull_request`/push events fire), CSharpier formats, merge-bot auto-merges each independently. This is the canonical dual-branch codegen -case the briefing's gotcha 11 protects. KEEP both targets. The daily schedule (04:00) is staggered after the weekly publish (Mon 02:00). -Note: the codegen task today runs ONLY `matrix --updateversion` (regenerates `Make/Version.json` + `Make/Matrix.json`); it does NOT run -`make` (Dockerfiles/compose are regenerated by a human via `Make/Create.sh`). Document this seam in WORKFLOW.md (S-section): the codegen -PR keeps the matrix current with upstream Nx versions; Dockerfile changes are a separate human-driven path. FLAG (open question 9.3): -should codegen also run `make` so new upstream versions auto-regenerate Dockerfiles? Current answer: no - Dockerfile structure changes -are reviewed; only version data auto-updates. Recommend keeping as-is. - -### 2.8 `merge-bot-pull-request.yml` - add `--delete-branch` (gotcha 5) -Add `--delete-branch` to the `gh pr merge --auto "$method"` calls in BOTH `merge-dependabot` and `merge-codegen` jobs (NOT to the -disable-auto job). Keep repo-wide `delete_branch_on_merge: false` in `settings.json` (gotcha 5 + github-auto-delete-branch-gotcha: -prevents a develop->main promotion from deleting develop). Per-merge deletion is explicit. Keep the per-base method case -(develop=squash, main=merge), the major-NuGet skip, the strict codegen head/base pairing, and `pull_request_target` + App-token model. - -### 2.9 Docker Hub overview - fold into the docker task (gotcha 12) -Delete `publish-docker-readme-task.yml` as a standalone and add a `peter-evans/dockerhub-description` step that runs ONLY on a main -publish. Because NxWitness has 12 repos, the fold must iterate the repo list. Cleanest: a small `docker-readme` job inside -`publish-release.yml` gated `if: github.ref_name == 'main'`, deriving the repo list from `Make/Matrix.json` via the existing -`manifest-jq` (`[.Images[].Name | ascii_downcase | "ptr727/\(.)"] + ["ptr727/nx-base","ptr727/nx-base-lsio"] | sort | unique`) and -matrixing `peter-evans/dockerhub-description` over it. This keeps the behavior but removes the standalone reusable file the briefing -says to strip. (If the maintainer prefers to keep the reusable task file for clarity given 12 repos, that is a defensible per-repo -deviation - document it. Recommendation: fold, to match the canonical strip.) - -### 2.10 `build-datebadge-task.yml` - DELETE (gotcha 12) -Remove the file and the `date-badge` job from `publish-release.yml`, and strip the badge from README if it points at the BYOB gist. - -### 2.11 `WORKFLOW.md` (NEW) - port from PlexCleaner, adapt for codegen + multi-image -Structure mirrors PlexCleaner: model-at-a-glance, glossary, architecture, the D0..D10 behavioral contract, 5-test methodology (5A static, -5B trace scenarios S1..Sn, 5C live probe, 5D config audit), repository configuration. NxWitness-specific additions: -- D-guarantees for the **codegen seam**: codegen dual-targets main+develop; codegen PR regenerates `Version.json`+`Matrix.json` only; - forward-only version guard (`ReleaseVersionForward`) prevents generic-tag regression; merge-bot auto-merges each codegen PR. -- D-guarantees for the **multi-image matrix**: shared base built once and reused; product matrix from `Matrix.json`; per-product Docker - Hub repos + base repos; multi-arch amd64+arm64; branch-scoped buildcache; weekly base refresh for CVEs. -- The NBGV-classification guarantee adapted to the chosen shape (3.1 native vs 3.2 IGNORE_GITHUB_REF), with the D2.2 backstop guarantee. -- A "Template adaptations" appendix documenting the legitimate divergences (shared-base fan-out, Docker-only release with no - release-asset files, folded docker-readme, codegen replacing merge-upstream-version). - -### 2.12 `repo-config/` (NEW) - port from PlexCleaner verbatim, retarget strings -- `configure.sh`: copy PlexCleaner's verbatim (helper bodies `jq_lacks`, `check_secrets`, `ruleset_id`, `check_app`, `assert`, `pass`, - `fail`, `note`, `apply_ruleset`, `cmd_apply`, `cmd_check`). Retarget: `REQUIRED_CHECK="Check pull request workflow status job"`, - `REQUIRED_ACTIONS_SECRETS`/`REQUIRED_DEPENDABOT_SECRETS` = `(DOCKER_HUB_USERNAME DOCKER_HUB_ACCESS_TOKEN CODEGEN_APP_CLIENT_ID - CODEGEN_APP_PRIVATE_KEY)` (identical set, both stores - gotcha 3), and the manual-verify note to enumerate the 12 NxWitness Docker Hub - repos (or note "push to docker.io/ptr727/"). `cmd_check` order: ruleset develop (squash, linear), - ruleset main (merge, non-linear), settings, security, secrets, app. -- `ruleset-develop.json`: condition `refs/heads/develop`; rules `deletion`, `non_fast_forward`, `required_linear_history`, - `required_signatures`, `pull_request` (allowed_merge_methods `["squash"]`, `required_review_thread_resolution: true`, - `dismiss_stale_reviews_on_push: true`, approvals 0), the required status check context - `Check pull request workflow status job` (integration_id 15368), `copilot_code_review` (review_on_push true). -- `ruleset-main.json`: condition `refs/heads/main`; SAME minus `required_linear_history` (must allow the develop->main merge commit); - `allowed_merge_methods: ["merge"]`; same status check + copilot rules. -- `settings.json`: `{ allow_squash_merge true, allow_merge_commit true, allow_rebase_merge false, allow_auto_merge true, - delete_branch_on_merge false }`. -- `repo-config/README.md`: port PlexCleaner's; retarget the repo slug (`ptr727/NxWitness`), the secret set, the Docker Hub repo - enumeration, and the required-check lockstep note. - -### 2.13 `.slnx` - rebuild to the real file set -Replace the stale LanguageTags-shaped file list with the actual files: workflows -`build-base-images-task.yml`, `build-docker-task.yml`, `get-version-task.yml`, `merge-bot-pull-request.yml`, `publish-release.yml`, -`run-codegen-pull-request-task.yml`, `run-periodic-codegen-pull-request.yml`, `test-pull-request.yml`, `validate-task.yml`; plus -`CreateMatrix.csproj` + `CreateMatrixTests.csproj` projects and the solution items (`WORKFLOW.md`, `version.json`, etc.). Remove the -deleted `build-datebadge-task.yml` / `publish-docker-readme-task.yml` / non-existent template names. - -### 2.14 Release artifact -None as a file. The GitHub release carries auto source zip + README + LICENSE only (Docker-only repo; `fail_on_unmatched_files` omitted). -The published artifacts are the 12 Docker Hub repos' multi-arch images. `:latest`/`:stable` from main, `:develop` from develop, plus -`:` and `:develop-` tags from `Matrix.json`. - ---- - -## 3. Release model decision (signed off 2026-06-29) - -**Decision: triggered-Docker, one-branch-per-run.** Publish on `weekly schedule (main)` + -`push-on-Matrix.json-change (main)` + `workflow_dispatch`. Schedule is **main-only**; `:develop` refreshes by -manual dispatch only. - -**How it reconciles the codegen cadence:** -- Daily codegen keeps `Matrix.json`/`Version.json` current on **main AND develop** (dual-target, gotcha 11 - - drift-avoidance only; develop's update is sync-only and never publishes). -- When codegen commits a new matrix **to main**, the path-scoped push publishes the new product versions - immediately (the accepted **publish-on-matrix-change**, superseding the earlier weekly-only decision - the - maintainer accepted the full-matrix rebuild cost for the tightest upstream window). -- The weekly main schedule still runs even with no matrix change, to refresh the shared base image for CVEs - and re-cut from the current pin. -- A maintainer wanting an off-cycle or develop-channel build dispatches the publisher from the branch. - Ordinary code merges never publish (only the matrix file path triggers). - -**Why one-branch (not the old two-leg combined run):** each publish run is single-branch, so NBGV classifies -natively - no `IGNORE_GITHUB_REF`, no cross-branch leg, none of the `nbgv-publicrelease-githubref-leak` class -of bugs - and it converges NxWitness's publisher with the four live repos and the ESPHome triggered-Docker -sub-model (identical shape, only the trigger path file differs: `Matrix.json` here, `upstream-version.json` -there). The D2.2 `Verify public release version` backstop stays as defense-in-depth. - -**Confirm during execution:** the exact repo-relative path of the codegen matrix output for the push -`paths` filter (e.g. `CreateMatrix/Matrix.json` vs `Make/Matrix.json`), and that codegen writes it on the -main branch directly (or via an auto-merged `codegen-main` PR whose merge is the publishing push). - ---- - -## 4. Files to create / edit / delete - -### Create (7) -1. `WORKFLOW.md` (port from PlexCleaner + codegen/multi-image D-guarantees). -2. `repo-config/configure.sh` (verbatim helpers, retargeted secrets/check/repo). -3. `repo-config/ruleset-main.json`. -4. `repo-config/ruleset-develop.json`. -5. `repo-config/settings.json`. -6. `repo-config/README.md` (port + retarget). -7. `.github/workflows/validate-task.yml` (rename of `test-release-task.yml`, canonical `Validate job`). - -### Edit (10) -1. `.github/workflows/test-pull-request.yml` - trigger `push: ['**']`+dispatch; drop paths-filter (inline diff, option B); - `!github.event.deleted` guards; aggregator rename to `Check pull request workflow status job`; smoke `branch: github.ref_name`; - call `validate-task.yml`. -2. `.github/workflows/publish-release.yml` - one-branch-per-run (shape 3.1): single get-version/build-base/build-docker per run, main-only - release+readme, develop re-dispatch for the develop channel; remove the `date-badge` job; fold docker-readme as a main-only job. -3. `.github/workflows/build-docker-task.yml` - remove nested `get-version`, accept threaded `semver2` (+ assembly versions) inputs. -4. `.github/workflows/get-version-task.yml` - keep single NBGV; (shape 3.2 only) add `IGNORE_GITHUB_REF`. Under 3.1 unchanged. -5. `.github/workflows/merge-bot-pull-request.yml` - add `--delete-branch` to both merge jobs. -6. `.github/dependabot.yml` - verify the 6 ecosystem x 2 branch entries cover the actions used in new/edited workflows; keep dual-target. -7. `AGENTS.md` - add `### Comments` subsection + `## Documentation Style Conventions` (converge with PlexCleaner); add a reference to - `WORKFLOW.md` + `repo-config/`; refresh the "Template adaptations" section to match the chosen publisher shape and the folded - docker-readme / dropped date-badge. -8. `CODESTYLE.md` - converge shared headings with PlexCleaner (diff and align General/.NET structure; keep NxWitness specifics). -9. `.github/copilot-instructions.md` - retarget owner/name strings in the Review Runbook (`ptr727/NxWitness`); otherwise verbatim. -10. `NxWitness.slnx` - rebuild to the real file set. - (Plus: `README.md` em-dash sweep + strip date badge; `version.json` floor bump to mark the overhaul and exercise the publish path - - reconcile with the AGENTS "routine edits leave version.json untouched" rule by noting a deliberate maintainer-directed infra bump.) - -### Delete (2) -1. `.github/workflows/build-datebadge-task.yml`. -2. `.github/workflows/publish-docker-readme-task.yml` (folded into `publish-release.yml`). - -Net: 7 create, ~12 edit (incl. README + version.json), 2 delete. (Counts exclude the branch-backlog cleanup in 1.7.) - ---- - -## 5. Convergence + backports - -### 5.1 Port VERBATIM (byte-for-byte, owner/name strings only) -- `repo-config/configure.sh` helper bodies (`jq_lacks`, `check_secrets`, `ruleset_id`, `check_app`, `assert`, `pass`/`fail`/`note`, - `apply_ruleset`) - the hardened canonical forms (gotcha 6). -- `repo-config/ruleset-*.json` structure (only condition/merge-method/linear-history differ between main and develop, already canonical). -- `repo-config/settings.json` (identical to PlexCleaner). -- `.github/copilot-instructions.md` Review Runbook (only `ptr727/NxWitness` substitutions; bot id `BOT_kgDOCnlnWA`, the requestReviews - mutation, GraphQL-vs-REST login split, known-broken `POST /requested_reviewers` note all carry verbatim). -- AGENTS.md shared subsections: `### Comments`, `## Git and Commit Rules`, the "Where rules live" lead-in, `## PR Review Etiquette` - (already present and canonical here), `## Documentation Style Conventions` incl. "write docs in the current state". - -### 5.2 Adapt (repo-specific) -- `configure.sh` required-secret list (same 4 names but the manual Docker Hub note enumerates 12 NxWitness repos) and `REPO` slug. -- `WORKFLOW.md` Docker mechanics + the NEW codegen and multi-image D-guarantees and the "Template adaptations" appendix. -- AGENTS.md "Template adaptations" and the codegen/Image-Architecture sections. -- `dependabot.yml` (6 ecosystems incl. docker, dual-target - already adapted). - -### 5.3 Backports to the four live repos (drift found) -- Confirm all four live repos' `merge-bot` use `--delete-branch` (gotcha 5). NxWitness lacked it; the others were noted to have it - - spot-check and backport if any regressed. -- If NxWitness's `configure.sh` helpers (ported from PlexCleaner) reveal any newer hardening than what LanguageTags/Utilities carry, - backport the hardened helper to those NuGet repos (they share the helper bodies verbatim). -- The folded docker-readme pattern (main-only `dockerhub-description` step) should match PlexCleaner's approach; if PlexCleaner kept a - reusable file vs an inline step, align NxWitness to whichever the maintainer blessed as canonical (PlexCleaner stripped the standalone - - fold here too). - ---- - -## 6. Gotcha checklist mapped to NxWitness - -1. **NBGV GITHUB_REF classification.** APPLIES. Under shape 3.1 (recommended) github.ref matches the built branch -> native - classification, no IGNORE_GITHUB_REF. Under shape 3.2 (combined two-leg) IGNORE_GITHUB_REF is REQUIRED. `version.json` floor 2.14, - `publicReleaseRefSpec ^refs/heads/main$`. Bump the floor to exercise the publish path. -2. **NBGV threading.** APPLIES. Today `build-docker-task.yml` re-runs NBGV via a nested `get-version`. Fix: remove the nested job, thread - `semver2` (+ assembly versions) from the orchestrator's single get-version run. This also removes the `:SemVer2`-tag-collision risk - across the image matrix (one classification feeds all product legs). -3. **Docker creds in BOTH stores.** APPLIES. `DOCKER_HUB_USERNAME` + `DOCKER_HUB_ACCESS_TOKEN` must be in Actions AND Dependabot stores - (Dependabot push CI smoke-builds and logs in to Docker Hub). `configure.sh` enforces both. Same for `CODEGEN_APP_*` (merge-bot). -4. **Branch-deletion guard.** APPLIES ONCE we move CI to `push: ['**']`. Add `!github.event.deleted` to every CI job + aggregator. n/a - while the trigger stays `pull_request` (no such event), but the move to push makes it mandatory. -5. **merge-bot `--delete-branch`.** APPLIES. Missing today; add to both merge jobs. Repo-wide auto-delete stays OFF in settings.json. -6. **5D audit hardened helpers.** APPLIES. Port the final hardened `jq_lacks` (exit 4 = lacks; keep stderr), `check_secrets` - (API error FAILs, both stores, paginate), `ruleset_id` (`first // empty` in jq, no `head -1`, let gh print error), `check_app` - (best-effort note, never fails). Audit must fail when it cannot verify. -7. **Required-check name lockstep.** APPLIES + ACTIVE BUG. The aggregator is named `Check pull request workflow status` (missing - ` job`). The required-check string, the aggregator job `name:`, and the ruleset JSON must all read `Check pull request workflow status - job`. Run `configure.sh apply` in the same change that ships the workflow edit, then `check`. -8. **Aggregator success/skipped allowlist.** APPLIES; already correct (success|skipped pass, failure|cancelled fail). Keep the `changes` - "must succeed" carve-out so an image-changing PR cannot merge on a `changes` failure treated as skip. -9. **EOL discipline.** APPLIES. All workflows are CRLF today; keep CRLF for md/yml/json/code-workspace/slnx, LF for .sh/Dockerfile/.py. - Pin in `.gitattributes`/`.editorconfig` (present - verify they cover `.slnx`). Re-check after Write/Edit (they can flip CRLF to LF); - verify with `grep -c $'\r'` not `file`. -10. **Copilot review loop.** APPLIES. Runbook already in `.github/copilot-instructions.md`. snupkg/OIDC/NBGV-prerelease false positives; - decline with rationale. Expect 1-3 rounds. Use `gh api -X PATCH .../pulls/N -F body=@file` for body edits. -11. **Dependabot + codegen dual-target main AND develop.** APPLIES - this is the canonical case. Both `dependabot.yml` and - `run-codegen-pull-request-task.yml` already dual-target. KEEP both; do not collapse to single-target (maintainer rejected it; it - caused non-linear rebase/merge-block conflicts). -12. **Strip template cruft.** APPLIES. Delete `build-datebadge-task.yml`; fold `publish-docker-readme-task.yml` into a main-publish step; - drop `dorny/paths-filter`; there is no `setup`/`PUBLISH_ON_MERGE` machinery here (already absent); merge-bot already omits - `merge-upstream-version` (codegen replaces it) - keep that. -13. **Action SHAs.** APPLIES. Current pins look converged (setup-dotnet v5.4.0 `26b0ec14...`, checkout v7.0.0 `9c091bb2...`, - create-github-app-token v3.2.0, docker actions v4.x/v7.2.0, softprops v3.0.1). VERIFY every SHA->version against the GitHub API before - asserting in review; do not trust Copilot's SHA/version mapping. -14. **Prose rules.** APPLIES. No em-dashes (README has 1 - sweep it). US English. Terse comments, one line if <~120 cols, top-of-file - workflow summaries (present). Never edit human-authored comments. - ---- - -## 7. Verification - -### 7.1 Static (local, before push) -- `actionlint` on all workflows (Docker image or npx). -- `markdownlint-cli2` on `WORKFLOW.md`, `AGENTS.md`, `CODESTYLE.md`, `README.md`, `repo-config/README.md`. -- `cspell` (scope = README + HISTORY per convention; add product/codegen terms to the dictionary as needed). -- YAML + JSON parse: every workflow, `Make/Matrix.json`, `Make/Version.json`, `version.json`, the three `repo-config/*.json`, - `.slnx` well-formedness. -- `bash -n repo-config/configure.sh` and `shellcheck` (the helpers carry `# shellcheck disable` directives - preserve them). -- `dotnet build` + `dotnet test` (CreateMatrixTests) green; `dotnet csharpier --check` + `dotnet husky run` clean. -- EOL audit: `grep -c $'\r'` to confirm CRLF on md/yml/json/code-workspace/slnx and LF on .sh/Dockerfile (gotcha 9). -- Em-dash sweep: `grep -rn '—'` across the tree (expect 0 after the README fix). -- Token sweep for stale template references: `LanguageTags|ProjectTemplate|build-executable-task|build-library-task| - build-release-task|publish-periodic-docker-release|datebadge|paths-filter|PUBLISH_ON_MERGE` (expect only intentional historical mentions). -- Codegen smoke: `dotnet run --project ./CreateMatrix -- matrix --versionpath=./Make/Version.json --matrixpath=/tmp/m.json - --updateversion` against a copy, confirm it still produces a valid Matrix.json (and that the forward-only guard holds). - -### 7.2 Config audit -- `repo-config/configure.sh check` BEFORE `apply`: expect drift (no ruleset yet, required-check name mismatch, possibly missing secrets - in one store). Document the expected drift list. -- `repo-config/configure.sh apply` then `check`: expect "Configuration matches" (modulo the App best-effort note and the manual Docker - Hub push note). - -### 7.3 Live dispatch verification (post-merge) -- Dispatch `publish-release.yml` from `main`: confirm clean `X.Y.` images on the 12 repos, `:latest`/`:stable` tags, multi-arch - manifest (amd64+arm64), the versioned GitHub release, the Docker Hub overview pushed, NO prerelease `-` on the main release. -- Dispatch from `develop`: confirm `:develop` + `:develop-` tags, prerelease classification (`X.Y.-g` label), NO - versioned GitHub release. -- Confirm the shared base tags (`nx-base:ubuntu-noble`, `nx-base-lsio:ubuntu-noble`) are intact and not overwritten by a develop run. -- Trigger codegen via dispatch: confirm both `codegen-main->main` and `codegen-develop->develop` PRs open and merge-bot auto-merges with - branch deletion. - ---- - -## 8. Go-live sequence - -1. Branch `migrate/branch-scoped-cicd` off `develop`. Verify SSH signing is live before the first commit (committing is enabled here). -2. Apply all create/edit/delete (section 4). Run the full static + codegen verification (7.1) locally. -3. Push the branch (CI now runs via `push: ['**']`). Open PR -> `develop`. -4. Copilot dance (gotcha 10): poll for auto-review, re-request via `requestReviews` mutation after each push, resolve every thread, - decline false positives (snupkg/OIDC n/a here; NBGV-prerelease, the IGNORE_GITHUB_REF presence/absence, and the two-leg-vs-one-branch - choice are the likely debate points) with rationale. Budget 1-3 rounds. -5. `repo-config/configure.sh apply` against the live repo IN THE SAME change window (gotcha 7): this writes the rulesets, renames the - required check to `Check pull request workflow status job` in lockstep with the workflow edit, and adds the Copilot rule - unblocking - the PR. Then `configure.sh check` -> matches. -6. Squash-merge to `develop`. Confirm CI green on develop. -7. Promote `develop -> main` via a merge-commit PR, Copilot-reviewed, NO admin bypass (main ruleset allows the merge commit by omitting - `required_linear_history`). Watch for the migration-promotion-conflict pattern if main has straggler bumps in rewritten files; if it - bites, a local signed merge commit (tree=develop) is the documented escape, but try the normal PR first. -8. Dispatch `publish-release.yml` from `main` to verify the publish path end-to-end (7.3). Then dispatch from `develop` to verify the - develop channel. -9. Confirm `develop` survives the promotion (github-auto-delete-branch-gotcha: delete_branch_on_merge stays OFF). -10. Prune the branch backlog (1.7): close superseded `fix-release-*` / `*cicd*` / `*versioning*` / `chore/sync-template` / - `release-notes-2.14` / `shields` branches (do NOT merge them - the migration supersedes their mechanics); delete merged dependabot - branches; let codegen branches recreate themselves on the next daily run. - ---- - -## 9. Open questions for the maintainer (with recommended defaults) - -1. **Smoke gate (2.1).** Always-smoke (A) vs inline-diff change gate (B)? **Default: B** - NxWitness has many frequent pushes and a - full product smoke is heavier than PlexCleaner's single-target smoke; keep the change gate but implement inline (drop paths-filter). -2. **Base-image sharing under one-branch (2.3).** Build the base only on the main run and have develop reuse the published shared tag, or - build per-branch? **Default: build on main, develop reuses (`build_base: false`)** - the `:ubuntu-noble` base tag is branch-agnostic; a - develop rebuild would churn the shared tag. Confirm base Dockerfiles never diverge between branches. -3. **Codegen scope (2.7).** Should codegen also run `make` to auto-regenerate Dockerfiles/compose on a new upstream version, or stay - version-data-only? **Default: stay version-data-only** - Dockerfile structure changes warrant human review; the matrix data is the - safe-to-automate part. -4. **Publisher shape (3).** One-branch-per-run + develop re-dispatch (3.1) vs hardened combined two-leg run with IGNORE_GITHUB_REF (3.2)? - **Default: 3.1** - converges with the four live repos and deletes the cross-branch NBGV-leak class. 3.2 is acceptable only if the - maintainer specifically wants a single combined weekly run; then harden with IGNORE_GITHUB_REF + the D2.2 backstop and document the - divergence. -5. **version.json floor bump.** Bump from 2.14 to mark the overhaul and exercise the publish path? **Default: yes (deliberate - maintainer-directed infra bump)**, reconciled in AGENTS.md so it does not contradict the "routine edits leave version.json untouched" - rule. -6. **Docker-readme fold (2.9).** Fold into a main-only `publish-release.yml` job (canonical strip) vs keep a reusable task file given the - 12-repo list? **Default: fold** - matches PlexCleaner's strip; the manifest-jq derivation moves inline.