From 43b8ad99ad139e20e49f4ac3fd5ad815c9fdcea1 Mon Sep 17 00:00:00 2001 From: ccross Date: Thu, 28 May 2026 14:51:04 -0400 Subject: [PATCH] chore(packaging): close `sha256sums=SKIP` TODO + add ADR 012 + update STRATEGY for v0.3.x MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `packaging/aur/PKGBUILD` has shipped `sha256sums=('SKIP')` since v0.1.0 (and a `# TODO: compute real sha256sum at release time` reminder). With v0.3.3 now published as a GitHub Release, the tarball URL resolves deterministically: https://github.com/sovren-software/visage/archive/refs/tags/v0.3.3.tar.gz Fetched + hashed: `e018fcc08dbb3aba381306424fc1fd94eaddc0a5da0d47437f17487f29b76f99` (199,300 bytes). PKGBUILD now declares the real hash + a bump-procedure comment for future maintainers (compute via curl ... | sha256sum; must re-compute on every pkgver bump). Closes the v0.1.0-era TODO. `makepkg` will now reject any tampered or corrupted download — the AUR integrity gate is finally active for visage. Trade-off: GitHub's git-archive output has changed compression behavior in the past (2023); if it changes again the hash will mismatch and AUR users will see an integrity error instead of a silent install. Operationally noisy but better than the prior SKIP state. Adds ADR 012 (`docs/decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md`) documenting the v0.3.x post-launch stabilization arc: - Two prioritized releases (v0.3.2 bug fixes → v0.3.3 deps + community) - PAM `success=end → success=done` fleet sweep across 9 sites - `visaged` SIGTERM handler + `TimeoutStopSec=10s` unit override - Devshell parity (`rustfmt`/`clippy`/`libclang`) - X1 Carbon Gen 9 quirk + AUR `!lto !debug` - Closing the `sha256sums=SKIP` TODO (this commit) - Dependency cohort (7 merged + 1 closed) - Trade-offs and known limitations - Remaining work for the arc Updates `docs/STRATEGY.md` "Where We Are" section from v0.3.0 → v0.3.3 with the bug-fix-wave context, the corrected component delivery descriptions (quirks DB now covers two hardware targets; visaged ships dual-signal handler; PAM module's corrected control flow), and a cross-reference to ADR 012 for the full story. Skipping ADR 011's numbering convention extension since v0.3 wasn't planned to need post-launch stabilization ADRs at the v0.3.x point releases — 012 fits the next-available slot. Future stabilization ADRs follow the same pattern. No version bump in this commit; lands as a packaging+docs hygiene patch that picks up on next release cut (v0.3.4 or v0.4.0). --- docs/STRATEGY.md | 31 +++- ...post-launch-stabilization-v0.3.2-v0.3.3.md | 169 ++++++++++++++++++ packaging/aur/PKGBUILD | 7 +- 3 files changed, 198 insertions(+), 9 deletions(-) create mode 100644 docs/decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md diff --git a/docs/STRATEGY.md b/docs/STRATEGY.md index 0aaf3a8..ab1de52 100644 --- a/docs/STRATEGY.md +++ b/docs/STRATEGY.md @@ -32,23 +32,40 @@ Linux deserves a biometric authentication layer that is **reliable, secure, and --- -## Where We Are: v0.3.0 +## Where We Are: v0.3.3 -**Shipped 2026-02-23. All 6 implementation steps complete. End-to-end tested on Ubuntu 24.04.4 LTS.** +**Shipped 2026-05-28. Bug-fix release wave (v0.3.2 → v0.3.3) on top of the v0.3.0 foundation.** + +v0.3.0 (2026-02-23) shipped all 6 implementation steps end-to-end on Ubuntu 24.04.4 LTS. +The v0.3.x point releases since then addressed two silent ship-time bugs and added +broader hardware + packaging coverage: + +- **v0.3.2 (2026-05-28)** — fixed `PAM success=end → success=done` keyword (libpam was + silently treating the unknown keyword as `ignore` since v0.1.0, so face auth was a + silent no-op on the documented setup paths). Closed Issue #26 — `visaged` now handles + SIGTERM correctly, dropping the ~90s post-hibernate `systemctl restart` hang to ~10s. +- **v0.3.3 (2026-05-28)** — Lenovo X1 Carbon Gen 9 IR camera quirk (second + Tier-1-verified hardware target after ASUS Zenbook 14 UM3406HA); AUR `!lto !debug` + fix so `makepkg -si` succeeds on stock Arch; devshell parity with CI; + 7 dependency bumps. | Component | What it delivers | |-----------|-----------------| -| `visage-hw` | V4L2 capture, GREY/YUYV/Y16 format detection, CLAHE preprocessing, dark frame rejection | +| `visage-hw` | V4L2 capture, GREY/YUYV/Y16 format detection, CLAHE preprocessing, dark frame rejection. Quirks DB covers ASUS Zenbook 14 + Lenovo X1 Carbon Gen 9 | | `visage-core` | SCRFD face detection + ArcFace recognition via ONNX Runtime — CPU-capable, no CUDA required | -| `visaged` | Persistent daemon — holds camera and model weights across auth requests, D-Bus IPC, SQLite WAL | -| `pam-visage` | Thin PAM module — `PAM_IGNORE` fallback, never blocks, system bus | -| IR emitter | UVC extension unit control, hardware quirks database, ASUS Zenbook 14 UM3406HA confirmed | -| Packaging | `.deb` with `pam-auth-update`, systemd hardening, AES-256-GCM embeddings at rest | +| `visaged` | Persistent daemon — holds camera and model weights across auth requests, D-Bus IPC, SQLite WAL. SIGINT + SIGTERM shutdown handlers; `TimeoutStopSec=10s` defense in depth | +| `pam-visage` | Thin PAM module — `PAM_IGNORE` fallback, never blocks, system bus. `[success=done default=ignore]` control flow (corrected v0.3.2) | +| IR emitter | UVC extension unit control, hardware quirks database | +| Packaging | `.deb` with `pam-auth-update`, AUR `!lto !debug` PKGBUILD with verified `sha256sums`, NixOS module, systemd hardening, AES-256-GCM embeddings at rest | Visage authenticates in ~1.4s on CPU with a USB webcam. Howdy's Python subprocess cold-start is 2-3s. Visage is already faster — without IR camera or GPU — because model weights are loaded once at daemon start, not per attempt. That is the architectural advantage. +See [ADR 012](decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md) for the full +v0.3.x stabilization context, the rationale behind each fix, and the trade-offs +accepted. + --- ## Ecosystem Position diff --git a/docs/decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md b/docs/decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md new file mode 100644 index 0000000..e3cb4ad --- /dev/null +++ b/docs/decisions/012-post-launch-stabilization-v0.3.2-v0.3.3.md @@ -0,0 +1,169 @@ +# ADR 012 — Post-Launch Stabilization: v0.3.2 + v0.3.3 Bug-Fix Wave + Community PR Intake + +**Date:** 2026-05-28 +**Status:** Implemented (v0.3.2 + v0.3.3 shipped) +**Scope:** `visaged`, `pam-visage`, `packaging/aur`, `packaging/debian`, `packaging/nix`, `flake.nix`, docs, CHANGELOG + +--- + +## Context + +Three months after v0.3.0 shipped (2026-02-23), the visage PR/issue/discussion queue had accumulated: + +- **7 open dependabot PRs** dating back to late March (3 GitHub Action major bumps + 4 Rust dep bumps + 1 failing `ort` rc bump) +- **3 community PRs** from external contributors: @themariusus's Lenovo X1 Carbon Gen 9 IR camera quirk (#29), @SelfRef's Arch README docs update (#27), @SomeCodecat's AUR `!lto !debug` fix (#25) +- **1 open issue** from @SomeCodecat (#26) — `visaged` blocks for 90s on `systemctl restart` after hibernate due to stale camera fd +- **1 open discussion** from @Alex52Github (#28) — when is Intel IPU6 / MIPI / libcamera support expected, and is Fedora supported? + +Investigation while preparing to respond to PR #27 surfaced a **fleet-wide PAM bug shipped since v0.1.0**: `/etc/pam.d/*` files generated by the repo (README, NixOS module, Debian pam-auth-update profile, etc.) used the keyword `success=end` in 9 places. `pam.conf(5)` documents only `ignore | bad | die | ok | done | reset | N` as valid value-keywords. libpam logs a warning and silently treats the unknown keyword as `ignore`, dropping `pam_visage.so`'s `PAM_SUCCESS`. Result: every Visage install on the documented setup paths has had face auth as a no-op since v0.1.0 — face match succeeded, libpam ignored the result, stack fell through to `pam_unix.so` → password prompt. + +Investigation of @SomeCodecat's hibernate-hang report (#26) also surfaced a second silent ship-time bug: `crates/visaged/src/main.rs` used `tokio::signal::ctrl_c().await?` for shutdown. On Unix, `tokio::signal::ctrl_c()` is SIGINT-only — it does NOT catch SIGTERM. systemd's `systemctl stop|restart` sends SIGTERM, which `visaged` was ignoring, so systemd waited the default `TimeoutStopSec=90s` and SIGKILL'd. Manifested as the 90s hang reported in #26 whenever `visage-resume.service` fired after hibernate. + +## Decision + +### 1. Two prioritized release cuts: v0.3.2 (bug fixes) → v0.3.3 (community + deps) + +Cut **v0.3.2 first** to ship the two real bug fixes (PAM keyword + SIGTERM handler) to users on v0.3.0 as fast as possible. Cut **v0.3.3 second** to bundle the X1 Carbon hardware quirk, the AUR LTO fix, the devshell parity improvement, and the dependabot cohort. **Skip v0.3.1 entirely** — its number is permanently unused. + +| Release | Tag | Date | Asset | Contents | +|---|---|---|---|---| +| Bug-fix release | `v0.3.2` | 2026-05-28 | `visage_0.3.2-1_amd64.deb` (9.46 MB) | PAM `end → done` fleet sweep + visaged SIGTERM handler + `TimeoutStopSec=10s` | +| Deps + community | `v0.3.3` | 2026-05-28 | `visage_0.3.3-1_amd64.deb` (9.45 MB) | X1 Carbon quirk + AUR `!lto !debug` + devshell parity + 7 dep bumps | + +### 2. PAM keyword fleet sweep: `success=end → success=done` across 9 sites + +Swept in PR #31 (squash-merged into v0.3.2). Affected files: + +- `README.md` (Arch install instructions, 1 site) +- `docs/operations-guide.md` (setup + verification, 2 sites) +- `docs/architecture.md` (PAM stack integration narrative, 1 site) +- `docs/research/architecture-review-and-roadmap.md` (roadmap context, 2 sites) +- `docs/research/domain-audit.md` (implementation plan, 1 site) +- `docs/research/howdy-analysis-and-visage-design.md` (design comparison, 1 site) +- `packaging/debian/pam-auth-update` (Ubuntu profile — live config, 1 site) +- `packaging/nix/module.nix` (NixOS module — `sudo` + `login` rules, 2 sites) + +**Reported by** @SelfRef in PR #27 (their PR caught 1 of 9 sites); commit `daa9903` credits them via `Reported-by:`. Their original PR (#27) is held open pending amend on cosmetic items unrelated to the keyword fix. + +**Existing-user upgrade path:** +- **Debian/Ubuntu** — `postinst` runs `pam-auth-update --package visage` on every install, which regenerates `/etc/pam.d/common-auth` from our corrected profile. Auto-recovery on next `.deb` upgrade. +- **NixOS** — corrected `security.pam.services.{sudo,login}.rules.auth.visage.control` value picked up on next `nixos-rebuild switch`. +- **Arch (manual)** — operators who copied the prior README's example into `/etc/pam.d/system-auth` must manually swap `success=end` for `success=done`. CHANGELOG calls this out. + +### 3. `visaged` SIGTERM handler + `TimeoutStopSec=10s` unit override + +Swept in PR #30 (squash-merged into v0.3.2). Replaced the SIGINT-only `tokio::signal::ctrl_c().await?` with a dual-signal handler matching the pattern at `esver-capture/crates/esver-capture-cli/src/daemon.rs::wait_for_shutdown_signal`: + +```rust +use tokio::signal::unix::{signal, SignalKind}; +let mut sigterm = + signal(SignalKind::terminate()).context("failed to install SIGTERM handler")?; +let mut sigint = + signal(SignalKind::interrupt()).context("failed to install SIGINT handler")?; +tokio::select! { + _ = sigterm.recv() => tracing::info!(signal = "SIGTERM", "received shutdown signal"), + _ = sigint.recv() => tracing::info!(signal = "SIGINT", "received shutdown signal"), +} +``` + +Added `TimeoutStopSec=10s` to `packaging/systemd/visaged.service` as defense in depth — covers the edge case where a `v4l2 VIDIOC_DQBUF` is mid-flight on shutdown (e.g. a stale camera fd after hibernate resume that isn't promptly interruptible). Worst-case `systemctl restart` drops from ~90s to ~10s. + +**Closes Issue #26.** + +### 4. Devshell parity (`rustfmt` + `clippy` + `libclang`) in `flake.nix` + +Swept in PR #32 (squash-merged into v0.3.3). The `nix develop` shell brought the package's build inputs via `inputsFrom = [ visage ]` but didn't include the cargo subcommands CI runs (`cargo fmt --check`, `cargo clippy --workspace -- -D warnings`) or `libclang.so` (transitively needed by `v4l2-sys-mit`'s `bindgen`). Devshell now declares: + +```nix +packages = with pkgs; [ + rustfmt + clippy + llvmPackages.libclang + rust-analyzer + cargo-deb + cargo-watch +]; +LIBCLANG_PATH = "${pkgs.llvmPackages.libclang.lib}/lib"; +``` + +Verified: `cargo fmt --all -- --check`, `cargo clippy --workspace -- -D warnings`, and `cargo build -p visaged` all run inside `nix develop` without further env tweaking. + +### 5. Lenovo ThinkPad X1 Carbon Gen 9 IR camera quirk (`174f:2454`) + +Merge-committed via PR #29 (per CONTRIBUTING.md "Hardware quirks: Merge commit"). Quirk file at `contrib/hw/174f-2454.toml`. Verified on hardware by @themariusus. Now embedded at compile time alongside the existing ASUS Zenbook 14 UM3406HA quirk. + +### 6. AUR `PKGBUILD: options=(!lto !debug)` + +Squash-merged via PR #25. Fixes the link-time `undefined symbol: ring_core_0_17_14__LIMBS_window5_split_window` (and many more from `ring` + `libsqlite3-sys`) failure on Arch's stock `makepkg.conf` (which defaults to `OPTIONS=(... lto ...)`). Root cause: LTO operates on LLVM IR, but `ring` ships hand-written assembly via `cc` and `libsqlite3-sys` (rusqlite's `bundled` feature) compiles `sqlite3.c` via `cc` — neither produces LTO-compatible IR. Reported and fixed by @SomeCodecat. + +### 7. Close the v0.1.0-era `sha256sums=('SKIP')` TODO in `packaging/aur/PKGBUILD` + +This ADR ships alongside the `SKIP` → real-hash fix. The PKGBUILD now declares the SHA-256 of the v0.3.3 source tarball at `github.com/sovren-software/visage/archive/refs/tags/v0.3.3.tar.gz`. `makepkg` will reject any tampered or corrupted download. + +A comment block explains the bump procedure for future maintainers (compute via `curl ... | sha256sum`; must re-compute on every `pkgver` bump). + +### 8. Dependency cohort + +Merged: `tokio` 1.49→1.50 (#17), `nix` 0.31.1→0.31.2 (#18), `uuid` 1.21→1.23 (#19), `image` 0.25.9→0.25.10 (#23), `actions/checkout` 4→6 (#15), `actions/upload-artifact` 4→7 (#16), `actions/download-artifact` 4→8 (#14). All bumps passed CI on the visage workspace. + +Closed: `ort` 2.0.0-rc.11 → 2.0.0-rc.12 (#20). CI failed on rc.12 — likely API drift in the `ort` 2.0.0-rc series. Will reattempt at rc.13+ or 2.0.0 final. + +### 9. Documentation updates + +- `README.md` Status line bumped `v0.3.0` → `v0.3.3` with a brief summary of the intervening fixes + dual-hardware support. +- `docs/STATUS.md` last-updated bumped to **2026-05-28**; build-state rewritten to reflect v0.3.3, post-v0.3.0 bug-fix wave, and quirks DB now covering ASUS Zenbook 14 UM3406HA + Lenovo X1 Carbon Gen 9 20XW00FPUS. +- `CHANGELOG.md` entries dated and structured under Keep-a-Changelog format (`[Unreleased]` rolled over each release cut). + +## Trade-offs + +### v0.3.1 numerical skip (D1) + +**Trade-off accepted:** Anyone reading the release list sees a missing v0.3.1. Once v0.3.2 ships, that number is permanently dead — no path back. + +**Benefit:** Users on v0.3.0 with face-auth silently broken get the fix in v0.3.2 within hours. The dep cohort + community PRs land in a clean v0.3.3 without commingling. + +### PAM sweep separate from PR #27 (D2) + +**Trade-off accepted:** @SelfRef's contribution looks "partial" until they respond to the amend request — their PR currently sits at "Changes requested" while the actual fleet-sweep fix shipped in #31. Mitigated by: +1. Crediting `Reported-by: @SelfRef` in commit `daa9903`. +2. Posting a follow-up clarification comment on PR #27 explicitly acknowledging the catch was theirs and that #31 carries the full fix. +3. Offering @SelfRef inclusion in `CODEOWNERS` for `packaging/aur/` (they already maintain `visage` / `visage-git` / `visage-bin` on AUR — they're our de facto AUR maintainer). + +**Benefit:** The PAM bug is fixed atomically across all 9 sites — no Debian/NixOS user is missed. + +### `TimeoutStopSec=10s` as defense in depth (not primary fix) (D3) + +**Trade-off accepted:** The SIGTERM handler is the primary fix; the unit timeout is a backstop. If a `v4l2 VIDIOC_DQBUF` is genuinely stuck (e.g. driver bug on stale fd), the synchronous capture inside `tokio::task::spawn_blocking` still needs the full 10s before systemd escalates to SIGKILL. + +**Benefit:** Worst-case operational `systemctl restart` is bounded — 90s is no longer a possibility. 10s gives operational headroom; could be tightened to 5s if measurement supports it. + +### `sha256sums=` real hash without changing source URL (this ADR) + +**Trade-off accepted:** Source URL is still `https://github.com/sovren-software/visage/archive/refs/tags/v$pkgver.tar.gz` (GitHub's git-archive endpoint). GitHub has historically (in 2023) changed git-archive compression behavior, breaking many projects' AUR PKGBUILDs that had pinned hashes. If that happens again, our hash will mismatch and AUR users will see a `makepkg` integrity error. + +**Benefit:** Closes the v0.1.0-era TODO without introducing a release-asset tarball generator. AUR integrity check is now active for the v0.3.3 tarball. The bump-procedure comment in the PKGBUILD documents how to re-verify on future bumps. + +**Alternative not chosen:** Add a tarball generation step to `.github/workflows/ci.yml`'s `release` job (produce a deterministic `.tar.gz` asset on each release tag, point the PKGBUILD source URL at that asset). More work; defer to v0.4 packaging arc if GitHub changes git-archive again. + +## Drawbacks / Known Limitations + +1. **`sha256sums` requires manual bump every release.** The comment in the PKGBUILD documents the procedure. If a future release cut forgets to update the hash, `makepkg` will fail with an integrity mismatch — operationally noisy but not silent (which is preferable to the prior `SKIP` state). +2. **PR #27 still open.** @SelfRef's PR carries non-PAM improvements (visage-resume enable, `visage verify` step, AUR variant documentation) that have not yet landed. Wait clock until 2026-06-04; close-and-redo as our own PR with `Reported-by:` credit if they don't amend. +3. **Discussion #28 answer drafted but not posted.** Operator constraint (fleet PAT lacks `discussions:write`). Drafted answer to @Alex52Github covers IPU6 path (v0.5 arc, depends on a libcamera backend behind `visage-hw`'s `Camera` trait) and Fedora packaging gap (no fundamental blocker — needs RPM `.spec` and pam-auth-update equivalent). +4. **Issue #33 dependabot security alerts (14 open, severity 6h/3m/5l) not triaged.** Operator constraint (PAT lacks `security_events`). Tracked at `https://github.com/sovren-software/visage/security/dependabot` for triage via the browser UI. Some may be advisory-database flags that don't reach our code path; others may require dependency pins. Gate v0.3.4 if any severity-high are reachable in `visaged` or `pam-visage`. +5. **Bypass-merge precedent on release PRs.** Single-maintainer repos with branch protection requiring 1 approval need the bypass for release PRs (operator is the only writer AND the PR author; GitHub explicitly forbids self-approval). Scope discipline: release PRs only — never bypass for code-change PRs. Documented in commit messages and CONTRIBUTING.md remains unchanged. + +## Companion documents + +- Engram session ADR: `~/cDesign/dendrite/Projects/Visage/Decisions/SESSION-2026-05-28-VISAGE-V0.3.2-V0.3.3-BUG-FIX-WAVE-AND-COMMUNITY-PR-INTAKE-ADR.md` — full per-decision rationale + remaining-work tracking for the org-internal audience. +- Engram dev-log: `~/cDesign/dendrite/Projects/Visage/dev-log.md` — session entry summarizing the cohort + key discoveries. + +## Remaining work for the v0.3.x post-launch stabilization arc + +| Item | Gate | Owner | +|---|---|---| +| PR #27 amend or close-and-redo | by 2026-06-04 | @SelfRef → maintainer fallback | +| Discussion #28 answer posted | operator UI paste OR PAT `discussions:write` | Operator | +| Issue #33 dependabot security alerts triaged | operator UI access OR PAT `security_events` | Operator | +| CODEOWNERS for `packaging/aur/` | @SelfRef accepts offer in PR #27 thread | @SelfRef | +| v0.4.0 packaging arc | scope decision | Maintainer | diff --git a/packaging/aur/PKGBUILD b/packaging/aur/PKGBUILD index f7f70d6..63ff7ae 100644 --- a/packaging/aur/PKGBUILD +++ b/packaging/aur/PKGBUILD @@ -18,8 +18,11 @@ install="$pkgname.install" # Preserve user data and face database across upgrades backup=('var/lib/visage/faces.db') source=("$pkgname-$pkgver.tar.gz::https://github.com/sovren-software/visage/archive/refs/tags/v$pkgver.tar.gz") -# TODO: compute real sha256sum at release time: sha256sum visage-0.3.3.tar.gz -sha256sums=('SKIP') +# sha256 of the v$pkgver tarball at github.com/sovren-software/visage/archive/refs/tags/v$pkgver.tar.gz +# Compute via: +# curl -fsSL https://github.com/sovren-software/visage/archive/refs/tags/v$pkgver.tar.gz | sha256sum +# Must be re-computed on every pkgver bump. +sha256sums=('e018fcc08dbb3aba381306424fc1fd94eaddc0a5da0d47437f17487f29b76f99') build() { cd "$pkgname-$pkgver"