Skip to content

Automatic updates for v4.1 — bayanat update + admin UI + opt-in patch auto-apply#320

Open
level09 wants to merge 37 commits into
mainfrom
auto-update-simplified
Open

Automatic updates for v4.1 — bayanat update + admin UI + opt-in patch auto-apply#320
level09 wants to merge 37 commits into
mainfrom
auto-update-simplified

Conversation

@level09
Copy link
Copy Markdown
Collaborator

@level09 level09 commented Apr 18, 2026

Summary

Adds a one-click update flow for partner operators. Admin clicks "Update now" in the nav-bar banner → the system clones the new release, takes a pg_dump snapshot, stops services, runs migrations, swaps the symlink, restarts, verifies health, and either lands on the new version (~30-90s of 502) or automatically reverts the symlink and restarts on the previous release.

Optional AUTO_APPLY_PATCH_UPDATES toggle (default off) lets partners opt in to silent background application of patch-only bumps (4.1.0 → 4.1.1, never 4.1.x → 4.2). Minor and major bumps always require a human click.

This is the simplified design approved in the "Bayanat Automatic Updates, Simplified Design (2026-04-16)" Notion page — 3 layers, 4 phases, single code path, reuses every v4 installer primitive. See the linked Notion page for the full spec; this PR description is intentionally short.

What ships

CLI (bayanat script):

  • bayanat update [<tag>] / --check / --recover
  • bayanat snapshots, bayanat restore <name>
  • bayanat status extended with update phase
  • /usr/local/sbin/bayanat-start-update root wrapper + fixed sudoers grants
  • State file at /opt/bayanat/state/update.json, atomic writes, crash-safe recovery dispatch

Flask:

  • GET /health (exempt from setup-wizard redirect + rate limiter)
  • GET/POST /admin/api/updates/{available,start,status} with @auth_required(within=15, grace=0) on the privileged start endpoint
  • /admin/snapshots/ read-only page
  • Celery beat: check_for_updates every 6h with opt-in patch auto-apply

Vue (Vuetify 3):

  • UpdateBanner in the admin nav-bar
  • UpdateProgressDialog polling /admin/api/updates/status
  • SnapshotsList page (read-only; restore is CLI-only by design)

Config:

  • AUTO_APPLY_PATCH_UPDATES wired through DEFAULT_CONFIG, CONFIG_LABELS, settings.Config, ConfigManager.serialize(), and FullConfigValidationModel — admin UI can show and save it.

Docs + tests:

  • docs/deployment/auto-update-runbook.md — operator reference
  • 19 pytest tests (health, config, endpoints, auto-apply logic)
  • e2e-auto-update.sh — one-command Hetzner E2E against the public test fork level09/bayanat-update-test

Security-relevant design notes

  • Sudoers pinned to exact paths, no globs. The wrapper takes zero arguments; Flask passes none.
  • Fresh-auth on POST /api/updates/start (within=15). Stolen cookie is not enough to trigger a privileged update.
  • _pg_load parses shared/.env natively — no eval on values written by the bayanat OS user. Prevents lateral-to-root escalation via malicious .env.
  • _install_self copies from \$RELEASES_DIR/\$tag/bayanat, never \$0. Safe under the standard curl | sudo bash -s install pattern.
  • Health probe uses unix socket for Flask + peer-auth psql + redis ping — exercises real dependencies, minimal surface.
  • Restore is CLI-only. No sudoers grant, no web UI. Intentionally requires SSH.

Testing done

  • pytest: uv run pytest tests/test_health.py tests/test_update_check.py tests/test_update_endpoints.py — 19/19 passing.
  • E2E on Hetzner CPX22 Ubuntu 24.04: ./e2e-auto-update.sh — S1 happy path + S2 bad migration (NEEDS_INTERVENTION) + S3 bad /health (ROLLED_BACK) + S4 recovery, all green. Install path uses `curl | sudo bash -s install` against the test fork's v4.0.0 tag.

The 15-entry gotchas list from the initial build + first E2E run is captured in the local skill at `.claude/skills/bayanat-auto-update/references/gotchas.md` for future contributors — covers everything from v-prefix handling to eval-on-.env to setup-wizard redirect to socket vs TCP pg auth.

Known limitations (deliberate, not blocking)

  • No maintenance-window gate on auto-apply yet. AUTO_APPLY_WINDOW_UTC=02:00-04:00 is a planned follow-up. If a partner opts in, it fires whenever the beat ticks.
  • /health is intentionally minimal (DB + Redis). A release where Flask starts but routes 500 would pass. Richer /api/smoke is a planned follow-up.
  • MIGRATE_DONE crash recovery needs an operator to re-run bayanat update. A bayanat update --resume command is a planned follow-up.
  • SETUP_COMPLETE is still a manual post-install step; the bash installer does not auto-create an admin. The E2E script works around this by writing shared/config.json directly; a proper installer change is pending.

Not in scope

  • FAST/HEAVY branching, release manifest, two-directory migration layout, Squawk CI gating — all explicitly deferred (see spec §16 "What we are NOT doing, and why").
  • Pre-v4 install support. Hard floor at v4.0.0 + Alembic revision cdaa80fb493a.
  • Docker / Swarm / blue-green topology.

Review focus

Suggested reading order:

  1. bayanat script — the CLI pipeline (~500 added lines, mostly small functions)
  2. enferno/tasks/maintenance.py::check_for_updates — auto-apply logic
  3. enferno/admin/views/system.py — the three update endpoints + /admin/snapshots/
  4. enferno/public/views.py + enferno/setup/views.py — /health + its setup-redirect exemption
  5. docs/deployment/auto-update-runbook.md — operator-facing flows

Test plan

  • CI pytest passes (19 tests)
  • Clone the branch, run `./e2e-auto-update.sh` against a test fork — expect all 4 scenarios green in ~10 min
  • Fresh install via `curl https://raw.githubusercontent.com///v4.0.0/bayanat | sudo bash -s install localhost` → verify `/health` returns 200, services active, `bayanat update --check` works
  • Trigger an update from the UI banner against a test fork with a newer tag — watch progress dialog, verify final version
  • Force a failure (tag with a bad migration) → verify NEEDS_INTERVENTION surfaces cleanly in the UI

level09 added 30 commits April 18, 2026 00:06
Adds nav-bar chip that polls /admin/api/updates/available every 6h and
shows a confirm dialog to trigger an update. Adds polling progress dialog
that opens on update-started event and tracks /admin/api/updates/status
at 2s intervals until the run completes.
level09 added 4 commits April 18, 2026 12:45
…for update start

_install_self previously relied on $0 to locate the installer script.
Under the standard 'curl ... | sudo bash -s install' pipe pattern, $0 is
'bash' and readlink resolves to the shell binary, which would then be
installed as /usr/local/bin/bayanat. Every subsequent CLI invocation and
wrapper-driven update would exec bash with 'update' as argv. Silent.
Catastrophic on first use.

Fix: _install_self now takes a tag arg and copies from
$RELEASES_DIR/$tag/bayanat (the just-cloned release, known-good source).
Also called in do_switch_verify's SUCCESS branch so the system CLI stays
in lockstep with the active release after every successful update.

Separately: @auth_required(within=15, grace=0) on POST /api/updates/start
matches the freshness discipline already used by /system-administration/
and other sensitive admin endpoints. A stale cookie is no longer enough
to trigger a privileged update.
…ore command

_pg_env previously emitted 'export KEY=VALUE' lines from shared/.env and
callers ran eval on the output. A malicious .env value (writable by the
bayanat user) could inject shell commands that run as root under
'bayanat update'. Lateral-to-root escalation path.

Replace _pg_env with _pg_load: parses POSTGRES_* keys natively in bash
and uses 'export "$key=$val"' which never interprets the value.
Verified a malicious value like 'foo; echo PWNED > /tmp/...' is stored
as a literal password instead of being executed.

Separately: the snapshots UI showed 'sudo -u bayanat bayanat restore',
which hits require_root and fails. Matches the same bug class as P4
(runbook) that was fixed in docs but missed in the Vue component.
Updated to 'sudo bayanat restore <name>'.
Codifies the manual 45-min ad-hoc test into a single-command script.
Provisions a disposable CPX22 Ubuntu 24.04 VM, installs via curl|bash
from the test fork's v4.0.0 tag, runs S1-S4 (happy / bad-migration /
bad-health / recovery), asserts final state per scenario, tears down.

Env knobs:
  KEEP_VM=1                 leave VM running for iteration
  VM_IP=x.y.z.w             reuse existing VM, skip provision/install
  SCENARIOS='S1 S2'         run subset
  TEST_FORK=you/yourfork    point at a different test fork

Requires hcloud CLI + ssh-agent loaded + gh CLI authenticated. Full run
~8-10 minutes, costs well under one cent per cycle.
@level09 level09 self-assigned this Apr 18, 2026
@level09 level09 requested a review from apodacaduron as a code owner April 18, 2026 16:15
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3a37be15-a0f8-48fb-ba6d-e9729615cf19

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch auto-update-simplified

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread enferno/public/views.py Fixed
@level09 level09 changed the title feat: automatic updates for v4.1 — bayanat update + admin UI + opt-in patch auto-apply Automatic updates for v4.1 — bayanat update + admin UI + opt-in patch auto-apply Apr 18, 2026
level09 and others added 3 commits April 19, 2026 11:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants