Skip to content

Add a client-side persistence-status layer with save-failure recovery#1859

Open
kmcginnes wants to merge 25 commits into
mainfrom
persistence-status-layer
Open

Add a client-side persistence-status layer with save-failure recovery#1859
kmcginnes wants to merge 25 commits into
mainfrom
persistence-status-layer

Conversation

@kmcginnes

@kmcginnes kmcginnes commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Description

Graph Explorer stores all user data client-side in IndexedDB. Previously, when a durable write failed, in-memory state silently diverged from disk and was lost on reload — with no signal to the user.

This adds a persistence-status layer that owns write failure centrally. Every IndexedDB write now routes through a shared retry/coalescing queue that classifies failures, retries transient ones with backoff, and reports terminal failures into a single global status store. A "Changes not saved" indicator surfaces in the nav bar only when a write has terminally failed; clicking it opens a dialog with the failure details and — when storage is full (and therefore still readable) — a button to back up the configuration to a file.

The persisted setter now returns void: callers just set state, and the storage layer is solely responsible for getting it to disk. This composes with the cross-tab merge work (ADR per-key-diff-merge-cross-tab-reconciliation) at the same flush seam without conflicting.

How to read

  1. classifyStorageError.ts — start here; the pure error taxonomy (quota/access terminal, everything else retryable)
  2. persistenceStatusStore.ts — the global external store (idle | saving | failed) outside React/Jotai
  3. writeQueue.ts — per-key serialize + coalesce-to-latest + retry/backoff, knows only a flush thunk
  4. index.ts — wires the singleton store + queue into persistThroughQueue
  5. writeThroughAtom.ts — the setter-returns-void change (shared by the localForage and active-connection paths)
  6. PersistenceStatusIndicator.tsx — the nav-bar indicator + detail dialog
  7. ADR — the design rationale and rejected alternatives

Core vs supporting: the four persistence/ modules + the indicator are the feature. The route files (Connections, DataExplorer, GraphExplorer, SchemaExplorer, SettingsRoot), NavBar spacing, and ConnectionDetail debug actions are wiring. The test-helper and *.test.ts changes are the mechanical fallout of the setter returning void (they synchronize on waitForIdle() instead of awaiting a per-write promise).

Validation

  • pnpm checks passes (lint, format, types).
  • pnpm test passes (1854 passed, 1 expected-fail — the pre-existing Concurrent edits in multiple browser tabs silently clobber shared persisted collections (styling, schema, connections, sessions) #1820 cross-tab clobber regression marker, unchanged).
  • New unit tests cover the error taxonomy, status store (aggregation precedence, per-key clear, waitForIdle, reset), and write queue (coalescing, retry-then-succeed, terminal no-retry, escalation after cap, per-key independence), plus a component test for the indicator.
  • Manual: enable debug actions (Settings → General), open a connection's detail screen, and use "Force Save Failure (Out of Storage / Storage Blocked)" and "Reset Save Status" to exercise the indicator and dialog through the real pipeline.

Related Issues

Check List

  • I confirm that my contribution is made under the terms of the Apache 2.0 license.
  • I have verified pnpm checks passes with no errors.
  • I have verified pnpm test passes with no failures.
  • I have covered new added functionality with unit tests if necessary.
  • I have updated documentation if necessary.

kmcginnes added 25 commits June 19, 2026 16:31
- waitForIdle now gates on in-flight count, not status, so a terminal
  failure on one key no longer signals settled while another key drains
- skip no-op status recomputes to avoid redundant indicator re-renders
- reclassify InvalidStateError as retryable (often a transient IDB state)
- give terminal-failure toasts stable ids so they replace, not stack
- remove the DEV-only window.__persistence hook (broke node-env tests)
The failure Alert now opens a detail dialog listing each failed
collection (humanized from its storage key) and why it failed, with a
Download backup button only when storage is full. Toasts are removed:
the standing Alert plus dialog is the single, non-stacking recovery
surface. Failure records now carry attemptCount and lastAttemptAt.

Syncs CONTEXT.md and the ADR to the shipped detail view, fixes the ADR's
waitForIdle seam name and the non-existent superseded-ADR reference, and
adds a troubleshooting entry for save failures.
The indicator is now a rounded danger "Changes not saved" button that
opens the detail dialog, rather than an Alert with an action slot. Revert
the AlertAction addition to the shared Alert component since nothing uses
it anymore, and sync the docs.
- waitForIdle drives off a new snapshot isSettling flag via the single
  subscribe path; drop the parallel idleWaiters mechanism
- add store.reset(); use it for test cleanup and a Reset Save Status debug
  action instead of enumerating keys
- detail dialog is now generic: always offers Save Configuration backup
  regardless of failure reason, matching the settings screen wording
- write queue running set is a Set, not a Map with an unread Promise value
- debug failure helper uses synthetic keys, decoupled from real atoms
- revert the global leading-none button base change
- sync ADR, CONTEXT.md, and troubleshooting docs
@kmcginnes kmcginnes marked this pull request as ready for review June 23, 2026 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Storage layer owns persistence failure: global save-status indicator + retry

1 participant