Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ea285d9
Add persistence-status layer design docs (CONTEXT.md, ADR)
kmcginnes Jun 19, 2026
256c217
Add persistence-status core: error taxonomy, status store, write queue
kmcginnes Jun 19, 2026
4151acb
Route IndexedDB writes through the persistence queue; setter returns …
kmcginnes Jun 19, 2026
7b48ebe
Add save-status indicator wired to the persistence status store
kmcginnes Jun 19, 2026
c1a55cb
Record rebase-driven design refinements in persistence ADR
kmcginnes Jun 19, 2026
08305c0
Add diagnostic logging to the persistence status store
kmcginnes Jun 19, 2026
ddc23c8
Move save-status indicator into the nav bar after the page title
kmcginnes Jun 19, 2026
dfb4905
Render save-status indicator as a danger Alert, only on failure
kmcginnes Jun 19, 2026
31f467c
Keep the out-of-storage backup toast until the user dismisses it
kmcginnes Jun 19, 2026
78360a7
Reword storage-full toast to reference browser storage, not disk
kmcginnes Jun 19, 2026
a9b0706
Split terminal-failure toast: backup for full storage, plain warning …
kmcginnes Jun 22, 2026
4bdaa50
Rename SaveStatusIndicator to PersistenceStatusIndicator for glossary…
kmcginnes Jun 22, 2026
97d6f23
Fix persistence status edge cases and remove debug hook
kmcginnes Jun 22, 2026
f1b86d8
Add failure detail dialog and drop terminal-failure toasts
kmcginnes Jun 22, 2026
95b1929
Add debug actions to force persistence failures with and without backup
kmcginnes Jun 22, 2026
740d60d
Give failure alert a Details button and show raw failure records as JSON
kmcginnes Jun 22, 2026
1e458ab
Add AlertAction slot from shadcn and use it for the failure Details b…
kmcginnes Jun 22, 2026
b1a8211
Reserve room for the Details button so it doesn't overlap the alert t…
kmcginnes Jun 22, 2026
a35b74a
Make failure indicator a danger button; drop unused Alert changes
kmcginnes Jun 22, 2026
0b04b2c
Address code review: simpler store, generic recovery dialog, cleanups
kmcginnes Jun 22, 2026
5d8de5f
Only offer the backup when storage is readable; restore reason-specif…
kmcginnes Jun 22, 2026
d1ba3b1
Update dialog test assertion after title cleanup
kmcginnes Jun 22, 2026
14d6cbc
Capture underlying error details on persistence failures
kmcginnes Jun 23, 2026
e7dfe13
Describe the indicator's current state only, not the interim toast model
kmcginnes Jun 23, 2026
16e156b
Test that the write queue captures error details onto failure records
kmcginnes Jun 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CONTEXT.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,14 @@ _Avoid_: Model, structure
The on-disk JSON format a user gets when they export a Connection (`saveConfigurationToFile`), and which import consumes. It bundles the connection config with a snapshot of the Schema (`lastUpdate` is an ISO string on disk). A single Zod schema in `parseConnectionFile.ts` is the source of truth: both the writer and the importer target the same inferred type (`ExportedConnectionFile`). The writer assigns a `Date` for `lastUpdate` and `JSON.stringify` serializes it to the ISO string; the parser coerces it back via `z.coerce.date()`. The schema is lenient (every level is a `looseObject`), so unknown and legacy fields — styling, `__inferred`/`__matches` on prefixes, attribute `dataType` — pass through untouched. It is intentionally decoupled from the in-memory configuration and from the IndexedDB storage shape, so the wire format can evolve independently. On import it is split — the connection lands in `configurationAtom`, the schema in `schemaAtom`.
_Avoid_: Configuration file (the wire format is not the in-memory or persisted shape)

**Persistence Status**:
The single, global state of whether the app's client-side data is safely written to IndexedDB — `idle | saving | failed`. One source of truth that the Persistence Status Indicator subscribes to. `idle` = nothing queued and everything durable (no separate "saved" state — visually identical to a fresh session). `saving` = at least one write queued or in flight, including retryable retries. `failed` = at least one terminal failure outstanding, carrying failure records for a drill-in detail view. Aggregated across all per-key write queues by precedence: any terminal → `failed`; else any in-flight → `saving`; else `idle`. A `failed` key clears on its next successful write; status returns to `idle` when no failure records remain. Deliberately **not** per-key: a failed write flips the whole status rather than naming which collection failed, because the user cannot act on an individual key (the storage layer retries on their behalf). Lives in a plain external store outside React/Jotai; the React edge bridges it via `useSyncExternalStore`.
_Avoid_: Save state (ambiguous with Session)

**Persistence Status Indicator**:
The UI element in the nav bar (after the page title) that renders Persistence Status. It surfaces only on `failed` — a standing danger "Changes not saved" button — and stays absent at `idle` and `saving`. Clicking it opens a dialog showing the raw failure records (key, reason, attempt count, last attempt, and the underlying error's name/message/cause) in a read-only JSON editor. The dialog offers to save the configuration to a file via `saveLocalForageToFile` (`core/StateProvider/localDb.ts`) when storage is full (quota) — IndexedDB is still readable then — but not when storage is inaccessible (private mode, blocked), since the database never opened and there is nothing to read. Recovery scope is retry (transient failures) plus that backup (terminal-quota failures) — it does not guarantee the write eventually lands.
_Avoid_: Save-status indicator

## Relationships

- Each browser tab has at most one **Active Connection**; different tabs may have different ones
Expand Down
42 changes: 42 additions & 0 deletions docs/adr/20260619-storage-layer-owns-persistence-failure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# ADR — The storage layer owns persistence failure; status lives outside React

- **Status:** Accepted
- **Date:** 2026-06-19
- **Related:** ADR `per-key-diff-merge-cross-tab-reconciliation` (the re-read-merge that occupies the same write-path seam this builds on); ADR `indexeddb-not-localstorage-for-persistence` (the substrate, and source of the error taxonomy). Supersedes the interim per-call-site floating-promise convention (a lint-enforced code convention, not an ADR). Issue #1854.

## Context

Graph Explorer stores all user data client-side in IndexedDB via `atomWithLocalForage` (`core/StateProvider/atomWithLocalForage.ts`). The write path updates an in-memory Jotai atom synchronously and flushes to IndexedDB in the background. When the durable write **fails**, nothing recovers: in-memory state silently diverges from disk and is lost on reload.

The setter currently returns `Promise<void>`, exposing the background write as interface surface. Each of ~36 call sites must then decide how to handle that promise — an implementation detail (the write is async) leaking into every caller. The interim convention (`logAndNotify`/`logAndIgnore`/`void`) made the leak explicit per site, but the per-site model means each caller re-derives failure handling and message vocabulary the storage layer is better positioned to own.

## Decision

The **storage layer owns the write, so it owns the failure.** Concretely:

1. **The persisted setter returns `void` again.** Failure is handled internally; there is no promise for callers to mishandle. This is a deliberately _deep_ module — a trivial interface (`set(atom, next)`) hiding substantial machinery (queue, retry, taxonomy, merge, status).
2. **Persistence Status is a single, global, three-state model**`idle | saving | failed` — aggregated across all per-key write queues (any terminal → `failed`; else any in-flight → `saving`; else `idle`). It is **not** per-key: users cannot act on an individual key, so the status names no key. Each failure carries a record (`key`, `reason`, `attemptCount`, `lastAttemptAt`); these feed a drill-in detail dialog opened from the standing failure indicator.
3. **The status store lives outside React/Jotai** — a plain external store (`subscribe`/`getSnapshot`/`emit`). The React edge bridges it with `useSyncExternalStore`. The store emits raw `{ key, reason, attemptCount, lastAttemptAt }` records; the detail dialog currently renders them verbatim as JSON (a diagnostic surface). Any humanization (friendly labels, engine-specific vocabulary via `useTranslations`) is a React-edge concern if added later — the store stays raw and React-free.
4. **Failures are classified by a pure `classifyStorageError` function**`QuotaExceededError` and security/access errors are terminal; everything else (including unknown `error.name`) is retryable up to a cap, then terminal. Retryable-by-default is the safer error: a transient wrongly called terminal loses recoverable data, while a terminal wrongly retried only wastes a few capped backoff cycles.
5. **A per-key write queue** coalesces rapid successive writes, retries transient failures with backoff, and reports outcomes into the status store. Its `flush` body is the only place that touches IndexedDB.

### Internal seams (deep ≠ monolithic)

The depth is hidden behind composition, not crammed into one file: `classifyStorageError` (pure taxonomy) · per-key write queue (coalesce + retry, knows only a `flush` thunk) · the merge-flush (the re-read-diff-write from the cross-tab ADR) · the global status store (write-only from the queue's side). `atomWithLocalForage` composes them.

## Considered Options

- **Storage-layer ownership + global status store (chosen).** Smallest caller-facing interface, single source of truth, React-free core. Collapses the per-site model and the engine-vocabulary-outside-hooks problem (the layer stays raw; React humanizes).
- **Keep the per-call-site `.catch`/`logAndNotify` model.** Rejected: re-derives handling and vocabulary at every site; status is scattered, not a single source of truth.
- **Status as a Jotai atom.** Rejected: every persisted atom would spawn status wiring that must aggregate to one global value, and it would join the fragile top-level-`await` preload dance in `storageAtoms.ts`. A plain store has no mount ordering and is the correct primitive for an external mutable source.
- **Per-key status (name which collection failed).** Rejected: not actionable for the user (the layer retries on their behalf); adds the engine-specific vocabulary burden to the core; the detail view covers the diagnostic need.

## Consequences

- **The ~36 call sites lose their floating-promise handling** — the interim convention is superseded, and that migration is expected to be largely undone. Net caller code shrinks.
- **Tests can no longer await a per-write promise.** They synchronize on a layer-level `persistenceStatusStore.waitForIdle()` derived from the status store — a better signal that survives coalescing/retry and exercises the production status path.
- **Status is strictly per-tab.** Consistent with the substrate ADR's "no cross-tab sync primitives," a failed write in one tab shows `failed` there regardless of other tabs, and clears only when that tab itself successfully flushes the key. Honest about whose edit is at risk.
- **Recovery is retry + backup, not a write guarantee.** The failure indicator opens a detail dialog; for terminal-quota failures the dialog offers a full configuration backup (`saveLocalForageToFile`), which is read-mostly and so remains viable under quota pressure. Terminal-access failures (private mode, blocked) offer no backup — IndexedDB never opened, so there is nothing to read. We do **not** block reload (`beforeunload`).
- **This effort and the cross-tab merge are orthogonal**, meeting only at the `flush` seam — either can ship first without blocking the other.
- **Every IndexedDB write routes through one shared queue, including the Active Connection breadcrumb.** The queue/status are not embedded in `atomWithLocalForage`; they live in a shared `persistThroughQueue` helper that wraps the localForage write itself. Both `atomWithLocalForage` and the per-tab active-connection path (whose synchronous sessionStorage write stays outside the queue) call it, so all IndexedDB writes feed one global status. This dropped the special-case that would have excluded the breadcrumb.
- **The `void`-setter change lives in the shared `createWriteThroughAtom`**, which the active-connection path also uses. No production call site awaited the old promise (only tests did), so the migration was mechanical. The test seam moved to a layer-level `waitForIdle()` on the status store, replacing per-write promise awaits across the persistence test helpers and the active-connection tests.
12 changes: 12 additions & 0 deletions docs/guides/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ This page contains workarounds for common issues and information on how to diagn
- [Docker Container Issues](#docker-container-issues)
- [Schema Sync Fails](#schema-sync-fails)
- [Save & Load Configuration](#save--load-configuration)
- [Graph Explorer Can't Save Your Changes](#graph-explorer-cant-save-your-changes)
- [Gathering SageMaker Logs](#gathering-sagemaker-logs)

## Docker Container Issues
Expand Down Expand Up @@ -131,6 +132,17 @@ To save the configuration data:

This configuration can be restored using the "Load Configuration" button in the same settings page.

## Graph Explorer Can't Save Your Changes

Graph Explorer stores all of your data — connections, schema, styling, and exploration sessions — in your browser, not on the server. When it can't write that data to browser storage, a "Changes not saved" indicator appears in the navigation bar. Click it to see which data failed to save and why.

There are two common causes:

- **Browser storage is full.** Graph Explorer has run out of space to save changes. The dialog offers a "Save Configuration" button — use it to export your data to a file (the same format as [Save & Load Configuration](#save--load-configuration)) so you don't lose your work, then free up browser storage and reload the page. You can restore the file afterward with "Load Configuration". Exploration sessions are the largest data and the most likely cause; clearing old sessions or other sites' storage frees space.
- **Storage is blocked or inaccessible.** Graph Explorer can't access browser storage at all, so changes won't be saved for the session. No backup is offered here, since the data can't be read out. This commonly happens in private/incognito windows or when browser settings or policy block storage for the site. Open Graph Explorer in a normal window, or allow storage for the site, to persist your work.

Transient storage hiccups are retried automatically, so the indicator only appears for failures that could not be recovered.

## Gathering SageMaker Logs

The Graph Explorer proxy server outputs log statements to standard out. By default, these logs will be forwarded to CloudWatch if the Notebook has the proper permissions.
Expand Down
2 changes: 1 addition & 1 deletion packages/graph-explorer/src/components/NavBar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ export function NavBarContent({
return (
<div
data-slot="nav-bar-content"
className={cn("flex flex-1 flex-row items-center gap-3", className)}
className={cn("flex flex-1 flex-row items-center gap-6", className)}
{...props}
/>
);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
// @vitest-environment happy-dom
import { act, render, screen } from "@testing-library/react";
import userEvent from "@testing-library/user-event";
import { afterEach, describe, expect, test } from "vitest";

import { TooltipProvider } from "@/components";
import { persistenceStatusStore } from "@/core/StateProvider/persistence";

import { PersistenceStatusIndicator } from "./PersistenceStatusIndicator";

function renderIndicator() {
return render(<PersistenceStatusIndicator />, { wrapper: TooltipProvider });
}

// The indicator reads the app-wide singleton store, so reset it between tests.
afterEach(() => {
act(() => persistenceStatusStore.reset());
});

function fail(key: string, reason: "terminal-quota" | "terminal-access") {
act(() =>
persistenceStatusStore.markFailed(key, reason, 1, {
name: "QuotaExceededError",
message: "storage full",
}),
);
}

describe("PersistenceStatusIndicator", () => {
test("shows nothing while idle", () => {
renderIndicator();

expect(
screen.queryByRole("button", { name: /changes not saved/i }),
).not.toBeInTheDocument();
});

test("shows nothing while a write is merely in flight", () => {
renderIndicator();

act(() => persistenceStatusStore.markSaving("configuration"));

expect(
screen.queryByRole("button", { name: /changes not saved/i }),
).not.toBeInTheDocument();
});

test("shows a standing indicator when a write fails terminally", () => {
renderIndicator();

fail("configuration", "terminal-access");

expect(
screen.getByRole("button", { name: /changes not saved/i }),
).toBeInTheDocument();
});

test("opens a detail dialog from the indicator", async () => {
const user = userEvent.setup();
renderIndicator();

fail("configuration", "terminal-access");
expect(screen.queryByRole("dialog")).not.toBeInTheDocument();

await user.click(
screen.getByRole("button", { name: /changes not saved/i }),
);

expect(screen.getByRole("dialog")).toHaveTextContent(/failed writes/i);
});

test("offers to save the configuration when storage is full", async () => {
const user = userEvent.setup();
renderIndicator();

fail("graph-sessions", "terminal-quota");
await user.click(
screen.getByRole("button", { name: /changes not saved/i }),
);

expect(
screen.getByRole("button", { name: /save configuration/i }),
).toBeInTheDocument();
});

test("offers no backup when storage is inaccessible", async () => {
const user = userEvent.setup();
renderIndicator();

fail("configuration", "terminal-access");
await user.click(
screen.getByRole("button", { name: /changes not saved/i }),
);

expect(
screen.queryByRole("button", { name: /save configuration/i }),
).not.toBeInTheDocument();
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import localforage from "localforage";
import { InfoIcon } from "lucide-react";

import { saveLocalForageToFile } from "@/core/StateProvider/localDb";
import { usePersistenceStatus } from "@/core/StateProvider/persistence/usePersistenceStatus";

import { Button } from "../Button";
import { CodeEditor } from "../CodeEditor";
import {
Dialog,
DialogBody,
DialogClose,
DialogContent,
DialogFooter,
DialogHeader,
DialogTitle,
DialogTrigger,
} from "../Dialog";
import { FormItem } from "../Form";
import { Label } from "../Label";

/**
* A standing warning that the app failed to write data to IndexedDB. Renders
* nothing while idle or saving — only a terminal failure surfaces, as a danger
* "Changes not saved" button that opens a detail dialog.
*
* The dialog is the single recovery surface: it shows the raw failure records
* and offers to save the configuration to a file when (and only when) storage
* is full — IndexedDB is still readable then, so a backup can capture whatever
* did persist. When storage is inaccessible (private mode, blocked) no backup is
* offered: the database never opened, so there is nothing to read.
*/
export function PersistenceStatusIndicator() {
const { status, failures } = usePersistenceStatus();

if (status !== "failed") {
return null;
}

const canBackUp = failures.some(
failure => failure.reason === "terminal-quota",
);

return (
<Dialog>
<DialogTrigger asChild>
<Button variant="danger" className="rounded-full">
Changes not saved
<InfoIcon />
</Button>
</DialogTrigger>
<DialogContent>
<DialogHeader>
<DialogTitle>Changes not saved</DialogTitle>
</DialogHeader>
<DialogBody>
<FormItem>
<Label>Details</Label>
<p className="text-base leading-snug">
{canBackUp
? "Your browser is out of storage. Save your configuration to a file so you don't lose your work, then free up space and reload."
: "Graph Explorer can't access browser storage, so these changes won't be saved this session. This often happens in private browsing or when storage is blocked."}
</p>
</FormItem>
<FormItem>
<Label>Failed writes</Label>
<div className="grid min-h-64 overflow-auto rounded-lg border bg-gray-50 shadow-xs">
<CodeEditor
defaultLanguage="json"
value={JSON.stringify(failures, null, 2)}
options={{
readOnly: true,
ariaLabel: "Persistence failure details",
// Matches current tailwind padding of 2 or 0.5rem
padding: { top: 7, bottom: 7 },
}}
wrapperProps={{ className: "pl-2" }}
/>
</div>
</FormItem>
</DialogBody>
<DialogFooter>
{canBackUp ? (
<Button
variant="primary"
onClick={() => void saveLocalForageToFile(localforage)}
>
Save Configuration
</Button>
) : null}
<DialogClose asChild>
<Button variant={canBackUp ? "secondary" : "primary"}>Close</Button>
</DialogClose>
</DialogFooter>
</DialogContent>
</Dialog>
);
}
2 changes: 2 additions & 0 deletions packages/graph-explorer/src/components/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ export * from "./NavBar";
export * from "./NoEdgeTypesEmptyState";
export * from "./NoNodeTypesEmptyState";

export * from "./PersistenceStatusIndicator/PersistenceStatusIndicator";

export * from "./Toaster";

export * from "./RouteButton";
Expand Down
Loading
Loading