Skip to content

Fix persistent login: /session race + stored brokerOrigin + StrictMode duplicate#33

Draft
MichielMAnalytics wants to merge 3 commits into
chad:mainfrom
MichielMAnalytics:fix/persistent-login-oauth-dev
Draft

Fix persistent login: /session race + stored brokerOrigin + StrictMode duplicate#33
MichielMAnalytics wants to merge 3 commits into
chad:mainfrom
MichielMAnalytics:fix/persistent-login-oauth-dev

Conversation

@MichielMAnalytics

@MichielMAnalytics MichielMAnalytics commented May 19, 2026

Copy link
Copy Markdown

Three small fixes for AT Protocol persistent-login. Each commit is self-contained; happy to split if easier to review.

Draft because this is my first contribution against this repo and I'd love a sanity check before marking ready.


1. fix(broker): serialize concurrent /session calls per broker_token

AT Protocol refresh tokens are single-use: the PDS rotates the refresh_token on each /token call and immediately invalidates the previous one. The broker stores the rotated value, so serial /session callers each read the latest token and rotate again.

Concurrent /session calls for the same broker_token break this:

  • Both callers read the same refresh_token R from the DB.
  • Caller A wins at the PDS: gets new token R', broker stores R'.
  • Caller B sends R: PDS replies invalid_grant "Refresh token replayed". Broker returns 502 to B without touching the DB.

The web client retries /session once on 502, and React 18 StrictMode's effect double-invocation means each refresh fires the /session effect twice in parallel — kicking the race off again. After a few rounds the stored refresh_token has been rotated to a value the PDS has since invalidated, and every subsequent /session returns 502 forever — the session is bricked until full re-OAuth.

Fix: a per-broker_token Mutex in BrokerState. Concurrent callers for the same token queue on the same Mutex, so the second one sees R' (already rotated) instead of racing on R. Different broker_tokens are independent, so /session concurrency across users is preserved. Bounded with an opportunistic GC pass (entries with Arc::strong_count == 1 get dropped on the next /session call), so session_locks stays proportional to concurrent refresh callers rather than the cumulative set of tokens served.

Also adds tracing::info!/warn! on the /session path so this kind of failure is visible without source diving.

Repro before patch: open the web client, log in via AT Protocol, refresh the page twice. Second refresh shows the login screen; broker log shows Refresh token replayed followed by permanent Invalid refresh token until you re-OAuth.

2. fix(web): ignore stored brokerOrigin equal to webOrigin

ConnectScreen initialises brokerOrigin from localStorage unconditionally. The broker session-refresh effect, however, treats a brokerOrigin equal to webOrigin as "no external broker — server handles auth directly" and short-circuits without calling /session, silently dropping persistent login.

This strands any user whose previous session ran in single-server mode: even after the deployment grows a real broker, the stored value wins the precedence check and refresh never works again, forcing manual localStorage cleanup. Treat stored === webOrigin as absent so the broker-derivation rules below get a chance to run.

3. fix(web): in-flight guard against parallel /session calls

Belt-and-suspenders with #1. Once the broker handles concurrent callers correctly, the SPA itself still fires two /session calls per refresh (React 18 StrictMode double-invokes the effect in dev). Both now succeed thanks to the broker mutex, but each invokes setSaslCredentials + connect(), causing a visible "Establishing session" flash mid-load when the second response lands.

A module-level sessionRefreshInFlight flag — same pattern as the existing oauthConnectInProgress next to it — short-circuits the duplicate before the fetch. Reset in .finally() so failures don't permanently block future refreshes. The broker mutex remains the source-of-truth defence; this is just the SPA being a well-behaved caller.


Test plan

  • Reproduced the /session race with React StrictMode + two quick refreshes; observed Refresh token replayed → permanent 502s in broker log.
  • With all three patches applied, completed AT Protocol OAuth end-to-end on a fresh local clone and refreshed the page 20+ times — single PDS round-trip per refresh, no login-screen fallback, no mid-load flash.
  • Existing tests still pass (couldn't run the full suite locally — would appreciate CI confirmation).
  • Broker /session behaviour unchanged when calls are serial (per-token mutex is uncontended in that case).

@MichielMAnalytics MichielMAnalytics force-pushed the fix/persistent-login-oauth-dev branch from eb9ec23 to b998381 Compare May 19, 2026 09:26
@MichielMAnalytics MichielMAnalytics changed the title Fix persistent login: /session race + stored brokerOrigin + dev OAuth metadata Fix persistent login: /session race + stored brokerOrigin guard May 19, 2026
MichielMAnalytics and others added 3 commits May 19, 2026 09:35
AT Protocol refresh tokens are single-use: the PDS rotates the
refresh_token on each /token call and immediately invalidates the
previous one. The broker stores the rotated value, so on the happy
path each subsequent /session call reads the latest token from the
DB and rotates again.

Concurrent /session calls for the same broker_token break this:

  - Both callers read the same refresh_token R from the DB.
  - Caller A wins at the PDS: gets new token R', broker stores R'.
  - Caller B sends R: PDS replies invalid_grant "Refresh token
    replayed". Broker returns 502 to B without touching the DB.

That alone is recoverable (the DB now holds R'). But the web client
retries /session once on 502, and React StrictMode's effect double-
invocation means each refresh fires the /session effect twice in
parallel, kicking the race off again. After a few rounds the stored
refresh_token has been rotated to a value the PDS has since
invalidated, and every subsequent /session returns 502 forever — the
session is bricked until full re-OAuth.

Fix: a per-broker-token Mutex in BrokerState. Concurrent callers for
the same token queue on the same Mutex, so caller B sees R' (already
rotated) instead of racing on R. Different broker_tokens are
independent, so the per-token map preserves /session concurrency.

Also adds tracing::info!/warn! on the /session path so this kind of
failure is visible in logs without source diving.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ConnectScreen initialises `brokerOrigin` from `localStorage`
unconditionally. The broker session-refresh effect, however, treats
a brokerOrigin equal to webOrigin as "no external broker — server
handles auth directly" and short-circuits without trying /session,
silently dropping persistent login.

This combination strands any user whose previous session ran in
single-server mode: even after the deployment grows a real broker,
the stored value wins the precedence and refresh never works again,
forcing manual localStorage cleanup. Treat stored == webOrigin as
absent so the broker-derivation rules below get a chance to run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit handles concurrent /session calls correctly on
the broker side, but the SPA itself fires two of them per refresh:
React 18 StrictMode invokes the broker-session effect twice in dev,
and the existing module-level guards (`brokerAutoAttempts`,
`oauthConnectInProgress`) don't cover this path. Both fetches
succeed sequentially thanks to the broker's per-token mutex, but
each triggers `setSaslCredentials` + `connect()` and the second
landing flashes the "Establishing session" screen a second time
mid-load.

Add a synchronous module-level `sessionRefreshInFlight` flag —
matches the existing pattern at line 50 — so StrictMode's repeat
invocation short-circuits before the duplicate fetch. Reset in
`.finally()` so failures don't permanently block future refreshes.

The broker mutex remains the source-of-truth defence against any
client (mobile, Tauri, future) that might still race; this guard
is the SPA being a well-behaved caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MichielMAnalytics MichielMAnalytics force-pushed the fix/persistent-login-oauth-dev branch from b998381 to 5fd5734 Compare May 19, 2026 09:54
@MichielMAnalytics MichielMAnalytics changed the title Fix persistent login: /session race + stored brokerOrigin guard Fix persistent login: /session race + stored brokerOrigin + StrictMode duplicate May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant