Fix persistent login: /session race + stored brokerOrigin + StrictMode duplicate#33
Draft
MichielMAnalytics wants to merge 3 commits into
Draft
Conversation
eb9ec23 to
b998381
Compare
AT Protocol refresh tokens are single-use: the PDS rotates the
refresh_token on each /token call and immediately invalidates the
previous one. The broker stores the rotated value, so on the happy
path each subsequent /session call reads the latest token from the
DB and rotates again.
Concurrent /session calls for the same broker_token break this:
- Both callers read the same refresh_token R from the DB.
- Caller A wins at the PDS: gets new token R', broker stores R'.
- Caller B sends R: PDS replies invalid_grant "Refresh token
replayed". Broker returns 502 to B without touching the DB.
That alone is recoverable (the DB now holds R'). But the web client
retries /session once on 502, and React StrictMode's effect double-
invocation means each refresh fires the /session effect twice in
parallel, kicking the race off again. After a few rounds the stored
refresh_token has been rotated to a value the PDS has since
invalidated, and every subsequent /session returns 502 forever — the
session is bricked until full re-OAuth.
Fix: a per-broker-token Mutex in BrokerState. Concurrent callers for
the same token queue on the same Mutex, so caller B sees R' (already
rotated) instead of racing on R. Different broker_tokens are
independent, so the per-token map preserves /session concurrency.
Also adds tracing::info!/warn! on the /session path so this kind of
failure is visible in logs without source diving.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ConnectScreen initialises `brokerOrigin` from `localStorage` unconditionally. The broker session-refresh effect, however, treats a brokerOrigin equal to webOrigin as "no external broker — server handles auth directly" and short-circuits without trying /session, silently dropping persistent login. This combination strands any user whose previous session ran in single-server mode: even after the deployment grows a real broker, the stored value wins the precedence and refresh never works again, forcing manual localStorage cleanup. Treat stored == webOrigin as absent so the broker-derivation rules below get a chance to run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit handles concurrent /session calls correctly on the broker side, but the SPA itself fires two of them per refresh: React 18 StrictMode invokes the broker-session effect twice in dev, and the existing module-level guards (`brokerAutoAttempts`, `oauthConnectInProgress`) don't cover this path. Both fetches succeed sequentially thanks to the broker's per-token mutex, but each triggers `setSaslCredentials` + `connect()` and the second landing flashes the "Establishing session" screen a second time mid-load. Add a synchronous module-level `sessionRefreshInFlight` flag — matches the existing pattern at line 50 — so StrictMode's repeat invocation short-circuits before the duplicate fetch. Reset in `.finally()` so failures don't permanently block future refreshes. The broker mutex remains the source-of-truth defence against any client (mobile, Tauri, future) that might still race; this guard is the SPA being a well-behaved caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b998381 to
5fd5734
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three small fixes for AT Protocol persistent-login. Each commit is self-contained; happy to split if easier to review.
1.
fix(broker): serialize concurrent /session calls per broker_tokenAT Protocol refresh tokens are single-use: the PDS rotates the
refresh_tokenon each/tokencall and immediately invalidates the previous one. The broker stores the rotated value, so serial/sessioncallers each read the latest token and rotate again.Concurrent
/sessioncalls for the samebroker_tokenbreak this:refresh_tokenR from the DB.invalid_grant "Refresh token replayed". Broker returns 502 to B without touching the DB.The web client retries
/sessiononce on 502, and React 18 StrictMode's effect double-invocation means each refresh fires the/sessioneffect twice in parallel — kicking the race off again. After a few rounds the storedrefresh_tokenhas been rotated to a value the PDS has since invalidated, and every subsequent/sessionreturns 502 forever — the session is bricked until full re-OAuth.Fix: a per-
broker_tokenMutexinBrokerState. Concurrent callers for the same token queue on the same Mutex, so the second one sees R' (already rotated) instead of racing on R. Differentbroker_tokens are independent, so/sessionconcurrency across users is preserved. Bounded with an opportunistic GC pass (entries withArc::strong_count == 1get dropped on the next/sessioncall), sosession_locksstays proportional to concurrent refresh callers rather than the cumulative set of tokens served.Also adds
tracing::info!/warn!on the/sessionpath so this kind of failure is visible without source diving.Repro before patch: open the web client, log in via AT Protocol, refresh the page twice. Second refresh shows the login screen; broker log shows
Refresh token replayedfollowed by permanentInvalid refresh tokenuntil you re-OAuth.2.
fix(web): ignore stored brokerOrigin equal to webOriginConnectScreeninitialisesbrokerOriginfromlocalStorageunconditionally. The broker session-refresh effect, however, treats abrokerOriginequal towebOriginas "no external broker — server handles auth directly" and short-circuits without calling/session, silently dropping persistent login.This strands any user whose previous session ran in single-server mode: even after the deployment grows a real broker, the stored value wins the precedence check and refresh never works again, forcing manual
localStoragecleanup. Treatstored === webOriginas absent so the broker-derivation rules below get a chance to run.3.
fix(web): in-flight guard against parallel /session callsBelt-and-suspenders with #1. Once the broker handles concurrent callers correctly, the SPA itself still fires two
/sessioncalls per refresh (React 18 StrictMode double-invokes the effect in dev). Both now succeed thanks to the broker mutex, but each invokessetSaslCredentials+connect(), causing a visible "Establishing session" flash mid-load when the second response lands.A module-level
sessionRefreshInFlightflag — same pattern as the existingoauthConnectInProgressnext to it — short-circuits the duplicate before the fetch. Reset in.finally()so failures don't permanently block future refreshes. The broker mutex remains the source-of-truth defence; this is just the SPA being a well-behaved caller.Test plan
/sessionrace with React StrictMode + two quick refreshes; observedRefresh token replayed→ permanent 502s in broker log./sessionbehaviour unchanged when calls are serial (per-token mutex is uncontended in that case).