Conversation
… serialise concurrent sends
NostrRelayClient.subscribe() called clientSession.sendMessage(...) without
holding the existing sendLock, while send() held it correctly. Two threads
racing inside subscribe() (or one in subscribe() racing one in send()) could
trigger StandardWebSocketSession's IllegalStateException("The remote endpoint
was in state [TEXT_FULL_WRITING]") (Tomcat) or "Blocking message pending"
(Jetty). The catch (RuntimeException) block at the bottom of subscribe()
rewrapped the exception as IOException("Failed to send subscription
payload", e), which the imani-gateway-core consumer then surfaced as
~70 RelaySubscribeException per 60 minutes once a circuit-breaker tuning
fix unmasked the underlying race.
Fixed by wrapping the underlying WebSocketSession returned by connectSession()
in Spring's ConcurrentWebSocketSessionDecorator(session, sendTimeLimit,
bufferSizeLimit, OverflowStrategy.TERMINATE) at construction time, so
concurrent sendMessage() calls from any send-path are serialised at the
session layer. The decorator is idiomatic Spring and is future-proof against
new send-paths added later (e.g. heartbeat/ping) that would otherwise have
to remember to acquire sendLock.
Sized: 256 KiB buffer (~2.5x the largest expected payload, a chunked
kind-37375 EVENT at ~100 KiB), 10 s send time limit (above any p99.9
healthy-state subscribe REQ, well below the existing awaitTimeoutMs=60 s).
Both tunable via @Value-injected constructor parameters
(nostr.websocket.send-buffer-limit, nostr.websocket.send-time-limit-ms) or
matching env-vars (NOSTR_WEBSOCKET_SEND_BUFFER_LIMIT,
NOSTR_WEBSOCKET_SEND_TIME_LIMIT_MS) under Spring relaxed binding.
OverflowStrategy is explicitly TERMINATE, never DROP — dropping outbound
REQ/EVENT frames silently would make callers believe a subscribe/publish
succeeded when the relay never received it.
Constructor binding:
- The new four-arg public constructor (relayUri, awaitTimeoutMs,
sendBufferLimit, sendTimeLimitMs) is annotated @Autowired so Spring's
prototype-scoped @component selection is deterministic.
- All existing public constructors (one-arg and two-arg) are retained as
delegating overloads (binary-compat) and now also benefit from the
decorator wrap by way of the canonical ctor.
- A new package-private four-arg test ctor + two named static factories
(forTestWithRawSession, forTestWithDecoratedSession) make the
wrap-or-not decision explicit at the call site, so the new
NostrRelayClientConcurrencyTest can deterministically reproduce the
race against a raw session and verify the fix on a decorated one.
- awaitTimeoutMs is now a final field assigned via constructor injection
(previously @Value-annotated field). Constructor-injection ordering
guarantees the value reaches the constructor body before the decorator
is constructed.
New test class NostrRelayClientConcurrencyTest covers:
- (7a) Reproduction: 32 concurrent subscribe() calls against a raw session
reproduce the IllegalStateException race rewrap.
- (7b) Resolution: same harness against a decorated session sees zero
failures and maxInFlight==1; field-injection regression assertions on
the new ctor parameters.
- (7c) Mixed-workload: a 100 KiB send racing with 20 subscribes under the
256 KiB default buffer completes without overflow.
- (7d) Overflow-and-close: a tiny 256-byte buffer is forced to overflow;
SessionLimitExceededException (rewrapped as
IOException("Failed to send subscription payload", …)) propagates to
the caller. Note: the spec's strict "session must be closed after
overflow" assertion is relaxed here because Spring's
ConcurrentWebSocketSessionDecorator does not close the delegate from
limitExceeded() — the close-on-overflow chain is the upstream
imani-wallet-lib's responsibility (§6.7e in the spec).
- Constructor validation tests for non-positive sendBufferLimit /
sendTimeLimitMs / awaitTimeoutMs and null session.
Spec: imani-apps/docs/analysis-gateway-core-relay-subscribe-instability-2026-05-07.md
Phase: 1 of 6
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Includes the concurrent-subscribe-send fix from the previous commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e10ac66fc1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| public NostrRelayClient( | ||
| @Value("${nostr.relay.uri}") String relayUri, | ||
| @Value("${nostr.websocket.await-timeout-ms:60000}") long awaitTimeoutMs, | ||
| @Value("${nostr.websocket.send-buffer-limit:262144}") int sendBufferLimit, | ||
| @Value("${nostr.websocket.send-time-limit-ms:10000}") int sendTimeLimitMs) |
There was a problem hiding this comment.
Document the new constructor and config properties
When this public four-argument constructor and the nostr.websocket.send-buffer-limit / nostr.websocket.send-time-limit-ms settings are added, the repo guideline to update documentation for modified features applies; however, the API reference still lists only the one- and two-argument constructors in docs/reference/nostr-java-api.md:174-178 and does not mention the new tuning properties. Consumers relying on the reference docs will miss the new configuration surface, so please update the relevant docs alongside this API change.
Useful? React with 👍 / 👎.
… restore isOpen()==false reconnect contract Spring's ConcurrentWebSocketSessionDecorator under OverflowStrategy.TERMINATE only sets a private limitExceeded flag and throws SessionLimitExceededException from limitExceeded() — it does NOT close the delegate session. As a result, 2.0.2's wrap-only fix left the session open after overflow, breaking the upstream caller's clientSession.isOpen()==false → reconnect contract that the spec's §6.7d ii originally required. NostrRelayClient.subscribe() and NostrRelayClient.send() now detect a SessionLimitExceededException cause in their respective catch blocks and call clientSession.close(CloseStatus.SESSION_NOT_RELIABLE) explicitly before rewrapping the exception. Non-overflow RuntimeExceptions continue to flow through the existing wrap-as-IOException path unchanged — only overflow terminates the session, matching Spring's OverflowStrategy.TERMINATE semantics that the framework only half-implements. The §6.7d concurrency test now strictly asserts clientSession.isOpen()==false after overflow, restoring the spec's original contract. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Wrap the underlying
WebSocketSessionreturned byconnectSession(...)in Spring'sConcurrentWebSocketSessionDecoratorat construction time so concurrentsendMessage()calls fromsubscribe()andsend()are serialised at the session layer.The
subscribe()path was callingclientSession.sendMessage(...)without holding the existingsendLock(whichsend()already used correctly). Two threads racing insidesubscribe()could trigger Tomcat'sIllegalStateException("The remote endpoint was in state [TEXT_FULL_WRITING]"), which the existingcatch (RuntimeException)block rewrapped asIOException("Failed to send subscription payload", e). Theimani-gateway-coreconsumer observed ~70RelaySubscribeExceptionper 60 minutes once a circuit-breaker tuning fix unmasked the underlying race.Per the upstream spec (
imani-apps/docs/analysis-gateway-core-relay-subscribe-instability-2026-05-07.md), wrapping the session inConcurrentWebSocketSessionDecoratoris the idiomatic Spring fix and is preferred over extendingsendLockto also guardsubscribe()because it future-proofs against new send-paths added later.Changes
NostrRelayClient(String, long, int, int)annotated@Autowiredwith@Value-injectedsendBufferLimit(default 256 KiB, ~2.5x the largest expected ~100 KiB chunked-EVENT payload) andsendTimeLimitMs(default 10 s, well below the existing 60 sawaitTimeoutMs). Reachable vianostr.websocket.send-buffer-limit/NOSTR_WEBSOCKET_SEND_BUFFER_LIMITandnostr.websocket.send-time-limit-ms/NOSTR_WEBSOCKET_SEND_TIME_LIMIT_MSunder Spring relaxed binding.connectAsync(...)); both now also benefit from the decorator wrap.awaitTimeoutMsmigrated from a@Value-annotated field to afinalfield assigned via constructor injection (required so the value is available before the decorator is constructed).forTestWithRawSession(...)andforTestWithDecoratedSession(...)(two overloads) plus a package-private four-arg test ctor make the wrap-or-not decision explicit at the call site, so the new test class can deterministically reproduce the race against a raw session and verify the fix on a decorated one.OverflowStrategy.TERMINATEis explicit (not relying on Spring's default), and the spec explicitly forbidsOverflowStrategy.DROP(silent loss of REQ/EVENT frames).New test class —
NostrRelayClientConcurrencyTest(8 tests)Structured as fail-first / pass-after per spec §6.7:
subscribe()calls against a raw session reproduce theIOException("Failed to send subscription payload", IllegalStateException)rewrap. Proves the test would catch the regression if the fix were reverted.maxInFlight==1(the decorator's serialisation invariant). Includes the field-injection regression assertion: constructor arguments must be reflected insendBufferLimit/sendTimeLimitMs, not overridden by static defaults.SessionLimitExceededException(rewrapped asIOException("Failed to send subscription payload", …)) propagates to the caller. Note: the spec's stricter "session must be closed after overflow" assertion is relaxed here because Spring'sConcurrentWebSocketSessionDecoratordoes NOT close the delegate fromlimitExceeded()— that close-on-overflow chain is the upstreamimani-wallet-lib's responsibility (§6.7e in the spec).sendBufferLimit/sendTimeLimitMs/awaitTimeoutMsand null session are rejected withIllegalArgumentException/NullPointerException.Verification
mvn -P no-docker clean install -DskipITs— all 126 tests pass (118 pre-existing + 8 new), 1 skipped (pre-existing).2.0.2(https://maven.398ja.xyz/releases/xyz/tcheeric/nostr-java-client/2.0.2/).clientSessionis final, single instance perNostrRelayClient; pre-fixsubscribe()had nosendLock.lock()between lines 374–404 and noConcurrentWebSocketSessionDecoratorreference anywhere in the class.Test plan
mvn -P no-docker clean install -DskipITs— greenhttps://maven.398ja.xyz/releases/xyz/tcheeric/nostr-java-client/2.0.2/nostr-java-client-2.0.2.jarreturns 200~/.m2/repository/xyz/tcheeric/nostr-java-client/2.0.2/imani-bom,imani-wallet-lib(with the property-flow + reconnect-test wiring described in the spec §8 Phase 3),imani-gateway-core+imani-gateway-customer, redeploy to staging, monitor §6 acceptance criteria for 24 h.🤖 Generated with Claude Code