Skip to content

XIP-83: Mutable subscription streams with liveness#139

Open
tylerhawkes wants to merge 2 commits into
mainfrom
tyler/xip-83-mutable-subscription-streams
Open

XIP-83: Mutable subscription streams with liveness#139
tylerhawkes wants to merge 2 commits into
mainfrom
tyler/xip-83-mutable-subscription-streams

Conversation

@tylerhawkes
Copy link
Copy Markdown
Contributor

@tylerhawkes tylerhawkes commented Jun 4, 2026

Draft XIP, opened for circulation/discussion.

Summary

Defines a single bidirectional subscription RPC (Subscribe) on the MLS API. A client opens one long-lived stream and mutates its topic set in place (add/remove deltas) instead of tearing down and reopening on every membership change; the server delivers messages plus a periodic liveness heartbeat (StatusUpdate{ WAITING }) so clients can detect silent stream death that transport keepalives miss — a terminating L7 proxy answers HTTP/2 pings at the edge while the backend subscription is gone. One connection can carry the union of many topics, the enabling primitive for multi-tenant agent gateways.

Compatibility

Additive and backward-compatible: existing SubscribeGroupMessages / SubscribeWelcomeMessages are untouched, and WASM/browser clients (no bidirectional gRPC) stay on them. A client calling Subscribe against a node that lacks it gets UNIMPLEMENTED and falls back.

Status

Draft, for discussion. The client-side liveness floor — a WatchdogStream combinator that reconnects a stale subscription from its persisted cursor — is already implemented in libxmtp against the existing server-streaming subscriptions (xmtp/libxmtp#3718). The Subscribe RPC + node heartbeat handler standardized here are the remaining protocol work.

Note

Add XIP-83 specification for mutable subscription streams with liveness

Adds xip-83-mutable-subscription-streams.md, a new protocol specification describing bidirectional mutable subscription streams with application-level keepalive heartbeats. The spec covers the protocol overview, protobuf definitions, server/client requirements, rationale, backward compatibility, test cases, and security considerations. Also adds "keepalive" and "keepalives" to the cspell.json dictionary.

📊 Macroscope summarized 59dc043. 1 file reviewed, 0 issues evaluated, 0 issues filtered, 0 comments posted

🗂️ Filtered Issues

No issues evaluated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@tylerhawkes tylerhawkes requested a review from a team as a code owner June 4, 2026 17:25
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment on lines +148 to +150
3. The node MUST process `add`/`remove` deltas that arrive **after** the initial request, mutating
the live subscription **without** terminating or reopening the stream. Removed topics MUST stop
being delivered; added topics MUST follow rule (2).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-83-mutable-subscription-streams.md:148

The add/remove protocol at lines 148-150 lacks an acknowledgement or ordering barrier, so the take-effect time of a delta is undefined. With concurrent delivery, the server may already have t1 messages buffered when it processes remove:[t1], making it impossible to satisfy test case 5's requirement that the client "MUST stop receiving t1 messages." Implementations will diverge: some will leak post-remove messages, others will drop in-flight ones, producing nondeterministic cross-node behavior.

-  4. The node MUST process `add`/`remove` deltas that arrive **after** the initial request, mutating
-    the live subscription **without** terminating or reopening the stream. Removed topics MUST stop
-    being delivered; added topics MUST follow rule (2).
+  4. The node MUST process `add`/`remove` deltas that arrive **after** the initial request, mutating
+    the live subscription **without** terminating or reopening the stream. The node MUST send a
+    `StatusUpdate{ SUBSCRIPTION_UPDATED }` acknowledging each processed delta before delivering any
+    messages under the new subscription state. Removed topics MUST stop being delivered after this
+    acknowledgement; added topics MUST follow rule (2).
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file XIPs/xip-83-mutable-subscription-streams.md around lines 148-150:

The `add`/`remove` protocol at lines 148-150 lacks an acknowledgement or ordering barrier, so the take-effect time of a delta is undefined. With concurrent delivery, the server may already have `t1` messages buffered when it processes `remove:[t1]`, making it impossible to satisfy test case 5's requirement that the client "MUST stop receiving `t1` messages." Implementations will diverge: some will leak post-remove messages, others will drop in-flight ones, producing nondeterministic cross-node behavior.

Comment on lines +152 to +155
a bounded interval. The interval is server-controlled and RECOMMENDED to be **≤ 30 seconds**. The
idle timer MUST reset whenever any `Messages` or other frame is delivered, so heartbeats add **no
per-message overhead** and impose **no per-topic broadcast** — they are a property of the
connection, not of any conversation.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-83-mutable-subscription-streams.md:152

The keepalive_interval_ms field is OPTIONAL in the protobuf, and the server heartbeat interval is only recommended to be ≤ 30 seconds, not required. A compliant server could send heartbeats every 60 seconds while omitting the field, yet clients following line 167 assume a 30-second default and would declare the stream dead at 60–90 seconds, triggering infinite false reconnects. Consider requiring servers to either (a) always include keepalive_interval_ms when heartbeats are sent, or (b) mandate a hard upper bound on the heartbeat interval so the default assumption holds.

-4. The node MUST emit a `StatusUpdate{ WAITING }` heartbeat whenever no other frame has been sent for
-  a bounded interval. The interval is server-controlled and RECOMMENDED to be **≤ 30 seconds**. The
-  idle timer MUST reset whenever any `Messages` or other frame is delivered, so heartbeats add **no
-  per-message overhead** and impose **no per-topic broadcast** — they are a property of the
-  connection, not of any conversation.
+4. The node MUST emit a `StatusUpdate{ WAITING }` heartbeat whenever no other frame has been sent for
+  a bounded interval. The interval MUST be **≤ 30 seconds**, and the node MUST include
+  `keepalive_interval_ms` in every `StatusUpdate` so clients can derive accurate watchdog thresholds.
+  The idle timer MUST reset whenever any `Messages` or other frame is delivered, so heartbeats add
+  **no per-message overhead** and impose **no per-topic broadcast** — they are a property of the
+  connection, not of any conversation.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file XIPs/xip-83-mutable-subscription-streams.md around lines 152-155:

The `keepalive_interval_ms` field is `OPTIONAL` in the protobuf, and the server heartbeat interval is only *recommended* to be `≤ 30 seconds`, not required. A compliant server could send heartbeats every 60 seconds while omitting the field, yet clients following line 167 assume a 30-second default and would declare the stream dead at 60–90 seconds, triggering infinite false reconnects. Consider requiring servers to either (a) always include `keepalive_interval_ms` when heartbeats are sent, or (b) mandate a hard upper bound on the heartbeat interval so the default assumption holds.

2. For each `TopicFilter` in `add`, the node MUST deliver messages with id greater than
`last_seen_id` (or from the live edge if `last_seen_id == 0`), performing catch-up from history
then transitioning to live delivery, and MUST NOT deliver an id at or below a cursor it has
already advanced past for that topic on this stream (no duplicates across catch-up/live).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-83-mutable-subscription-streams.md:147

Server requirement 2 forbids redelivering IDs below the highest cursor ever seen for a topic on the same stream. After a remove and re-add, the server cannot honor a lower last_seen_id in the new TopicFilter, so clients requesting replay from an older durable checkpoint silently miss messages in that window.

-and MUST NOT deliver an id at or below a cursor it has already advanced past for that topic on this stream (no duplicates across catch-up/live).
+and MUST NOT deliver an id at or below a cursor it has already advanced past for that topic on this stream *unless* the topic was removed and is being re-added with an explicit `last_seen_id`, in which case the cursor is reset to the requested value (allowing intentional replay).
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file XIPs/xip-83-mutable-subscription-streams.md around line 147:

Server requirement 2 forbids redelivering IDs below the highest cursor ever seen for a topic on the same stream. After a `remove` and re-`add`, the server cannot honor a lower `last_seen_id` in the new `TopicFilter`, so clients requesting replay from an older durable checkpoint silently miss messages in that window.

@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented Jun 4, 2026

Approvability

Verdict: Needs human review

3 blocking correctness issues found. This XIP specification document is owned by @jhaaaa, not the PR author, and there are 3 unresolved review comments identifying substantive protocol ambiguities (delta acknowledgment ordering, keepalive interval bounds, cursor reset semantics) that warrant owner review before merging.

You can customize Macroscope's approvability policy. Learn more.

Copy link
Copy Markdown
Contributor

neekolas commented Jun 4, 2026

Love this direction. Would be a huge unlock for Herald and also better for all mobile apps.

Main concern is that we have an even-more-different code path for the browser and everything else, and that browser issues can go unnoticed. Not a blocker, just an unfortunate side effect. Also means that our client streaming implementation needs to handle both mutable and immutable streams, which makes things harder to maintain. Such is life.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants