fix: gateway session data loss and restore deleted _delivery module by cursor[bot] · Pull Request #2102 · MervinPraison/PraisonAI

cursor · 2026-06-20T09:09:43Z

Summary

Critical bug scan found three high-severity regressions in recent commits. This PR applies minimal fixes with regression tests.

Bugs fixed

1. Gateway session resume drops conversation history (data loss)

Impact: With session persistence enabled, every disconnect → reconnect cycle restored an empty session instead of the latest conversation state.

Root cause: create_session() wrote an initial empty session_data system message on create and a full snapshot on close. Resume logic used break on the first matching system message, always selecting the empty initial snapshot.

Fix: Iterate all system messages and keep the latest session_data snapshot.

2. Gateway `stop()` loses in-flight sessions (data loss)

Impact: Graceful shutdown (SIGTERM / stop()) closed sessions in memory without persisting them, unlike WebSocket disconnect which calls close_session().

Root cause: stop() called session.close() directly instead of close_session(session_id).

Fix: Call close_session() for each active session during shutdown.

3. Missing `bots/_delivery.py` causes import crashes

Impact: from praisonai.bots import DurableDelivery, DurableAdapterMixin, or deliver_chunked raised ModuleNotFoundError.

Root cause: PR #2054 removed _delivery.py as "dead code", but _durable_adapter.py and bots/__init__.py still import from it.

Fix: Restore _delivery.py from pre-removal history.

Validation

Added tests/unit/gateway/test_gateway_persistence_fixes.py
All 4 new regression tests pass

Summary by CodeRabbit

New Features
- Enhanced message delivery system with streaming, chunking, and adaptive platform support.
- Durable message persistence with automatic retry, recovery, and idempotency key support.
Bug Fixes
- Fixed session recovery to use the latest persisted snapshot.
- Improved session shutdown to properly persist active sessions through the correct close flow.
Tests
- Added tests verifying session persistence and durable delivery functionality.

- Resume gateway sessions from the latest session_data snapshot instead of the initial empty one written at session creation - Persist active sessions on gateway stop() via close_session() - Restore bots/_delivery.py accidentally removed in #2054; imports for DurableDelivery and DurableAdapterMixin were still live - Add regression tests for resume, shutdown persist, and delivery imports Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>

MervinPraison · 2026-06-20T09:09:50Z

@coderabbitai review

MervinPraison · 2026-06-20T09:09:51Z

/review

qodo-code-review · 2026-06-20T09:09:55Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-06-20T09:09:56Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-06-20T09:10:04Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b7b64a1-b0f6-4ba7-aeb1-8f0dae06bb89

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new _delivery.py bot module implementing MessageSender, UnifiedDelivery (streaming/chunking with platform capability detection), deliver_with_retry (exponential backoff with abort signal), deliver_chunked, and DurableDelivery (outbox-backed idempotent delivery). Fixes two WebSocketGateway session persistence bugs in stop() and snapshot rehydration, with accompanying regression tests.

Changes

Durable Bot Delivery Module

Layer / File(s)	Summary
MessageSender protocol and UnifiedDelivery `src/praisonai/praisonai/bots/_delivery.py` (L1–255, L557–564)	Defines `MessageSender` protocol and `UnifiedDelivery` with platform-capability caching, optional typing indicator, streaming progressive edits with rate limiting, fallback chunking, and `__all__` exports.
deliver_with_retry and deliver_chunked helpers `src/praisonai/praisonai/bots/_delivery.py` (L257–387)	Implements `create_delivery`, bounded exponential backoff with recoverable/permanent error classification, abort-signal sleep, and `deliver_chunked` applying `reply_to` only on the first chunk.
DurableDelivery with outbox and drain `src/praisonai/praisonai/bots/_delivery.py` (L389–555)	Implements `DurableDelivery` with optional outbox enqueuing, auto-generated idempotency keys, sent/failed status updates, and `drain_pending` for at-most-once retry per cycle.

Gateway Session Persistence Fixes and Tests

Layer / File(s)	Summary
Gateway stop and rehydration bug fixes `src/praisonai/praisonai/gateway/server.py`	`stop()` now calls `close_session(session_id)` per session; rehydration overwrites `session_data` on each matching system message rather than stopping at the first match.
Regression tests `src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py`	Adds `_make_gateway` helper and three test classes covering latest-snapshot resume, stop-time persistence, and `DurableDelivery`/`deliver_chunked`/`DurableAdapterMixin` import smoke tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

MervinPraison/PraisonAI#2036: Directly overlaps this PR's delivery utility layer — both implement MessageSender, deliver_with_retry, deliver_chunked, and DurableDelivery in praisonai/bots/_delivery.py.
MervinPraison/PraisonAI#2054: Directly conflicts with this PR — it removes the same _delivery.py module that this PR introduces.

Suggested reviewers

MervinPraison

🐇 A rabbit hops to send your words,
Through streams and chunks and retried birds,
The outbox holds each message tight,
Till sessions wake and snapshots write —
No message lost, no session stray,
Durable delivery saves the day! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes: fixing gateway session data loss and restoring the deleted _delivery module.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cursor/critical-bug-investigation-7d5f

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

MervinPraison · 2026-06-20T09:10:11Z

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first — incorporate their findings.

Review areas:

Bloat check: Are changes minimal and focused? Any unnecessary code or scope creep?
Security: Any hardcoded secrets, unsafe eval/exec, missing input validation?
Performance: Any module-level heavy imports? Hot-path regressions?
Tests: Are tests included? Do they cover the changes adequately?
Backward compat: Any public API changes without deprecation?
Code quality: DRY violations, naming conventions, error handling?
Address reviewer feedback: If Qodo, Coderabbit, or Gemini flagged valid issues, include them in your review
Suggest specific improvements with code examples where possible

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

src/praisonai/praisonai/bots/_delivery.py (1)

247-254: ⚡ Quick win

Rate limiter not using per-channel limiting.

The RateLimiter.acquire() method supports an optional channel_id parameter for per-channel throttling, but _acquire_rate_limit doesn't pass it. This reduces effectiveness—multiple sends to the same channel won't respect per-channel delays documented in the rate limiter.

♻️ Suggested fix

-    async def _acquire_rate_limit(self) -> None:
-        """Acquire rate limit token if platform needs it."""
+    async def _acquire_rate_limit(self, channel_id: Optional[str] = None) -> None:
+        """Acquire rate limit token if platform needs it.
+        
+        Args:
+            channel_id: Optional channel ID for per-channel limiting
+        """
         if self._rate_limiter is None:
             # Use platform-specific rate limiter
             self._rate_limiter = RateLimiter.for_platform(self.bot.platform)
         
         if self._rate_limiter:
-            await self._rate_limiter.acquire()
+            await self._rate_limiter.acquire(channel_id)

Then update call sites to pass channel_id:

-                await self._acquire_rate_limit()
+                await self._acquire_rate_limit(channel_id)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/praisonai/praisonai/bots/_delivery.py` around lines 247 - 254, The
_acquire_rate_limit method is not leveraging the per-channel throttling
capability of the rate limiter. Modify the _acquire_rate_limit method to accept
a channel_id parameter and pass it to the self._rate_limiter.acquire() call as
an optional argument. Then update all call sites that invoke _acquire_rate_limit
to provide the appropriate channel_id value so that the rate limiter can
properly enforce per-channel delays as documented.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/praisonai/praisonai/bots/_delivery.py`:
- Around line 182-191: The message editing logic performs a redundant final edit
when the streaming loop completes successfully with the full text content. To
fix this, introduce a boolean flag (such as is_complete) that tracks whether the
streaming loop in the edit_message flow successfully sent all content (when
next_pos >= len(text) evaluates to true). Then, before the final edit operation
that occurs around lines 220-230 (after the main streaming loop ends), check
this flag and only proceed with the edit if the content was not already fully
sent during the loop. This prevents wasting an API call when the complete text
has already been delivered to the Discord message.
- Around line 87-89: The fire-and-forget task created by
asyncio.create_task(self._show_typing(channel_id)) lacks a stored reference,
risking garbage collection before completion and swallowing exceptions. Store
the task reference returned by asyncio.create_task in a task set or collection
maintained by the class to track background tasks, ensuring proper lifecycle
management and debugging capability. Make sure to add appropriate cleanup logic
to remove completed tasks from the tracking collection to prevent memory leaks.

In `@src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py`:
- Around line 49-72: The test_stop_persists_active_sessions test calls await
gw.stop() while _is_running is False, causing the stop method to return early
without executing session close persistence logic. Additionally, the test reuses
the same gateway instance after stop, so in-memory state masks whether
persistence actually worked. Fix this by ensuring the gateway is marked as
running before calling stop (set _is_running to True), and then create a
completely new gateway instance that loads from the persisted session store to
verify the data was truly persisted and not just held in memory.

---

Nitpick comments:
In `@src/praisonai/praisonai/bots/_delivery.py`:
- Around line 247-254: The _acquire_rate_limit method is not leveraging the
per-channel throttling capability of the rate limiter. Modify the
_acquire_rate_limit method to accept a channel_id parameter and pass it to the
self._rate_limiter.acquire() call as an optional argument. Then update all call
sites that invoke _acquire_rate_limit to provide the appropriate channel_id
value so that the rate limiter can properly enforce per-channel delays as
documented.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5838afd3-8488-4fb2-b364-b9b1237bb6cc

📥 Commits

Reviewing files that changed from the base of the PR and between 5f39d66 and 1d65bf3.

📒 Files selected for processing (3)

src/praisonai/praisonai/bots/_delivery.py
src/praisonai/praisonai/gateway/server.py
src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py

coderabbitai · 2026-06-20T09:16:38Z

+        # Show typing indicator if supported
+        if typing and caps.supports_typing:
+            asyncio.create_task(self._show_typing(channel_id))


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Store a reference to the typing indicator task.

The fire-and-forget task created here has no stored reference, which risks garbage collection before completion and swallows any exceptions. While typing indicator failure is non-critical, storing the reference enables proper lifecycle management and debugging.

🛠️ Suggested fix

Add a task set to track background tasks and ensure proper cleanup:

class UnifiedDelivery: def __init__(self, bot: "BotProtocol"): self.bot = bot self._capabilities: Optional["PlatformCapabilities"] = None self._rate_limiter: Optional[RateLimiter] = None self._streamer: Optional[DraftStreamer] = None + self._background_tasks: set[asyncio.Task] = set()

# Show typing indicator if supported if typing and caps.supports_typing: - asyncio.create_task(self._show_typing(channel_id)) + task = asyncio.create_task(self._show_typing(channel_id)) + self._background_tasks.add(task) + task.add_done_callback(self._background_tasks.discard)

🧰 Tools

🪛 Ruff (0.15.17)

[warning] 89-89: Store a reference to the return value of asyncio.create_task

(RUF006)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai/praisonai/bots/_delivery.py` around lines 87 - 89, The fire-and-forget task created by asyncio.create_task(self._show_typing(channel_id)) lacks a stored reference, risking garbage collection before completion and swallowing exceptions. Store the task reference returned by asyncio.create_task in a task set or collection maintained by the class to track background tasks, ensuring proper lifecycle management and debugging capability. Make sure to add appropriate cleanup logic to remove completed tasks from the tracking collection to prevent memory leaks.

Source: Linters/SAST tools

coderabbitai · 2026-06-20T09:16:38Z

+            await self.bot.edit_message(
+                channel_id,
+                message.message_id,
+                display_text,
+            )
+
+            if next_pos >= len(text):
+                break  # Complete text has been sent
+
+            current_pos = next_pos


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Redundant final edit when streaming completes successfully.

When the streaming loop completes with next_pos >= len(text) (line 188), the message has already been edited with the complete text. However, if the text fits within max_message_length, lines 220-230 perform another edit with the same content. This wastes an API call and could unnecessarily trigger rate limits.

♻️ Suggested fix

Track whether the loop completed with the full content and skip the redundant edit:

+ streaming_completed = False # Stream partial content while current_pos < len(text): # ... existing loop code ... if next_pos >= len(text): + streaming_completed = True break # Complete text has been sent current_pos = next_pos # If text is too long and couldn't fit in one message, chunk it if _calculate_length(text, caps.length_unit) > caps.max_message_length: # ... existing chunking code ... - else: + elif not streaming_completed: # Final edit with complete content (fits in limit) await asyncio.sleep(caps.edit_interval_ms / 1000.0) # ... rest of else block ...

Also applies to: 220-230

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai/praisonai/bots/_delivery.py` around lines 182 - 191, The message editing logic performs a redundant final edit when the streaming loop completes successfully with the full text content. To fix this, introduce a boolean flag (such as is_complete) that tracks whether the streaming loop in the edit_message flow successfully sent all content (when next_pos >= len(text) evaluates to true). Then, before the final edit operation that occurs around lines 220-230 (after the main streaming loop ends), check this flag and only proceed with the edit if the content was not already fully sent during the loop. This prevents wasting an API call when the complete text has already been delivered to the Discord message.

coderabbitai · 2026-06-20T09:16:38Z

+    async def test_stop_persists_active_sessions(self, tmp_path):
+        gw = _make_gateway(tmp_path)
+        session_id = "sess-stop-1"
+
+        session = gw.create_session("agent-1", "client-1", session_id)
+        session.add_message(
+            GatewayMessage(
+                content="Shutdown test message",
+                sender_id="user",
+                session_id=session_id,
+            )
+        )
+
+        await gw.stop()
+
+        store = gw._session_store
+        assert store is not None
+        assert store.session_exists(session_id)
+
+        resumed = gw.create_session("agent-1", "client-2", session_id)
+        messages = resumed.get_messages()
+
+        assert len(messages) == 1
+        assert messages[0].content == "Shutdown test message"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

test_stop_persists_active_sessions can pass without exercising stop-time persistence

Line 62 calls await gw.stop() while _is_running is still False, so stop() returns early and never runs session close persistence. The test then reuses the same gateway instance, so in-memory state can hide persistence regressions.

Suggested test hardening

@@ async def test_stop_persists_active_sessions(self, tmp_path): gw = _make_gateway(tmp_path) session_id = "sess-stop-1" @@ - await gw.stop() + # Exercise the actual shutdown branch in stop() + gw._is_running = True + await gw.stop() @@ - resumed = gw.create_session("agent-1", "client-2", session_id) + # Rehydrate from storage, not in-memory state + gw2 = _make_gateway(tmp_path) + resumed = gw2.create_session("agent-1", "client-2", session_id) messages = resumed.get_messages()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async def test_stop_persists_active_sessions(self, tmp_path):

gw = _make_gateway(tmp_path)

session_id = "sess-stop-1"

session = gw.create_session("agent-1", "client-1", session_id)

session.add_message(

GatewayMessage(

content="Shutdown test message",

sender_id="user",

session_id=session_id,

)

)

await gw.stop()

store = gw._session_store

assert store is not None

assert store.session_exists(session_id)

resumed = gw.create_session("agent-1", "client-2", session_id)

messages = resumed.get_messages()

assert len(messages) == 1

assert messages[0].content == "Shutdown test message"

async def test_stop_persists_active_sessions(self, tmp_path):

gw = _make_gateway(tmp_path)

session_id = "sess-stop-1"

session = gw.create_session("agent-1", "client-1", session_id)

session.add_message(

GatewayMessage(

content="Shutdown test message",

sender_id="user",

session_id=session_id,

)

)

# Exercise the actual shutdown branch in stop()

gw._is_running = True

await gw.stop()

store = gw._session_store

assert store is not None

assert store.session_exists(session_id)

# Rehydrate from storage, not in-memory state

gw2 = _make_gateway(tmp_path)

resumed = gw2.create_session("agent-1", "client-2", session_id)

messages = resumed.get_messages()

assert len(messages) == 1

assert messages[0].content == "Shutdown test message"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py` around lines 49 - 72, The test_stop_persists_active_sessions test calls await gw.stop() while _is_running is False, causing the stop method to return early without executing session close persistence logic. Additionally, the test reuses the same gateway instance after stop, so in-memory state masks whether persistence actually worked. Fix this by ensuring the gateway is marked as running before calling stop (set _is_running to True), and then create a completely new gateway instance that loads from the persisted session store to verify the data was truly persisted and not just held in memory.

MervinPraison · 2026-06-20T09:20:01Z

@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding.

Phase 1: Review per AGENTS.md

Protocol-driven: check heavy implementations vs core SDK
Backward compatible: ensure zero feature regressions
Performance: no hot-path regressions

Phase 2: FIX Valid Issues
4. For any VALID bugs or architectural flaws found by Gemini, CodeRabbit, Qodo, Copilot, or any other reviewer: implement the fix
5. Push all code fixes directly to THIS branch (do NOT create a new PR)
6. Comment a summary of exact files modified and what you skipped

Phase 3: Final Verdict
7. If all issues are resolved, approve the PR / close the Issue
8. If blocking issues remain, request changes / leave clear action items

praisonai-triage-agent · 2026-06-20T09:20:17Z

Claude finished @MervinPraison's task —— View job

🔍 Final Architecture Review

Todo List:

Read AGENTS.md to understand architecture rules
Review changes against protocol-driven design principles
Check for backward compatibility issues
Analyze issues from other reviewers (CodeRabbit, Qodo, Copilot)
Implement fixes for valid issues
Provide final verdict

Phase 1: Architecture Review

✅ Protocol-driven design: The implementation follows the correct pattern - the _delivery module provides heavy implementation in the wrapper layer (praisonai/), not in the core SDK.

✅ Backward compatibility: The restored _delivery module maintains all previously exported functions, ensuring no regressions.

✅ Performance: No hot-path regressions detected. The gateway fixes improve correctness without adding overhead.

Phase 2: Fixed Valid Issues

✅ Fixed all valid issues from reviewers:

Fire-and-forget task (line 89) - Added task reference tracking to prevent GC
Redundant final edit (line 191) - Added streaming_completed flag to avoid duplicate API call
Test hardening (line 72) - Fixed test to properly verify persistence by setting _is_running and using new gateway instance
Rate limiter improvement (line 247) - Added channel_id parameter for per-channel throttling

Phase 3: Final Verdict

✅ APPROVED - Ready to Merge

This PR correctly fixes three critical bugs:

Gateway session resume now uses the latest snapshot (not the empty initial one)
Gateway stop() now properly persists active sessions
Restored the accidentally deleted _delivery module

All architectural principles are followed, backward compatibility is maintained, and reviewer feedback has been addressed. The changes are minimal, focused, and include appropriate tests.

Files Modified:

src/praisonai/praisonai/bots/_delivery.py - Fixed background task GC, redundant edit, and rate limiting
src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py - Fixed test to properly verify persistence

Skipped Changes:

No changes needed to gateway/server.py (already correct in PR)
No changes to praisonai-rust or praisonai-ts (out of scope)

- Store reference to typing indicator background task to prevent GC - Fix redundant final edit when streaming completes successfully - Fix test_stop_persists_active_sessions to properly test persistence - Add channel_id parameter to rate limiter for per-channel throttling Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>

qodo-code-review · 2026-06-22T22:17:05Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

greptile-apps · 2026-06-22T22:21:10Z

Greptile Summary

This PR fixes three regressions: session history was silently discarded on reconnect due to a break selecting the initial empty snapshot instead of the final one; graceful shutdown via stop() bypassed close_session() so in-flight sessions were never persisted; and _delivery.py was missing after PR #2054 removed it, breaking any import of DurableDelivery, DurableAdapterMixin, or deliver_chunked.

Session resume fix (server.py): removes the break inside the system-message scan so the loop keeps overwriting session_data until the last snapshot (written by close_session) wins instead of stopping at the empty init snapshot.
Shutdown persistence fix (server.py): stop() now calls the full close_session(session_id) path for each active session, which persists state and sets the resume TTL, rather than calling session.close() directly.
_delivery.py restore: the file is re-added with UnifiedDelivery, DurableDelivery, deliver_with_retry, and deliver_chunked; four regression tests covering all three bugs are also added.

Confidence Score: 4/5

Safe to merge — all three fixes are minimal and targeted, none of the changes widen any code path in a way that could introduce new failures.

The session-resume and shutdown-persistence fixes in server.py are correct: iterating without break picks the last snapshot as intended, and routing through close_session() in stop() properly persists state. The restored _delivery.py satisfies all existing imports. The only open items are a missing deliver_chunked entry in the TYPE_CHECKING block (affects static analysis only, not runtime) and a missing trailing newline — neither affects correctness or runtime behavior.

No files require special attention; the two minor style issues are in bots/init.py and bots/_delivery.py.

Important Files Changed

Filename	Overview
src/praisonai/praisonai/gateway/server.py	Two targeted fixes: resume now scans all system messages to pick the latest snapshot, and stop() routes through close_session() to persist sessions before exit. Both changes are correct and safe.
src/praisonai/praisonai/bots/_delivery.py	Restored file with UnifiedDelivery, DurableDelivery, deliver_with_retry, and deliver_chunked. All dependent imports resolve. Missing trailing newline is a minor style nit.
src/praisonai/tests/unit/gateway/test_gateway_persistence_fixes.py	Four regression tests for the three bugs fixed; covers resume snapshot selection, shutdown persistence, and _delivery.py importability. Tests are well-structured but the async test requires pytest-asyncio in the environment.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant C as Client
    participant GW as WebSocketGateway
    participant SS as SessionStore

    Note over GW: Session Create
    GW->>SS: "add_message(role=system, session_data=empty_snapshot)"

    Note over C,GW: Client sends messages
    C->>GW: message
    GW->>GW: session.add_message()

    Note over GW: Session Close
    GW->>GW: close_session(session_id)
    GW->>SS: "add_message(role=system, session_data=full_snapshot)"

    Note over C,GW: Client reconnects
    C->>GW: join with session_id
    GW->>SS: get_session(session_id)
    SS-->>GW: messages [system(empty), ..., system(full)]
    Note over GW: OLD: break on first - empty snapshot
    Note over GW: NEW: iterate all - last snapshot wins
    GW->>GW: GatewaySession.from_dict(full_snapshot)
    GW-->>C: "joined (resumed=true, history intact)"

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant C as Client
    participant GW as WebSocketGateway
    participant SS as SessionStore

    Note over GW: Session Create
    GW->>SS: "add_message(role=system, session_data=empty_snapshot)"

    Note over C,GW: Client sends messages
    C->>GW: message
    GW->>GW: session.add_message()

    Note over GW: Session Close
    GW->>GW: close_session(session_id)
    GW->>SS: "add_message(role=system, session_data=full_snapshot)"

    Note over C,GW: Client reconnects
    C->>GW: join with session_id
    GW->>SS: get_session(session_id)
    SS-->>GW: messages [system(empty), ..., system(full)]
    Note over GW: OLD: break on first - empty snapshot
    Note over GW: NEW: iterate all - last snapshot wins
    GW->>GW: GatewaySession.from_dict(full_snapshot)
    GW-->>C: "joined (resumed=true, history intact)"

_{Reviews (1): Last reviewed commit: "fix: address code review feedback from r..." | Re-trigger Greptile}

greptile-apps · 2026-06-22T22:21:14Z

+    "DurableDelivery",
+]


The file is missing a trailing newline. Most Python linters (flake8 W292, ruff) and many editors flag this, and it can cause noisy diffs when content is appended later.

Suggested change

"DurableDelivery",

]

"DurableDelivery",

]

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…2102) * fix: gateway session data loss and restore deleted _delivery module - Resume gateway sessions from the latest session_data snapshot instead of the initial empty one written at session creation - Persist active sessions on gateway stop() via close_session() - Restore bots/_delivery.py accidentally removed in #2054; imports for DurableDelivery and DurableAdapterMixin were still live - Add regression tests for resume, shutdown persist, and delivery imports Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com> * fix: address code review feedback from reviewers - Store reference to typing indicator background task to prevent GC - Fix redundant final edit when streaming completes successfully - Fix test_stop_persists_active_sessions to properly test persistence - Add channel_id parameter to rate limiter for per-channel throttling Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com> Co-authored-by: praisonai-triage-agent[bot] <272766704+praisonai-triage-agent[bot]@users.noreply.github.com>

Copilot AI requested a review from MervinPraison June 20, 2026 09:10

coderabbitai Bot reviewed Jun 20, 2026

View reviewed changes

MervinPraison marked this pull request as ready for review June 22, 2026 22:17

MervinPraison merged commit 81c2cf8 into main Jun 22, 2026
16 checks passed

MervinPraison mentioned this pull request Jun 22, 2026

docs(bots): document Durable Outbound Delivery + note gateway session persistence fixes (from PR PraisonAI#2102) MervinPraison/PraisonAIDocs#688

Closed

10 tasks

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

MervinPraison mentioned this pull request Jun 22, 2026

fix: restore _delivery.py removed while still imported by durable delivery #2129

Merged

Uh oh!

Conversation

cursor Bot commented Jun 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bugs fixed

1. Gateway session resume drops conversation history (data loss)

2. Gateway stop() loses in-flight sessions (data loss)

3. Missing bots/_delivery.py causes import crashes

Validation

Summary by CodeRabbit

Uh oh!

MervinPraison commented Jun 20, 2026

Uh oh!

MervinPraison commented Jun 20, 2026

Uh oh!

qodo-code-review Bot commented Jun 20, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

MervinPraison commented Jun 20, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

MervinPraison commented Jun 20, 2026

Uh oh!

praisonai-triage-agent Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Final Architecture Review

Todo List:

Phase 1: Architecture Review

Phase 2: Fixed Valid Issues

Phase 3: Final Verdict

✅ APPROVED - Ready to Merge

Files Modified:

Skipped Changes:

Uh oh!

qodo-code-review Bot commented Jun 22, 2026

Qodo reviews are paused for this user.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 22, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor Bot commented Jun 20, 2026 •

edited by coderabbitai Bot

Loading

2. Gateway `stop()` loses in-flight sessions (data loss)

3. Missing `bots/_delivery.py` causes import crashes

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading

praisonai-triage-agent Bot commented Jun 20, 2026 •

edited

Loading