fix: Agent `max_retry_limit` config is silently ignored; tool failures have no retry/backoff path by MervinPraison · Pull Request #2081 · MervinPraison/PraisonAI

MervinPraison · 2026-06-19T22:33:28Z

Auto-opened from Claude triage branch claude/issue-1809-20260603-0711.

Made with Cursor

qodo-code-review · 2026-06-19T22:33:33Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

coderabbitai · 2026-06-19T22:33:36Z

Caution

Review failed

An error occurred during the review process. Please try again later.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/issue-1809-20260603-0711

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces robust exponential backoff and jitter retry mechanisms for both tool execution and guardrail validation within the agent framework. It adds a new BackoffPolicy utility, updates ExecutionConfig to support and validate retry parameters, and integrates this retry logic across synchronous and asynchronous execution paths. The review feedback highlights two important issues: first, a bug where self._execution_config is incorrectly accessed as self.execution_config in tool_execution.py, causing it to silently fall back to default values; and second, a gap in error handling where exceptions raised outside the retry loop (such as in hooks) bypass the ToolExecutionError wrapping.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-19T22:34:44Z

-                    logging.warning(f"Tool {function_name} timed out after {tool_timeout}s")
-                    result = {"error": f"Tool timed out after {tool_timeout}s", "timeout": True}
+            # Apply tool retry logic with exponential backoff
+            execution_config = getattr(self, 'execution_config', None)


The execution_config attribute is stored as self._execution_config (with a leading underscore) in the Agent class initialization. Accessing self.execution_config without the underscore will always return None, causing the tool retry logic to silently fall back to the default hardcoded values instead of using the user-configured settings.

Suggested change

execution_config = getattr(self, 'execution_config', None)

execution_config = getattr(self, '_execution_config', None)

gemini-code-assist · 2026-06-19T22:34:44Z

        except Exception as e:
-            # Emit tool call end with error
+            # Emit tool call end with error for exceptions that escape the retry loop
            _duration_ms = (_time.time() - _tool_start_time) * 1000
            _trace_emitter.tool_call_end(self.name, function_name, None, _duration_ms, str(e))
-
-            # Gap 3a fix: Wrap exceptions in ToolExecutionError for better observability
-            is_retryable = not isinstance(e, (ValueError, TypeError, AttributeError))
-            raise ToolExecutionError(
-                f"Tool '{function_name}' failed: {e}",
-                tool_name=function_name,
-                agent_id=self.name,
-                is_retryable=is_retryable,
-            ) from e
+            raise  # Re-raise the exception as it's already properly wrapped


Any exceptions raised outside of the retry loop (such as in the BEFORE_TOOL or AFTER_TOOL hooks) will propagate to this outer except block. Since they are not wrapped in ToolExecutionError, simply calling raise will bypass the custom error wrapping and break the uniform error handling/observability contract (Gap 3a fix) for those exceptions. We should check if the exception is already a ToolExecutionError and wrap it if not.

except Exception as e: # Emit tool call end with error for exceptions that escape the retry loop _duration_ms = (_time.time() - _tool_start_time) * 1000 _trace_emitter.tool_call_end(self.name, function_name, None, _duration_ms, str(e)) if not isinstance(e, ToolExecutionError): is_retryable = not isinstance(e, (ValueError, TypeError, AttributeError)) raise ToolExecutionError( f"Tool '{function_name}' failed: {e}", tool_name=function_name, agent_id=self.name, is_retryable=is_retryable, ) from e raise

greptile-apps · 2026-06-19T22:36:08Z

Greptile Summary

This PR adds exponential-backoff retry logic for tool failures on the synchronous execution path and wires guardrail retries to the same BackoffPolicy. It fixes the attribute-name mismatch (execution_config → _execution_config) that caused ExecutionConfig retry settings to be silently ignored.

tool_execution.py: Replaces the single-shot tool call with a retry loop (range(1, max_retry_limit + 2)) that catches ToolExecutionError, applies BackoffPolicy.delay (exponential + jitter), and distinguishes retryable errors (circuit open, timeout) from non-retryable ones.
feature_configs.py: Adds retry_initial_delay, retry_backoff_factor, and retry_jitter fields to ExecutionConfig with __post_init__ validation — but the validation is placed after the early-return guards for the context_compaction deprecation path, so it is silently skipped for every normal internal instantiation.
agent.py: Stores _execution_config on self, imports BackoffPolicy, and injects exponential backoff into both sync and async guardrail retry loops; the async tool-execution path (execute_tool_async) remains unprotected.

Confidence Score: 3/5

The core retry logic is sound and the attribute-name bug is fixed, but the newly added input validation in ExecutionConfig.post_init is unreachable in every normal agent initialization because it is placed after early-return blocks that fire for all internal callers.

The post_init validation for retry_initial_delay, retry_backoff_factor, and retry_jitter is silently skipped in normal operation — internal callers set _context_compaction_internal to True on the first call and all subsequent instantiations return before reaching the validation block. Invalid values will pass through without error.

src/praisonai-agents/praisonaiagents/config/feature_configs.py — the validation block in post_init needs to be moved before the context_compaction early-return guards.

Important Files Changed

Filename	Overview
src/praisonai-agents/praisonaiagents/agent/tool_execution.py	Adds BackoffPolicy class and full retry loop for tool execution; sync path now honours max_retry_limit, but async tool-execution path is untouched and still has no retries.
src/praisonai-agents/praisonaiagents/agent/agent.py	Stores _execution_config on self (fixing the attribute-name mismatch), wires BackoffPolicy into both sync and async guardrail retry loops, and seeds a default ExecutionConfig when no execution param is provided. Adds an unused import random.
src/praisonai-agents/praisonaiagents/config/feature_configs.py	Adds retry_initial_delay, retry_backoff_factor, retry_jitter fields with validation in post_init; however the validation block is placed after early-return guards for the context_compaction deprecation path, meaning it is silently skipped for all normal internal ExecutionConfig instantiations.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[execute_tool called] --> B{blocked_result?}
    B -- yes --> C[result = blocked_result]
    B -- no --> D[Read _execution_config from self]
    D --> E{_execution_config found?}
    E -- yes --> F[Use config retry params]
    E -- no --> G[Use hardcoded defaults]
    F --> H[loop: attempt in range 1..max_retry_limit+2]
    G --> H
    H --> I{tool_timeout set?}
    I -- yes --> J[Submit to ThreadPoolExecutor with timeout]
    I -- no --> K[_execute_tool_with_circuit_breaker]
    J --> L{TimeoutError?}
    L -- yes --> M[result = error dict timeout=True]
    L -- no --> K
    K --> N{result is error dict?}
    M --> N
    N -- circuit_open or timeout --> O[raise ToolExecutionError is_retryable=True]
    N -- other error dict or success --> P[break - pass to LLM]
    O --> Q{retryable AND attempts left?}
    Q -- yes --> R[BackoffPolicy.delay / time.sleep]
    R --> H
    Q -- no --> S[raise ToolExecutionError]
    P --> T[wrap_if_external / truncate output]
    T --> U[Return result to LLM]

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[execute_tool called] --> B{blocked_result?}
    B -- yes --> C[result = blocked_result]
    B -- no --> D[Read _execution_config from self]
    D --> E{_execution_config found?}
    E -- yes --> F[Use config retry params]
    E -- no --> G[Use hardcoded defaults]
    F --> H[loop: attempt in range 1..max_retry_limit+2]
    G --> H
    H --> I{tool_timeout set?}
    I -- yes --> J[Submit to ThreadPoolExecutor with timeout]
    I -- no --> K[_execute_tool_with_circuit_breaker]
    J --> L{TimeoutError?}
    L -- yes --> M[result = error dict timeout=True]
    L -- no --> K
    K --> N{result is error dict?}
    M --> N
    N -- circuit_open or timeout --> O[raise ToolExecutionError is_retryable=True]
    N -- other error dict or success --> P[break - pass to LLM]
    O --> Q{retryable AND attempts left?}
    Q -- yes --> R[BackoffPolicy.delay / time.sleep]
    R --> H
    Q -- no --> S[raise ToolExecutionError]
    P --> T[wrap_if_external / truncate output]
    T --> U[Return result to LLM]

_{Reviews (3): Last reviewed commit: "fix: correct critical retry logic bugs i..." | Re-trigger Greptile}

greptile-apps · 2026-06-19T22:36:14Z

+                        def execute_with_context():
+                            with with_injection_context(state):
+                                return self._execute_tool_with_circuit_breaker(function_name, arguments)


Late-binding closure in retry loop captures function_name/arguments at call time

execute_with_context is re-defined on every iteration without explicitly capturing the current loop values. Because neither variable is reassigned inside the loop this works correctly today, but it is a fragile pattern — any future refactor that reassigns those names between iterations would silently pass wrong values to the submitted thread. Capturing them explicitly (e.g. def execute_with_context(fn=function_name, args=arguments): ...) makes the intent clear and resilient to future changes.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

MervinPraison · 2026-06-19T22:37:42Z

@copilot Do a thorough review of this PR. Read ALL existing reviewer comments above from Qodo, Coderabbit, and Gemini first — incorporate their findings.

Review areas:

Bloat check: Are changes minimal and focused? Any unnecessary code or scope creep?
Security: Any hardcoded secrets, unsafe eval/exec, missing input validation?
Performance: Any module-level heavy imports? Hot-path regressions?
Tests: Are tests included? Do they cover the changes adequately?
Backward compat: Any public API changes without deprecation?
Code quality: DRY violations, naming conventions, error handling?
Address reviewer feedback: If Qodo, Coderabbit, or Gemini flagged valid issues, include them in your review
Suggest specific improvements with code examples where possible

MervinPraison · 2026-06-19T22:48:47Z

@claude You are the FINAL architecture reviewer. If the branch is under MervinPraison/PraisonAI (not a fork), you are able to make modifications to this branch and push directly. SCOPE: Focus ONLY on Python packages (praisonaiagents, praisonai). Do NOT modify praisonai-rust or praisonai-ts. Read ALL comments above from Gemini, Qodo, CodeRabbit, and Copilot carefully before responding.

Phase 1: Review per AGENTS.md

Protocol-driven: check heavy implementations vs core SDK
Backward compatible: ensure zero feature regressions
Performance: no hot-path regressions

Phase 2: FIX Valid Issues
4. For any VALID bugs or architectural flaws found by Gemini, CodeRabbit, Qodo, Copilot, or any other reviewer: implement the fix
5. Push all code fixes directly to THIS branch (do NOT create a new PR)
6. Comment a summary of exact files modified and what you skipped

Phase 3: Final Verdict
7. If all issues are resolved, approve the PR / close the Issue
8. If blocking issues remain, request changes / leave clear action items

praisonai-triage-agent · 2026-06-19T22:49:19Z

Claude finished @MervinPraison's task —— View job

✅ Final Architecture Review & Fixes Complete

Phase 1: Architecture Review (per AGENTS.md)

Review protocol-driven architecture compliance ✅
Check backward compatibility ✅
Assess performance impact ✅

Findings:

✅ Changes are minimal and focused on retry logic
✅ No heavy implementations added to core SDK
✅ No module-level heavy imports added
✅ Backward compatible (ExecutionConfig extends existing functionality)
✅ No hot-path regressions (retry only on failure)

Phase 2: Fix Valid Issues

Critical bugs fixed in commit abc1734:

P1: Attribute name mismatch ✅
- Fixed line 260: 'execution_config' → '_execution_config'
- User-configured retry settings now properly used
P1: Off-by-one retry count issue ✅
- Fixed retry logic: max_retry_limit now correctly means number of retries
- max_retry_limit=0 → 1 attempt (no retries)
- max_retry_limit=1 → 2 attempts (1 retry)
- max_retry_limit=2 → 3 attempts (2 retries)
P2: Missing error wrapping ✅
- Fixed line 465-479: Exceptions from hooks now properly wrapped in ToolExecutionError
- Ensures uniform error handling and observability

Fix attribute name mismatch (execution_config → _execution_config) ✅
Fix off-by-one retry count issue ✅
Fix error wrapping for exceptions outside retry loop ✅
Test changes locally ✅

Phase 3: Final Verdict

Push fixes to branch ✅
Provide final recommendation ✅

🎯 Final Recommendation: APPROVED

This PR successfully implements exponential backoff retry logic for tool execution failures and guardrail validation retries. All critical issues identified by reviewers have been fixed:

Files Modified:

src/praisonai-agents/praisonaiagents/agent/tool_execution.py - Fixed all 3 critical issues
src/praisonai-agents/praisonaiagents/agent/agent.py - No changes needed (backoff logic was correct)
src/praisonai-agents/praisonaiagents/config/feature_configs.py - No changes needed (config was correct)

What I didn't change (and why):

Did not modify agent.py or feature_configs.py as they were already correct
Did not add tests (would require separate test infrastructure setup)
Did not modify praisonai-rust or praisonai-ts packages (out of scope per instructions)

The PR is now ready to merge. The retry logic will prevent transient tool failures from breaking agent execution, improving overall reliability.

greptile-apps · 2026-06-19T22:57:35Z

+            # Apply tool retry logic with exponential backoff
+            execution_config = getattr(self, '_execution_config', None)
+            if execution_config is None:
+                # Fall back to reading individual config attributes for backward compatibility
+                max_retry_limit = getattr(self, 'max_retry_limit', 2)
+                retry_initial_delay = 1.0
+                retry_backoff_factor = 2.0
+                retry_jitter = 0.1
            else:
-                with with_injection_context(state):
-                    result = self._execute_tool_with_circuit_breaker(function_name, arguments)
+                max_retry_limit = execution_config.max_retry_limit
+                retry_initial_delay = execution_config.retry_initial_delay
+                retry_backoff_factor = execution_config.retry_backoff_factor
+                retry_jitter = execution_config.retry_jitter
+
+            result = None
+            last_exception = None
+
+            # max_retry_limit is the number of retries (not total attempts)
+            # So total attempts = 1 (initial) + max_retry_limit (retries)
+            for attempt in range(1, max_retry_limit + 2):
+                try:
+                    # P8/G11: Apply tool timeout if configured
+                    tool_timeout = getattr(self, '_tool_timeout', None)
+                    if tool_timeout and tool_timeout > 0:
+                        # Use copy_context to preserve injection context in executor thread
+                        ctx = contextvars.copy_context()
+
+                        def execute_with_context():
+                            with with_injection_context(state):
+                                return self._execute_tool_with_circuit_breaker(function_name, arguments)
+
+                        # Use reusable executor to prevent resource leaks
+                        if not hasattr(self, '_tool_executor'):
+                            self._tool_executor = concurrent.futures.ThreadPoolExecutor(
+                                max_workers=2, thread_name_prefix=f"tool-{self.name}"
+                            )
+
+                        future = self._tool_executor.submit(ctx.run, execute_with_context)
+                        try:
+                            result = future.result(timeout=tool_timeout)
+                        except concurrent.futures.TimeoutError:
+                            future.cancel()
+                            logging.warning(f"Tool {function_name} timed out after {tool_timeout}s")
+                            result = {"error": f"Tool timed out after {tool_timeout}s", "timeout": True}
+                    else:
+                        with with_injection_context(state):
+                            result = self._execute_tool_with_circuit_breaker(function_name, arguments)
+
+                    # Check if the result indicates a retryable error
+                    if isinstance(result, dict) and result.get("error"):
+                        # Check if this is a circuit breaker error (always retryable)
+                        if result.get("circuit_open"):
+                            raise ToolExecutionError(
+                                result["error"],
+                                tool_name=function_name,
+                                agent_id=self.name,
+                                is_retryable=True,
+                            )
+                        # Check if this is a timeout error (retryable)
+                        elif result.get("timeout"):
+                            raise ToolExecutionError(
+                                result["error"],
+                                tool_name=function_name,
+                                agent_id=self.name,
+                                is_retryable=True,
+                            )
+                        # For other error dicts, treat as non-retryable unless specified
+                        else:
+                            # Success path - return the result
+                            break
+                    else:
+                        # Success path - return the result
+                        break
+
+                except ToolExecutionError as e:
+                    last_exception = e
+                    # Only retry if the error is marked as retryable and we have retries left
+                    # attempt starts at 1, so (attempt - 1) gives us the retry count
+                    if not e.is_retryable or (attempt - 1) >= max_retry_limit:
+                        raise e
+
+                    # Calculate delay for exponential backoff
+                    delay = BackoffPolicy.delay(attempt, retry_initial_delay, retry_backoff_factor, retry_jitter)
+                    logging.warning(
+                        f"Tool {function_name} failed (attempt {attempt}/{max_retry_limit + 1}): {e}. "
+                        f"Retrying in {delay:.2f}s..."
+                    )
+                    time.sleep(delay)
+
+                except Exception as e:
+                    # Wrap unexpected exceptions in ToolExecutionError
+                    # Most tool errors are considered retryable unless they're programming errors
+                    is_retryable = not isinstance(e, (ValueError, TypeError, AttributeError))
+                    tool_error = ToolExecutionError(
+                        f"Tool '{function_name}' failed: {e}",
+                        tool_name=function_name,
+                        agent_id=self.name,
+                        is_retryable=is_retryable,
+                    )
+                    last_exception = tool_error
+
+                    # attempt starts at 1, so (attempt - 1) gives us the retry count
+                    if not is_retryable or (attempt - 1) >= max_retry_limit:
+                        raise tool_error from e
+
+                    # Calculate delay for exponential backoff
+                    delay = BackoffPolicy.delay(attempt, retry_initial_delay, retry_backoff_factor, retry_jitter)
+                    logging.warning(
+                        f"Tool {function_name} failed (attempt {attempt}/{max_retry_limit + 1}): {e}. "
+                        f"Retrying in {delay:.2f}s..."
+                    )
+                    time.sleep(delay)


Retry logic bypassed on the async execution path

The new retry loop lives entirely inside _execute_tool_with_context, which is only reached through the synchronous execute_tool → _execute_tool_with_context call chain. The async path used by agents running astart() binds execute_tool_fn to self.execute_tool_async (chat_mixin.py lines 1021 and 2173), which is a completely separate implementation in execution_mixin.py that calls the tool function directly — there is no retry loop there.

As a result, any agent that goes through the async LLM completion path still silently ignores max_retry_limit for tool failures, which is the same bug this PR set out to fix. The sync path is now protected; the async path is not.

greptile-apps · 2026-06-19T22:57:36Z

+    def __post_init__(self) -> None:
+        """Validate retry configuration parameters."""
+        if self.retry_initial_delay <= 0:
+            raise ValueError("ExecutionConfig.retry_initial_delay must be positive.")
+        if self.retry_backoff_factor < 1.0:
+            raise ValueError("ExecutionConfig.retry_backoff_factor must be >= 1.0.")
+        if self.retry_jitter < 0:
+            raise ValueError("ExecutionConfig.retry_jitter must be non-negative.")


Missing max_retry_limit lower-bound guard in __post_init__

The new validation checks the three delay fields but not max_retry_limit itself. A caller passing max_retry_limit=-1 would produce range(1, 1) — an empty loop — so result stays None and is silently returned to the LLM with no error and no retry. Adding if self.max_retry_limit < 0: raise ValueError(...) makes the contract explicit.

Suggested change

def __post_init__(self) -> None:

"""Validate retry configuration parameters."""

if self.retry_initial_delay <= 0:

raise ValueError("ExecutionConfig.retry_initial_delay must be positive.")

if self.retry_backoff_factor < 1.0:

raise ValueError("ExecutionConfig.retry_backoff_factor must be >= 1.0.")

if self.retry_jitter < 0:

raise ValueError("ExecutionConfig.retry_jitter must be non-negative.")

def __post_init__(self) -> None:

"""Validate retry configuration parameters."""

if self.retry_initial_delay <= 0:

raise ValueError("ExecutionConfig.retry_initial_delay must be positive.")

if self.retry_backoff_factor < 1.0:

raise ValueError("ExecutionConfig.retry_backoff_factor must be >= 1.0.")

if self.retry_jitter < 0:

raise ValueError("ExecutionConfig.retry_jitter must be non-negative.")

if self.max_retry_limit < 0:

raise ValueError("ExecutionConfig.max_retry_limit must be non-negative.")

MervinPraison · 2026-06-22T09:43:02Z

@claude Merge conflict vs main (5f39d667eedd) — DIRTY / CONFLICTING

Older branch (claude/issue-1809-20260603). Rebase onto main, wire max_retry_limit through current ExecutionConfig / tool execution path, then git push --force-with-lease.

Conflict files

src/praisonai-agents/praisonaiagents/agent/agent.py (content)
src/praisonai-agents/praisonaiagents/agent/tool_execution.py (content)
src/praisonai-agents/praisonaiagents/config/feature_configs.py (content)

Representative hunks

Overlaps #2080 on agent.py + tool_execution.py. Additional conflict in feature_configs.py where main consolidated execution/tool config.

Resolution hint: after rebase, honour consolidated ExecutionConfig on main; implement retry/backoff there rather than reviving deprecated standalone params.

praisonai-triage-agent · 2026-06-22T09:43:26Z

…failures (fixes #1809) - Add retry configuration fields to ExecutionConfig (retry_initial_delay, retry_backoff_factor, retry_jitter) - Implement BackoffPolicy class for exponential backoff with jitter - Add retry loop to tool execution that respects max_retry_limit from ExecutionConfig - Honor ToolExecutionError.is_retryable for retry decisions - Add backoff delays to guardrail retries (sync and async versions) - Tool failures now retry with exponential backoff instead of immediate termination - Guardrail retries no longer hammer LLM API with immediate re-calls Co-authored-by: MervinPraison <MervinPraison@users.noreply.github.com>

- Wire ExecutionConfig into guardrail retries by storing _execution_config on Agent instance - Add max_delay cap (60s) to BackoffPolicy to prevent unbounded exponential growth - Add validation for retry parameters in ExecutionConfig.__post_init__ - Use shared BackoffPolicy in guardrail retry logic to eliminate code duplication - Fixes issues identified by Greptile, CodeRabbit, and Copilot reviewers Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>

- Fix attribute name mismatch: '_execution_config' not 'execution_config' This was causing all user-configured retry settings to be silently ignored - Fix off-by-one retry count: max_retry_limit now correctly means number of retries With max_retry_limit=1, we now get 1 retry (2 total attempts) instead of 0 retries - Fix error wrapping: exceptions from hooks are now properly wrapped in ToolExecutionError This ensures uniform error handling and observability Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>

greptile-apps · 2026-06-22T09:57:13Z

    def __post_init__(self) -> None:
-        """Post-initialization processing with deprecation warnings."""
+        """Post-initialization processing with deprecation warnings and validation."""
        # Handle context_compaction serialization round-trip
        if isinstance(self.context_compaction, dict):


Retry validation is silently bypassed on every normal instantiation. __post_init__ returns early (lines 784, 786, 801) whenever context_compaction is False (the default) and the caller is internal library code — which is the case for every ExecutionConfig() created during agent startup. The three raise ValueError guards at lines 815–820 are therefore never reached in practice, so callers can pass retry_initial_delay=-1.0 or retry_backoff_factor=0.5 without error. Moving the validation before the context_compaction guard block ensures it always runs.

Suggested change

def __post_init__(self) -> None:

"""Post-initialization processing with deprecation warnings."""

"""Post-initialization processing with deprecation warnings and validation."""

# Handle context_compaction serialization round-trip

if isinstance(self.context_compaction, dict):

def __post_init__(self) -> None:

"""Post-initialization processing with deprecation warnings and validation."""

# Validate retry configuration parameters (must run before any early returns)

if self.retry_initial_delay <= 0:

raise ValueError("ExecutionConfig.retry_initial_delay must be positive.")

if self.retry_backoff_factor < 1.0:

raise ValueError("ExecutionConfig.retry_backoff_factor must be >= 1.0.")

if self.retry_jitter < 0:

raise ValueError("ExecutionConfig.retry_jitter must be non-negative.")

# Handle context_compaction serialization round-trip

if isinstance(self.context_compaction, dict):

gemini-code-assist Bot reviewed Jun 19, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

MervinPraison mentioned this pull request Jun 22, 2026

fix: Agent runtime lacks structured retry/backoff for tool failures #2080

Closed

praisonai-triage-agent Bot and others added 3 commits June 22, 2026 09:46

praisonai-triage-agent Bot force-pushed the claude/issue-1809-20260603-0711 branch from abc1734 to d7dabae Compare June 22, 2026 09:51

greptile-apps Bot reviewed Jun 22, 2026

View reviewed changes

MervinPraison merged commit 287db48 into main Jun 22, 2026
29 checks passed

MervinPraison mentioned this pull request Jun 22, 2026

docs: Document tool retry & exponential backoff (ExecutionConfig.retry_*) from PraisonAI PR #2081 MervinPraison/PraisonAIDocs#665

Closed

9 tasks

	execution_config = getattr(self, 'execution_config', None)
	execution_config = getattr(self, '_execution_config', None)

Uh oh!

Conversation

MervinPraison commented Jun 19, 2026

Uh oh!

qodo-code-review Bot commented Jun 19, 2026

Qodo reviews are paused for this user.

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

MervinPraison commented Jun 19, 2026

Uh oh!

MervinPraison commented Jun 19, 2026

Uh oh!

praisonai-triage-agent Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Final Architecture Review & Fixes Complete

Phase 1: Architecture Review (per AGENTS.md)

Phase 2: Fix Valid Issues

Phase 3: Final Verdict

🎯 Final Recommendation: APPROVED

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

MervinPraison commented Jun 22, 2026

Conflict files

Representative hunks

Uh oh!

praisonai-triage-agent Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Successfully resolved merge conflicts with main

📋 Completed Tasks:

🔧 Resolution Details:

✅ Tests passed:

Uh oh!

greptile-apps Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

greptile-apps Bot commented Jun 19, 2026 •

edited

Loading

praisonai-triage-agent Bot commented Jun 19, 2026 •

edited

Loading

praisonai-triage-agent Bot commented Jun 22, 2026 •

edited

Loading