Skip to content

fix(codex): retry transient backend errors and restore web search in …#637

Merged
yaozheng-fang merged 2 commits into
mainfrom
fix/codex-runtime-longtask
Jun 30, 2026
Merged

fix(codex): retry transient backend errors and restore web search in …#637
yaozheng-fang merged 2 commits into
mainfrom
fix/codex-runtime-longtask

Conversation

@yaozheng-fang

Copy link
Copy Markdown
Collaborator

…Responses shim

The Codex Responses shim called the Ark backend with num_retries=0 and no timeout, so a transient 429/5xx/overloaded error or a stalled connection failed the turn outright — the eval client's read timeout fired before any recovery (surfacing as ReadTimeout). It also stripped Codex's hosted web_search before Ark (Ark rejects its schema) with nothing replacing it, leaving Codex+Ark agents with no web capability.

  • num_retries is now env-tunable (CODEX_SHIM_NUM_RETRIES, default 2) so litellm applies exponential backoff; add an optional per-call timeout (CODEX_SHIM_TIMEOUT, default off).
  • Behind CODEX_SHIM_MAX_TOOL_ITERS (default 0 = off, byte-identical to before), translate the hosted web_search/web_fetch into Ark-accepted function tools and run a bounded shim-internal tool loop that executes the veADK builtins and feeds results back, so the agent regains web search.

tiantt and others added 2 commits June 30, 2026 18:46
…Responses shim

The Codex Responses shim called the Ark backend with num_retries=0 and no
timeout, so a transient 429/5xx/overloaded error or a stalled connection failed
the turn outright — the eval client's read timeout fired before any recovery
(surfacing as ReadTimeout). It also stripped Codex's hosted web_search before
Ark (Ark rejects its schema) with nothing replacing it, leaving Codex+Ark
agents with no web capability.

- num_retries is now env-tunable (CODEX_SHIM_NUM_RETRIES, default 2) so litellm
  applies exponential backoff; add an optional per-call timeout
  (CODEX_SHIM_TIMEOUT, default off).
- Behind CODEX_SHIM_MAX_TOOL_ITERS (default 0 = off, byte-identical to before),
  translate the hosted web_search/web_fetch into Ark-accepted function tools and
  run a bounded shim-internal tool loop that executes the veADK builtins and
  feeds results back, so the agent regains web search.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pre-commit ruff-format collapsed a multi-line .extend() call in the
web-tool translation block; no behavior change.
@yaozheng-fang yaozheng-fang merged commit 466d19b into main Jun 30, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants