Reduce EnvGenAgent spurious failures by qianl-nv · Pull Request #782 · isaac-sim/IsaacLab-Arena

qianl-nv · 2026-06-11T15:01:23Z

Summary

Reduce failure rate of the EnvGenAgent

Detailed description

What was the reason for the change?
EnvGenAgent call has a ~20% failure rate on single call due to mostly server side issues. We need to reduce this for better user experience.
What has been changed?
-- max token length doubled to reduce chance of truncation (occasionally with long reasoning text)
-- add retries to recover from spurious network issue or timeouts
What is the impact of this change?
Reduces the failure rate to <1% with default prompt

greptile-apps · 2026-06-11T15:07:31Z

Greptile Summary

This PR reduces EnvGenAgent failure rates by doubling max_tokens from 2000 to 4096 and wrapping the entire API call + response-parsing pipeline in a retry loop (default 3 retries). The previous code only caught parse/validation errors; the new structure catches all exceptions including network errors and assertion failures.

Retry loop (max_retries=3 default) now covers API-level failures (network errors, timeouts, empty responses, malformed JSON) — the whole client.chat.completions.create call plus response parsing is inside the try block.
max_tokens doubled to 4096 to prevent response truncation on long reasoning outputs.
Tests added for the retry path: one that succeeds on the second attempt after a ConnectionError, and one that exhausts retries and raises RuntimeError.

Confidence Score: 5/5

Safe to merge — the retry wrapper correctly covers the entire API call and response-parsing pipeline, and the new tests validate both the success-after-retry and the exhaust-retries paths.

The change is small and targeted: a retry loop around an existing API call and a token-limit bump. The retry loop now correctly covers all failure modes (network errors, empty responses, parse failures) that it previously missed. New unit tests verify the retry behavior end-to-end.

No files require special attention; both changed files are straightforward and well-tested.

Important Files Changed

Filename	Overview
isaaclab_arena/agentic_environment_generation/environment_generation_agent.py	Retry loop correctly wraps the full API call and response parsing; max_tokens doubled; minor: no inter-retry delay.
isaaclab_arena/tests/test_environment_generation_agent.py	New unit tests cover the retry-then-succeed and exhaust-retries paths; existing no-choices test updated to assert retry count; stale @pytest.mark.flaky TODO not removed despite its trigger condition now being met.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant generate_spec
    participant OpenAI_API as OpenAI API

    Caller->>generate_spec: "generate_spec(prompt, max_retries=3)"
    loop attempt in range(1 + max_retries)
        generate_spec->>OpenAI_API: chat.completions.create(...)
        alt Success
            OpenAI_API-->>generate_spec: resp with choices
            generate_spec->>generate_spec: extract_response_text()
            generate_spec->>generate_spec: json.loads() + model_validate()
            generate_spec-->>Caller: (EnvironmentIntentSpec, raw_text)
        else Any Exception
            OpenAI_API-->>generate_spec: raises Exception
            generate_spec->>generate_spec: store last_exc, continue
        end
    end
    generate_spec-->>Caller: raise RuntimeError("failed after N attempts")

_{Reviews (2): Last reviewed commit: "Fix exception handling" | Re-trigger Greptile}

qianl-nv added 2 commits June 11, 2026 07:57

Increase max token to avoid truncating response

0f5ce1e

Add agent retries

54cdcc2

qianl-nv requested review from alexmillane, cvolkcvolk, peterd-NV, viiik-inside, xyao-nv and zhx06 as code owners June 11, 2026 15:01

Merge remote-tracking branch 'origin/main' into qianl/dev/agentic_retry

e0c5a55

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread isaaclab_arena/agentic_environment_generation/environment_generation_agent.py Outdated

Comment thread isaaclab_arena/agentic_environment_generation/environment_generation_agent.py Outdated

Fix exception handling

ee04bd8

xyao-nv approved these changes Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce EnvGenAgent spurious failures#782

Reduce EnvGenAgent spurious failures#782
qianl-nv wants to merge 4 commits into
mainfrom
qianl/dev/agentic_retry

qianl-nv commented Jun 11, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qianl-nv commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Detailed description

Uh oh!

greptile-apps Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qianl-nv commented Jun 11, 2026 •

edited

Loading

greptile-apps Bot commented Jun 11, 2026 •

edited

Loading