Reduce EnvGenAgent spurious failures#782
Conversation
Greptile SummaryThis PR reduces
Confidence Score: 5/5Safe to merge — the retry wrapper correctly covers the entire API call and response-parsing pipeline, and the new tests validate both the success-after-retry and the exhaust-retries paths. The change is small and targeted: a retry loop around an existing API call and a token-limit bump. The retry loop now correctly covers all failure modes (network errors, empty responses, parse failures) that it previously missed. New unit tests verify the retry behavior end-to-end. No files require special attention; both changed files are straightforward and well-tested. Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant generate_spec
participant OpenAI_API as OpenAI API
Caller->>generate_spec: "generate_spec(prompt, max_retries=3)"
loop attempt in range(1 + max_retries)
generate_spec->>OpenAI_API: chat.completions.create(...)
alt Success
OpenAI_API-->>generate_spec: resp with choices
generate_spec->>generate_spec: extract_response_text()
generate_spec->>generate_spec: json.loads() + model_validate()
generate_spec-->>Caller: (EnvironmentIntentSpec, raw_text)
else Any Exception
OpenAI_API-->>generate_spec: raises Exception
generate_spec->>generate_spec: store last_exc, continue
end
end
generate_spec-->>Caller: raise RuntimeError("failed after N attempts")
Reviews (2): Last reviewed commit: "Fix exception handling" | Re-trigger Greptile |
Summary
Reduce failure rate of the EnvGenAgent
Detailed description
EnvGenAgent call has a ~20% failure rate on single call due to mostly server side issues. We need to reduce this for better user experience.
-- max token length doubled to reduce chance of truncation (occasionally with long reasoning text)
-- add retries to recover from spurious network issue or timeouts
Reduces the failure rate to <1% with default prompt