fix(core): bound streamText fullStream iteration with an inactivity timeout (#515)#518
fix(core): bound streamText fullStream iteration with an inactivity timeout (#515)#518MohammadYusif wants to merge 2 commits into
Conversation
…imeout (mozilla#515) The `timeout`/`abortSignal` passed to `streamText` in `WebAgent.generateAndProcessAction` only bound the initial request. The `for await (...streamResult.fullStream)` that consumes the stream was not independently bounded, so a provider that stalled mid-stream (partial data then silence, or a half-open TCP connection that never sends a FIN) could leave the iteration pending forever and freeze the agent with no error and no progress. Wrap the stream consumption in `iterateWithInactivityTimeout`, which races each `next()` against the existing `llmProviderTimeoutMs` as an inactivity window and rejects with a descriptive error if no chunk arrives in time. The rejection flows into the existing try/catch and is surfaced like any other generation failure; the underlying iterator's `return()` is invoked on exit so the stream's resources are released. Reuses the configured provider timeout rather than adding a new knob, mirroring the request-level bound. Same indefinite-hang class as the aria-tree frame.evaluate fix in mozilla#511.
|
Thank you for this, @MohammadYusif — and apologies for the slow turnaround. 🙏 You actually filed this before #521, and you correctly diagnosed the same unbounded-stream hang. I'm going to close this one as superseded rather than rejected — and I want to explain the why, because your approach is genuinely the tighter of the two. #521 ended up bounding the whole iteration with a watchdog ( Your
So merging on top wouldn't break anything. The reason I'm holding off is that the marginal benefit is small relative to the cost:
If we find in practice that the 300s watchdog is too coarse for real mid-stream stalls, your So: close-as-superseded, not a rejection — just unlucky timing on parallel work. I really appreciate the clear writeup, the tests, and the careful reasoning, and I hope to see more PRs from you. 🙂 |
Closes #515.
Problem
WebAgent.generateAndProcessActionpassestimeout/abortSignaltostreamText, but those only bound the initial request. Thefor await (...streamResult.fullStream)that consumes the stream is not independently bounded. A provider that stalls mid-stream (partial data then silence, or a half-open TCP connection that never sends FIN) can leave the iteration hanging forever and freeze the agent with no error and no progress.Same indefinite-hang class as the aria-tree
frame.evaluatefix in #511.Fix
Wrap the stream consumption in
iterateWithInactivityTimeout, which races eachnext()against the existingllmProviderTimeoutMsas an inactivity window and rejects with a descriptive error if no chunk arrives in time. The rejection flows into the existing try/catch and surfaces like any generation failure; the iterator'sreturn()is invoked on exit to release stream resources. Reuses the configured provider timeout rather than adding a new knob — mirrors the request-level bound.Testing
return()is called on timeout — 3 tests inpackages/core/test/webAgent.test.tsconfig.test.ts, confirmed on clean checkout, unrelated)PR Type
Checklist
pnpm test)AI Usage