fix runtime issue by arkml · Pull Request #516 · rowboatlabs/rowboat

arkml · 2026-04-21T07:34:43Z

Fix intermittent assistant no-response issue by making run triggering resilient and runtime
failures observable

This change addresses the intermittent cases where the assistant appeared to stop responding,
sometimes after a few tool calls and sometimes even after the first user message.

Primary fix: prevent queued work from getting stranded

The assistant runtime uses a per-run lock so only one processing loop runs for a conversation
at a time. Previously, if a new trigger arrived while that lock was held, the trigger was
dropped after logging “unable to acquire lock.” That is fine only if the active loop naturally
discovers all new work before it exits. In practice, there are edge cases around run shutdown,
permission/human-input continuations, and queued user messages where work can be enqueued while
the active loop is near completion or paused, then never picked up again.

The runtime now records that a rerun was requested when it cannot acquire the lock. Once the
active run releases the lock, it checks that flag and immediately triggers the run again. This
makes trigger delivery edge-triggered plus level-triggered: even if the immediate trigger loses
the lock race, the queued message or continuation is processed after the current cycle ends.

This directly addresses the “assistant does not respond” symptom because messages and
continuation events are no longer silently stranded behind a missed trigger.

Secondary fix: surface runtime failures instead of ending silently

Previously, runs:createMessage fired runtime.trigger(runId) in the background and returned
success to the renderer. If the background runtime later threw outside the model stream’s own
error-event path, the UI could receive only run-processing-end without any error event or
assistant message. From the user’s perspective, the assistant simply stopped.

The runtime now catches non-abort failures at the top level, logs them, persists a run error
event, publishes that event to the renderer, and still emits run-processing-end for cleanup.
That means model setup failures, provider/client exceptions, run-state issues, serialization
failures, and other unexpected runtime exceptions are visible in the chat instead of looking
like a silent no-op.

Additional robustness: tool and model stream errors recover more cleanly

Tool execution exceptions are now converted into tool-result error payloads when they are not
aborts. This lets the model observe that a tool failed and continue or explain the issue,
instead of the entire run dying mid-tool-call.

Model stream setup and iteration errors are also converted into model stream error events.
The existing stream error handling can then produce a run error event and show it in the UI.

Small cleanup

The call sites that intentionally fire runtime.trigger(runId) in the background now use void runtime.trigger(runId) to make that behavior explicit.

Net effect

The main behavior change is that assistant work is no longer dropped when a trigger races with
an active run lock. If something still fails, the failure is now surfaced as a visible run
error rather than leaving the user with an indefinitely silent or completed-looking
conversation.

fix runtime issue

f422d37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix runtime issue#516

fix runtime issue#516
arkml wants to merge 1 commit intodevfrom
fix_runtime2

arkml commented Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arkml commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arkml commented Apr 21, 2026 •

edited

Loading