Skip to content

refactor(desktop): share tool activity block updates#8470

Draft
eulicesl wants to merge 2 commits into
BasedHardware:mainfrom
eulicesl:draft/dry-tool-activity
Draft

refactor(desktop): share tool activity block updates#8470
eulicesl wants to merge 2 commits into
BasedHardware:mainfrom
eulicesl:draft/dry-tool-activity

Conversation

@eulicesl

@eulicesl eulicesl commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Extracts shared tool-call content-block update logic into ChatContentBlock.applyToolActivity.
  • Routes both main chat (ChatProvider) and task chat (TaskChatState) through the shared helper.
  • Aligns task chat terminal status mapping with ChatProvider.mapBridgeToolStatus, so failed/cancelled/interrupted statuses do not get flattened to completed.

Why draft

This is follow-up cleanup from PR #7555. It is not urgent for the stall telemetry rollout, but it removes duplicated logic that had already started to drift.

Verification

  • xcrun swift test -c debug --package-path desktop/macos/Desktop --filter 'StallDetectorTests|ToolCallStatusTests' -> 23 tests, 0 failures
  • xcrun swift build -c debug --package-path desktop/macos/Desktop -> build complete
  • python3 .github/scripts/check-desktop-changelog.py --base upstream/main --head HEAD -> Desktop changelog fragment found
  • git diff --check -> clean

Review in cubic

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the desktop chat tool-activity rendering logic to reduce duplication and prevent status-handling drift between the main chat surface and task chat surface. It centralizes the “tool call content block update” behavior and reuses the same bridge-status mapping in both locations so terminal failures aren’t accidentally shown as successful completions.

Changes:

  • Extracted shared tool-call content-block update logic into ChatContentBlock.applyToolActivity.
  • Routed both ChatProvider and TaskChatState tool-activity updates through the shared helper.
  • Updated task chat to use ChatProvider.mapBridgeToolStatus so failed/cancelled/interrupted aren’t flattened into .completed.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
desktop/macos/Desktop/Sources/Providers/ChatProvider.swift Adds the shared ChatContentBlock.applyToolActivity helper and routes ChatProvider.addToolActivity through it.
desktop/macos/Desktop/Sources/ProactiveAssistants/Assistants/TaskAgent/TaskChatState.swift Uses the shared bridge-status mapping and shared tool-activity block update helper to match main chat behavior.
desktop/macos/changelog/unreleased/dry-tool-activity.json Adds a desktop changelog fragment documenting the refactor.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Git-on-my-level

Copy link
Copy Markdown
Collaborator

Nice cleanup. Routing both chat surfaces through ChatContentBlock.applyToolActivity and aligning TaskChatState on mapBridgeToolStatus removes real drift — the old task-chat path was flattening failed/cancelled/interrupted to .completed, so this is a correctness fix, not just deduplication.

A couple of observations for when you mark this ready for review:

  • The shared helper (and its in-flight matching via isInFlight, plus the toolUseId fallback to name) is currently only covered transitively via ToolCallStatusTests/StallDetectorTests. Since it is now the single source of truth for tool-call block updates across both surfaces, a focused test on applyToolActivity itself (create-new, update-in-flight, resolve-terminal by id vs by name, no-op when already terminal) would lock the shared contract in place.
  • Minor: the changelog fragment says "Shared desktop chat tool activity updates across main and task chat surfaces" — accurate and source-backed, good.

No formal approval since this is still a draft. Will review the final version.

@Git-on-my-level Git-on-my-level added the needs-tests Needs additional or corrected test coverage label Jun 28, 2026
@eulicesl

Copy link
Copy Markdown
Collaborator Author

Added focused unit coverage for ChatContentBlock.applyToolActivity in be1e3e278: create-new, update in-flight by toolUseId, legacy name fallback when toolUseId arrives later, terminal resolution by id vs name, latest-name fallback, and no-op when the matching row is already terminal.

@eulicesl eulicesl removed the needs-tests Needs additional or corrected test coverage label Jun 28, 2026
@Git-on-my-level

Copy link
Copy Markdown
Collaborator

Nice cleanup — extracting the duplicated addToolActivity logic into ChatContentBlock.applyToolActivity(to:) and routing both ChatProvider and TaskChatState through it reads cleanly and removes the // TODO: DRY these debt.

Worth flagging that this is more than a mechanical refactor — it corrects two latent behaviors along the way:

  1. Task chat now goes through ChatProvider.mapBridgeToolStatus instead of status == "started" ? .running : .completed, so failed/cancelled/interrupted tool calls now resolve as .failed rather than being flattened to .completed. Good catch — that's a real bug fix.
  2. The shared running-branch in-place-update guard is unified on st.isInFlight (previously ChatProvider matched only st == .running), so a .slow/.stalled tool that later receives a .running status with input no longer appends a duplicate running block. Consistent with the terminal-state branch, which already used isInFlight at both call sites.

One gap before this is marked ready for review: the new shared helper (applyToolActivity) — its update-in-place vs. append branches and the slow/stalled + late-input path — doesn't have direct unit coverage. The existing ToolCallStatusTests/StallDetectorTests cover the primitives and the detector, but not this centralized block-mutation logic. A few focused tests on the helper (update by toolUseId; update by name when isInFlight; append when no match; terminal status preserved; no duplicate on slow/stalled + late running+input; task-chat failed/cancelled/interrupted → .failed) would let this graduate cleanly.

Adding needs-tests for that. Otherwise looks good.

@Git-on-my-level Git-on-my-level added needs-tests Needs additional or corrected test coverage and removed needs-tests Needs additional or corrected test coverage labels Jun 28, 2026
@Git-on-my-level

Copy link
Copy Markdown
Collaborator

Thanks @euliceslbe1e3e278 addresses the testing gap. The six cases in ChatContentBlockToolActivityTests cover the shared applyToolActivity contract directly: create-new, update-in-flight by toolUseId, legacy name fallback when toolUseId arrives later (and correctly preserves a detector-promoted .slow rather than clobbering it to .running), terminal resolution by id-before-name, latest-name fallback, and no-op when the matching row is already terminal. That matches every scenario from the earlier review.

Removing needs-tests. The refactor itself reads cleanly: the single helper is now the source of truth, the task-chat mapBridgeToolStatus alignment is a genuine correctness fix, and the changelog fragment is accurate. One note for a human maintainer before merge — the running-branch matching was unified on isInFlight (previously ChatProvider matched only == .running), which is the intended behavior but is a subtle behavioral change worth a glance on macOS.

Still a draft, so no formal approval from me. Looks good to mark ready once a maintainer confirms.

@Git-on-my-level

Copy link
Copy Markdown
Collaborator

Nice cleanup. Extracting the duplicated tool-activity logic into ChatContentBlock.applyToolActivity reads cleanly, and the new tests cover the important branches well (toolUseId match, legacy name fallback, latest-in-flight resolution, and no-rewrite of terminal blocks).

One thing worth surfacing beyond the pure DRY win: TaskChatState was previously mapping bridge status as started ? .running : .completed, which silently flattened failed/cancelled/interrupted tool calls to completed in the task chat surface. Routing it through mapBridgeToolStatus fixes that, so failed calls now resolve correctly. Good catch folding that into the refactor rather than preserving the drift.

This looks mergeable to me once you're ready to flip it out of draft. Is anything still pending, or was the draft marker just process/caution?

@eulicesl

eulicesl commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

No implementation work is still pending from my side. The draft marker is process/caution: this was created as a later cleanup PR, and the requested focused ChatContentBlock.applyToolActivity coverage is now in place. I’ll leave it draft until ready to review/promote the cleanup, but the reviewer comments are addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants