Skip to content

[WIP] Start on a coding agent demo#117

Draft
msullivan wants to merge 60 commits into
mainfrom
tau-agent
Draft

[WIP] Start on a coding agent demo#117
msullivan wants to merge 60 commits into
mainfrom
tau-agent

Conversation

@msullivan
Copy link
Copy Markdown
Contributor

@msullivan msullivan commented May 13, 2026

tau, a crappy coding agent using Textual.
(Even though I don't really like alternate screen coding agents, but.)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-python Ready Ready Preview, Comment May 22, 2026 6:12pm

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 13, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​rich@​15.0.098100100100100
Addedpypi/​pydantic@​2.13.4100100100100100
Addedpypi/​modelsdotdev@​0.20260516.1100100100100100
Addedpypi/​ruff@​0.15.13100100100100100

View full report

msullivan added 14 commits May 22, 2026 11:02
The composer is no longer disabled mid-turn.  Submissions go into a
pending queue; run_turn is the sole consumer and loops until drained,
popping one queued message per turn so user/assistant alternation
stays clean.
All interaction with the `ai` library now lives in `chat_loop(app)`
at the top of the file.  TauApp owns public state (model, agent,
messages, pending) that chat_loop reads.  run_turn shrinks to a thin
worker wrapper around the busy flag.
Streaming deltas only call scroll_end when the transcript was already
at the bottom — scroll up to read earlier output and the stream keeps
writing offscreen.  The scrollbar itself is hidden (scrollbar-size: 0
0) because its per-chunk thumb motion was visually noisy; arrow keys,
pageup/pagedown, and mouse wheel still scroll.
tools.py mirrors pi's built-in surface: read, write, edit, bash, grep,
find, ls.  Same schema shapes and continuation/truncation behavior:

- read: 1-indexed offset/limit; head truncation at 2000 lines or 50KB
  with a 'use offset=N to continue' hint; first-line-too-big escape
  pointing at sed | head -c.
- write/edit/bash: require_approval=True so the agent's default loop
  gates them behind a ToolApproval hook.
- edit: exact-match, must-be-unique str_replace; multiple disjoint
  edits per call, applied right-to-left against the original file.
- bash: tail truncation (errors live at the end), exit-code footer.
- grep/find: skip .git/node_modules/etc; respect limit/byte caps.

chat_loop now handles ToolEnd, ToolCallResult, and HookEvent — tool
calls and their results render as 'tool' bubbles below the streaming
text, and pending approvals turn the composer placeholder into a
[y/n] prompt.  Non-y/n input during an approval falls through to the
message queue without resolving the hook.
Replaces the y/n-in-the-composer approval flow with a HookPrompt
widget that mounts above the composer when a tool fires its approval
hook.  Yellow rounded border, shows the tool name + args; single-key
shortcuts (y/a approve, n/d deny) resolve the hook.  Focus shifts to
the prompt automatically and returns to the composer on resolution.
Tab cycles between prompt and composer if the user wants to look
something up before deciding \u2014 the hook stays pending until y/n.

Drops the parallel y/n branch from on_composer_submitted; the prompt
owns the decision now.
Only bash still requires approval.  File mutations through write/edit
are fine — they're targeted, reversible, and the approval prompt
became friction more than safety once both fired several times per
turn.
… messages

When TAU_ADVERTISE=1 is set, appends an instruction to the system prompt
asking the model to include a Co-authored-by trailer in any commit messages
it writes or suggests.

Co-authored-by: anthropic/claude-sonnet-4.6, via tau
Co-authored-by: anthropic/claude-opus-4.6, via tau
Add a Static widget below the composer that displays running totals
for input, output, cache-read, and cache-write tokens.  Updated on
each turn and when restoring a session.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Derive context estimate from the last assistant turn's input + output
tokens and display it as 'ctx: ~N' in the footer bar.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Pass providerOptions.gateway.caching=auto as params to agent.run().
Update the usage footer to show uncached input as 'in' and
cache-read tokens separately as 'cached'.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Co-authored-by: anthropic/claude-opus-4.6, via tau
Bash approval prompts now offer four options: [y] approve once,
[n] deny, [!] always approve this exact command, [a] always approve
all commands for the rest of the session.

Co-authored-by: anthropic/claude-opus-4.6, via tau
msullivan added 19 commits May 22, 2026 11:02
Co-authored-by: anthropic/claude-opus-4.6, via tau
Session replay (iterating messages and creating bubbles) is now a free
function alongside chat_loop, following the same pattern of taking the
app and calling its rendering methods.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Co-authored-by: anthropic/claude-opus-4.6, via tau
…ember

Replace the three separate methods (approve_command, approve_directory,
approve_all) and the if-chain in on_hook_prompt_decided with a single
remember(hook, decision) method on ApprovalTracker.

Co-authored-by: anthropic/claude-opus-4.6, via tau
ApprovalTracker.resolve handles both updating approval state and
calling ai.resolve_hook. Removes _resolve_hook from TauApp — the app
just shows the system message based on the returned bool.

Co-authored-by: anthropic/claude-opus-4.6, via tau
All UI code (HookPrompt, ApprovalTracker, TauApp hook plumbing) now
works with our own Hook type. The ai.messages.HookPart is converted
once in _run_turn via Hook.from_event. The only remaining reference
to HookPart is in that constructor.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Co-authored-by: anthropic/claude-opus-4.6, via tau
ApprovalTracker.check now returns a list of PromptOption when the
operator needs to decide, instead of bare None. HookPrompt renders
whatever options it's given and uses _on_key for dynamic dispatch
instead of static BINDINGS. The widget no longer knows about tool
categories.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Put the arrow marker before the newline, not after it, so tool result
bubbles don't start with a blank line.

Co-authored-by: anthropic/claude-opus-4.6, via tau
…omment

- Move _bell from TauApp staticmethod to module-level function.
- Drop unnecessary async from on_text_area_changed.
- Update tools.py docstring to explain why all tools use
  require_approval=True.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Handle ai.events.PartialToolCallResult in the agent loop to stream
tool output (e.g. bash) into the transcript as it arrives, instead of
waiting for the final ToolCallResult. Each tool_call_id gets its own
bubble; final results are skipped for tools that were already streamed.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Prevents child processes from inheriting the terminal's stdin, which
would conflict with Textual's input handling.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Iterate assistant message parts instead of just using msg.text, so
ReasoningPart (thinking), ToolCallPart (tool invocations), and
TextPart are all rendered into the transcript on resume.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Walk up from cwd looking for AGENTS.md and append its contents to the
system prompt. Follows the convention used by other coding agents for
project-level instructions.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Detect image files (jpg, png, gif, webp) via magic bytes and return
them as base64-encoded image content parts instead of dumping binary
as garbled text. Uses the Vercel AI SDK multi-part content format
(text + image) so the gateway can pass images through to vision models.

Co-authored-by: anthropic/claude-opus-4.6, via tau
The hand-rolled follow_scroll / at_bottom approach had a stale-state bug:
after an async gap (e.g. between stream end and error handler), Textual's
layout would update and at_bottom would become False even though the user
never scrolled.

Textual's built-in Widget.anchor() handles all of this correctly:
- New content auto-scrolls to the bottom
- User scrolls up → anchor releases, no forced scrolling
- User scrolls back to bottom → anchor restores

Removed: follow_scroll(), at_bottom property, auto_scroll parameter,
         _following flag, and all manual scroll_end calls.
Added: single transcript.anchor() call at mount time.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Co-authored-by: anthropic/claude-opus-4.6, via tau
Switch from Textual's default dark theme (which forces #121212
background and #1e1e1e surface) to ansi-dark, which uses the
terminal's own background color. Removes explicit background
overrides from composer and hook prompt widgets.

Co-authored-by: anthropic/claude-opus-4.6, via tau
msullivan added 8 commits May 22, 2026 11:05
Tools can return [str, FilePart] (e.g. image reads) and all three
providers (gateway, anthropic, openai) now convert them to proper
image/document content blocks instead of stringifying them.

Provider changes:
- Gateway: _has_file_parts(), _file_part_to_v3_inline(),
  _multipart_to_v3_content(), _tool_result_output() route between
  json, content, and error-text output types
- Anthropic: _tool_result_to_anthropic() expands FileParts into
  image/document blocks via existing _file_part_to_anthropic()
- OpenAI: _tool_result_to_openai() converts images to image_url
  with data URLs for chat-completions

FilePart serialization fixes:
- Replace ser_json_bytes/val_json_bytes with a field_serializer that
  uses standard base-64 (+, /) instead of pydantic's URL-safe
  encoding (-, _) which LLM APIs reject
- Add model_validator on ToolResultPart that rehydrates FilePart
  dicts back into FilePart instances after JSON round-trip (the
  result: Any field loses type info during deserialization)

Co-authored-by: anthropic/claude-opus-4.6, via tau
Ctrl+J inserts a newline (works on all terminals). Trailing
backslash before Enter also inserts a newline. Shift/Alt+Enter
work on terminals with Kitty keyboard protocol support.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Detect the underlying provider from MODEL_ID and add the
appropriate server-side web search tool:
- anthropic -> anthropic.tools.web_search()
- openai/xai -> openai.tools.web_search()

Also adds Ctrl+J and backslash newline shortcuts to the
composer, and fixes a docstring escape warning.

Co-authored-by: anthropic/claude-opus-4.6, via tau
The model wasn't using web_search because the system prompt
explicitly listed only the coding tools. Now it includes a
mention of web_search when provider tools are configured.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Display BuiltinToolEnd and BuiltinToolResult events (from
provider-executed tools like web_search) in the transcript
alongside regular tool call/result output.

Also updates system prompt on session resume so new tools
and AGENTS.md changes take effect without a fresh session.

Co-authored-by: anthropic/claude-opus-4.6, via tau
MODEL_ID includes the 'gateway:' prefix, which shouldn't appear in
commit trailers. Use _raw_model instead to get the user-provided value.

Co-authored-by: anthropic/claude-opus-4.6, via tau
- Import detect_image_media_type from ai.types.media instead of
  ai.messages.media (not an explicit export).
- Rename shadowed variable to avoid ToolCallPart/BuiltinToolCallPart
  type conflict.
- Guard against None tool_call_id in PartialToolCallResult.
- Add mypy to dev dependencies.

Co-authored-by: anthropic/claude-opus-4.6, via tau
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant