Skip to content

fix: harden daemon engine stability#4

Closed
zensh wants to merge 3 commits into
mainfrom
claude/angry-black-aa524e
Closed

fix: harden daemon engine stability#4
zensh wants to merge 3 commits into
mainfrom
claude/angry-black-aa524e

Conversation

@zensh

@zensh zensh commented Jun 11, 2026

Copy link
Copy Markdown
Member

What changed

Stability review of anda_bot/src/engine.rs, fixing four issues plus one simplification:

  • Atomic config writes: PUT /daemon/config previously overwrote config.toml in place with tokio::fs::write, which truncates before writing. A crash or power loss mid-write would corrupt the config and prevent the daemon (and the anda CLI, which parses the config on load) from starting. Writes now go through a same-directory temp file with sync_all followed by an atomic rename, with temp-file cleanup on failure.
  • Serialized config updates: concurrent PUT /daemon/config requests could interleave the backup check, backup copy, and file write. Updates are now serialized with a tokio::sync::Mutex in the route state. Reads need no lock since rename is atomic.
  • Non-fatal skill loading: skills_tool.load().await? aborted daemon startup on any skill-directory scan error. The scanned dirs include the shared, user-writable ~/.agents/skills, so a permission problem there could brick startup entirely. Failures are now logged and the daemon starts without skills.
  • Status errors surface their cause: /daemon/status failures (which depend on the brain service via brain_status()) returned a fixed "Failed to get status" with no logging. The handler now logs a warning and includes the error detail in the 500 response.
  • Simplification: removed verify_update_request, an empty pass-through wrapper around verify_authenticated_request.

Reviewed and intentionally unchanged

  • /daemon/status stays unauthenticated: it is the launcher's readiness probe (250ms polling), returns only three counters, and the gateway binds to 127.0.0.1:8042 by default.
  • The brain CWT token is signed once at startup without an exp claim; cwt_from only validates expiry when the claim is present, so long-running daemons cannot hit token expiry.

Testing

  • cargo test -p anda_bot --bin anda engine — 113 passed, including a new test covering the atomic write helper (replace + create paths, no leftover temp files).
  • cargo clippy and cargo fmt --check clean.

🤖 Generated with Claude Code

zensh and others added 3 commits June 11, 2026 09:34
Refine desktop launcher update handling so downloaded updates are
recorded in the menu instead of prompting from automatic polling.
Clear the pending downloaded-update state after a successful restart
handoff and update the menu label to describe the explicit action.

Co-Authored-By: Anda Bot <noreply@anda.bot>
…est progress

When a background shell task or subagent completes, the final output
prompt now checks against the latest intermediate progress content. If
identical, it sends a short completion notice instead of repeating the
full payload, reducing context window noise.

Extract helper functions for both subagent and shell background output
formatting, and maintain a per-session progress output cache that is
cleaned up on task start and completion.

Includes 4 unit tests covering dedup and non-dedup paths for both
subagent and shell background completion.

Co-Authored-By: Anda Bot <noreply@anda.bot>
Write daemon config updates atomically via temp file + fsync + rename
and serialize concurrent updates with a mutex, so a crash mid-save can
no longer leave a truncated config that blocks the next daemon start.
Make skill directory scan failures non-fatal at startup, log and return
the underlying error from /daemon/status, and drop the redundant
verify_update_request wrapper.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 11, 2026 15:17
@zensh

zensh commented Jun 11, 2026

Copy link
Copy Markdown
Member Author

Superseded: rebasing locally onto main instead.

@zensh zensh closed this Jun 11, 2026
@zensh zensh deleted the claude/angry-black-aa524e branch June 11, 2026 15:20

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on improving daemon stability and launcher update UX in the anda_bot crate, including safer daemon config writes and clearer launcher update behavior.

Changes:

  • Harden daemon config writes by serializing concurrent updates and writing via temp-file + fsync + rename.
  • Make daemon startup more resilient by logging (not failing) on skill-directory scan/load errors.
  • Adjust launcher update flow to only install/restart from explicit user action, with improved menu state handling and updated localized strings.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
scripts/publish-homebrew.sh Updates Homebrew formula test to assert the macOS launcher binary exists.
CHANGELOG.md Adds 0.9.7 release notes describing daemon stability and launcher update UX changes.
Cargo.lock Bumps anda_bot package version to 0.9.7.
anda_bot/Cargo.toml Bumps crate version to 0.9.7.
anda_bot/src/engine/agent.rs Avoids duplicating final background outputs when they match the latest progress output; adds tests.
anda_bot/src/engine.rs Adds serialized + atomic config writes, makes skill loading non-fatal, improves /daemon/status error reporting.
anda_bot/src/bin/anda_launcher/windows.rs Removes auto-prompting on downloaded updates; aligns restart-success UI state handling.
anda_bot/src/bin/anda_launcher/macos.rs Same update-flow adjustments as Windows launcher.
anda_bot/src/bin/anda_launcher/core.rs Adds finish_update_restart_success to clear “downloaded update” menu state after successful install/restart.
anda_bot/locales/app.yml Updates EN/ZH menu text to “Install and restart …”.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread anda_bot/src/engine.rs
Comment on lines +586 to +590
let mut file = tokio::fs::File::create(&tmp_path).await?;
file.write_all(content).await?;
file.sync_all().await?;
drop(file);
tokio::fs::rename(&tmp_path, path).await
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants