Skip to content

feat: add agent catalog/auth API and safer orchestrator switching#2309

Open
nikhilachale wants to merge 27 commits into
AgentWrapper:mainfrom
nikhilachale:agent-switcher-updated
Open

feat: add agent catalog/auth API and safer orchestrator switching#2309
nikhilachale wants to merge 27 commits into
AgentWrapper:mainfrom
nikhilachale:agent-switcher-updated

Conversation

@nikhilachale

@nikhilachale nikhilachale commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Summary
This PR adds a daemon-backed agent catalog, exposes installed/authorized agent state to the frontend, and uses that data in project settings so users can choose worker/orchestrator agents more safely.

It also adds orchestrator replacement handling: when the saved orchestrator agent changes, AO starts the replacement first and only retires the previous orchestrator after the new one is up, so a failed replacement does not cause downtime.

A key caveat is that agent-auth/login flows can interfere with replacement startup. If switching agents triggers the agent’s own bootstrap path, the replacement may come up outside AO’s normal orchestrator initialization path and miss the AO orchestrator system prompt.

What Changed
Backend
Added agent inventory service for:

supported agents
installed agents
authorized agents
counts for each
Added optional AgentAuthChecker capability on adapters.

Added shared CLI auth probing helper for adapters with cheap local auth checks.

Added GET /api/v1/agents.

Extended registry inventory entries to carry adapter manifest metadata for user-facing labels.

Added orchestrator replacement flow in the session service:

spawn replacement first
retire previous orchestrator only after successful replacement
preserve previous orchestrator when replacement startup fails
Added backend tests for agent catalog, controller responses, session replacement behavior, and related project/service wiring.

Frontend
Regenerated API types for the new agents endpoint/DTOs.

Updated ProjectSettingsForm to:

load agent catalog from the daemon
show authorized agent options
handle installed-but-not-authorized states
surface orchestrator replacement pending state
allow retry once replacement is safe to perform
Added/updated tests for the new settings behavior.

aoagents/ReverbCode#276. extended version of this pr

closes #2310

nikhilachale and others added 15 commits June 30, 2026 12:53
- Implemented AgentsController to handle /agents endpoint, returning a list of supported and installed agents.
- Created agent inventory service to manage agent data and detect installed agents.
- Updated ProjectSettingsForm to fetch and display agent information, including installed and supported agents.
- Enhanced error handling for agent detection and orchestrator restarts.
- Added tests for agent catalog and service to ensure correct functionality and error handling.
…flect changes

- Added `AuthStatus` method to various agent plugins to check authorization status using CLI probes.
- Introduced `authprobe` package to handle common CLI command checks for agent authorization.
- Updated backend tests to include scenarios for authorized and unauthorized agents.
- Modified frontend API schema to include `authorized` counts and `authStatus` for agents.
- Enhanced `ProjectSettingsForm` to display authorized agents and their statuses, including prompts for login when necessary.
- Adjusted agent selection logic to prioritize authorized agents and provide feedback for unauthorized or uninstalled agents.
- Updated NewTaskDialog tests to increase timeout for async operations.
- Modified ProjectSettingsForm tests to improve agent handling and validation messages.
- Refactored ProjectSettingsForm component to streamline agent selection and validation logic.
- Introduced new agent service to manage agent inventory and authentication status.
- Improved Sidebar tests to ensure proper agent options are loaded and handled.
- Enhanced SessionsBoard component by removing unused imports and optimizing state management.
- Fixed Select component styling for better consistency in UI.
- Added error handling for AO daemon readiness in ShellLayout.
- Implement local authentication status checks for PI, Qwen, and Vibe agents.
- Introduce JSON-based authentication status retrieval for PI agents.
- Add environment variable checks for Qwen agents and improve settings file parsing.
- Enhance Vibe agent authentication with support for environment variables and session logs.
- Update agent service to handle asynchronous probing for installed and authorized agents.
- Modify session manager to support prompt delivery strategies based on agent capabilities.
- Improve frontend agent selection UI with loading states and error handling.
- Add tests for new authentication logic and session management features.
- Implement local authentication status checks for the Devin, Droid, Kiro, and other agents.
- Add support for reading credentials from specific configuration files and environment variables.
- Introduce new tests for various agents to ensure proper authentication status reporting.
- Refactor existing authentication logic to improve clarity and maintainability.
- Remove deprecated agent setup warnings from the SessionsBoard component in the frontend.
@nikhilachale

Copy link
Copy Markdown
Collaborator Author

@neversettle17-101 @illegalcall please review this

@nikhilachale

Copy link
Copy Markdown
Collaborator Author

nikhilachale and others added 7 commits July 1, 2026 10:25
- Deleted the RetireOrchestrator function and its associated error handling.
- Removed tests related to orchestrator retirement and state management.
- Simplified ProjectSettingsForm by eliminating orchestrator restart logic and related UI elements.
- Updated API client mocks to reflect the removal of orchestrator-related functionality.
- Added agent refresh functionality in ProjectSettingsForm with UI updates for agent availability.
- Implemented `refreshAgents` API call to fetch the latest agent catalog.
- Updated agent selection logic to disable unavailable agents and show appropriate error messages.
- Enhanced error handling in `apiErrorMessage` to include daemon error codes alongside messages.
- Created new test cases for agent availability and error handling in Sidebar component.
- Introduced `ResolveBinary` method for multiple agent adapters to standardize binary resolution.
- Added new agent adapter files for various agents (e.g., Aider, Claude Code, etc.) to support binary resolution.
@nikhilachale

Copy link
Copy Markdown
Collaborator Author

Agent adapters gained binary-resolution behavior across Agy, Aider, Amp, Auggie, AutoHand, Claude Code, Cline, Codex, Continue, Copilot, Crush, Cursor, Devin, Droid, Goose, Grok, Kilocode, Kimi, Kiro, OpenCode, Pi, Qwen, and Vibe.
The agent service and ports were extended to support refreshed catalog status and availability checks.
HTTP agent controllers and DTOs were updated, with OpenAPI generation kept in sync.
Added refresh support for agent catalog data, exposed refresh in the API contract, disabled unavailable agents in selection flows.
@illegalcall

@illegalcall illegalcall left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few issues that should be fixed before merging:

  1. P1: Preserve the no-resurrection marker cleanup

session_worktrees is used as the shutdown-restore marker. The intended flow is: shutdown writes the marker, next boot restores the session, then the marker is consumed/deleted.

In this PR, Kill marks a session terminated but no longer clears that stale marker. Example: AO shutdown saves proj-1, next boot restores it, the user later explicitly kills proj-1, and the next daemon boot sees the old marker and relaunches the killed session. Explicit kill intent must win.

Please restore marker cleanup on Kill and after RestoreAll consumes a marker, and keep the no-resurrection tests.

  1. P2: Return arrays, not null, for the initial agent cache

On a fresh daemon before POST /agents/refresh, Installed and Authorized are nil slices. JSON encodes those as null, but the OpenAPI/generated TS contract says these fields are arrays.

Example response today can be:

{ "supported": [...], "installed": null, "authorized": null }

Clients should receive:

{ "supported": [...], "installed": [], "authorized": [] }

Please initialize/clone nil slices as empty slices so first-load clients always get arrays.

  1. P2: Make OpenCode binary resolution fail when absent

/agents/refresh treats a successful ResolveBinary as installed. OpenCode currently falls back to the bare string "opencode", so a machine without OpenCode can still report it as installed. Later, spawn does the real binary check and fails.

For catalog/install detection, ResolveBinary should only succeed when a real executable path is found; otherwise return ports.ErrAgentBinaryNotFound.

  1. P2: Avoid sending a real Continue prompt during auth refresh

Refreshing the catalog should be local/status-only. The Continue auth fallback runs cn -p hi, which is a prompt execution path, not an auth status check.

Example: opening project settings and refreshing agents could submit an actual model prompt, create local session/history, consume quota, or hang on provider/login state. If Continue has no safe local auth signal, return unknown rather than running a live prompt.

@nikhilachale nikhilachale requested a review from illegalcall July 1, 2026 22:23
@nikhilachale

Copy link
Copy Markdown
Collaborator Author

-session_manager
- Kill now deletes stale session_worktrees restore markers so explicitly killed sessions do not resurrect on the next boot.
- RestoreAll now deletes the marker after a successful restore.
- Added tests for both cleanup paths.

  • service/agent

    • Initial agent inventory now returns empty arrays for installed and authorized, not null.
    • cloneInventory now preserves empty slices with make + copy.
    • Added test assertions for non-nil empty slices.
  • opencode

    • ResolveOpenCodeBinary no longer falls back to "opencode" when absent.
    • It now returns ports.ErrAgentBinaryNotFound if no real executable path is found.
    • Added tests for missing binary behavior and canceled context.
  • continueagent

    • Removed refresh-time auth probing.
    • Continue no longer implements ports.AgentAuthChecker, so catalog refresh will mark it installed if cn exists but leave authStatus as unknown.
    • Removed the cn -p hi probe and related tests.
    • Added a regression test ensuring Continue does not implement AgentAuthChecker.

is these changes are good or something else is needed ? @illegalcall

@illegalcall illegalcall left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining issues on latest head abbd29c:

  1. P2: unknown auth agents are still unselectable

Catalog auth is advisory, and Continue now intentionally returns authStatus: "unknown" because there is no safe local auth probe. But RequiredAgentField still disables every agent not present in authorized (CreateProjectAgentSheet.tsx builds authorizedIds and sets disabled: !isAuthorized).

Example: Continue is installed, /agents/refresh returns it in installed with authStatus: "unknown", but not in authorized; the UI shows “Needs auth” and disables it. The user can never select Continue from the guided create/settings/task flows even though spawn is supposed to be the authoritative validation point. Installed + unknown should be selectable, probably with a warning/status label; spawn can surface the real auth failure.

Refs: backend/internal/service/agent/service.go, frontend/src/renderer/components/CreateProjectAgentSheet.tsx

  1. P2: Fresh daemon still starts with an empty usable catalog

NewWithAgents initializes installed and authorized as empty arrays, and _shell preloads only GET /agents through ensureQueryData(agentsQueryOptions). Since GET /agents does not run probes, a fresh daemon presents installed=[] and authorized=[] until the user manually clicks Refresh.

Example: user chooses a project folder, the agent sheet opens, both dropdowns are required, but every supported agent is disabled because the cached catalog has no authorized entries yet. First load should trigger refresh, or the UI should distinguish “not refreshed yet” from “known unavailable.”

Refs: backend/internal/service/agent/service.go, frontend/src/renderer/routes/_shell.tsx, frontend/src/renderer/components/CreateProjectAgentSheet.tsx

  1. P2: Auth text parser can mark explicit false status as authorized

StatusFromText only has false patterns for loggedin. Outputs like { "authenticated": false } or { "authorized": false } miss the negative block, then match the positive substrings authenticated / authorized and return authorized.

That can put an unauthenticated agent into authorized, making the UI show it as ready until spawn fails later. Please handle structured/common false keys before broad positive substring matches, or parse JSON-ish status output more carefully.

Ref: backend/internal/adapters/agent/authprobe/authprobe.go

  1. P3: Docs links are broken

The PR deletes docs/agent/README.md, but README.md and docs/README.md still link to it. docs/README.md also links agent/switching.md, but no docs/agent/* files exist in this branch.

Refs: README.md, docs/README.md

nikhilachale and others added 2 commits July 3, 2026 02:12
…puts and add tests for explicit false/true keys

chore(docs): update README to remove outdated agent adapter contract references
fix(components): improve agent selection logic to handle unknown auth status and update related tests
@nikhilachale

Copy link
Copy Markdown
Collaborator Author
  • Unknown installed agents: fixed in frontend/src/renderer/components/CreateProjectAgentSheet.tsx
    - Installed agents with authStatus: "unknown" are selectable.
    - They show Auth unknown with a warning icon.
    - Explicit unauthorized still stays disabled as Needs auth.

    • Fresh daemon empty catalog: fixed in frontend/src/renderer/routes/_shell.tsx

      • First ready daemon load now uses refreshAgents under the existing agents query key, so /agents/refresh runs
        instead of only cached GET /agents.
    • False auth parser: fixed in backend/internal/adapters/agent/authprobe/authprobe.go

      • Explicit false keys like "authenticated": false, "authorized": false, logged_in=false, etc. are checked before
        broad positive substring matches.

      • Covered by new tests in authprobe_test.go.

    • Broken docs links: fixed in README.md and docs/README.md

      • Removed stale links to missing docs/agent/README.md and docs/agent/switching.md.
Screenshot 2026-07-03 at 02 02 48

@illegalcall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Project Settings allows switching to unsupported orchestrator agents without preflight install/auth checks

2 participants