Drop-in UI validation for any CLI coding agent — iOS Simulator, Android emulator, and web in one kit.
Your agent installs the right MCPs, picks the right platform tools, drives the app like a real user, and ships an evidence-backed report. No hand-holding.
Works with Claude Code, Codex CLI, Cursor, and any AGENTS.md-compatible agent (Windsurf, Cline, Continue, Aider, Zed).
Agentic UI testing has two best-in-class primitives:
- Web —
agent-browser(Vercel Labs): token-efficient@eNaccessibility refs, video recording, React DevTools - Mobile + TV + desktop —
agent-device(Callstack): same@eNrefs, one CLI for iOS / Android / tvOS / macOS / Linux
Both ship @eN refs that the kit's skill already speaks. Add Maestro for declarative cross-platform flows (with the brand-new Maestro Viewer putting a live device inside your coding agent), wrap them in a portable skill + QA sub-agent, and you have a complete validation rig.
This kit is the integration: skill + sub-agent + auto-install per harness + platform-aware tool selection.
One install. Your agent figures out the rest.
One-liner (humans):
bash <(curl -fsSL https://raw.githubusercontent.com/Dallionking/ui-validation-kit/main/install.sh)Auto-detects which agent harness(es) you have (Claude Code, Codex, Cursor) and installs the skill, sub-agent, and MCPs into each. Project-scope by default; --global for user-scope.
Or paste this to your agent:
Read https://github.com/Dallionking/ui-validation-kit and install it. Follow GETTING-STARTED-PROMPT.md.
The agent will clone, detect the platform stack (iOS/Android/web/Expo), install the required MCPs, register the skill + QA sub-agent, and run a smoke validation on the current app.
npm install -g agent-device # mobile + TV + desktop
npm install -g agent-browser # web
brew install maestro # cross-platform declarative flowsThese are CLIs that ANY agent can call. No MCP server registration required for the primary tools.
Only needed if you specifically want the lower-level MCPs (the primary CLIs above are usually all you need):
| Platform | One-click install |
|---|---|
| iOS Simulator (fallback) | Install ios-simulator |
| Xcode build (fallback) | Install xcodebuild |
| Mobile unified (fallback) | Install mobile-mcp |
| Web — Playwright | Install playwright |
| Web — Chrome DevTools | Install chrome-devtools |
After installing the MCPs above, copy the rule:
mkdir -p .cursor/rules
curl -fsSL https://raw.githubusercontent.com/Dallionking/ui-validation-kit/main/adapters/cursor/rules/ui-validation.mdc > .cursor/rules/ui-validation.mdc| Platform | Primary tool the agent installs | Optional / fallback |
|---|---|---|
| iOS / Android / tvOS / desktop | agent-device — one CLI, @eN refs, screenshots/video/logs/network/profile |
xcrun simctl, adb, ios-simulator-mcp, XcodeBuildMCP, mobile-mcp |
| Web | agent-browser — @eN refs, video, React DevTools, Web Vitals |
Playwright MCP, Chrome DevTools MCP |
| Cross-platform declarative flows | Maestro CLI (one YAML runs everywhere) + Maestro Viewer (CLI 2.6.0+) | — |
| Expo / React Native | agent-device (covers Expo, RN, Flutter, native) |
Expo dev menu shortcuts |
| Visual diff (optional) | — | Argos, Lost Pixel |
The installer is platform-aware: only installs what your project needs. Mobile project → agent-device + Maestro. Web-only → agent-browser + Playwright. Both → both.
Why this stack? agent-device and agent-browser share the same @eN ref convention, so the skill speaks one language across all platforms.
- Skill —
ui-validationskill that loads on UI-related triggers ("validate the app", "click through test", "screenshot the home screen") - QA Sub-agent —
qa-validatoragent the orchestrator can dispatch for full validation runs - Primary CLIs —
agent-device(mobile/TV/desktop) +agent-browser(web) + Maestro (declarative flows) installed via npm/brew. No MCP registration required for the primary path. - Optional MCPs — Playwright + Chrome DevTools MCPs registered for web by default (cross-browser + perf complements to agent-browser). Mobile fallback MCPs opt-in via
--include-fallback-mcps. - Playbooks — Golden path, visual regression, accessibility audit, flaky-debug
- Cost-tiered fix triage — Agent fixes cheap bugs in-flight (CSS, JSX, missing handlers), bails on expensive ones (native rebuilds, cross-file refactors), reports everything
- Detect — Agent reads
package.json,Package.swift,build.gradle,app.jsonto identify the stack - Install — Pulls the primary CLIs (
agent-device,agent-browser,maestro) for detected platforms; optional MCPs only when relevant - Drive — Click-through navigation (never typing URLs / deep links) with
@eNaccessibility-tree refs - Evidence — Screenshots + video at every step, saved to
recordings/ - Fix-or-flag — Cost-tiered triage: cheap fixes in-flight, expensive ones reported
- Report — Structured Markdown report with verdict, evidence, remaining concerns
A self-contained demo runs the kit against a tiny static-HTML app — no real project needed.
bash demo/record/record-demo.shProduces demo/demo.gif (≈30s, ~720KB) showing click-through, modal interaction, settings toggles. See demo/README.md.
.maestro/templates/ ships 3 ready-to-customize flows:
login.yaml— login form → homenavigation.yaml— sweep every top-level tabsettings.yaml— toggles + scroll + sign-out
Customize the appId + selectors and run with maestro test .maestro/login.yaml.
ui-validation-kit/
├── skill/SKILL.md # Canonical skill (Anthropic format)
├── agent/qa-validator.md # Sub-agent definition
├── adapters/ # Per-harness install shims
│ ├── claude-code/
│ ├── codex/
│ ├── cursor/
│ └── generic/ # AGENTS.md for Cline/Continue/Aider/Windsurf/Zed
├── platforms/ # ios.md, android.md, web.md, expo.md
├── mcps/manifest.json # Portable mcpServers JSON
├── playbooks/ # golden-path, visual-regression, a11y, flaky-debug
├── examples/ # ios-swiftui, expo-rn, nextjs-web
├── .maestro/templates/ # Cross-platform Maestro starter flows
└── demo/ # Self-contained sample app + recording script
| Harness | Skill format | Sub-agent | MCP install | Tested |
|---|---|---|---|---|
| Claude Code | .claude/skills/ui-validation/SKILL.md |
.claude/agents/qa-validator.md |
claude mcp add |
✅ |
| Codex CLI | .agents/skills/ui-validation/SKILL.md |
.codex/agents/qa-validator.toml |
~/.codex/config.toml |
✅ |
| Cursor | .cursor/rules/ui-validation.mdc |
(rules-only, no sub-agent) | .cursor/mcp.json |
✅ |
| Windsurf / Cline / Continue / Aider / Zed | AGENTS.md |
(varies) | per-harness MCP file |
bash <(curl -fsSL https://raw.githubusercontent.com/Dallionking/ui-validation-kit/main/install.sh) --uninstallRemoves skill files, sub-agent files, and MCP entries from any harness. Leaves your project's UI code alone.
MIT — see LICENSE. Use it, fork it, ship it.
PRs welcome. New platform adapter? Add a file under platforms/. New playbook? Add to playbooks/. See docs/contributing.md.