2828 - [ Screenshot] ( #screenshot )
2929 - [ Action Recording & Playback] ( #action-recording--playback )
3030 - [ JSON Action Scripting] ( #json-action-scripting )
31+ - [ MCP Server (Use AutoControl from Claude)] ( #mcp-server-use-autocontrol-from-claude )
3132 - [ Scheduler (Interval & Cron)] ( #scheduler-interval--cron )
3233 - [ Global Hotkey Daemon] ( #global-hotkey-daemon )
3334 - [ Event Triggers] ( #event-triggers )
6667- ** Event Triggers** — fire scripts when an image appears, a window opens, a pixel changes, or a file is modified
6768- ** Run History** — SQLite-backed run log across scheduler / triggers / hotkeys / REST with auto error-screenshot artifacts
6869- ** Report Generation** — export test records as HTML, JSON, or XML reports with success/failure status
70+ - ** MCP Server** — JSON-RPC 2.0 Model Context Protocol server (stdio + HTTP/SSE) so Claude Desktop / Claude Code / custom tool-use loops can drive AutoControl. ~ 90 tools, full protocol coverage (resources, prompts, sampling, roots, logging, progress, cancellation, elicitation), bearer-token auth + TLS, audit log, rate limit, plugin hot-reload, CI fake backend
6971- ** Remote Automation** — TCP socket server ** and** REST API server to receive automation commands
7072- ** Plugin Loader** — drop ` .py ` files exposing ` AC_* ` callables into a directory and register them as executor commands at runtime
7173- ** Shell Integration** — execute shell commands within automation workflows with async output capture
8183
8284## Architecture
8385
86+ The runtime is layered: ** client surfaces** (CLI, GUI, MCP/REST/socket
87+ servers) sit on top of the ** headless API** (` wrapper/ ` + ` utils/ ` ),
88+ which resolves to a ** per-OS backend** chosen at import time by
89+ ` wrapper/platform_wrapper.py ` . The package façade
90+ (` je_auto_control/__init__.py ` ) re-exports every public name so users
91+ need only ` import je_auto_control ` regardless of which surface or
92+ backend they hit.
93+
94+ ``` mermaid
95+ flowchart LR
96+ subgraph Clients["Client Surfaces"]
97+ direction TB
98+ Claude[["Claude Desktop /<br/>Claude Code"]]
99+ APIUser[["Custom Anthropic /<br/>OpenAI tool loops"]]
100+ HTTPClient[["HTTP / SSE clients"]]
101+ TCPClient[["Socket / REST clients"]]
102+ GUIUser[["PySide6 GUI"]]
103+ CLIUser[["python -m<br/>je_auto_control[.cli]"]]
104+ Library[["Library users<br/>(import je_auto_control)"]]
105+ end
106+
107+ subgraph Transports["Transports & Servers"]
108+ direction TB
109+ Stdio["MCP stdio<br/>JSON-RPC 2.0"]
110+ HTTPMCP["MCP HTTP /<br/>SSE + auth + TLS"]
111+ REST["REST server<br/>:9939"]
112+ Socket["Socket server<br/>:9938"]
113+ end
114+
115+ subgraph MCP["mcp_server/"]
116+ direction TB
117+ Dispatcher["MCPServer<br/>(JSON-RPC dispatcher)"]
118+ Tools["tools/<br/>~90 ac_* + aliases"]
119+ Resources["resources/<br/>files · history ·<br/>commands · screen-live"]
120+ Prompts["prompts/<br/>built-in templates"]
121+ Context["context · audit ·<br/>rate-limit · log-bridge"]
122+ FakeBE["fake_backend<br/>(CI smoke)"]
123+ end
124+
125+ subgraph Core["Headless Core (wrapper/ + utils/)"]
126+ direction TB
127+ Wrapper["wrapper/<br/>mouse · keyboard · screen ·<br/>image · record · window"]
128+ Executor["executor/<br/>AC_* JSON action engine"]
129+ Vision["vision/ · ocr/ ·<br/>accessibility/"]
130+ Recorder["scheduler/ · triggers/ ·<br/>hotkey/ · plugin_loader/<br/>run_history/"]
131+ IOUtils["clipboard/ · cv2_utils/ ·<br/>shell_process/ · json/"]
132+ end
133+
134+ subgraph Backends["Per-OS Backends"]
135+ direction TB
136+ Win["windows/<br/>Win32 ctypes"]
137+ Mac["osx/<br/>pyobjc · Quartz"]
138+ X11["linux_with_x11/<br/>python-Xlib"]
139+ end
140+
141+ Claude --> Stdio
142+ APIUser --> Stdio
143+ HTTPClient --> HTTPMCP
144+ TCPClient --> Socket
145+ TCPClient --> REST
146+
147+ Stdio --> Dispatcher
148+ HTTPMCP --> Dispatcher
149+ Dispatcher --> Tools
150+ Dispatcher --> Resources
151+ Dispatcher --> Prompts
152+ Dispatcher -.- Context
153+ Tools -.optional.-> FakeBE
154+
155+ Tools --> Wrapper
156+ Tools --> Executor
157+ Tools --> Vision
158+ Tools --> Recorder
159+ Tools --> IOUtils
160+ Resources --> Recorder
161+ Resources --> Wrapper
162+
163+ REST --> Executor
164+ Socket --> Executor
165+
166+ GUIUser --> Wrapper
167+ GUIUser --> Recorder
168+ CLIUser --> Executor
169+ Library --> Wrapper
170+ Library --> Executor
171+
172+ Wrapper --> Backends
173+ Vision -.- Wrapper
174+ Recorder -.- Executor
175+ ```
176+
84177```
85178je_auto_control/
86179├── wrapper/ # Platform-agnostic API layer
@@ -89,19 +182,21 @@ je_auto_control/
89182│ ├── auto_control_keyboard.py# Keyboard operations
90183│ ├── auto_control_image.py # Image recognition (OpenCV template matching)
91184│ ├── auto_control_screen.py # Screenshot, screen size, pixel color
185+ │ ├── auto_control_window.py # Cross-platform window manager facade
92186│ └── auto_control_record.py # Action recording/playback
93187├── windows/ # Windows-specific backend (Win32 API / ctypes)
94188├── osx/ # macOS-specific backend (pyobjc / Quartz)
95189├── linux_with_x11/ # Linux-specific backend (python-Xlib)
96190├── gui/ # PySide6 GUI application
97191└── utils/
192+ ├── mcp_server/ # MCP server (stdio + HTTP/SSE) — server, tools/, resources, prompts, audit, rate_limit, fake_backend, plugin_watcher
98193 ├── executor/ # JSON action executor engine
99194 ├── callback/ # Callback function executor
100195 ├── cv2_utils/ # OpenCV screenshot, template matching, video recording
101196 ├── accessibility/ # UIA (Windows) / AX (macOS) element finder
102197 ├── vision/ # VLM-based locator (Anthropic / OpenAI backends)
103198 ├── ocr/ # Tesseract-backed text locator
104- ├── clipboard/ # Cross-platform clipboard
199+ ├── clipboard/ # Cross-platform clipboard (text + image)
105200 ├── scheduler/ # Interval + cron scheduler
106201 ├── hotkey/ # Global hotkey daemon
107202 ├── triggers/ # Image/window/pixel/file triggers
@@ -408,6 +503,77 @@ je_auto_control.execute_action([
408503| Process | ` AC_execute_process ` |
409504| Executor | ` AC_execute_action ` , ` AC_execute_files ` |
410505
506+ ### MCP Server (Use AutoControl from Claude)
507+
508+ Expose AutoControl as a Model Context Protocol server so any
509+ MCP-compatible client (Claude Desktop, Claude Code, custom Anthropic
510+ / OpenAI tool-use loops) can drive the host machine. Stdlib-only —
511+ JSON-RPC 2.0 over stdio or HTTP+SSE.
512+
513+ ** Register with Claude Code:**
514+
515+ ``` bash
516+ claude mcp add autocontrol -- python -m je_auto_control.utils.mcp_server
517+ ```
518+
519+ ** Register with Claude Desktop** (` claude_desktop_config.json ` ):
520+
521+ ``` json
522+ {
523+ "mcpServers" : {
524+ "autocontrol" : {
525+ "command" : " python" ,
526+ "args" : [" -m" , " je_auto_control.utils.mcp_server" ]
527+ }
528+ }
529+ }
530+ ```
531+
532+ ** Start programmatically:**
533+
534+ ``` python
535+ import je_auto_control as ac
536+
537+ # Stdio (blocks until stdin closes)
538+ ac.start_mcp_stdio_server()
539+
540+ # Or HTTP / SSE with bearer-token auth + optional TLS
541+ ac.start_mcp_http_server(host = " 127.0.0.1" , port = 9940 ,
542+ auth_token = " hunter2" )
543+ ```
544+
545+ ** Inspect the catalogue without starting the server:**
546+
547+ ``` bash
548+ je_auto_control_mcp --list-tools
549+ je_auto_control_mcp --list-tools --read-only
550+ je_auto_control_mcp --list-resources
551+ je_auto_control_mcp --list-prompts
552+ ```
553+
554+ ** What ships:**
555+
556+ | Surface | Coverage |
557+ | ---| ---|
558+ | Tools (~ 90) | mouse · keyboard · drag · screen / multi-monitor · screenshot-as-image · diff · OCR · image · windows (move/min/max/restore/...) · clipboard text+image · process / shell · recording · screen recording · scheduler / triggers / hotkeys · accessibility tree · VLM locator · executor · history |
559+ | Aliases | ` click ` , ` type ` , ` screenshot ` , ` find_image ` , ` drag ` , ` shell ` , ` wait_image ` , ... — toggle with ` JE_AUTOCONTROL_MCP_ALIASES=0 ` |
560+ | Resources | ` autocontrol://files/<name> ` , ` autocontrol://history ` , ` autocontrol://commands ` , ` autocontrol://screen/live ` (with ` resources/subscribe ` ) |
561+ | Prompts | ` automate_ui_task ` , ` record_and_generalize ` , ` compare_screenshots ` , ` find_widget ` , ` explain_action_file ` |
562+ | Protocol | tools / resources / prompts / sampling / roots / logging / progress / cancellation / list_changed / elicitation |
563+ | Transports | stdio, HTTP ` POST /mcp ` , SSE streaming when ` Accept: text/event-stream ` |
564+ | Safety | tool annotations · ` JE_AUTOCONTROL_MCP_READONLY ` · ` JE_AUTOCONTROL_MCP_CONFIRM_DESTRUCTIVE ` · audit log · token-bucket rate limiter · auto-screenshot on error |
565+ | Ops | bearer-token auth · TLS via ` ssl_context ` · ` PluginWatcher ` hot-reload · ` JE_AUTOCONTROL_FAKE_BACKEND=1 ` for CI |
566+
567+ See [ docs/source/Eng/doc/mcp_server/mcp_server_doc.rst] ( docs/source/Eng/doc/mcp_server/mcp_server_doc.rst )
568+ for the full reference (or the
569+ [ 繁體中文] ( docs/source/Zh/doc/mcp_server/mcp_server_doc.rst ) version).
570+
571+ > ⚠️ The MCP server can move the mouse, send keystrokes, capture the
572+ > screen, and execute arbitrary ` AC_* ` actions. Only register it with
573+ > MCP clients you trust. HTTP defaults to ` 127.0.0.1 ` ; binding to
574+ > ` 0.0.0.0 ` requires explicit reason and ** must** be paired with
575+ > ` auth_token ` plus ` ssl_context ` .
576+
411577### Scheduler (Interval & Cron)
412578
413579``` python
0 commit comments