Skip to content

Commit a1df760

Browse files
committed
Document host ID, TLS, WebSocket, audio, clipboard, file transfer for Remote Desktop
Adds a 'secure transports, audio, clipboard, file transfer' section to docs/source/{Eng,Zh}/doc/new_features/new_features_doc.rst with: - Host ID handshake (persistent 9-digit ID, expected_host_id verify) - TLS via ssl_context on host and viewer (HTTPS-grade encryption) - WebSocketDesktopHost / WebSocketDesktopViewer (RFC 6455, in-tree, ssl_context doubles as wss://) - AUDIO message + sounddevice integration (host capture, viewer AudioPlayer; bounded per-client deque so slow viewers drop frames instead of stalling capture) - CLIPBOARD message with JSON envelope (text + image; explicit per-call sync; Windows CF_DIB via ctypes, Linux xclip image/png, macOS get via Pillow ImageGrab) - FILE_BEGIN/CHUNK/END (chunked, bidirectional, arbitrary destination path, no aggregate size limit, progress via local callbacks; GUI drag-drop on the viewer's frame display) README.md, README_zh-TW.md, README_zh-CN.md gain a code-sample-rich appendix under the existing Remote Desktop section, plus prominent warnings about the no-path-restriction / no-size-cap behaviour the file transfer ships with.
1 parent 91cba6e commit a1df760

5 files changed

Lines changed: 516 additions & 3 deletions

File tree

README.md

Lines changed: 70 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@
6363
- **OCR** — extract text from screen regions using Tesseract; wait for, click, or locate rendered text; regex search and full-region dump
6464
- **LLM Action Planner** — translate a plain-language description into a validated `AC_*` action list using Claude
6565
- **Runtime Variables & Control Flow**`${var}` substitution at execution time, plus `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_retry` for data-driven scripts
66-
- **Remote Desktop** — stream this machine's screen and accept remote input over a token-authenticated TCP protocol, *or* connect to another machine and view + control it (host + viewer GUIs included)
66+
- **Remote Desktop** — stream this machine's screen and accept remote input over a token-authenticated TCP protocol, *or* connect to another machine and view + control it (host + viewer GUIs included). Optional TLS (HTTPS-grade encryption), WebSocket transport (ws:// + wss:// for browser / firewall-friendly clients), persistent 9-digit Host ID, host→viewer audio streaming, bidirectional clipboard sync (text + image), and chunked file transfer (drag-drop + progress bar; arbitrary destination path; no size cap)
6767
- **Clipboard** — read/write system clipboard text on Windows, macOS, and Linux
6868
- **Screenshot & Screen Recording** — capture full screen or regions as images, record screen to video (AVI/MP4)
6969
- **Action Recording & Playback** — record mouse/keyboard events and replay them
@@ -540,6 +540,75 @@ GUI: **Remote Desktop** tab with two sub-tabs.
540540
> externally only via SSH tunnel or TLS front-end. The token is the
541541
> only line of defence — treat it like a password.
542542
543+
**Encrypted transports + alternate protocols.** Pass an `ssl_context`
544+
to either `RemoteDesktopHost` or `RemoteDesktopViewer` to wrap every
545+
connection in TLS. For firewall-friendly access, use the in-tree
546+
WebSocket variants (no extra deps) — same protocol, RFC 6455 framing,
547+
and `wss://` if you also pass `ssl_context`:
548+
549+
```python
550+
from je_auto_control import (
551+
WebSocketDesktopHost, WebSocketDesktopViewer,
552+
)
553+
host = WebSocketDesktopHost(token="hunter2", ssl_context=server_ctx)
554+
viewer = WebSocketDesktopViewer(
555+
host="example.com", port=443, token="hunter2",
556+
ssl_context=client_ctx, expected_host_id="123456789",
557+
)
558+
```
559+
560+
**Persistent Host ID.** Every host owns a stable 9-digit numeric ID
561+
(persisted at `~/.je_auto_control/remote_host_id`), announced in
562+
`AUTH_OK` and verifiable via the viewer's `expected_host_id`:
563+
564+
```python
565+
print(host.host_id) # e.g. "123456789"
566+
viewer = RemoteDesktopViewer(
567+
host=..., port=..., token=...,
568+
expected_host_id="123456789", # AuthenticationError on mismatch
569+
)
570+
```
571+
572+
**Audio streaming (host → viewer).** Optional `sounddevice` dep; opt
573+
in with `enable_audio=True` on the host, attach an `AudioPlayer` (or
574+
your own callback) on the viewer:
575+
576+
```python
577+
host = RemoteDesktopHost(token="tok", enable_audio=True)
578+
579+
from je_auto_control.utils.remote_desktop import AudioPlayer
580+
player = AudioPlayer(); player.start()
581+
viewer = RemoteDesktopViewer(host=..., on_audio=player.play)
582+
```
583+
584+
**Clipboard sync (text + image, bidirectional).** Explicit per-call —
585+
no auto-poll loops. Image clipboard works on Windows (CF_DIB via
586+
ctypes) and Linux (`xclip -t image/png`); macOS get is supported via
587+
Pillow ImageGrab, set requires PyObjC.
588+
589+
```python
590+
viewer.send_clipboard_text("hello")
591+
viewer.send_clipboard_image(open("logo.png", "rb").read())
592+
host.broadcast_clipboard_text("greetings")
593+
```
594+
595+
**File transfer with progress.** Bidirectional, chunked, arbitrary
596+
destination path, no size cap; the GUI viewer also accepts drag-drop:
597+
598+
```python
599+
viewer.send_file(
600+
"local.bin", "/tmp/uploaded.bin",
601+
on_progress=lambda tid, done, total: print(done, total),
602+
)
603+
host.send_file_to_viewers("local.bin", "/tmp/from_host.bin")
604+
```
605+
606+
> ⚠️ Path is unrestricted and there is no aggregate size limit.
607+
> Anyone with the token can write any file to any location and can
608+
> fill the disk — keep "trusted token holders == trusted users" in
609+
> mind, or wrap with your own `FileReceiver` subclass that vets
610+
> destination paths.
611+
543612
### Clipboard
544613

545614
```python

README/README_zh-CN.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@
6262
- **OCR** — 使用 Tesseract 从屏幕提取文字,可搜索、点击或等待文字出现;支持 regex 搜索与整块区域 dump
6363
- **LLM 动作规划器** — 用 Claude 把自然语言描述翻译成验证过的 `AC_*` 动作清单
6464
- **运行期变量与流程控制** — 执行时 `${var}` 替换,加上 `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_retry` 让脚本数据驱动
65-
- **远程桌面** — 用 token 认证的 TCP 协议串流本机画面并接收输入,**** 连接到他机观看与控制(host + viewer GUI 内置)
65+
- **远程桌面** — 用 token 认证的 TCP 协议串流本机画面并接收输入,**** 连接到他机观看与控制(host + viewer GUI 内置)。可选 TLS(HTTPS 级加密)、WebSocket 传输(``ws://`` + ``wss://``,穿墙/浏览器友好)、持久化 9 位数 Host ID、host→viewer 音频串流、双向剪贴板同步(文字 + 图片)、分块文件传输(拖放 + 进度条;任意目的路径;无大小上限)
6666
- **剪贴板** — 于 Windows / macOS / Linux 读写系统剪贴板文本
6767
- **截图与屏幕录制** — 捕获全屏或指定区域为图片,录制屏幕为视频(AVI/MP4)
6868
- **动作录制与回放** — 录制鼠标/键盘事件并重新播放
@@ -504,6 +504,59 @@ GUI:**Remote Desktop** 分页,内含两个子分页。
504504

505505
> ⚠️ 取得 host:port 与 token 的人,等同拥有本机完整鼠标 / 键盘控制权。默认仅绑 `127.0.0.1`;要对外暴露请务必搭配 SSH tunnel 或 TLS 前端。Token 是唯一防线 — 请当作密码保管。
506506
507+
**加密传输与替代协议**:传 `ssl_context``RemoteDesktopHost``RemoteDesktopViewer` 即套上 TLS。要穿墙/给浏览器接,用内置的 WebSocket 版本(无额外依赖),加 `ssl_context``wss://`
508+
509+
```python
510+
from je_auto_control import (
511+
WebSocketDesktopHost, WebSocketDesktopViewer,
512+
)
513+
host = WebSocketDesktopHost(token="hunter2", ssl_context=server_ctx)
514+
viewer = WebSocketDesktopViewer(
515+
host="example.com", port=443, token="hunter2",
516+
ssl_context=client_ctx, expected_host_id="123456789",
517+
)
518+
```
519+
520+
**持久化 Host ID**:每台 host 有稳定的 9 位数字 ID(存在 `~/.je_auto_control/remote_host_id`),在 `AUTH_OK` 中声明,viewer 通过 `expected_host_id` 验证:
521+
522+
```python
523+
print(host.host_id) # 例如 "123456789"
524+
viewer = RemoteDesktopViewer(
525+
host=..., port=..., token=...,
526+
expected_host_id="123456789", # 不一致就抛 AuthenticationError
527+
)
528+
```
529+
530+
**音频串流(host → viewer)**:可选 `sounddevice` 依赖;host 端 `enable_audio=True` 开启,viewer 端接 `AudioPlayer`(或自己的 callback):
531+
532+
```python
533+
host = RemoteDesktopHost(token="tok", enable_audio=True)
534+
535+
from je_auto_control.utils.remote_desktop import AudioPlayer
536+
player = AudioPlayer(); player.start()
537+
viewer = RemoteDesktopViewer(host=..., on_audio=player.play)
538+
```
539+
540+
**剪贴板同步(文字 + 图片,双向)**:明确调用,没有自动 polling 循环。图片剪贴板在 Windows(CF_DIB via ctypes)和 Linux(`xclip -t image/png`)支持;macOS get 走 Pillow ImageGrab、set 暂时需要 PyObjC。
541+
542+
```python
543+
viewer.send_clipboard_text("hello")
544+
viewer.send_clipboard_image(open("logo.png", "rb").read())
545+
host.broadcast_clipboard_text("greetings")
546+
```
547+
548+
**文件传输 + 进度**:双向、分块、目的路径任意、无大小上限;GUI viewer 还可以拖放:
549+
550+
```python
551+
viewer.send_file(
552+
"local.bin", "/tmp/uploaded.bin",
553+
on_progress=lambda tid, done, total: print(done, total),
554+
)
555+
host.send_file_to_viewers("local.bin", "/tmp/from_host.bin")
556+
```
557+
558+
> ⚠️ 路径无限制、大小无上限。任何拿到 token 的人都能把任意文件写到任意位置,也能塞满磁盘 — 必须等同信任 token 持有者,或自己继承 `FileReceiver``handle_begin` 内验证 dest_path。
559+
507560
### 剪贴板
508561

509562
```python

README/README_zh-TW.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@
6262
- **OCR** — 使用 Tesseract 從螢幕擷取文字,可搜尋、點擊或等待文字出現;支援 regex 搜尋與整塊區域 dump
6363
- **LLM 動作規劃器** — 用 Claude 把自然語言描述翻譯成驗證過的 `AC_*` 動作清單
6464
- **執行期變數與流程控制** — 執行時 `${var}` 取代,加上 `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_retry` 讓腳本資料驅動
65-
- **遠端桌面** — 用 token 認證的 TCP 協定串流本機畫面並接收輸入,**** 連線到他機觀看與控制(host + viewer GUI 皆內建)
65+
- **遠端桌面** — 用 token 認證的 TCP 協定串流本機畫面並接收輸入,**** 連線到他機觀看與控制(host + viewer GUI 皆內建)。可選 TLS(HTTPS 級加密)、WebSocket 傳輸(``ws://`` + ``wss://``,穿牆/瀏覽器友善)、持久化 9 位數 Host ID、host→viewer 音訊串流、雙向剪貼簿同步(文字 + 圖片)、分塊檔案傳輸(拖放 + 進度條;任意目的路徑;無大小上限)
6666
- **剪貼簿** — 於 Windows / macOS / Linux 讀寫系統剪貼簿文字
6767
- **截圖與螢幕錄製** — 擷取全螢幕或指定區域為圖片,錄製螢幕為影片(AVI/MP4)
6868
- **動作錄製與回放** — 錄製滑鼠/鍵盤事件並重新播放
@@ -504,6 +504,59 @@ GUI:**Remote Desktop** 分頁,內含兩個子分頁。
504504

505505
> ⚠️ 取得 host:port 與 token 的人,等同擁有本機完整滑鼠 / 鍵盤控制權。預設只綁 `127.0.0.1`;要對外暴露請務必搭配 SSH tunnel 或 TLS 前端。Token 是唯一防線 — 請當作密碼來保管。
506506
507+
**加密傳輸與替代協定**:傳 `ssl_context``RemoteDesktopHost``RemoteDesktopViewer` 即套上 TLS。要穿牆/給瀏覽器接,用內建的 WebSocket 版本(無額外相依),加 `ssl_context` 就變 `wss://`
508+
509+
```python
510+
from je_auto_control import (
511+
WebSocketDesktopHost, WebSocketDesktopViewer,
512+
)
513+
host = WebSocketDesktopHost(token="hunter2", ssl_context=server_ctx)
514+
viewer = WebSocketDesktopViewer(
515+
host="example.com", port=443, token="hunter2",
516+
ssl_context=client_ctx, expected_host_id="123456789",
517+
)
518+
```
519+
520+
**持久化 Host ID**:每台 host 有穩定的 9 位數字 ID(存在 `~/.je_auto_control/remote_host_id`),在 `AUTH_OK` 中宣告,viewer 透過 `expected_host_id` 驗證:
521+
522+
```python
523+
print(host.host_id) # 例如 "123456789"
524+
viewer = RemoteDesktopViewer(
525+
host=..., port=..., token=...,
526+
expected_host_id="123456789", # 不一致就拋 AuthenticationError
527+
)
528+
```
529+
530+
**音訊串流(host → viewer)**:選用 `sounddevice` 相依;host 端 `enable_audio=True` 開啟,viewer 端接 `AudioPlayer`(或自己的 callback):
531+
532+
```python
533+
host = RemoteDesktopHost(token="tok", enable_audio=True)
534+
535+
from je_auto_control.utils.remote_desktop import AudioPlayer
536+
player = AudioPlayer(); player.start()
537+
viewer = RemoteDesktopViewer(host=..., on_audio=player.play)
538+
```
539+
540+
**剪貼簿同步(文字 + 圖片,雙向)**:明確呼叫,沒有自動 polling 迴圈。圖片剪貼簿在 Windows(CF_DIB via ctypes)跟 Linux(`xclip -t image/png`)支援;macOS get 走 Pillow ImageGrab、set 暫時需要 PyObjC。
541+
542+
```python
543+
viewer.send_clipboard_text("hello")
544+
viewer.send_clipboard_image(open("logo.png", "rb").read())
545+
host.broadcast_clipboard_text("greetings")
546+
```
547+
548+
**檔案傳輸 + 進度**:雙向、分塊、目的路徑任意、無大小上限;GUI viewer 還可以拖放:
549+
550+
```python
551+
viewer.send_file(
552+
"local.bin", "/tmp/uploaded.bin",
553+
on_progress=lambda tid, done, total: print(done, total),
554+
)
555+
host.send_file_to_viewers("local.bin", "/tmp/from_host.bin")
556+
```
557+
558+
> ⚠️ 路徑無限制、大小無上限。任何拿到 token 的人都能把任意檔案寫到任意位置,也能塞滿磁碟 — 必須等同信任 token 持有者,或自己繼承 `FileReceiver``handle_begin` 內驗證 dest_path。
559+
507560
### 剪貼簿
508561

509562
```python

0 commit comments

Comments
 (0)