feat(cua): add managed sandbox lifecycle dashboard#8083
feat(cua): add managed sandbox lifecycle dashboard#8083zouyonghe wants to merge 28 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Sorry @zouyonghe, your pull request is larger than the review limit of 150000 diff characters
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive CUA sandbox management features, including new tools for listing, creating, switching, releasing, taking over, destroying sandboxes, taking screenshots, and copying files between them. It also adds a CuaSandboxRegistry for persistent sandbox state management and integrates a new dashboard route for sandbox administration. Review feedback points out a resource leak with unclosed file descriptors, a potential startup crash due to unhandled JSON parsing errors in the registry, and a bug related to inconsistent use of time sources for timeout calculations. Additionally, there are suggestions for improving dictionary access patterns in the dashboard routes for better robustness.
| relay_file = Path( | ||
| tempfile.mkstemp(prefix="cua-relay-", suffix=".tmp", dir=relay_root)[1] | ||
| ) |
There was a problem hiding this comment.
The use of tempfile.mkstemp creates a low-level file handle and a file path, but the file handle (file descriptor) is not being closed. This will lead to a resource leak, which could eventually exhaust the available file descriptors for the process.
You should explicitly close the file descriptor after getting it from mkstemp.
| relay_file = Path( | |
| tempfile.mkstemp(prefix="cua-relay-", suffix=".tmp", dir=relay_root)[1] | |
| ) | |
| fd, relay_path = tempfile.mkstemp(prefix="cua-relay-", suffix=".tmp", dir=relay_root) | |
| os.close(fd) | |
| relay_file = Path(relay_path) |
| with self.storage_path.open("r", encoding="utf-8") as f: | ||
| payload = json.load(f) |
There was a problem hiding this comment.
The json.load(f) call is not wrapped in a try...except block. If the sandbox_registry.json file becomes corrupted or is malformed, this will raise a json.JSONDecodeError (or OSError for other read issues), which is unhandled. Since this load() method is called during application startup, a corrupted registry file will prevent the application from starting.
It would be more robust to catch potential exceptions here, log a warning, and proceed with a default empty payload, allowing the application to start and recover gracefully.
| with self.storage_path.open("r", encoding="utf-8") as f: | |
| payload = json.load(f) | |
| try: | |
| with self.storage_path.open("r", encoding="utf-8") as f: | |
| payload = json.load(f) | |
| except (json.JSONDecodeError, OSError): | |
| # Consider logging a warning here about the corrupted registry file. | |
| self._payload = _default_registry_payload() | |
| return |
| last_used_at = record.get("last_used_at") | ||
| if last_used_at is not None: | ||
| idle_remaining = timeout - (time.time() - float(last_used_at)) | ||
| if idle_remaining > 0: | ||
| current_expires_at = time.monotonic() + idle_remaining | ||
| cua_idle_state[sandbox_id] = _CUAIdleState( | ||
| expires_at=current_expires_at, task=state.task | ||
| ) | ||
| continue |
There was a problem hiding this comment.
There's a potential issue here with mixing time sources. last_used_at is based on wall-clock time (time.time()), while the idle cleanup task's expiration (current_expires_at) is based on a monotonic clock (time.monotonic()).
The calculation idle_remaining = timeout - (time.time() - float(last_used_at)) uses wall-clock time. Then, this duration is added to time.monotonic() to set a new monotonic expiration time.
If the system's wall-clock time is adjusted (e.g., by NTP), this can lead to incorrect timeout calculations. idle_remaining could become very large or small, causing the sandbox to either never expire or expire too soon.
It's recommended to consistently use time.monotonic() for scheduling and duration calculations within a single process lifetime to avoid this. Since last_used_at must be persisted as wall-clock time, you could calculate the sleep duration based on wall-clock time and then use that duration with asyncio.sleep.
| sandbox = release_current_cua_sandbox( | ||
| _session_id(data), | ||
| str(data["sandbox_id"]) if data.get("sandbox_id") else None, | ||
| ) |
There was a problem hiding this comment.
The expression str(data["sandbox_id"]) if data.get("sandbox_id") else None is a bit fragile. If data.get("sandbox_id") is truthy, it then accesses data["sandbox_id"] using bracket notation, which could raise a KeyError under unusual circumstances (though unlikely here). A simpler and safer approach is to just use data.get("sandbox_id"), as the function you are calling (release_current_cua_sandbox) already expects str | None.
_session_id(data),
data.get("sandbox_id"),| ) | ||
| result = await shell.exec( | ||
| _terminal_command(command), | ||
| cwd=str(data["cwd"]) if data.get("cwd") else None, |
There was a problem hiding this comment.
Similar to another comment, the expression str(data["cwd"]) if data.get("cwd") else None is unnecessarily complex and slightly fragile. Using data.get("cwd") is sufficient, cleaner, and safer as it will return None if the key is missing, which is the desired behavior.
| cwd=str(data["cwd"]) if data.get("cwd") else None, | |
| cwd=data.get("cwd"), |
There was a problem hiding this comment.
Sorry @zouyonghe, your pull request is larger than the review limit of 150000 diff characters
Summary
Verification
uv run pytest tests/unit/test_cua_registry.py tests/unit/test_cua_computer_use.py tests/unit/test_cua_sandbox_tools.py tests/unit/test_core_lifecycle.py tests/unit/test_computer.py tests/unit/test_func_tool_manager.py tests/unit/test_astr_main_agent.py tests/test_dashboard.py -quv run ruff check astrbot/core/astr_main_agent.py astrbot/core/computer/booters/cua.py astrbot/core/computer/computer_client.py astrbot/core/computer/cua_registry.py astrbot/core/core_lifecycle.py astrbot/core/tools/computer_tools/__init__.py astrbot/core/tools/computer_tools/cua.py astrbot/core/tools/computer_tools/cua_sandbox.py astrbot/core/tools/computer_tools/shell.py astrbot/dashboard/routes/sandbox.py tests/unit/test_cua_registry.py tests/unit/test_cua_computer_use.py tests/unit/test_cua_sandbox_tools.py tests/unit/test_core_lifecycle.py tests/unit/test_func_tool_manager.py tests/unit/test_astr_main_agent.py tests/test_dashboard.pycd dashboard && pnpm typecheckcd dashboard && pnpm build