Skip to content

fix(sandbox): add host-mediated gateway restart#5874

Open
ericksoa wants to merge 55 commits into
mainfrom
fix/issue-2426-gateway-restart
Open

fix(sandbox): add host-mediated gateway restart#5874
ericksoa wants to merge 55 commits into
mainfrom
fix/issue-2426-gateway-restart

Conversation

@ericksoa

@ericksoa ericksoa commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a supported host-mediated gateway restart command for NemoClaw-managed OpenClaw and Hermes gateways so runtime configuration and plugin changes can be applied without weakening the sandbox/gateway user boundary.

This revision also hardens the supervised gateway lifecycle: restart and crash recovery act only on a proven process identity and listener, configuration/state changes are transactional, and ambiguous privileged execution or failed built-in health probes fail closed.

Related Issue

Fixes #2426
Fixes #5253
Supersedes #5416

Changes

  • Added sandbox:gateway:restart plus public nemoclaw <name> gateway restart and nemohermes <name> gateway restart routing.
  • Added root-owned OpenClaw and Hermes PID 1 supervisors with a root-only control helper, exact PID/process-start identity checks, listener ownership validation, post-stop absence proof, exact reaping, health wait, and forward recovery.
  • Added descriptor-based, no-follow, atomic configuration/state guards, including bounded Hermes secret-boundary and configuration-hash validation.
  • Serialized shields mutations and recovery timers, and ordered snapshot, destroy, and inference transitions so stale callbacks cannot reapply state.
  • Made built-in OpenClaw/Hermes probes fail closed after trusted execution failure or timeout; rejected unsupported privileged-exec drivers and ambiguous container selection. Explicit custom gateway agents retain their compatibility path.
  • Wired all runtime helpers into optimized build contexts with root-owned, non-writable permissions, and preserved root PID 1 access to mutable sandbox-group state when hardened runtimes drop CAP_DAC_OVERRIDE.
  • Updated command, lifecycle, runtime-control, troubleshooting, and Hermes documentation and added focused unit, integration, source-shape, packaging, and live-scenario coverage.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Quality Gates

  • Tests added or updated for changed behavior
  • Existing tests cover changed behavior — justification:
  • Tests not applicable — justification:
  • Docs updated for user-facing behavior changes
  • Docs not applicable — justification:
  • Sensitive paths changed (security, policy, credentials, preflight, onboarding, inference, runner, sandbox, or messaging)
  • Sensitive-path review completed or maintainer-approved waiver recorded — reviewer/approval link/justification: Independent final review covered secrets, input/path handling, injection, privilege boundaries, integrity, PID/race safety, fail-closed behavior, build wiring, and tests; no blocking finding remained. CI and maintainer review remain authoritative.
  • Non-success, skipped, or missing CI check accepted by maintainer — check name, approval link, and follow-up issue:

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Quality Gates section completed with required justifications or waivers
  • No secrets, API keys, or credentials committed
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Verification run locally against the current main merge:

  • npm run build:cli
  • npm run typecheck:cli
  • npm run typecheck
  • Plugin suite and typecheck: 17 files / 490 tests passed; tsc --noEmit passed.
  • Critical post-merge CLI slice: 45 files / 648 tests passed, with 2 platform skips.
  • Exact-head guardrail follow-up: all changed test files are at or below main's if count; all new test files are at zero. The affected suites passed 414 tests with 2 platform skips, and the final forward-listener consumers reran 19/19.
  • Full 99-step OpenClaw image build completed; test/e2e-non-root-smoke.sh passed 2/2 under --security-opt no-new-privileges.
  • Capability-dropped root-group proof passed with CAP_DAC_OVERRIDE absent; stale-base group repair is idempotent and target-user step-down still resets supplementary groups.
  • npm run checks
  • Biome over all changed TypeScript, JavaScript, and JSON files.
  • Bash syntax, configured shfmt, ShellCheck, Ruff, and Python compilation over changed helpers.
  • Test-size, source-shape, environment-variable documentation, Gitleaks, Hadolint, and git diff --check gates.
  • npm run docs completed with 0 errors and 2 pre-existing Fern warnings.
  • Push hooks passed CLI typechecking and version synchronization; earlier full-surface push hooks also passed plugin and JS-config typechecking.

The full local CLI hook was run before commit. It still encounters unchanged macOS/Node 22.16 harness failures (plain Node loading TypeScript diagnostic preloads and a GNU sed assumption) plus broad parallel timeout noise. Branch-related failures exposed by that run were fixed, then commits were retried with only test-cli skipped; all other commit hooks and all push hooks passed. Exact-head CI is the final broad-suite proof.


Signed-off-by: Aaron Erickson aerickson@nvidia.com

Summary by CodeRabbit

  • New Features
    • Added gateway restart for supported sandboxes, including quiet mode and automatic gateway/port-forward recovery verification.
  • Bug Fixes
    • Improved recovery and forward restoration logic with stricter health/forward checks and safer fail-closed behavior when privileged control isn’t available.
    • Hardened gateway, Hermes, and state transition permissions/guarding to reduce mis-supervision and config/hash drift issues.
  • Documentation
    • Updated restart/recover, shields/timer windows, snapshot/rebuild/destroy, and troubleshooting references with clearer operator guidance.

Advisor dispositions (exact head c4223e3)

  • PRA-1 / PRA-2 — compatible endpoint policy: normalizeCustomEndpointUrl and its accepted bridge-host behavior have no diff from base SHA 8e4784e8; this PR only adds serialization and shields-posture checks around inference mutations. Changing the endpoint trust policy here would be an unrelated security-policy migration, so it is outside this PR.
  • PRA-3 / PRA-T6 / PRA-T7 — issue Impossible to restart hermes gateway if it ever stops #2426 reproduction: the hardened root supervisor launches Hermes as the separate gateway UID specifically so the sandbox user cannot kill or restart that child. The live scenario therefore uses trusted sandbox root to TERM/KILL the exact tracked gateway PID, which establishes the same stopped-process state without weakening the new privilege boundary. Recovery then runs from the host, modeling the post-exit step, and verifies a replacement gateway PID, unchanged PID 1 identity, both config hashes, API health on 8642, dashboard forward/HTML on 18789 when enabled, and Ready/healthy status.
  • PRA-T8 — clear recovery output: test/cli/connect-recovery.test.ts asserts Probe complete: recovered Hermes Agent gateway; the live Hermes scenario separately proves exit code 0 and the recovered runtime/forward state.
  • PRA-4 — shields module size: a behavior-neutral extraction of the remaining security-sensitive orchestration is deferred because it would materially widen this already large recovery change. The new transition, state-directory, and config-lock modules provide focused seams for later extraction without mixing that refactor into the bug fix.
  • Nemotron informational timeout: Nemotron timed out internally and produced no code finding. The final nine-file gate fix received independent correctness, security, and docs audits; the focused suite passed 184 tests with 2 platform skips both normally and under V8 coverage.
  • Exact-head runtime/security proof: all required checks on c4223e3fe passed, including five CLI shards and aggregate, both sandbox-image builds, non-root smoke, gateway isolation, port overrides, sandbox E2E, macOS/WSL, CodeRabbit, GPT-5.5, Nemotron workflow, and CodeQL with 0 annotations (down from 41). The broader cloud matrix remains an advisor recommendation; no unrun cloud job is claimed as proof.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@ericksoa ericksoa self-assigned this Jun 26, 2026
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds gateway restart control, PID 1 supervisor wiring, descriptor-safe state/config locking, identity-based process supervision, Hermes protocol handling, forward recovery, and timer-bound shields/mutation flows across CLI, scripts, docs, and tests.

Changes

Gateway restart, PID 1 supervisor, and shields hardening

Layer / File(s) Summary
Supervisor shell protocol and image wiring
scripts/gateway-control.sh, scripts/lib/gateway-supervisor.sh, Dockerfile, Dockerfile.base, agents/hermes/Dockerfile, src/lib/sandbox/build-context.ts, test/hermes-doctor-config-hash.test.ts
Adds the host control script and PID 1 supervisor library, stages them into the build contexts and images, hardens ownership and modes, broadens Hermes base-image allowlists, and updates the healthcheck fallback to PID/starttime validation.
Descriptor-safe state-dir and config guards
scripts/state-dir-guard.py, src/lib/shields/state-dir-lock.ts, src/lib/shields/openclaw-config-lock.ts, agents/hermes/validate-env-secret-boundary.py, src/lib/shields/state-dir-lock.test.ts, src/lib/shields/openclaw-config-lock.test.ts
Adds the Python state-dir guard, the OpenClaw config guard wrapper, and the Hermes secret-boundary validator changes; these modules parse structured output, enforce bounded traversal/read rules, and drive privileged guard execution paths for lock, unlock, and write-config flows.
Identity supervision in start scripts
scripts/nemoclaw-start.sh, agents/hermes/start.sh
Refactors OpenClaw and Hermes startup scripts to track process start identities, gate liveness and reaping on identity matches, supervise forwarders and log tails, and integrate authenticated control/restart handling into the PID 1 loop.
Timer control, restore timer, and timer-bound locks
src/lib/shields/timer-control.ts, src/lib/shields/timer.ts, src/lib/shields/timer-bound-lock.ts, src/lib/shields/timer.test.ts, src/lib/shields/timer-bound-lock.test.ts
Extends timer markers with lease-owner and legacy-protocol fields, adds process identity helpers, makes restore state updates atomic, and wraps shield mutation locks with retry-on-token-churn sync/async helpers.
Shields index refactor and Hermes protocol transitions
src/lib/shields/index.ts, src/lib/shields/legacy-hermes-compat.test.ts, src/lib/shields/legacy-hermes-transition.test.ts, src/lib/shields/openclaw-transition.test.ts, src/lib/shields/flow.test.ts, src/lib/shields/transition-lock.test.ts, src/lib/shields/index.test.ts
Reworks shields transitions around sealed vs legacy Hermes protocol support, transition locks, deferred exits, identity-pinned recovery, and protocol-aware lock/unlock flows; includes the legacy Hermes transition script and the OpenClaw/Hermes transition and flow tests.
Gateway restart, process recovery, and forward recovery
src/lib/agent/gateway-restart-markers.ts, src/lib/agent/gateway-restart-scripts.ts, src/lib/agent/gateway-script-shared.ts, src/lib/agent/runtime.ts, src/lib/actions/sandbox/gateway-restart.ts, src/lib/actions/sandbox/process-recovery.ts, src/lib/actions/sandbox/forward-health.ts, src/lib/actions/sandbox/forward-recovery.ts, src/lib/actions/sandbox/hermes-dashboard-recovery.ts, src/lib/actions/sandbox/connect.ts, test/process-recovery*.test.ts, test/cli/connect-recovery.test.ts, test/recover-port-forward.test.ts, src/lib/agent/runtime*.test.ts
Adds restart markers, gateway recovery script builders, gateway script helpers, forward-health classification, forward-recovery routines, the gateway restart action, and supervisor-mediated process recovery. Updates Hermes dashboard recovery, connect probe failure handling, and recovery tests.
CLI wiring, config safety, inference gating, and lifecycle wrappers
src/commands/sandbox/gateway/restart.ts, src/commands/sandbox/recover.ts, src/lib/cli/public-display-defaults.ts, src/lib/sandbox/config.ts, src/lib/sandbox/privileged-exec.ts, src/lib/actions/inference-set.ts, src/lib/actions/sandbox/destroy.ts, src/lib/actions/sandbox/snapshot.ts, src/lib/actions/sandbox/rebuild-shields.ts, src/lib/agent/defs.ts, src/lib/agent/onboard.ts, related tests
Adds the restart CLI entrypoint and public display entry, updates recover help text, hardens config reads and writes with digests and guarded restarts, rejects unsupported direct drivers, gates inference-set on shields mutability, and wraps destroy/snapshot/rebuild flows in timer-bound mutation locks.
Docs, live scenario, and static analysis
test/e2e-scenario/live/hermes-e2e.test.ts, docs/manage-sandboxes/*, docs/reference/*, docs/security/best-practices.mdx, ci/env-var-doc-allowlist.json, scripts/checks/no-test-dist-imports.ts
Updates live Hermes E2E coverage for host-mediated restart and locked-drift refusal, refreshes operator docs for restart, recovery, shields, snapshots, and inference-set behavior, adds the forward-recovery env allowlist entry, and upgrades the no-test-dist-imports checker.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#5197: Both PRs modify gateway recovery paths in src/lib/actions/sandbox/process-recovery.ts and related runtime helpers.
  • NVIDIA/NemoClaw#5681: Both PRs touch src/lib/shields/index.ts and the shields auto-restore/config-lock flow.
  • NVIDIA/NemoClaw#5869: Both PRs modify src/lib/actions/inference-set.ts and its shield-gated mutation flow.

Suggested labels

area: sandbox, area: security, bug-fix

Suggested reviewers

  • cv
  • cjagwani
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely summarizes the main change: adding a host-mediated gateway restart path.
Linked Issues check ✅ Passed The restart/recover changes and Hermes boundary validation address #2426 and #5253.
Out of Scope Changes check ✅ Passed The broader shields, state-dir, and config changes support the same restart/recovery hardening scope.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-2426-gateway-restart

Comment @coderabbitai help to get the list of available commands.

@github-code-quality

github-code-quality Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 955c87e +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 68%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 955c87e +/-
src/lib/shields...nsition-lock.ts 89%
src/lib/actions...all/run-plan.ts 80%
src/lib/state/o...oard-session.ts 79%
src/lib/actions...dbox/rebuild.ts 74%
src/lib/state/sandbox.ts 72%
src/lib/shields/index.ts 71%
src/lib/onboard/preflight.ts 69%
src/lib/actions...licy-channel.ts 59%
src/lib/onboard...er-gpu-patch.ts 59%
src/lib/onboard.ts 20%

Updated June 29, 2026 01:10 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

Copy link
Copy Markdown
Contributor

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: cloud-onboard-e2e, hermes-e2e, hermes-root-entrypoint-smoke-e2e, hermes-secret-boundary-e2e, hermes-inference-switch-e2e, openclaw-inference-switch-e2e, sandbox-operations-e2e, sandbox-survival-e2e, issue-2478-crash-loop-recovery-e2e, shields-config-e2e, runtime-overrides-e2e, state-backup-restore-e2e, rebuild-openclaw-e2e, rebuild-hermes-e2e, rebuild-hermes-stale-base-e2e, gateway-health-honest-e2e
Optional E2E: onboard-resume-e2e, onboard-repair-e2e, credential-sanitization-e2e, network-policy-e2e, gateway-drift-preflight-e2e

Dispatch hint: cloud-onboard-e2e,hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e,hermes-inference-switch-e2e,openclaw-inference-switch-e2e,sandbox-operations-e2e,sandbox-survival-e2e,issue-2478-crash-loop-recovery-e2e,shields-config-e2e,runtime-overrides-e2e,state-backup-restore-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e

Auto-dispatched E2E: hermes-root-entrypoint-smoke-e2e, hermes-secret-boundary-e2e, issue-2478-crash-loop-recovery-e2e via nightly-e2e.yaml at 955c87e53eb5ef029a7ce244a7082e8fb60f2ab2nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • cloud-onboard-e2e (high): Full hosted onboarding is required because Dockerfiles, sandbox build context, agent onboarding/defs, runtime startup, and hosted inference wiring changed.
  • hermes-e2e (high): Validates the real Hermes install/onboard/start/health/live-inference flow after extensive Hermes Dockerfile, start script, runtime guard, dashboard recovery, and gateway lifecycle changes.
  • hermes-root-entrypoint-smoke-e2e (medium): Directly exercises the Hermes root entrypoint, gateway-user execution, runtime layout repair, and PID/lifecycle handling touched by start.sh, managed-gateway-control, supervisor, Dockerfile permissions, and runtime-config-guard changes.
  • hermes-secret-boundary-e2e (medium): Required for changes to Hermes secret-boundary validator, Dockerfile runtime permissions/preloads, start path, and config guard behavior that must prevent raw secrets from entering sandbox env/config.
  • hermes-inference-switch-e2e (high): Required because Hermes inference switching, runtime config writes, config hashes/seals, guard transactions, and live routed inference paths changed.
  • openclaw-inference-switch-e2e (high): Required because shared inference-set logic and OpenClaw config guard/rebuild-shields paths changed, affecting live route swaps and OpenClaw agent traffic.
  • sandbox-operations-e2e (high): Required for changed sandbox connect, destroy, recover, gateway restart, dashboard recovery, snapshot, and operations scenario code that backs real sandbox management flows.
  • sandbox-survival-e2e (medium): Required for gateway restart/recovery and process-supervisor changes; validates an onboarded sandbox survives gateway stop/start while preserving workspace and inference.
  • issue-2478-crash-loop-recovery-e2e (medium): Required because process recovery, forward recovery, gateway supervisor, and startup recovery race code changed; this catches crash-loop/self-healing regressions in a real sandbox lifecycle.
  • shields-config-e2e (medium): Required for changes to shields transitions, state-dir/openclaw config locks, rebuild-shields, timer/verify locks, and Hermes legacy transition compatibility.
  • runtime-overrides-e2e (medium): Required because runtime startup, runtime controls, public display defaults, config set behavior, and entrypoint env/override handling changed.
  • state-backup-restore-e2e (high): Required because sandbox snapshot and backup/restore-adjacent command/docs paths changed; validates state preservation and restoration through real sandbox lifecycle operations.
  • rebuild-openclaw-e2e (high): Required because the OpenClaw production/base Dockerfiles, start script, gateway/control guards, image permissions, and rebuild shield handling changed.
  • rebuild-hermes-e2e (high): Required because Hermes Dockerfile, runtime guard, start script, stale layout handling, and rebuild config seal paths changed.
  • rebuild-hermes-stale-base-e2e (high): Required because Hermes stale-base cleanup and allowed local/stale base image cases changed in agents/hermes/Dockerfile and verifier scripts.
  • gateway-health-honest-e2e (medium): Required because Dockerfile healthcheck and gateway PID/starttime validation changed; this regression lane verifies gateway health is not falsely reported when the gateway is unhealthy.

Optional E2E

  • onboard-resume-e2e (medium): Adjacent confidence for onboarding state continuity because src/lib/agent/onboard.ts and image/build inputs changed, though the required onboarding-resume machine-slice rule is not directly triggered by the listed files.
  • onboard-repair-e2e (medium): Adjacent confidence for repair after onboarding/build/runtime changes; useful if the PR also changes persisted partial onboarding states not visible in the truncated diff.
  • credential-sanitization-e2e (medium): Optional defense-in-depth coverage for credential display/env sanitization after secret-boundary, runtime env, and provider placeholder changes.
  • network-policy-e2e (medium): Optional adjacent coverage for security boundary and config-set SSRF-related changes, especially if any network policy or egress proxy behavior changed outside the visible diff.
  • gateway-drift-preflight-e2e (low): Optional adjacent gateway lifecycle confidence because gateway supervision and health/recovery code changed, although the PR does not appear to target OpenShell gateway image drift specifically.

New E2E recommendations

  • OpenClaw managed gateway control and sealed config recovery (high): This PR adds/changes root-owned gateway control, managed-gateway-control.py, state-dir/openclaw config guards, PID starttime markers, and recovery locking for OpenClaw. Existing E2E lanes exercise broad sandbox operations and recovery, but there is no clearly dedicated live OpenClaw E2E that attacks the managed gateway-control boundary under shields/config-lock conditions and verifies PID-reuse-safe restart behavior end-to-end.
    • Suggested test: Add an OpenClaw live E2E that onboards a sandbox, enables shields/config locks, attempts an unprivileged gateway/config tamper, performs nemoclaw sandbox gateway restart and sandbox recover, verifies config hashes/guards, and confirms gateway PID/starttime identity and inference after recovery.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,hermes-e2e,hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e,hermes-inference-switch-e2e,openclaw-inference-switch-e2e,sandbox-operations-e2e,sandbox-survival-e2e,issue-2478-crash-loop-recovery-e2e,shields-config-e2e,runtime-overrides-e2e,state-backup-restore-e2e,rebuild-openclaw-e2e,rebuild-hermes-e2e,rebuild-hermes-stale-base-e2e

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
docs/manage-sandboxes/runtime-controls.mdx (1)

60-60: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use $$nemoclaw on this shared page.

These Hermes-only references hard-code nemohermes for a host command that also exists on the shared alias surface. Use $$nemoclaw here so the generated OpenClaw and Hermes variants render the correct command name consistently.
As per coding guidelines, Use $$nemoclaw for host CLI command examples on shared OpenClaw and Hermes pages. As per path instructions, ask the author to use $$nemoclaw instead so generated OpenClaw and Hermes docs render the right command name. Based on learnings, use a concrete alias only when the command is intentionally OpenClaw-specific.

Also applies to: 70-70

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/manage-sandboxes/runtime-controls.mdx` at line 60, The runtime-controls
table currently hard-codes the host command as nemohermes on a shared
OpenClaw/Hermes page, which should instead use the shared alias surface. Update
the affected entry in the docs content that references the gateway restart
command to use $$nemoclaw so both generated variants render the correct host CLI
name consistently, and apply the same fix to the other referenced occurrence in
this section.

Sources: Coding guidelines, Path instructions, Learnings

docs/reference/commands.mdx (1)

1334-1334: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use $$nemoclaw in the shared reference page.

This note is on a shared docs page, and gateway restart is not Hermes-only. Hard-coding nemohermes here breaks the variant-friendly command style used elsewhere in this page.
As per coding guidelines, Use $$nemoclaw for host CLI command examples on shared OpenClaw and Hermes pages. As per path instructions, ask the author to use $$nemoclaw instead so generated OpenClaw and Hermes docs render the right command name. Based on learnings, concrete aliases are fine when the command is intentionally agent-specific, which is not the case here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/commands.mdx` at line 1334, Replace the hard-coded Hermes CLI
name in the shared docs example with the shared host-command placeholder used
elsewhere on this page. Update the command example in the reference docs so it
uses $$nemoclaw instead of nemohermes, keeping the wording variant-friendly for
both OpenClaw and Hermes and preserving the existing gateway restart guidance.

Sources: Coding guidelines, Path instructions, Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/process-recovery.test.ts`:
- Around line 295-299: The test is reading failure-only fields from
`restartSandboxGateway` without narrowing the union first, so `result.detail` is
not type-safe after `toMatchObject`. Update the assertion in
`process-recovery.test.ts` to explicitly narrow on `result.ok` for the failure
branch before accessing `detail`, while keeping the existing
`restartSandboxGateway` and `deps.buildOpenClawGatewayRestartScript`
expectations intact.

In `@src/lib/actions/sandbox/process-recovery.ts`:
- Around line 617-624: The success log in process recovery is not honoring
quiet, so `process-recovery.ts` still prints the gateway restart message
unconditionally. Update the `processRecovery` restart path around the
`forwardRecovered` checks so the final `console.log` for “Gateway restarted;
health passed; forwards checked/recovered” only runs when `quiet` is false,
matching the existing quiet handling used for the earlier status messages.
- Around line 593-599: The wedge diagnostics path is bypassing the injected exec
behavior by passing the direct sandbox exec helper instead of the dependency
override. In the process-recovery flow that calls waitForRecoveredSandboxGateway
and printGatewayWedgeDiagnostics, make sure the diagnostics invocation uses
deps.executeSandboxExecCommand when present (falling back to the default helper
only if not injected) so callers can fully control exec behavior during
health-timeout recovery.

In `@src/lib/agent/runtime.ts`:
- Around line 282-291: In buildOpenClawGatewayRestartScript and the
gatewayRootGosuLaunchCommand flow, move the root check ahead of any
state-mutating restart steps so OpenClaw verifies root before log setup, lock
removal, or stale-process termination. Use the existing root-check logic from
gatewayRootGosuLaunchCommand and ensure the non-root path exits before
buildGatewayLogSetup, buildGatewayGuardRecoveryLines, rm -rf lock cleanup, and
buildGatewayStopLines are executed.

---

Nitpick comments:
In `@docs/manage-sandboxes/runtime-controls.mdx`:
- Line 60: The runtime-controls table currently hard-codes the host command as
nemohermes on a shared OpenClaw/Hermes page, which should instead use the shared
alias surface. Update the affected entry in the docs content that references the
gateway restart command to use $$nemoclaw so both generated variants render the
correct host CLI name consistently, and apply the same fix to the other
referenced occurrence in this section.

In `@docs/reference/commands.mdx`:
- Line 1334: Replace the hard-coded Hermes CLI name in the shared docs example
with the shared host-command placeholder used elsewhere on this page. Update the
command example in the reference docs so it uses $$nemoclaw instead of
nemohermes, keeping the wording variant-friendly for both OpenClaw and Hermes
and preserving the existing gateway restart guidance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4573e314-45ce-43a2-b097-901ac1580774

📥 Commits

Reviewing files that changed from the base of the PR and between 84ffec1 and 13d407f.

📒 Files selected for processing (17)
  • agents/hermes/Dockerfile
  • docs/manage-sandboxes/install-plugins-hermes.mdx
  • docs/manage-sandboxes/lifecycle.mdx
  • docs/manage-sandboxes/runtime-controls.mdx
  • docs/reference/commands-nemohermes.mdx
  • docs/reference/commands.mdx
  • docs/reference/troubleshooting.mdx
  • src/commands/sandbox/gateway/restart.ts
  • src/commands/sandbox/oclif-command-adapters.test.ts
  • src/lib/actions/sandbox/process-recovery.test.ts
  • src/lib/actions/sandbox/process-recovery.ts
  • src/lib/agent/runtime.test.ts
  • src/lib/agent/runtime.ts
  • src/lib/cli/command-registry.test.ts
  • src/lib/cli/public-argv-translation.test.ts
  • src/lib/cli/public-display-defaults.ts
  • test/hermes-doctor-config-hash.test.ts

Comment thread src/lib/actions/sandbox/process-recovery.test.ts Outdated
Comment thread src/lib/actions/sandbox/process-recovery.ts Outdated
Comment thread src/lib/actions/sandbox/process-recovery.ts Outdated
Comment thread src/lib/agent/runtime.ts Outdated
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: gateway-guard-recovery, hermes-dashboard-vitest, hermes-sandbox-secret-boundary-vitest, sandbox-operations-vitest, hermes-inference-switch-vitest, hermes-root-entrypoint-smoke-vitest, shields-config-vitest, snapshot-commands-vitest, cloud-onboard-vitest
Optional Vitest E2E scenarios: openclaw-inference-switch-vitest, sandbox-survival-vitest, onboard-repair-vitest

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gateway-guard-recovery
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-dashboard-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-sandbox-secret-boundary-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=sandbox-operations-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-inference-switch-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-root-entrypoint-smoke-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=shields-config-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=snapshot-commands-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=cloud-onboard-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • gateway-guard-recovery: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/gateway-guard-recovery.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=gateway-guard-recovery
  • hermes-dashboard-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/hermes-e2e.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-dashboard-vitest
  • hermes-sandbox-secret-boundary-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/hermes-sandbox-secret-boundary.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-sandbox-secret-boundary-vitest
  • sandbox-operations-vitest: Focused free-standing Vitest job wired for changed live test test/e2e-scenario/live/sandbox-operations.test.ts.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=sandbox-operations-vitest
  • hermes-inference-switch-vitest: The Hermes inference switch live helper and shared inference-set action changed, so run the dedicated Hermes inference-switch Vitest job.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-inference-switch-vitest
  • hermes-root-entrypoint-smoke-vitest: Hermes entrypoint, root-owned lifecycle controls, gateway supervisor wiring, and Dockerfile permissions changed; run the root entrypoint smoke job.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=hermes-root-entrypoint-smoke-vitest
  • shields-config-vitest: Shields lock/transition/timer/config code and Hermes/OpenClaw config guard behavior changed; run the dedicated shields config Vitest job.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=shields-config-vitest
  • snapshot-commands-vitest: Sandbox snapshot action code changed and should be covered by the dedicated snapshot commands live Vitest job.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=snapshot-commands-vitest
  • cloud-onboard-vitest: Dockerfile/base image, sandbox build context/config, agent onboard/runtime, and startup behavior changed, so the primary cloud onboard live Vitest job should run.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=cloud-onboard-vitest

Optional Vitest E2E scenarios

  • openclaw-inference-switch-vitest: Shared inference-set code changed; Hermes inference switch is the direct target, but the OpenClaw inference-switch job is adjacent coverage for the same command surface on the OpenClaw runtime.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=openclaw-inference-switch-vitest
  • sandbox-survival-vitest: Gateway/process recovery and startup health paths changed; sandbox operations and gateway guard recovery are the direct required coverage, while sandbox survival is adjacent lifecycle coverage.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=sandbox-survival-vitest
  • onboard-repair-vitest: Process recovery, recover command, runtime guard, and startup transaction code changed; run this if reviewers want extra persisted-session repair/backstop coverage beyond the required sandbox and cloud-onboard jobs.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=onboard-repair-vitest

Relevant changed files

  • Dockerfile
  • Dockerfile.base
  • agents/hermes/Dockerfile
  • agents/hermes/runtime-config-guard.py
  • agents/hermes/start.sh
  • agents/hermes/validate-env-secret-boundary.py
  • scripts/gateway-control.sh
  • scripts/lib/gateway-supervisor.sh
  • scripts/managed-gateway-control.py
  • scripts/nemoclaw-start.sh
  • scripts/openclaw-config-guard.py
  • scripts/state-dir-guard.py
  • scripts/verify-hermes-stale-openclaw-image.sh
  • src/commands/sandbox/config/set.ts
  • src/commands/sandbox/gateway/restart.ts
  • src/commands/sandbox/recover.ts
  • src/lib/actions/inference-set.ts
  • src/lib/actions/sandbox/connect.ts
  • src/lib/actions/sandbox/destroy.ts
  • src/lib/actions/sandbox/forward-health.ts
  • src/lib/actions/sandbox/forward-recovery.ts
  • src/lib/actions/sandbox/gateway-restart.ts
  • src/lib/actions/sandbox/hermes-dashboard-recovery.ts
  • src/lib/actions/sandbox/process-recovery.ts
  • src/lib/actions/sandbox/rebuild-shields.ts
  • src/lib/actions/sandbox/snapshot.ts
  • src/lib/agent/defs.ts
  • src/lib/agent/gateway-restart-markers.ts
  • src/lib/agent/gateway-restart-scripts.ts
  • src/lib/agent/gateway-script-shared.ts
  • src/lib/agent/hermes-recovery-boundary.ts
  • src/lib/agent/onboard.ts
  • src/lib/agent/runtime.ts
  • src/lib/sandbox/build-context.ts
  • src/lib/sandbox/config.ts
  • src/lib/sandbox/privileged-exec.ts
  • src/lib/shields/index.ts
  • src/lib/shields/openclaw-config-lock.ts
  • src/lib/shields/state-dir-lock.ts
  • src/lib/shields/timer-bound-lock.ts
  • src/lib/shields/timer-control.ts
  • src/lib/shields/timer.ts
  • src/lib/shields/transition-lock.ts
  • src/lib/shields/verify-lock.ts
  • test/e2e-scenario/live/gateway-guard-recovery.test.ts
  • test/e2e-scenario/live/hermes-e2e.test.ts
  • test/e2e-scenario/live/hermes-inference-switch-helpers.ts
  • test/e2e-scenario/live/hermes-sandbox-secret-boundary.test.ts
  • test/e2e-scenario/live/sandbox-operations.test.ts
  • test/e2e-scenario/support-tests/hermes-dashboard-workflow-boundary.test.ts
  • tools/e2e-scenarios/hermes-dashboard-workflow-boundary.mts

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — Blocked

Merge posture: Do not merge until addressed
Primary next action: Fix PRA-4: Validate custom-compatible inference endpoint URLs; then add or justify PRA-T1.
Open items: 1 required · 7 warnings · 0 suggestions · 8 test follow-ups
Since last review: 0 prior items resolved · 6 still apply · 1 new item found

Action checklist

  • PRA-4 Fix: Validate custom-compatible inference endpoint URLs in src/lib/actions/inference-set.ts:405
  • PRA-1 Resolve or justify: Source-of-truth review needed: Custom-compatible inference endpoint metadata
  • PRA-2 Resolve or justify: Source-of-truth review needed: Stale-base gateway/root sandbox-group fallback
  • PRA-3 Resolve or justify: Source-of-truth review needed: Managed OpenShell gateway control mutable config branch
  • PRA-5 Resolve or justify: Cover or document the literal `hermes gateway stop` reproduction in test/e2e-scenario/live/hermes-e2e.test.ts:710
  • PRA-6 Resolve or justify: Identify runtime proof for root-only supervisor controls under capability drop in Dockerfile:925
  • PRA-7 Resolve or justify: Make managed non-root restart limitations explicit in behavior and tests in scripts/managed-gateway-control.py:17
  • PRA-8 Resolve or justify: Shrink the central shields orchestration hotspot in src/lib/shields/index.ts:1
  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Runtime validation
  • PRA-T5 Add or justify test follow-up: Runtime validation
  • PRA-T6 Add or justify test follow-up: Identify runtime proof for root-only supervisor controls under capability drop
  • PRA-T7 Add or justify test follow-up: Acceptance clause
  • PRA-T8 Add or justify test follow-up: Acceptance clause

Findings index

ID Severity Category Location Required action
PRA-1 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-2 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-3 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-4 Required security src/lib/actions/inference-set.ts:405 Reuse or expose the public-host and DNS validation policy from `src/lib/sandbox/config.ts` for both custom-compatible providers, including HTTP DNS pinning where appropriate. If `host.openshell.internal` or another bridge host is intentionally supported, encode that as a narrow documented allowlist with tests instead of accepting arbitrary private/internal targets.
PRA-5 Resolve/justify acceptance test/e2e-scenario/live/hermes-e2e.test.ts:710 Add a focused live/integration step that invokes `hermes gateway stop` when available and then runs `nemohermes <name> recover`, or add an in-code test comment documenting why stopping the exact tracked PID is the supported equivalence under the new gateway-user boundary.
PRA-6 Resolve/justify tests Dockerfile:925 Add or identify targeted runtime validation for both OpenClaw and Hermes images proving the installed helper and control directory modes, non-root refusal, root access without `CAP_DAC_OVERRIDE`, and supplementary-group reset after gateway/sandbox step-down.
PRA-7 Resolve/justify security scripts/managed-gateway-control.py:17 Keep the code fail-closed where it cannot prove root-owned locked config, and add an explicit behavior test or user-facing diagnostic/docs note that managed mutable restarts are not equivalent to the root-PID-1 seal. If stronger guarantees are required for this PR, gate managed mutable restart behind a locked-config/root-anchor proof instead.
PRA-8 Resolve/justify architecture src/lib/shields/index.ts:1 Where local to this PR, extract cohesive Hermes/OpenClaw transition orchestration blocks from `index.ts` into narrow modules beside the new lock modules, or add focused in-code justification for why the remaining `index.ts` growth must stay coupled for this bug fix. Preserve trust-boundary validation at call sites and keep existing fail-closed tests intact.

🚨 Required before merge

Address these before merging unless a maintainer explicitly overrides the advisor with rationale.

PRA-4 Required — Validate custom-compatible inference endpoint URLs

  • Location: src/lib/actions/inference-set.ts:405
  • Category: security
  • Problem: `normalizeCustomEndpointUrl()` still accepts any syntactically valid `http:` or `https:` URL without embedded credentials, strips query/hash, and persists it as durable metadata for `compatible-endpoint` and `compatible-anthropic-endpoint`. It does not reject loopback, link-local, RFC1918/private hosts, private/internal hostnames, DNS-private resolutions, or HTTP DNS-rebinding windows, while `config set` already has stricter private-host/private-IP/DNS validation and HTTP DNS pinning.
  • Impact: A user-controlled compatible inference endpoint can become trusted routing metadata through a path that bypasses NemoClaw's established SSRF and DNS-rebinding defenses. If consumed by gateway or sandbox routing, that metadata can target host bridge, metadata, or internal services outside the intended egress policy.
  • Required action: Reuse or expose the public-host and DNS validation policy from `src/lib/sandbox/config.ts` for both custom-compatible providers, including HTTP DNS pinning where appropriate. If `host.openshell.internal` or another bridge host is intentionally supported, encode that as a narrow documented allowlist with tests instead of accepting arbitrary private/internal targets.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Read `normalizeCustomEndpointUrl()` in `src/lib/actions/inference-set.ts` and compare it with `validateUrlValueWithDnsResult()` / `rewriteConfigUrlsWithDnsPinning()` in `src/lib/sandbox/config.ts`; then inspect `src/lib/actions/inference-set.test.ts` for negative private/DNS-private endpoint cases.
  • Missing regression test: Add `runInferenceSet rejects compatible-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts` and `runInferenceSet rejects compatible-anthropic-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts`, covering `http://127.0.0.1\`, `http://169.254.169.254\`, `http://10.0.0.1\`, and a mocked hostname resolving to a private address. If a bridge host is allowed, add a positive test proving only that allowlisted host is accepted.
  • Done when: The required change is committed and verification passes: Read `normalizeCustomEndpointUrl()` in `src/lib/actions/inference-set.ts` and compare it with `validateUrlValueWithDnsResult()` / `rewriteConfigUrlsWithDnsPinning()` in `src/lib/sandbox/config.ts`; then inspect `src/lib/actions/inference-set.test.ts` for negative private/DNS-private endpoint cases.
  • Evidence: `src/lib/actions/inference-set.ts:405` validates only URL shape/protocol/embedded credentials. `src/lib/sandbox/config.ts` imports `isPrivateHostname`/`isPrivateIp` and implements DNS validation and HTTP pinning. The inspected inference tests include custom-compatible positive metadata and `http://host.openshell.internal:18767/\`, but no private/DNS-private rejection cases.
Review findings by urgency: 1 required fix, 7 items to resolve/justify, 0 in-scope improvements

⚠️ Resolve or justify before merge

Investigate these in the current review; either fix them, explain why they are not applicable, or document the accepted risk.

PRA-1 Resolve/justify — Source-of-truth review needed: Custom-compatible inference endpoint metadata

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as missing.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Missing negative tests for loopback, link-local, RFC1918, and DNS-private hosts for both compatible providers.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the security blocker at `src/lib/actions/inference-set.ts:405`.

PRA-2 Resolve/justify — Source-of-truth review needed: Stale-base gateway/root sandbox-group fallback

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Partially covered by provisioning/helper-permission and group-writable tests; missing identified full runtime image validation for helper/control-dir ownership, non-root refusal, capability-drop access, and supplementary-group reset.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the runtime-validation finding at `Dockerfile:925`.

PRA-3 Resolve/justify — Source-of-truth review needed: Managed OpenShell gateway control mutable config branch

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Needs a test asserting the mutable branch is surfaced as degraded and never treats mutable Hermes hash as a root trust anchor.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the managed non-root restart limitation finding at `scripts/managed-gateway-control.py:17`.

PRA-5 Resolve/justify — Cover or document the literal `hermes gateway stop` reproduction

  • Location: test/e2e-scenario/live/hermes-e2e.test.ts:710
  • Category: acceptance
  • Problem: Issue Impossible to restart hermes gateway if it ever stops #2426's reproduction step is literally `$ `hermes gateway stop`` followed by exiting and reconnecting. The live Hermes scenario creates a stopped gateway by TERM/KILL of the exact tracked gateway PID, then runs `nemohermes <name> recover`; I did not find a changed test invoking the Hermes CLI stop subcommand itself or an in-code equivalence note at that boundary.
  • Impact: If `hermes gateway stop` updates pidfiles, lock files, supervisor-visible state, or cleanup paths differently from direct PID termination, the reported user path can still diverge from the covered recovery path.
  • Recommended action: Add a focused live/integration step that invokes `hermes gateway stop` when available and then runs `nemohermes <name> recover`, or add an in-code test comment documenting why stopping the exact tracked PID is the supported equivalence under the new gateway-user boundary.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Search changed tests for the literal string `hermes gateway stop`, then read Phase 5 of `test/e2e-scenario/live/hermes-e2e.test.ts` around the stopped-gateway recovery block.
  • Missing regression test: Add `Hermes live recover restores a gateway stopped via the Hermes CLI stop command` or add an explicit equivalence assertion/comment beside the existing TERM/KILL stopped-gateway setup.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Search changed tests for the literal string `hermes gateway stop`, then read Phase 5 of `test/e2e-scenario/live/hermes-e2e.test.ts` around the stopped-gateway recovery block.
  • Evidence: The inspected Phase 5 block sends TERM/KILL to `afterGateway.pid`, asserts `GATEWAY_STOPPED`, kills auxiliaries, and runs `nemohermes <name> recover`; no literal `hermes gateway stop` invocation was found in the inspected changed test.

PRA-6 Resolve/justify — Identify runtime proof for root-only supervisor controls under capability drop

  • Location: Dockerfile:925
  • Category: tests
  • Problem: This PR adds a root-only gateway control helper, root-owned control directory protocol, and a stale-base fallback that adds `root` to the `sandbox` group so PID 1 can access mutable state after `CAP_DAC_OVERRIDE` is dropped. The inspected tests cover many source snippets and harnessed process identities, but I did not identify one concise runtime-image proof spanning the full privilege contract for both OpenClaw and Hermes images.
  • Impact: A source-shape test can miss image layering, ownership, supplementary-group, or capability-drop differences. If the installed helper or control directory is writable/invocable by sandbox/gateway users, or if step-down retains unintended groups, host-mediated restart can become a privilege or policy-bypass boundary.
  • Recommended action: Add or identify targeted runtime validation for both OpenClaw and Hermes images proving the installed helper and control directory modes, non-root refusal, root access without `CAP_DAC_OVERRIDE`, and supplementary-group reset after gateway/sandbox step-down.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect `test/sandbox-provisioning-helper-permissions.test.ts`, `test/startup-process-identity.test.ts`, `test/repro-2681-group-writable.test.ts`, and live image tests for a single runtime proof covering `/usr/local/bin/nemoclaw-gateway-control`, `/run/nemoclaw/gateway-control`, non-root invocation/write refusal, `CAP_DAC_OVERRIDE` absence, and `id -G` after step-down.
  • Missing regression test: Add `OpenClaw and Hermes images keep gateway-control root-only under CAP_DAC_OVERRIDE drop`, asserting `/usr/local/bin/nemoclaw-gateway-control` is `root:root 0700`, `/run/nemoclaw/gateway-control` is `root:root 0700` after startup, sandbox/gateway users cannot invoke the helper or write request/status files, PID 1 can still read/write required mutable config paths without `CAP_DAC_OVERRIDE`, and stepped-down gateway/sandbox commands do not retain root-only supplementary groups.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect `test/sandbox-provisioning-helper-permissions.test.ts`, `test/startup-process-identity.test.ts`, `test/repro-2681-group-writable.test.ts`, and live image tests for a single runtime proof covering `/usr/local/bin/nemoclaw-gateway-control`, `/run/nemoclaw/gateway-control`, non-root invocation/write refusal, `CAP_DAC_OVERRIDE` absence, and `id -G` after step-down.
  • Evidence: `Dockerfile` installs `nemoclaw-gateway-control` as root:root 0700 and adds root to the sandbox group in the stale-base fallback. Existing inspected tests assert copied file modes and process-identity harnesses, but the static inventory did not establish the full runtime image contract across both agents.

PRA-7 Resolve/justify — Make managed non-root restart limitations explicit in behavior and tests

  • Location: scripts/managed-gateway-control.py:17
  • Category: security
  • Problem: The managed OpenShell topology branch documents that the host enters through `docker exec --user root`, but the managed supervisor, gateway, and agent all share the `sandbox` UID, so process-shape proof cannot establish provenance against a malicious same-UID agent. It also documents that mutable managed configuration has no durable root-owned config hash and therefore is not equivalent to the root-PID-1 restart seal.
  • Impact: Maintainers or callers can over-trust the managed restart path if tests and user-visible behavior only say restart succeeded. In mutable managed mode, the helper can prove host authorization, PID reuse safety, validators, and health, but not same-UID process provenance or root-anchored mutable config integrity.
  • Recommended action: Keep the code fail-closed where it cannot prove root-owned locked config, and add an explicit behavior test or user-facing diagnostic/docs note that managed mutable restarts are not equivalent to the root-PID-1 seal. If stronger guarantees are required for this PR, gate managed mutable restart behind a locked-config/root-anchor proof instead.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Read the limitation comments and `_verify_locked_hermes_hash()` in `scripts/managed-gateway-control.py`, then inspect `test/managed-gateway-control.test.ts` for assertions that the mutable branch does not claim strict hash trust or equivalence with the root supervisor seal.
  • Missing regression test: Add `managed gateway control reports mutable same-UID restart as degraded and never treats mutable Hermes hash as a root trust anchor`, asserting the mutable branch skips strict hash trust, still runs secret-boundary validators, and surfaces or documents the degraded provenance contract.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Read the limitation comments and `_verify_locked_hermes_hash()` in `scripts/managed-gateway-control.py`, then inspect `test/managed-gateway-control.test.ts` for assertions that the mutable branch does not claim strict hash trust or equivalence with the root supervisor seal.
  • Evidence: `scripts/managed-gateway-control.py:17-29` states the host request boundary and same-UID limitation; lines around `_verify_locked_hermes_hash()` state that mutable managed config cannot be treated as a root trust anchor.

PRA-8 Resolve/justify — Shrink the central shields orchestration hotspot

  • Location: src/lib/shields/index.ts:1
  • Category: architecture
  • Problem: `src/lib/shields/index.ts` remains a very large central orchestration file and grew substantially in this PR, even though new transition/config/state lock modules were also introduced. The changed code is security-sensitive timer/transition/shields behavior, where monolithic coupling makes branch coverage, invariant review, and future fixes harder.
  • Impact: Security-critical shields/timer transitions become harder to audit for TOCTOU, stale callback, and lock-order bugs. Future fixes are more likely to accidentally weaken validation hidden in the central file.
  • Recommended action: Where local to this PR, extract cohesive Hermes/OpenClaw transition orchestration blocks from `index.ts` into narrow modules beside the new lock modules, or add focused in-code justification for why the remaining `index.ts` growth must stay coupled for this bug fix. Preserve trust-boundary validation at call sites and keep existing fail-closed tests intact.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Compare `src/lib/shields/index.ts` against the new `openclaw-config-lock.ts`, `state-dir-lock.ts`, and `transition-lock.ts` modules, and identify transition-specific functions that can move without changing behavior.
  • Missing regression test: Existing shields transition and timer tests should continue to cover the extracted paths; if extraction changes seams, add `shields transition orchestration preserves stale timer fail-closed behavior after module extraction`.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Compare `src/lib/shields/index.ts` against the new `openclaw-config-lock.ts`, `state-dir-lock.ts`, and `transition-lock.ts` modules, and identify transition-specific functions that can move without changing behavior.
  • Evidence: The drift context reports `src/lib/shields/index.ts` growing from 1611 to 3236 lines while new lock modules also exist. Prior advisor review flagged the same central hotspot and the current diff still leaves it expanded.

💡 In-scope improvements

These are lower-risk, not throwaway. Prefer fixing them in this PR when they are local to changed code; defer only with rationale or a linked follow-up.

  • None.
Simplification opportunities: 1 possible cut

These are safe simplification checks only. Do not remove validation, security controls, data-loss prevention, or required tests.

  • PRA-8 shrink (src/lib/shields/index.ts:1): Move cohesive transition/timer orchestration blocks out of `src/lib/shields/index.ts` that are already conceptually backed by the new lock modules.
    • Replacement: Small modules with narrow exported functions near `transition-lock.ts`, `timer-bound-lock.ts`, and config/state lock modules; keep validation in the caller/callee contract and preserve tests.
    • Safety boundary: Do not remove or weaken shields lock verification, timer expiry fail-closed checks, config/state ownership checks, or recovery rollback tests.
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — runInferenceSet rejects compatible-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts. The PR changes Dockerfiles, root-owned helpers, PID 1 supervisors, capability-sensitive group access, Hermes/OpenClaw config guards, and live gateway recovery. Unit and harness coverage is broad, but key trust-boundary behavior depends on actual built image ownership, runtime users, control directory modes, and capability-drop behavior.
  • PRA-T2 Runtime validation — runInferenceSet rejects compatible-anthropic-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts. The PR changes Dockerfiles, root-owned helpers, PID 1 supervisors, capability-sensitive group access, Hermes/OpenClaw config guards, and live gateway recovery. Unit and harness coverage is broad, but key trust-boundary behavior depends on actual built image ownership, runtime users, control directory modes, and capability-drop behavior.
  • PRA-T3 Runtime validation — Hermes live recover restores a gateway stopped via the Hermes CLI stop command. The PR changes Dockerfiles, root-owned helpers, PID 1 supervisors, capability-sensitive group access, Hermes/OpenClaw config guards, and live gateway recovery. Unit and harness coverage is broad, but key trust-boundary behavior depends on actual built image ownership, runtime users, control directory modes, and capability-drop behavior.
  • PRA-T4 Runtime validation — OpenClaw and Hermes images keep gateway-control root-only under CAP_DAC_OVERRIDE drop. The PR changes Dockerfiles, root-owned helpers, PID 1 supervisors, capability-sensitive group access, Hermes/OpenClaw config guards, and live gateway recovery. Unit and harness coverage is broad, but key trust-boundary behavior depends on actual built image ownership, runtime users, control directory modes, and capability-drop behavior.
  • PRA-T5 Runtime validation — managed gateway control reports mutable same-UID restart as degraded and never treats mutable Hermes hash as a root trust anchor. The PR changes Dockerfiles, root-owned helpers, PID 1 supervisors, capability-sensitive group access, Hermes/OpenClaw config guards, and live gateway recovery. Unit and harness coverage is broad, but key trust-boundary behavior depends on actual built image ownership, runtime users, control directory modes, and capability-drop behavior.
  • PRA-T6 Identify runtime proof for root-only supervisor controls under capability drop — Add or identify targeted runtime validation for both OpenClaw and Hermes images proving the installed helper and control directory modes, non-root refusal, root access without `CAP_DAC_OVERRIDE`, and supplementary-group reset after gateway/sandbox step-down.
  • PRA-T7 Acceptance clause — Issue Impossible to restart hermes gateway if it ever stops #2426: "If the hermes gateway is ever stopped, then NemoClaw will never be able to bring it back up." — add test evidence or identify existing coverage. The PR adds host-mediated restart/recover paths and the live Hermes scenario kills the tracked gateway PID, runs `nemohermes <name> recover`, and verifies replacement gateway/health/forwards. The exact `hermes gateway stop` reproduction is not directly covered.
  • PRA-T8 Acceptance clause — Issue Impossible to restart hermes gateway if it ever stops #2426 Reproduction Steps 1: "Connect to the sandbox" — add test evidence or identify existing coverage. The live scenario operates against a live Hermes sandbox and executes sandbox commands, but the exact interactive connect sequence is modeled through test helpers.
Since last review details

Current findings, using the urgency labels above:

PRA-1 Resolve/justify — Source-of-truth review needed: Custom-compatible inference endpoint metadata

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as missing.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Missing negative tests for loopback, link-local, RFC1918, and DNS-private hosts for both compatible providers.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the security blocker at `src/lib/actions/inference-set.ts:405`.

PRA-2 Resolve/justify — Source-of-truth review needed: Stale-base gateway/root sandbox-group fallback

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Partially covered by provisioning/helper-permission and group-writable tests; missing identified full runtime image validation for helper/control-dir ownership, non-root refusal, capability-drop access, and supplementary-group reset.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the runtime-validation finding at `Dockerfile:925`.

PRA-3 Resolve/justify — Source-of-truth review needed: Managed OpenShell gateway control mutable config branch

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Needs a test asserting the mutable branch is surfaced as degraded and never treats mutable Hermes hash as a root trust anchor.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Covered by the managed non-root restart limitation finding at `scripts/managed-gateway-control.py:17`.

PRA-4 Required — Validate custom-compatible inference endpoint URLs

  • Location: src/lib/actions/inference-set.ts:405
  • Category: security
  • Problem: `normalizeCustomEndpointUrl()` still accepts any syntactically valid `http:` or `https:` URL without embedded credentials, strips query/hash, and persists it as durable metadata for `compatible-endpoint` and `compatible-anthropic-endpoint`. It does not reject loopback, link-local, RFC1918/private hosts, private/internal hostnames, DNS-private resolutions, or HTTP DNS-rebinding windows, while `config set` already has stricter private-host/private-IP/DNS validation and HTTP DNS pinning.
  • Impact: A user-controlled compatible inference endpoint can become trusted routing metadata through a path that bypasses NemoClaw's established SSRF and DNS-rebinding defenses. If consumed by gateway or sandbox routing, that metadata can target host bridge, metadata, or internal services outside the intended egress policy.
  • Required action: Reuse or expose the public-host and DNS validation policy from `src/lib/sandbox/config.ts` for both custom-compatible providers, including HTTP DNS pinning where appropriate. If `host.openshell.internal` or another bridge host is intentionally supported, encode that as a narrow documented allowlist with tests instead of accepting arbitrary private/internal targets.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Read `normalizeCustomEndpointUrl()` in `src/lib/actions/inference-set.ts` and compare it with `validateUrlValueWithDnsResult()` / `rewriteConfigUrlsWithDnsPinning()` in `src/lib/sandbox/config.ts`; then inspect `src/lib/actions/inference-set.test.ts` for negative private/DNS-private endpoint cases.
  • Missing regression test: Add `runInferenceSet rejects compatible-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts` and `runInferenceSet rejects compatible-anthropic-endpoint endpoint-url values for loopback, link-local, RFC1918, and DNS-private hosts`, covering `http://127.0.0.1\`, `http://169.254.169.254\`, `http://10.0.0.1\`, and a mocked hostname resolving to a private address. If a bridge host is allowed, add a positive test proving only that allowlisted host is accepted.
  • Done when: The required change is committed and verification passes: Read `normalizeCustomEndpointUrl()` in `src/lib/actions/inference-set.ts` and compare it with `validateUrlValueWithDnsResult()` / `rewriteConfigUrlsWithDnsPinning()` in `src/lib/sandbox/config.ts`; then inspect `src/lib/actions/inference-set.test.ts` for negative private/DNS-private endpoint cases.
  • Evidence: `src/lib/actions/inference-set.ts:405` validates only URL shape/protocol/embedded credentials. `src/lib/sandbox/config.ts` imports `isPrivateHostname`/`isPrivateIp` and implements DNS validation and HTTP pinning. The inspected inference tests include custom-compatible positive metadata and `http://host.openshell.internal:18767/\`, but no private/DNS-private rejection cases.

PRA-5 Resolve/justify — Cover or document the literal `hermes gateway stop` reproduction

  • Location: test/e2e-scenario/live/hermes-e2e.test.ts:710
  • Category: acceptance
  • Problem: Issue Impossible to restart hermes gateway if it ever stops #2426's reproduction step is literally `$ `hermes gateway stop`` followed by exiting and reconnecting. The live Hermes scenario creates a stopped gateway by TERM/KILL of the exact tracked gateway PID, then runs `nemohermes <name> recover`; I did not find a changed test invoking the Hermes CLI stop subcommand itself or an in-code equivalence note at that boundary.
  • Impact: If `hermes gateway stop` updates pidfiles, lock files, supervisor-visible state, or cleanup paths differently from direct PID termination, the reported user path can still diverge from the covered recovery path.
  • Recommended action: Add a focused live/integration step that invokes `hermes gateway stop` when available and then runs `nemohermes <name> recover`, or add an in-code test comment documenting why stopping the exact tracked PID is the supported equivalence under the new gateway-user boundary.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Search changed tests for the literal string `hermes gateway stop`, then read Phase 5 of `test/e2e-scenario/live/hermes-e2e.test.ts` around the stopped-gateway recovery block.
  • Missing regression test: Add `Hermes live recover restores a gateway stopped via the Hermes CLI stop command` or add an explicit equivalence assertion/comment beside the existing TERM/KILL stopped-gateway setup.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Search changed tests for the literal string `hermes gateway stop`, then read Phase 5 of `test/e2e-scenario/live/hermes-e2e.test.ts` around the stopped-gateway recovery block.
  • Evidence: The inspected Phase 5 block sends TERM/KILL to `afterGateway.pid`, asserts `GATEWAY_STOPPED`, kills auxiliaries, and runs `nemohermes <name> recover`; no literal `hermes gateway stop` invocation was found in the inspected changed test.

PRA-6 Resolve/justify — Identify runtime proof for root-only supervisor controls under capability drop

  • Location: Dockerfile:925
  • Category: tests
  • Problem: This PR adds a root-only gateway control helper, root-owned control directory protocol, and a stale-base fallback that adds `root` to the `sandbox` group so PID 1 can access mutable state after `CAP_DAC_OVERRIDE` is dropped. The inspected tests cover many source snippets and harnessed process identities, but I did not identify one concise runtime-image proof spanning the full privilege contract for both OpenClaw and Hermes images.
  • Impact: A source-shape test can miss image layering, ownership, supplementary-group, or capability-drop differences. If the installed helper or control directory is writable/invocable by sandbox/gateway users, or if step-down retains unintended groups, host-mediated restart can become a privilege or policy-bypass boundary.
  • Recommended action: Add or identify targeted runtime validation for both OpenClaw and Hermes images proving the installed helper and control directory modes, non-root refusal, root access without `CAP_DAC_OVERRIDE`, and supplementary-group reset after gateway/sandbox step-down.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect `test/sandbox-provisioning-helper-permissions.test.ts`, `test/startup-process-identity.test.ts`, `test/repro-2681-group-writable.test.ts`, and live image tests for a single runtime proof covering `/usr/local/bin/nemoclaw-gateway-control`, `/run/nemoclaw/gateway-control`, non-root invocation/write refusal, `CAP_DAC_OVERRIDE` absence, and `id -G` after step-down.
  • Missing regression test: Add `OpenClaw and Hermes images keep gateway-control root-only under CAP_DAC_OVERRIDE drop`, asserting `/usr/local/bin/nemoclaw-gateway-control` is `root:root 0700`, `/run/nemoclaw/gateway-control` is `root:root 0700` after startup, sandbox/gateway users cannot invoke the helper or write request/status files, PID 1 can still read/write required mutable config paths without `CAP_DAC_OVERRIDE`, and stepped-down gateway/sandbox commands do not retain root-only supplementary groups.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect `test/sandbox-provisioning-helper-permissions.test.ts`, `test/startup-process-identity.test.ts`, `test/repro-2681-group-writable.test.ts`, and live image tests for a single runtime proof covering `/usr/local/bin/nemoclaw-gateway-control`, `/run/nemoclaw/gateway-control`, non-root invocation/write refusal, `CAP_DAC_OVERRIDE` absence, and `id -G` after step-down.
  • Evidence: `Dockerfile` installs `nemoclaw-gateway-control` as root:root 0700 and adds root to the sandbox group in the stale-base fallback. Existing inspected tests assert copied file modes and process-identity harnesses, but the static inventory did not establish the full runtime image contract across both agents.

PRA-7 Resolve/justify — Make managed non-root restart limitations explicit in behavior and tests

  • Location: scripts/managed-gateway-control.py:17
  • Category: security
  • Problem: The managed OpenShell topology branch documents that the host enters through `docker exec --user root`, but the managed supervisor, gateway, and agent all share the `sandbox` UID, so process-shape proof cannot establish provenance against a malicious same-UID agent. It also documents that mutable managed configuration has no durable root-owned config hash and therefore is not equivalent to the root-PID-1 restart seal.
  • Impact: Maintainers or callers can over-trust the managed restart path if tests and user-visible behavior only say restart succeeded. In mutable managed mode, the helper can prove host authorization, PID reuse safety, validators, and health, but not same-UID process provenance or root-anchored mutable config integrity.
  • Recommended action: Keep the code fail-closed where it cannot prove root-owned locked config, and add an explicit behavior test or user-facing diagnostic/docs note that managed mutable restarts are not equivalent to the root-PID-1 seal. If stronger guarantees are required for this PR, gate managed mutable restart behind a locked-config/root-anchor proof instead.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Read the limitation comments and `_verify_locked_hermes_hash()` in `scripts/managed-gateway-control.py`, then inspect `test/managed-gateway-control.test.ts` for assertions that the mutable branch does not claim strict hash trust or equivalence with the root supervisor seal.
  • Missing regression test: Add `managed gateway control reports mutable same-UID restart as degraded and never treats mutable Hermes hash as a root trust anchor`, asserting the mutable branch skips strict hash trust, still runs secret-boundary validators, and surfaces or documents the degraded provenance contract.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Read the limitation comments and `_verify_locked_hermes_hash()` in `scripts/managed-gateway-control.py`, then inspect `test/managed-gateway-control.test.ts` for assertions that the mutable branch does not claim strict hash trust or equivalence with the root supervisor seal.
  • Evidence: `scripts/managed-gateway-control.py:17-29` states the host request boundary and same-UID limitation; lines around `_verify_locked_hermes_hash()` state that mutable managed config cannot be treated as a root trust anchor.

PRA-8 Resolve/justify — Shrink the central shields orchestration hotspot

  • Location: src/lib/shields/index.ts:1
  • Category: architecture
  • Problem: `src/lib/shields/index.ts` remains a very large central orchestration file and grew substantially in this PR, even though new transition/config/state lock modules were also introduced. The changed code is security-sensitive timer/transition/shields behavior, where monolithic coupling makes branch coverage, invariant review, and future fixes harder.
  • Impact: Security-critical shields/timer transitions become harder to audit for TOCTOU, stale callback, and lock-order bugs. Future fixes are more likely to accidentally weaken validation hidden in the central file.
  • Recommended action: Where local to this PR, extract cohesive Hermes/OpenClaw transition orchestration blocks from `index.ts` into narrow modules beside the new lock modules, or add focused in-code justification for why the remaining `index.ts` growth must stay coupled for this bug fix. Preserve trust-boundary validation at call sites and keep existing fail-closed tests intact.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Compare `src/lib/shields/index.ts` against the new `openclaw-config-lock.ts`, `state-dir-lock.ts`, and `transition-lock.ts` modules, and identify transition-specific functions that can move without changing behavior.
  • Missing regression test: Existing shields transition and timer tests should continue to cover the extracted paths; if extraction changes seams, add `shields transition orchestration preserves stale timer fail-closed behavior after module extraction`.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Compare `src/lib/shields/index.ts` against the new `openclaw-config-lock.ts`, `state-dir-lock.ts`, and `transition-lock.ts` modules, and identify transition-specific functions that can move without changing behavior.
  • Evidence: The drift context reports `src/lib/shields/index.ts` growing from 1611 to 3236 lines while new lock modules also exist. Prior advisor review flagged the same central hotspot and the current diff still leaves it expanded.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28253422289
Target ref: 2bfd6d931b1c81e2668ed9957454f02eb03b1157
Workflow ref: main
Requested jobs: hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

@wscurran wscurran added the integration: dcode LangChain Deep Code integration behavior label Jun 26, 2026
ericksoa added 2 commits June 26, 2026 10:26
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/process-recovery.ts`:
- Line 401: The restart script is being built with the wrong port value:
`buildHermesGatewayRestartScript` expects the Hermes health/API port for
`_NEMOCLAW_RESTART_HEALTH_PORT`, but `process-recovery.ts` is currently passing
`dashboardPort`. Update the call site to pass the actual Hermes health port from
the recovery context instead of the dashboard port, and make sure the value used
here matches the fixture’s health probe port so
`buildHermesGatewayRestartScript` launches socat/health recovery against the
correct endpoint.
- Around line 395-405: The Hermes recovery path in recoverSandboxProcesses still
prints failure details even when callers request quiet mode, so thread the quiet
flag into this helper and suppress the printGatewayRestartFailure calls when
quiet is true. Update the existing checkAndRecoverSandboxProcesses call site to
pass { quiet }, and use the recoverSandboxProcesses and
printGatewayRestartFailure symbols to locate the affected Hermes branch.

In `@src/lib/agent/runtime.ts`:
- Around line 364-367: The restart validation in runtime.ts should fail closed
instead of falling back to resolving Hermes from PATH. Update the logic in the
validationSteps construction so that if AGENT_BIN at binaryPath is missing or
not executable, it exits with AGENT_MISSING rather than calling command -v on
binaryName. Keep the check within the restart flow that relaunches as gateway,
and ensure the code only accepts the trusted binaryPath for locating the
executable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e725fd75-9d5c-4add-9d95-ab5fbe819868

📥 Commits

Reviewing files that changed from the base of the PR and between 36c0bca and 2dfcdb8.

📒 Files selected for processing (4)
  • src/lib/actions/sandbox/process-recovery.ts
  • src/lib/agent/runtime.test.ts
  • src/lib/agent/runtime.ts
  • test/process-recovery.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/agent/runtime.test.ts

Comment thread src/lib/actions/sandbox/process-recovery.ts Outdated
Comment thread src/lib/actions/sandbox/process-recovery.ts Outdated
Comment thread src/lib/agent/runtime.ts Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28254560749
Target ref: 2dfcdb8a8dd9436465ca7a029f4a8a3f47beacf2
Workflow ref: main
Requested jobs: hermes-secret-boundary-e2e,hermes-root-entrypoint-smoke-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28254817942
Target ref: 5648819da358c58d53a46393a31d23713426ff45
Workflow ref: main
Requested jobs: hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

ericksoa added 2 commits June 26, 2026 10:40
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28255266980
Target ref: d3431fd6d815259c99f98c9a0df647839b03e95a
Workflow ref: main
Requested jobs: hermes-secret-boundary-e2e
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-secret-boundary-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28255490504
Target ref: b1b7451cce9dcbd9635fbf06b4cca2f7ef93ee48
Workflow ref: main
Requested jobs: hermes-secret-boundary-e2e,hermes-root-entrypoint-smoke-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28257003744
Target ref: 7fb4a89396383961879a2735d69c7530a3e6c8ab
Workflow ref: main
Requested jobs: hermes-secret-boundary-e2e,issue-2478-crash-loop-recovery-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-secret-boundary-e2e ✅ success
issue-2478-crash-loop-recovery-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28337415605
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: cloud-onboard-vitest,hermes-e2e-vitest,hermes-root-entrypoint-smoke-vitest,hermes-sandbox-secret-boundary-vitest
Summary: 1 passed, 3 failed, 0 cancelled, 0 skipped

Job Result
cloud-onboard-vitest ✅ success
hermes-e2e-vitest ❌ failure
hermes-root-entrypoint-smoke-vitest ❌ failure
hermes-sandbox-secret-boundary-vitest ❌ failure

Failed jobs: hermes-e2e-vitest, hermes-root-entrypoint-smoke-vitest, hermes-sandbox-secret-boundary-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28337418297
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: openclaw-inference-switch-vitest,hermes-inference-switch-vitest,credential-sanitization-vitest,hermes-dashboard-vitest
Summary: 1 passed, 3 failed, 0 cancelled, 0 skipped

Job Result
credential-sanitization-vitest ✅ success
hermes-dashboard-vitest ❌ failure
hermes-inference-switch-vitest ❌ failure
openclaw-inference-switch-vitest ❌ failure

Failed jobs: hermes-dashboard-vitest, hermes-inference-switch-vitest, openclaw-inference-switch-vitest. Check run artifacts for logs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28337416532
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: gateway-guard-recovery,shields-config-vitest,rebuild-hermes-vitest,rebuild-openclaw-vitest
Summary: 1 passed, 3 failed, 0 cancelled, 0 skipped

Job Result
gateway-guard-recovery ❌ failure
rebuild-hermes-vitest ❌ failure
rebuild-openclaw-vitest ✅ success
shields-config-vitest ❌ failure

Failed jobs: gateway-guard-recovery, rebuild-hermes-vitest, shields-config-vitest. Check run artifacts for logs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28338427259
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (selector rejected by workflow validation)
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 0 passed, 1 failed, 0 cancelled, 0 skipped

Job Result
generate-matrix ❌ failure

Failed jobs: generate-matrix. Check run artifacts for logs.

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28338521527
Target ref: 99a6193aaaa38286cfb134b0fea560fb122fe333
Workflow ref: main
Requested jobs: hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338428181
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: sandbox-rebuild,state-backup-restore,sandbox-survival,snapshot-commands
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
sandbox-rebuild-vitest ⚠️ cancelled
sandbox-survival-vitest ✅ success
snapshot-commands-vitest ✅ success
state-backup-restore-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338429021
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: credential-sanitization,hermes-dashboard,hermes-inference-switch,openclaw-inference-switch
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 2 passed, 0 failed, 2 cancelled, 0 skipped

Job Result
credential-sanitization-vitest ✅ success
hermes-dashboard-vitest ⚠️ cancelled
hermes-inference-switch-vitest ⚠️ cancelled
openclaw-inference-switch-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338426375
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: cloud-onboard,hermes-root-entrypoint-smoke,hermes-sandbox-secret-boundary,hermes-e2e
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
cloud-onboard-vitest ✅ success
hermes-e2e-vitest ⚠️ cancelled
hermes-root-entrypoint-smoke-vitest ✅ success
hermes-sandbox-secret-boundary-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338449956
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: rebuild-openclaw-vitest,gateway-guard-recovery,shields-config-vitest,rebuild-hermes-vitest
Summary: 2 passed, 0 failed, 2 cancelled, 0 skipped

Job Result
gateway-guard-recovery ⚠️ cancelled
rebuild-hermes-vitest ⚠️ cancelled
rebuild-openclaw-vitest ✅ success
shields-config-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28338817268
Target ref: 21aba470668eeeef19191e819586594de63d3210
Workflow ref: main
Requested jobs: hermes-root-entrypoint-smoke-e2e,hermes-secret-boundary-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
hermes-root-entrypoint-smoke-e2e ✅ success
hermes-secret-boundary-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All selected jobs passed

Run: 28338697335
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: sandbox-rebuild,state-backup-restore,sandbox-survival,snapshot-commands
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 4 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
sandbox-rebuild-vitest ✅ success
sandbox-survival-vitest ✅ success
snapshot-commands-vitest ✅ success
state-backup-restore-vitest ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338698274
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: credential-sanitization,hermes-dashboard,hermes-inference-switch,openclaw-inference-switch
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 2 passed, 0 failed, 2 cancelled, 0 skipped

Job Result
credential-sanitization-vitest ✅ success
hermes-dashboard-vitest ⚠️ cancelled
hermes-inference-switch-vitest ⚠️ cancelled
openclaw-inference-switch-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28338696474
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: rebuild-openclaw-vitest,gateway-guard-recovery,shields-config-vitest,rebuild-hermes-vitest
Summary: 2 passed, 1 failed, 1 cancelled, 0 skipped

Job Result
gateway-guard-recovery ❌ failure
rebuild-hermes-vitest ⚠️ cancelled
rebuild-openclaw-vitest ✅ success
shields-config-vitest ✅ success

Failed jobs: gateway-guard-recovery. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28338695721
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: cloud-onboard,hermes-root-entrypoint-smoke,hermes-sandbox-secret-boundary,hermes-e2e
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
cloud-onboard-vitest ✅ success
hermes-e2e-vitest ⚠️ cancelled
hermes-root-entrypoint-smoke-vitest ✅ success
hermes-sandbox-secret-boundary-vitest ✅ success

Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28339206619
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: cloud-onboard,hermes-root-entrypoint-smoke,hermes-sandbox-secret-boundary,hermes-e2e
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
cloud-onboard-vitest ✅ success
hermes-e2e-vitest ⚠️ cancelled
hermes-root-entrypoint-smoke-vitest ✅ success
hermes-sandbox-secret-boundary-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28339206659
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: rebuild-openclaw-vitest,gateway-guard-recovery,shields-config-vitest,rebuild-hermes-vitest
Summary: 1 passed, 0 failed, 3 cancelled, 0 skipped

Job Result
gateway-guard-recovery ⚠️ cancelled
rebuild-hermes-vitest ⚠️ cancelled
rebuild-openclaw-vitest ⚠️ cancelled
shields-config-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28339206629
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: credential-sanitization,hermes-dashboard,hermes-inference-switch,openclaw-inference-switch
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 2 passed, 0 failed, 2 cancelled, 0 skipped

Job Result
credential-sanitization-vitest ✅ success
hermes-dashboard-vitest ⚠️ cancelled
hermes-inference-switch-vitest ⚠️ cancelled
openclaw-inference-switch-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ⚠️ Some jobs cancelled — partial pass

Run: 28339206603
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: sandbox-rebuild,state-backup-restore,sandbox-survival,snapshot-commands
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
sandbox-rebuild-vitest ⚠️ cancelled
sandbox-survival-vitest ✅ success
snapshot-commands-vitest ✅ success
state-backup-restore-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ✅ All selected jobs passed

Run: 28339472376
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: sandbox-rebuild,state-backup-restore,sandbox-survival,snapshot-commands
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 4 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
sandbox-rebuild-vitest ✅ success
sandbox-survival-vitest ✅ success
snapshot-commands-vitest ✅ success
state-backup-restore-vitest ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28339472357
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: cloud-onboard,hermes-root-entrypoint-smoke,hermes-sandbox-secret-boundary,hermes-e2e
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 3 passed, 1 failed, 0 cancelled, 0 skipped

Job Result
cloud-onboard-vitest ✅ success
hermes-e2e-vitest ❌ failure
hermes-root-entrypoint-smoke-vitest ✅ success
hermes-sandbox-secret-boundary-vitest ✅ success

Failed jobs: hermes-e2e-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28339472359
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: (default — all supported)
Requested jobs: rebuild-openclaw-vitest,gateway-guard-recovery,shields-config-vitest,rebuild-hermes-vitest
Summary: 2 passed, 2 failed, 0 cancelled, 0 skipped

Job Result
gateway-guard-recovery ❌ failure
rebuild-hermes-vitest ❌ failure
rebuild-openclaw-vitest ✅ success
shields-config-vitest ✅ success

Failed jobs: gateway-guard-recovery, rebuild-hermes-vitest. Check run artifacts for logs.

@github-actions

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Results — ❌ Some jobs failed

Run: 28339472411
Workflow ref: fix/issue-2426-gateway-restart
Requested scenarios: credential-sanitization,hermes-dashboard,hermes-inference-switch,openclaw-inference-switch
Requested jobs: (default — all default-enabled free-standing jobs; explicit-only jobs such as jetson-nvmap-gpu-vitest and sandbox-rlimits-connect-vitest are skipped unless selected)
Summary: 2 passed, 2 failed, 0 cancelled, 0 skipped

Job Result
credential-sanitization-vitest ✅ success
hermes-dashboard-vitest ❌ failure
hermes-inference-switch-vitest ❌ failure
openclaw-inference-switch-vitest ✅ success

Failed jobs: hermes-dashboard-vitest, hermes-inference-switch-vitest. Check run artifacts for logs.

ericksoa added 2 commits June 28, 2026 18:02
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: cli Command line interface, flags, terminal UX, or output area: docs Documentation, examples, guides, or docs build area: packaging Packages, images, registries, installers, or distribution area: sandbox OpenShell sandbox lifecycle, runtime, config, or recovery area: security Security controls, permissions, secrets, or hardening feature PR adds or expands user-visible functionality integration: dcode LangChain Deep Code integration behavior integration: hermes Hermes integration behavior integration: openclaw OpenClaw integration behavior v0.0.71 Release target

Projects

None yet

4 participants