Skip to content

fix(onboard): surface network-policy hint when OpenClaw plugin install fails during sandbox build#5835

Open
Dongni-Yang wants to merge 6 commits into
mainfrom
fix/4127-plugin-install-network-policy-hint
Open

fix(onboard): surface network-policy hint when OpenClaw plugin install fails during sandbox build#5835
Dongni-Yang wants to merge 6 commits into
mainfrom
fix/4127-plugin-install-network-policy-hint

Conversation

@Dongni-Yang

@Dongni-Yang Dongni-Yang commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

When `openshell sandbox create` runs and the Docker build fails at the `openclaw plugins install` RUN step due to a network/egress reachability error, NemoClaw previously fell through to the generic "Recovery: nemoclaw onboard --resume" message with no indication of why the build failed.

This PR adds a new `plugin_install_network_denied` failure kind to `classifySandboxCreateFailure()` and a targeted hint in `printSandboxCreateRecoveryHints()` that:

  • Names the failing step (OpenClaw plugin-install)
  • Points to the most likely cause (sandbox network policy blocking npm/ClawHub egress)
  • Suggests two remedies: relax the preset that requires egress, or disable the feature that triggers the install (e.g. `NEMOCLAW_WEB_SEARCH_ENABLED=0`)
  • Continues to surface `nemoclaw onboard --resume` as the recovery path

The classifier requires both the Docker failed-command block embedding the plugin-install step and a network/egress reachability error (`ENOTFOUND`, `EAI_AGAIN`, `ECONNREFUSED`, `ETIMEDOUT`, etc.) within the text up to and including the Docker error boundary. Scoping the network-evidence search to that window prevents a network error from an unrelated later RUN block — or from an npm script that runs after a successful plugin install in the same block — from producing a misleading hint. The `npm error` prefix further anchors evidence to npm's own output format.

Changes

  • `src/lib/validation.ts`: add `"plugin_install_network_denied"` to the `SandboxCreateFailure.kind` union; add classifier case requiring plugin-install failed-command block + network-error evidence within the output prefix up to that block
  • `src/lib/build-context.ts`: add targeted hint handler in `printSandboxCreateRecoveryHints()`
  • `src/lib/validation.test.ts`: positive tests for ENOTFOUND (npm registry) and ECONNREFUSED (ClawHub) paths; negative tests for non-network E404 failure, step-header false positive, doctor-fix error format, and post-boundary npm error
  • `src/lib/build-context.test.ts`: direct coverage for the new hint branch asserting key user-visible strings

Test plan

  • `npx vitest run --project cli src/lib/validation.test.ts src/lib/build-context.test.ts` — 85 tests pass
  • `npm run typecheck:cli` — clean
  • `npx @biomejs/biome lint src/lib/validation.ts src/lib/build-context.ts` — clean
  • New tests were RED on unmodified `main` (stash confirmed) and GREEN after the fix
  • Non-network plugin-install failure (E404 package-not-found) correctly returns `"unknown"` — regression guard for the advisor's false-positive concern

Fixes #4127

Signed-off-by: Dongni Yang dongniy@nvidia.com

…the openclaw plugins install step

When `openshell sandbox create` fails because the Dockerfile's
`openclaw plugins install` RUN step exits non-zero, the build output
contains the distinctive strings "openclaw plugins install" or
"npm:@openclaw/", but `classifySandboxCreateFailure` had no branch for
this pattern and fell through to `kind="unknown"`. The resulting hint
was the generic "nemoclaw onboard --resume" line with no indication of
the likely cause.

Add a `"plugin_install_network_denied"` kind to `SandboxCreateFailure`
and a classifier that matches those strings in the openshell create
output. `printSandboxCreateRecoveryHints` now emits a targeted message
noting that network policy may be blocking outbound access to ClawHub
or the npm registry, and suggests disabling the feature (e.g.
`NEMOCLAW_WEB_SEARCH_ENABLED=0`) as an alternative if a preset isn't
available.

Fixes #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
The previous regex matched `openclaw plugins install|npm:@openclaw/`
anywhere in the captured output, which would mis-classify a build where
that step succeeded and a later step failed — the Docker step header
(`Step N: RUN openclaw plugins install ...`) would match even though the
failure came from an unrelated subsequent RUN step.

Tighten to the Docker error block format:
  The command '...<plugin text>...' returned a non-zero code

`[^']*` is used (not `[^\n]*`) because the embedded shell command often
spans multiple lines (chained commands joined with semicolons), and JS
character classes match newlines. Add a regression test that verifies
the step-header false-positive is rejected.

Refs #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
…n-install failure as network-denied

The previous classifier fired on any failed `openclaw plugins install`
Docker command, including package-not-found (HTTP 404 from the registry),
version conflicts, and auth errors — all of which would receive a
misleading "network policy blocking egress" hint.

Narrow the match to require both:
1. The Docker error block anchored to the failed plugin-install step
2. A network/egress reachability error (ENOTFOUND, EAI_AGAIN,
   ECONNREFUSED, ETIMEDOUT, fetch failed, etc.) in the captured output

This ensures the hint "your sandbox network policy may be blocking
outbound plugin-install access" is only shown when the underlying
failure is actually a network reachability problem.

Test changes:
- Updated positive tests to include ENOTFOUND / ECONNREFUSED output
  matching real npm network error messages (registry.npmjs.org and
  ClawHub paths respectively)
- Added negative test: same Docker failed-command block but E404
  package-not-found → classifies as "unknown"
- Added direct build-context.test.ts coverage for the new hint branch,
  asserting the key user-visible strings (plugin-install step, ClawHub,
  npm registry, network policy, NEMOCLAW_WEB_SEARCH_ENABLED=0,
  onboard --resume)

Refs #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a new sandbox-create failure kind for plugin-install network denial, classifies matching OpenClaw install errors, and prints recovery hints that mention the plugin-install step, network policy, and onboard --resume.

Changes

OpenClaw plugin install network denial

Layer / File(s) Summary
Failure kind and classifier
src/lib/validation.ts, src/lib/validation.test.ts
Adds plugin_install_network_denied to sandbox-create failures and classifies plugin-install failures when network/DNS-style error text is present; tests cover matching and non-matching cases.
Recovery hints and resume path
src/lib/build-context.ts, src/lib/build-context.test.ts
Prints targeted recovery guidance for the new failure kind, including the plugin-install step context and onboard --resume, with a test for the emitted text.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Suggested labels

integration: openclaw, area: onboarding, area: sandbox, bug-fix

Suggested reviewers

  • cv
  • ericksoa

Poem

A rabbit hopped through logs so bright,
and found a network-denied night.
“ClawHub’s far,” said I with flair,
“but onboard --resume gets us there!”
🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: a network-policy hint for OpenClaw plugin-install sandbox build failures.
Linked Issues check ✅ Passed The PR adds targeted diagnostics for plugin-install egress denials and covers both npm registry and ClawHub network failures as requested.
Out of Scope Changes check ✅ Passed No unrelated functionality is introduced beyond the plugin-install failure diagnostics, classifier, and tests.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4127-plugin-install-network-policy-hint

Comment @coderabbitai help to get the list of available commands.

@github-code-quality

github-code-quality Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 9d8df94 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 47%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 9d8df94 +/-
src/lib/state/o...oard-session.ts 91%
src/lib/actions...dbox/rebuild.ts 72%
src/lib/sandbox/config.ts 72%
src/lib/onboard/preflight.ts 62%
src/lib/shields/index.ts 62%
src/lib/actions...licy-channel.ts 60%
src/lib/state/sandbox.ts 56%
src/lib/policy/index.ts 48%
src/lib/onboard...er-gpu-patch.ts 47%
src/lib/onboard.ts 19%

Updated June 26, 2026 07:00 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — Changes requested

Merge posture: Do not merge yet
Primary next action: Resolve or justify PRA-1: Source-of-truth review needed: Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()`.
Open items: 0 required · 2 warnings · 0 suggestions · 6 test follow-ups
Since last review: 0 prior items resolved · 2 still apply · 0 new items found

Action checklist

  • PRA-1 Resolve or justify: Source-of-truth review needed: Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()`
  • PRA-2 Resolve or justify: Bind npm network evidence to the plugin-install failure, not the whole Docker-log prefix in src/lib/validation.ts:151
  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Acceptance clause
  • PRA-T4 Add or justify test follow-up: Acceptance clause
  • PRA-T5 Add or justify test follow-up: Acceptance clause
  • PRA-T6 Add or justify test follow-up: Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()`

Findings index

ID Severity Category Location Required action
PRA-1 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-2 Resolve/justify security src/lib/validation.ts:151 Scope the network predicate to evidence associated with the plugin-install failure itself. For example, extract the npm error block immediately adjacent to the failed plugin-install invocation, require the failed command to be the plugin-install command rather than any multi-command RUN block containing it, or correlate the npm error package URL with the `npm:@openclaw/...` package in the failed command.
Review findings by urgency: 0 required fixes, 2 items to resolve/justify, 0 in-scope improvements

⚠️ Resolve or justify before merge

Investigate these in the current review; either fix them, explain why they are not applicable, or document the accepted risk.

PRA-1 Resolve/justify — Source-of-truth review needed: Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()`

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Existing tests cover npm-registry ENOTFOUND, ClawHub ECONNREFUSED, non-network E404, later Docker step failure, non-npm network text after a same-RUN block, post-boundary npm network text, and the user-visible hint. Missing coverage is pre-boundary unrelated npm network output in the same RUN block.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: The code comment claims the prefix search prevents a different npm script in the same RUN block from triggering the hint, but `segment` includes all process output before Docker's final summary, where that later npm script's stderr would normally appear.

PRA-2 Resolve/justify — Bind npm network evidence to the plugin-install failure, not the whole Docker-log prefix

  • Location: src/lib/validation.ts:151
  • Category: security
  • Problem: The updated classifier no longer scans the entire output, but it still builds `segment` from the beginning of the Docker/OpenShell log through the failed command summary. That segment can include unrelated `npm error ... ENOTFOUND` output emitted before Docker prints `The command ... returned a non-zero code`, including from a later npm command in the same RUN block after `openclaw plugins install` succeeded. In that case the code would still return `plugin_install_network_denied` for a non-plugin-install network failure.
  • Impact: A false positive can print guidance to allow ClawHub/npm egress even when plugin installation did not fail because of sandbox network policy. Because sandbox network policy is a security boundary, misleading recovery text can push users toward broader egress than necessary.
  • Recommended action: Scope the network predicate to evidence associated with the plugin-install failure itself. For example, extract the npm error block immediately adjacent to the failed plugin-install invocation, require the failed command to be the plugin-install command rather than any multi-command RUN block containing it, or correlate the npm error package URL with the `npm:@openclaw/...` package in the failed command.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Read `classifySandboxCreateFailure()` at `src/lib/validation.ts:146-158` and confirm the npm network-error regex no longer runs against `text.slice(0, failedCommandEnd)`. It should operate on a bounded plugin-install/npm-error block or an equivalent correlation.
  • Missing regression test: Add `classifySandboxCreateFailure does NOT classify a later npm command in the same RUN block when its npm ENOTFOUND output appears before Docker's failed-command summary`, with output ordered as npm ENOTFOUND lines first, then `The command '/bin/bash ... openclaw plugins install ...; npm run doctor-fix;' returned a non-zero code: 1`.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Read `classifySandboxCreateFailure()` at `src/lib/validation.ts:146-158` and confirm the npm network-error regex no longer runs against `text.slice(0, failedCommandEnd)`. It should operate on a bounded plugin-install/npm-error block or an equivalent correlation.
  • Evidence: `validation.ts` computes `segment = text.slice(0, pluginInstallErrorMatch.index + pluginInstallErrorMatch[0].length)` and then tests `/npm error.*(?:ENOTFOUND|...)/i` against that whole prefix. The added negative test for the later npm command places npm error lines after the Docker failed-command boundary, so it does not cover the common log ordering where process stderr precedes Docker's final failed-command summary.

💡 In-scope improvements

These are lower-risk, not throwaway. Prefer fixing them in this PR when they are local to changed code; defer only with rationale or a linked follow-up.

  • None.
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — Add `classifySandboxCreateFailure does NOT classify a later npm command in the same RUN block when its npm ENOTFOUND output appears before Docker's failed-command summary`.. The changed source is pure and has direct unit coverage, but it classifies infrastructure-facing Docker/OpenShell output and prints sandbox network-policy guidance. The unit tests are good for common paths, yet a real or realistically ordered log fixture is important because output ordering determines false positives.
  • PRA-T2 Runtime validation — Add or identify a captured real Docker/OpenShell plugin-install network-denial fixture showing the actual ordering of npm error lines and Docker's `The command ... returned a non-zero code` summary.. The changed source is pure and has direct unit coverage, but it classifies infrastructure-facing Docker/OpenShell output and prints sandbox network-policy guidance. The unit tests are good for common paths, yet a real or realistically ordered log fixture is important because output ordering determines false positives.
  • PRA-T3 Acceptance clausefix(policy): allow OpenClaw plugin npm registry access #4125 fixes the default sandbox policy so `openclaw plugins install` can reach the intended ClawHub/npm registry paths from the default restricted baseline. The original issue also suggested improving the user-facing error when network policy blocks plugin-install egress. — add test evidence or identify existing coverage. `build-context.ts` adds a clearer error path naming ClawHub/npm registry and sandbox network policy. The remaining partial gap is classifier precision: npm network evidence is still accepted from the whole pre-summary prefix, so some non-plugin-install failures can receive the network-policy hint.
  • PRA-T4 Acceptance clause — When `openclaw plugins install <package>` cannot reach ClawHub or the npm registry because sandbox network policy denies egress, surface a clearer diagnostic instead of only the lower-level fetch/socket failure. — add test evidence or identify existing coverage. `validation.ts` adds `plugin_install_network_denied`; `build-context.ts` prints `The sandbox Docker build failed at the OpenClaw plugin-install step`, `Could not reach ClawHub or the npm registry`, and `sandbox network policy`. Tests cover npm-registry ENOTFOUND, ClawHub ECONNREFUSED, and user-visible hint text. The false-positive path described in the finding keeps this clause partial because the clearer diagnostic is not yet reliably limited to this condition.
  • PRA-T5 Acceptance clause — Related open issues: - [[NemoClaw][All Platforms] Default sandbox network policy blocks clawhub, preventing openclaw plugins install from succeeding #4104 [NemoClaw][All Platforms] Default sandbox network policy blocks clawhub, preventing openclaw plugins install from succeeding](https://github.com/NVIDIA/NemoClaw/issues/4104\) — add test evidence or identify existing coverage. The new classifier and hint target the [NemoClaw][All Platforms] Default sandbox network policy blocks clawhub, preventing openclaw plugins install from succeeding #4104 failure mode, with tests for npm registry and ClawHub paths. Precision remains partial until the network evidence is correlated to the plugin-install failure rather than the whole pre-summary output prefix.
  • PRA-T6 Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()` — Existing tests cover npm-registry ENOTFOUND, ClawHub ECONNREFUSED, non-network E404, later Docker step failure, non-npm network text after a same-RUN block, post-boundary npm network text, and the user-visible hint. Missing coverage is pre-boundary unrelated npm network output in the same RUN block.. The code comment claims the prefix search prevents a different npm script in the same RUN block from triggering the hint, but `segment` includes all process output before Docker's final summary, where that later npm script's stderr would normally appear.
Since last review details

Current findings, using the urgency labels above:

PRA-1 Resolve/justify — Source-of-truth review needed: Plugin-install network-denial classifier and recovery hint in `classifySandboxCreateFailure()` / `printSandboxCreateRecoveryHints()`

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: Existing tests cover npm-registry ENOTFOUND, ClawHub ECONNREFUSED, non-network E404, later Docker step failure, non-npm network text after a same-RUN block, post-boundary npm network text, and the user-visible hint. Missing coverage is pre-boundary unrelated npm network output in the same RUN block.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: The code comment claims the prefix search prevents a different npm script in the same RUN block from triggering the hint, but `segment` includes all process output before Docker's final summary, where that later npm script's stderr would normally appear.

PRA-2 Resolve/justify — Bind npm network evidence to the plugin-install failure, not the whole Docker-log prefix

  • Location: src/lib/validation.ts:151
  • Category: security
  • Problem: The updated classifier no longer scans the entire output, but it still builds `segment` from the beginning of the Docker/OpenShell log through the failed command summary. That segment can include unrelated `npm error ... ENOTFOUND` output emitted before Docker prints `The command ... returned a non-zero code`, including from a later npm command in the same RUN block after `openclaw plugins install` succeeded. In that case the code would still return `plugin_install_network_denied` for a non-plugin-install network failure.
  • Impact: A false positive can print guidance to allow ClawHub/npm egress even when plugin installation did not fail because of sandbox network policy. Because sandbox network policy is a security boundary, misleading recovery text can push users toward broader egress than necessary.
  • Recommended action: Scope the network predicate to evidence associated with the plugin-install failure itself. For example, extract the npm error block immediately adjacent to the failed plugin-install invocation, require the failed command to be the plugin-install command rather than any multi-command RUN block containing it, or correlate the npm error package URL with the `npm:@openclaw/...` package in the failed command.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Read `classifySandboxCreateFailure()` at `src/lib/validation.ts:146-158` and confirm the npm network-error regex no longer runs against `text.slice(0, failedCommandEnd)`. It should operate on a bounded plugin-install/npm-error block or an equivalent correlation.
  • Missing regression test: Add `classifySandboxCreateFailure does NOT classify a later npm command in the same RUN block when its npm ENOTFOUND output appears before Docker's failed-command summary`, with output ordered as npm ENOTFOUND lines first, then `The command '/bin/bash ... openclaw plugins install ...; npm run doctor-fix;' returned a non-zero code: 1`.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Read `classifySandboxCreateFailure()` at `src/lib/validation.ts:146-158` and confirm the npm network-error regex no longer runs against `text.slice(0, failedCommandEnd)`. It should operate on a bounded plugin-install/npm-error block or an equivalent correlation.
  • Evidence: `validation.ts` computes `segment = text.slice(0, pluginInstallErrorMatch.index + pluginInstallErrorMatch[0].length)` and then tests `/npm error.*(?:ENOTFOUND|...)/i` against that whole prefix. The added negative test for the later npm command places npm error lines after the Docker failed-command boundary, so it does not cover the common log ordering where process stderr precedes Docker's final failed-command summary.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: onboard-negative-paths-e2e, cloud-onboard-e2e
Optional E2E: network-policy-e2e

Dispatch hint: onboard-negative-paths-e2e,cloud-onboard-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • onboard-negative-paths-e2e: This change affects onboarding negative/failure-path behavior and operator recovery output. Run the existing onboarding negative-path E2E to validate failure handling still exits cleanly and remains actionable.
  • cloud-onboard-e2e: The touched code is in the sandbox build/create recovery path used by full hosted onboarding. This E2E validates install-to-onboard integration with custom npm/pypi policy presets and a real sandbox build/create path.

Optional E2E

  • network-policy-e2e: Useful adjacent confidence because the new diagnostic is about plugin-install egress being blocked by network policy; this E2E exercises restricted policy behavior and allowed package-registry egress.

New E2E recommendations

  • plugin-install-network-denied-onboard-recovery (medium): Existing E2Es do not appear to intentionally deny npm/ClawHub during the OpenClaw plugin-install Docker build step and assert that onboarding prints the new plugin-install network-policy recovery guidance.
    • Suggested test: Add a targeted onboarding negative-path E2E that runs sandbox creation with web-search/plugin install enabled under a policy or network shim that blocks registry.npmjs.org/ClawHub, then asserts the classifier emits the plugin-install network-policy hint and recommends nemoclaw onboard --resume.

Dispatch hint

  • Workflow: .github/workflows/nightly-e2e.yaml
  • jobs input: onboard-negative-paths-e2e,cloud-onboard-e2e

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: None
Optional Vitest E2E scenarios: None

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • None. Changes are limited to sandbox-create failure classification and user-facing recovery hints for a plugin-install network-denial error, with accompanying unit tests. The live Vitest scenario workflow does not have a deterministic typed scenario that forces the OpenClaw plugin-install network-policy failure path, so a scenario dispatch would not prove this changed surface.

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • src/lib/build-context.ts
  • src/lib/validation.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/validation.ts`:
- Around line 144-149: The classifier in validation.ts is too broad because it
matches any Docker RUN block containing openclaw plugins install, even when a
later command in the same block fails. Tighten the logic in the network-failure
branch so it only classifies plugin_install_network_denied when the failing
subcommand itself shows plugin-fetch/install evidence, using the existing
validation.ts matching and the relevant openclaw plugins install / npm:`@openclaw`
patterns as anchors. Update validation.test.ts with a regression case where
plugin install succeeds but a later command in the same shell block fails, and
verify it no longer returns the plugin install recovery hint.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b142153c-913c-41f6-bb87-87ba3729f360

📥 Commits

Reviewing files that changed from the base of the PR and between 8e047a6 and 563708c.

📒 Files selected for processing (4)
  • src/lib/build-context.test.ts
  • src/lib/build-context.ts
  • src/lib/validation.test.ts
  • src/lib/validation.ts

Comment thread src/lib/validation.ts Outdated
Comment on lines +144 to +149
if (
/The command '[^']*(?:openclaw plugins install|npm:@openclaw\/)[^']*'\s*returned a non-zero code/i.test(
text,
) &&
/ENOTFOUND|EAI_AGAIN|ECONNREFUSED|ETIMEDOUT|ESOCKETTIMEDOUT|network request.*failed|getaddrinfo|fetch failed|socket hang up|network timeout/i.test(
text,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Anchor this classifier to the failing subcommand, not just the enclosing Docker block.

Line 145 only proves the quoted RUN block contains openclaw plugins install; it does not prove that plugin install was the step that failed. The fixture in src/lib/validation.test.ts already shows the same block also runs openclaw doctor --fix, so a later network failure inside that shell block would still classify as plugin_install_network_denied and emit the wrong recovery hint. Please narrow this to plugin-fetch evidence or otherwise isolate the failing subcommand, and add a regression test for “plugin install succeeded, later command in the same block failed”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/validation.ts` around lines 144 - 149, The classifier in
validation.ts is too broad because it matches any Docker RUN block containing
openclaw plugins install, even when a later command in the same block fails.
Tighten the logic in the network-failure branch so it only classifies
plugin_install_network_denied when the failing subcommand itself shows
plugin-fetch/install evidence, using the existing validation.ts matching and the
relevant openclaw plugins install / npm:`@openclaw` patterns as anchors. Update
validation.test.ts with a regression case where plugin install succeeds but a
later command in the same shell block fails, and verify it no longer returns the
plugin install recovery hint.

…false positives from later RUN block commands

The previous network-error predicate matched any ENOTFOUND/ECONNREFUSED/etc.
in the captured output, which could fire the plugin-install hint when the
plugin install itself succeeded but a later command in the same RUN block
(e.g. `openclaw doctor --fix`) failed with a network error. npm's error
output consistently prefixes every line with "npm error", whereas commands
run after a successful install produce different error formats. Requiring
"npm error" + network pattern anchors the evidence to the npm installer
specifically, ruling out later-command false positives.

Add a regression test: the same RUN block runs plugin install (succeeds)
then openclaw doctor (fails with ENOTFOUND but no "npm error" prefix) →
correctly returns "unknown".

Refs #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
…get divergence

Refs #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
Bound the network-evidence regex to the text up to and including the
failed-plugin-install Docker error block rather than scanning the entire
output. This prevents an npm script that runs after a successful plugin
install in the same RUN block from producing a false
plugin_install_network_denied classification when that later script emits
an npm-prefixed network error.

Adds a regression test for this case (npm script post-install in the
same RUN block emits npm error ENOTFOUND after the Docker boundary).

Refs #4127

Signed-off-by: Dongni Yang <dongniy@nvidia.com>
@Dongni-Yang

Copy link
Copy Markdown
Contributor Author

Addressing PRA-2 / CodeRabbit (windowed npm-error search):

Both PRA-2 and the CodeRabbit comment correctly identified that scanning the entire output for npm error network evidence could misfire if an unrelated npm script in the same RUN block emits a network error after the plugin-install step has already succeeded.

Fixed in the latest commit (9d8df9496): the network-evidence regex now searches only the text up to and including the Docker The command '...' returned a non-zero code boundary, not the full output. This means npm errors that appear after the Docker error line (i.e. from a later command whose errors are streamed after Docker reports the block failure) are excluded from the classification window.

A regression test was added for this case:

"The command '...openclaw plugins install...' returned a non-zero code: 1"
"npm error code ENOTFOUND"       ← appears after the Docker boundary → excluded
"npm error network request ..."  ← same
→ kind: "unknown"  ✓

Addressing PRA-1 (source-of-truth review):

The classification is intentionally localized to NemoClaw's classifier layer. The source-of-truth facts:

  • Invalid state: Docker build fails at the plugin-install RUN block with both (a) a The command '...' returned a non-zero code boundary containing openclaw plugins install or npm:@openclaw/, and (b) an npm error network error in the prefix up to that boundary.
  • Source boundary: classifySandboxCreateFailure() in src/lib/validation.ts — a pure function with no I/O, so the behavior is fully unit-testable without Docker.
  • Source-fix constraint: The upstream fix would be OpenShell exposing the failing RUN-block step name in structured error output so the classifier doesn't need regex parsing. Until that lands, the localized pattern match is the best available signal.
  • Regression tests: 7 tests total covering the positive path (ENOTFOUND, ECONNREFUSED against npm/ClawHub registries), 5 negative paths (unrelated failures, non-network E404, step-header false positive, doctor-fix format, post-boundary npm error).
  • Removal condition: when OpenShell exposes a structured failedStep field in sandbox create failure output, this classifier branch can be replaced with a direct field check.

@Dongni-Yang Dongni-Yang added the v0.0.69 Release target label Jun 26, 2026
@wscurran wscurran added area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior labels Jun 26, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28261980087
Target ref: fix/4127-plugin-install-network-policy-hint
Requested jobs: onboard-negative-paths-e2e,cloud-onboard-e2e
Summary: 2 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
onboard-negative-paths-e2e ✅ success

@cv cv added v0.0.71 Release target and removed v0.0.69 Release target labels Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: onboarding Onboarding FSM, provider setup, sandbox launch, or first-run flow bug-fix PR fixes a bug or regression integration: openclaw OpenClaw integration behavior v0.0.71 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve OpenClaw plugin install network-policy denial diagnostics

4 participants