Skip to content

fix(rebuild): clear stale policy presets from registry after rebuild#5856

Open
laitingsheng wants to merge 3 commits into
mainfrom
fix/5847-channels-stop-preset-registry-cleanup
Open

fix(rebuild): clear stale policy presets from registry after rebuild#5856
laitingsheng wants to merge 3 commits into
mainfrom
fix/5847-channels-stop-preset-registry-cleanup

Conversation

@laitingsheng

@laitingsheng laitingsheng commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

registry.policies was append-only via applyPreset, so disabling a messaging channel and rebuilding left the disabled channel's preset stale in the registry even though the gateway state was correctly pruned. Reapply loop now records its effective output and Step 7 writes it back, so policy-list reflects the live applied set.

Related Issue

Fixes #5847

Changes

  • rebuild.ts: hoist restoredPresets from the inner reapply block and fold policies: restoredPresets into the Step 7 registry.updateSandbox call so the registry always mirrors what was just reapplied to the gateway. Empty savedPresets correctly clears stale entries.
  • test/e2e-scenario/live/channels-stop-start-helpers.ts: assert policyPresetActive(channel) === false after stop+rebuild and === true after start+rebuild — closes the assertion gap that allowed the regression to ship.
  • test/e2e/test-channels-stop-start.sh: mirror the post-stop / post-start preset assertions in the legacy bash E2E.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Quality Gates

  • Tests added or updated for changed behavior
  • Existing tests cover changed behavior — justification:
  • Tests not applicable — justification:
  • Docs updated for user-facing behavior changes
  • Docs not applicable — justification: defect fix, user-visible flag/command surface unchanged
  • Sensitive paths changed (security, policy, credentials, preflight, onboarding, inference, runner, sandbox, or messaging)
  • Sensitive-path review completed or maintainer-approved waiver recorded — reviewer/approval link/justification:
  • Non-success, skipped, or missing CI check accepted by maintainer — check name, approval link, and follow-up issue:

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Quality Gates section completed with required justifications or waivers
  • No secrets, API keys, or credentials committed
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • Bug Fixes
    • Improved policy preset restoration during sandbox rebuilds, including clearer reporting when some presets fail to reapply.
    • Registry updates now reflect only successfully restored policies, and rebuild success is blocked when preset restore is incomplete.
    • Messaging channel policy presets now correctly reflect channel state changes (inactive after stop+rebuild, active after start+rebuild), and disabled-channel preset pruning matches the full applicable set.
  • Tests
    • Expanded end-to-end and unit coverage to verify per-channel preset activity and registry policy payloads.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: aeb35a8e-1d55-4683-aeba-d179fd80566e

📥 Commits

Reviewing files that changed from the base of the PR and between 8e69e33 and 9166c87.

📒 Files selected for processing (2)
  • src/lib/actions/sandbox/rebuild-flow.test.ts
  • src/lib/onboard/messaging-policy-presets.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard/messaging-policy-presets.ts

📝 Walkthrough

Walkthrough

Messaging channel preset pruning now uses all channel presets, sandbox rebuild writes restored policies back to the registry, and the stop/start flows verify per-channel preset state after rebuild.

Changes

Messaging policy preset pruning and rebuild sync

Layer / File(s) Summary
Preset mapping and pruning
src/lib/messaging/channels/metadata.ts, src/lib/onboard/messaging-policy-presets.ts, src/lib/onboard/messaging-policy-presets.test.ts, test/rebuild-policy-presets.test.ts
Per-channel preset lookup is added, onboarding helpers aggregate all presets for a channel, and disabled-channel pruning is updated and tested against that broader preset set.
Restore and persist presets
src/lib/actions/sandbox/rebuild.ts, src/lib/actions/sandbox/rebuild-flow.test.ts
rebuildSandbox tracks restored and failed presets, updates the registry with restored policies, changes success gating, and logs incomplete preset restoration; the rebuild-flow tests cover the new registry and failure logging behavior.
Stop/start policy checks
test/e2e-scenario/live/channels-stop-start-helpers.ts, test/e2e/test-channels-stop-start.sh
The live stop/start scenario helper and shell test now assert each channel’s policy preset is inactive after stop+rebuild and active after start+rebuild.

Estimated review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

area: messaging, area: e2e

Suggested reviewers

  • cv
  • ericksoa

Poem

🐇 I hopped through presets, one by one,
Disabled channels, work well done.
Rebuild now sings with tidy light,
Stop and start look clean and right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: stale policy presets are cleared from the registry after rebuild.
Linked Issues check ✅ Passed The rebuild flow now removes disabled channel presets from the registry and tests cover Teams stop+rebuild, matching issue #5847.
Out of Scope Changes check ✅ Passed The added helper and test updates stay focused on messaging policy preset pruning and rebuild correctness.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/5847-channels-stop-preset-registry-cleanup

Comment @coderabbitai help to get the list of available commands.

@github-code-quality

github-code-quality Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 9166c87 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 47%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 9166c87 +/-
src/lib/state/o...oard-session.ts 91%
src/lib/actions...dbox/rebuild.ts 73%
src/lib/sandbox/config.ts 72%
src/lib/onboard/preflight.ts 62%
src/lib/shields/index.ts 62%
src/lib/actions...licy-channel.ts 60%
src/lib/state/sandbox.ts 56%
src/lib/policy/index.ts 48%
src/lib/onboard...er-gpu-patch.ts 47%
src/lib/onboard.ts 19%

Updated June 26, 2026 16:15 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@laitingsheng laitingsheng added NV QA Bugs found by the NVIDIA QA Team bug-fix PR fixes a bug or regression area: integrations Third-party service integration behavior labels Jun 26, 2026
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: channels-stop-start-vitest, sandbox-rebuild-vitest, network-policy-vitest
Optional E2E: rebuild-openclaw-vitest

Dispatch hint: channels-stop-start-vitest,sandbox-rebuild-vitest,network-policy-vitest

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • channels-stop-start-vitest (high; matrix runs openclaw and hermes): Primary coverage for this PR: exercises messaging channel stop/start across rebuild for both OpenClaw and Hermes, including the newly changed policy active/inactive assertions for disabled and restarted channels.
  • sandbox-rebuild-vitest (high): Required because src/lib/actions/sandbox/rebuild.ts changes core rebuild lifecycle and registry metadata reconciliation. This live scenario validates rebuild preservation and registry refresh behavior in a real sandbox.
  • network-policy-vitest (high): Required because the PR changes policy preset selection/pruning and registry policy state, which can affect network policy/security boundary behavior. This verifies live network policy allow/deny behavior still matches expected presets.

Optional E2E

  • rebuild-openclaw-vitest (very high): Useful adjacent confidence for the OpenClaw-specific rebuild path, especially because the changed rebuild flow runs OpenClaw post-restore steps before final registry/status reporting.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: .github/workflows/e2e-vitest-scenarios.yaml
  • jobs input: channels-stop-start-vitest,sandbox-rebuild-vitest,network-policy-vitest

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: channels-stop-start-vitest
Optional Vitest E2E scenarios: channels-add-remove-vitest

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=channels-stop-start-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • channels-stop-start-vitest: The PR changes messaging-channel policy preset cleanup and rebuild reconciliation, and directly updates the Vitest live channels stop/start helper to assert policy presets are inactive after stop+rebuild and active after start+rebuild. The wired free-standing Vitest job exercises the changed OpenClaw/Hermes stop/start rebuild contract.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=channels-stop-start-vitest

Optional Vitest E2E scenarios

  • channels-add-remove-vitest: Adjacent free-standing messaging-channel scenario that covers channel add/remove, rebuild, and policy-list behavior, but the PR's primary changed live helper and assertions target stop/start rebuild cleanup.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=channels-add-remove-vitest

Relevant changed files

  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/messaging/channels/metadata.ts
  • src/lib/onboard/messaging-policy-presets.ts
  • test/e2e-scenario/live/channels-stop-start-helpers.ts

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — No blocking findings

Merge posture: No blocking advisor findings
Primary next action: Add or justify PRA-T1 and any related test follow-ups.
Open items: 0 required · 0 warnings · 0 suggestions · 3 test follow-ups
Since last review: 1 prior item resolved · 0 still apply · 0 new items found

Action checklist

  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — `channels stop teams followed by rebuild reports teams inactive in policy-list`: onboard or stage a Teams-enabled sandbox, stop Teams, rebuild, and assert policy-list shows `○ teams` or no active Teams preset.. Changed code touches sandbox rebuild and network-policy state, so the focused unit coverage is strong enough for the current fix, but a runtime/integration check would further reduce risk around live gateway policy-list output without relying on external job status.
  • PRA-T2 Runtime validation — `rebuild policy output omits network_policies.teams when disabled Teams preset was present before rebuild`: inspect the gateway policy representation after stop+rebuild and assert the `teams` policy key is absent rather than checking only host substrings.. Changed code touches sandbox rebuild and network-policy state, so the focused unit coverage is strong enough for the current fix, but a runtime/integration check would further reduce risk around live gateway policy-list output without relying on external job status.
  • PRA-T3 Runtime validation — `active Teams rebuild keeps teams preset recorded and applied when disabledChannels is empty`: start with saved `teams` and an active Teams rebuild plan, then assert `applyPreset("teams")` runs and final registry policies include `teams`.. Changed code touches sandbox rebuild and network-policy state, so the focused unit coverage is strong enough for the current fix, but a runtime/integration check would further reduce risk around live gateway policy-list output without relying on external job status.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/lib/actions/sandbox/rebuild.ts (1)

1137-1143: 🎯 Functional Correctness | 🔵 Trivial

Run the stop/start E2E for this rebuild path.

This code is the persistence point for disabled-channel preset state after rebuild, so please run the recommended channels-stop-start-e2e job before merge. As per path instructions, src/lib/actions/sandbox/rebuild.ts "controls disabled channel resolution used during onboard and rebuild" and recommends channels-stop-start-e2e.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/rebuild.ts` around lines 1137 - 1143, The rebuild
persistence path in restore logic needs validation by running the
disabled-channel end-to-end coverage before merge. After the changes around
registry.updateSandbox and the restoredPresets handling in rebuild.ts, run the
recommended channels-stop-start-e2e job to verify rebuild correctly preserves
disabled-channel preset state and channel resolution behavior.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 1137-1143: The rebuild persistence path in restore logic needs
validation by running the disabled-channel end-to-end coverage before merge.
After the changes around registry.updateSandbox and the restoredPresets handling
in rebuild.ts, run the recommended channels-stop-start-e2e job to verify rebuild
correctly preserves disabled-channel preset state and channel resolution
behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 38ed9bd5-f551-42da-9747-cc0f2fece1c1

📥 Commits

Reviewing files that changed from the base of the PR and between 7ab94fe and 86066e2.

📒 Files selected for processing (3)
  • src/lib/actions/sandbox/rebuild.ts
  • test/e2e-scenario/live/channels-stop-start-helpers.ts
  • test/e2e/test-channels-stop-start.sh

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@laitingsheng laitingsheng added the v0.0.69 Release target label Jun 26, 2026
@cv cv added v0.0.70 Release target and removed v0.0.69 Release target labels Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: integrations Third-party service integration behavior bug-fix PR fixes a bug or regression NV QA Bugs found by the NVIDIA QA Team v0.0.70 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.04][Policy&Network] nemoclaw channels stop teams does not remove teams policy preset after rebuild

3 participants