Skip to content

refactor(onboard): share provider sandbox phase handoff#5941

Merged
cv merged 5 commits into
mainfrom
codex/5938-provider-sandbox-handoff
Jun 28, 2026
Merged

refactor(onboard): share provider sandbox phase handoff#5941
cv merged 5 commits into
mainfrom
codex/5938-provider-sandbox-handoff

Conversation

@cv

@cv cv commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Routes both onboarding provider/sandbox assembly paths through the existing shared phase factories and carries the fully narrowed context across that boundary. The refactor deletes duplicate field projection and assertion code while consolidating boundary coverage, finishing with 6 fewer lines overall and 42 fewer production lines.

Related Issue

Closes #5938

Changes

  • Make the live core flow reuse createProviderInferencePhase() and createSandboxPhase().
  • Pass complete narrowed contexts through the shared factories instead of rebuilding them field by field.
  • Chain production provider and sandbox handlers in one behavior test, including completed-provider resume.
  • Consolidate the shared-factory mapping tests into one full-context handoff test.
  • Cover provider, model, and GPU-config narrowing failures with one parameterized test.
  • Prove full-context preservation through the generic full-sequence path.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Quality Gates

  • Tests added or updated for changed behavior
  • Existing tests cover changed behavior — justification:
  • Tests not applicable — justification:
  • Docs updated for user-facing behavior changes
  • Docs not applicable — justification: Internal type and phase-assembly refactor with no prompt, state, persistence, or user-facing change.
  • Sensitive paths changed (security, policy, credentials, preflight, onboarding, inference, runner, sandbox, or messaging)
  • Sensitive-path review completed or maintainer-approved waiver recorded — reviewer/approval link/justification: Fresh, completed-provider resume, retry/state-transition, credential, Dockerfile, GPU, Hermes gateway, and messaging context behavior remain covered by 30 targeted state-machine tests and CLI typecheck.
  • Non-success, skipped, or missing CI check accepted by maintainer — check name, approval link, and follow-up issue:

Verification

  • PR description includes the DCO sign-off declaration and every commit appears as Verified in GitHub
  • Git hooks passed during commit and push, or npx prek run --from-ref main --to-ref HEAD passes
  • Targeted tests pass for changed behavior
  • Full npm test passes (broad runtime changes only)
  • Quality Gates section completed with required justifications or waivers
  • No secrets, API keys, or credentials committed
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv cv added the v0.0.70 Release target label Jun 28, 2026
@cv cv self-assigned this Jun 28, 2026
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Removes duplicate provider-to-sandbox phase assembly by delegating core onboarding flow setup to shared phase factories. The provider and sandbox transitions now pass handler-produced context through directly, and tests were updated to cover the propagated context across the handoff.

Changes

Provider-to-sandbox phase consolidation

Layer / File(s) Summary
Phase factory types and return path
src/lib/onboard/machine/flow-phases/provider-sandbox.ts
Handler types change to full OnboardFlowContext inputs, merge helpers are removed, and both phase run methods return onboardFlowPhaseResult(result.context, result.result) directly.
flow-sequence context passthrough
src/lib/onboard/machine/flow-sequence.ts, src/lib/onboard/machine/flow-sequence.test.ts
Provider-inference and sandbox phases return the handler-provided context directly after assertion, and the sequence test now seeds and checks the propagated GPU and messaging fields.
core-flow-phases delegation to factories
src/lib/onboard/machine/core-flow-phases.ts, src/lib/onboard/machine/core-flow-phases.test.ts
createCoreOnboardFlowPhases now uses createProviderInferencePhase and createSandboxPhase; inline phase objects and the sandbox precondition assertion are removed, and core flow tests run provider before sandbox and assert the resulting sandbox context.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main refactor: sharing the provider-to-sandbox handoff in onboarding.
Linked Issues check ✅ Passed The refactor consolidates the provider-to-sandbox handoff, removes duplicate copying/assertions, and preserves behavior through shared phase tests.
Out of Scope Changes check ✅ Passed The changed files stay focused on onboarding phase handoff refactoring and related tests, with no unrelated code paths introduced.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/5938-provider-sandbox-handoff

Comment @coderabbitai help to get the list of available commands.

@github-code-quality

github-code-quality Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Code Coverage Overview

Languages: TypeScript

TypeScript / code-coverage/plugin

The overall coverage in the branch is 96%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 2a3cbf1 +/-
nemoclaw/src/se...cret-scanner.ts 100%
nemoclaw/src/commands/slash.ts 100%
nemoclaw/src/li...bprocess-env.ts 100%
nemoclaw/src/bl...eprint/state.ts 98%
nemoclaw/src/onboard/config.ts 98%
nemoclaw/src/bl...int/snapshot.ts 97%
nemoclaw/src/bl...print/runner.ts 95%
nemoclaw/src/co...ration-state.ts 94%
nemoclaw/src/bl...ate-networks.ts 94%
nemoclaw/src/index.ts 94%

TypeScript / code-coverage/cli

The overall coverage in the branch is 68%. Coverage data for the branch is not yet available.

Show a code coverage summary of the most covered files.
File 2a3cbf1 +/-
src/lib/actions...all/run-plan.ts 80%
src/lib/state/o...oard-session.ts 79%
src/lib/actions...dbox/rebuild.ts 74%
src/lib/state/sandbox.ts 72%
src/lib/shields/index.ts 70%
src/lib/onboard/preflight.ts 69%
src/lib/actions...licy-channel.ts 59%
src/lib/onboard...er-gpu-patch.ts 59%
src/lib/policy/index.ts 52%
src/lib/onboard.ts 20%

Updated June 28, 2026 19:18 UTC
Code Coverage is in Public Preview. Learn more and provide us with your feedback.

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor (Nemotron Ultra) — Changes requested

Merge posture: Do not merge yet
Primary next action: Fix PRA-2: Compatibility bridge workaround lacks tracking issue and regression test; then add or justify PRA-T1.
Open items: 1 required · 5 warnings · 5 suggestions · 8 test follow-ups
Since last review: 1 prior item resolved · 5 still apply · 3 new items found

Action checklist

  • PRA-2 Fix: Compatibility bridge workaround lacks tracking issue and regression test in src/lib/onboard/machine/core-flow-phases.ts:146
  • PRA-1 Resolve or justify: Source-of-truth review needed: src/lib/onboard/machine/core-flow-phases.ts:146-175 (compatibility bridge workaround in runCoreOnboardFlowSlice)
  • PRA-3 Resolve or justify: Source-of-truth review needed for compatibility bridge workaround in src/lib/onboard/machine/core-flow-phases.ts:146
  • PRA-4 Resolve or justify: Missing integration test for resume path with pre-completed provider_selection through shared factories in src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:1
  • PRA-5 Resolve or justify: Test consolidation between core-flow-phases.test.ts and provider-sandbox.test.ts incomplete in src/lib/onboard/machine/core-flow-phases.test.ts:1
  • PRA-7 Resolve or justify: Acceptance criterion fix: update all repo references to NVIDIA/NemoClaw #7 (test consolidation) not fully met in src/lib/onboard/machine/core-flow-phases.test.ts:1
  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Runtime validation
  • PRA-T5 Add or justify test follow-up: Missing integration test for resume path with pre-completed provider_selection through shared factories
  • PRA-T6 Add or justify test follow-up: Parameterized test validates sandbox phase rejects execution when required fields missing
  • PRA-T7 Add or justify test follow-up: Missing integration test for resume path through shared provider/sandbox factories in flow sequence
  • PRA-T8 Add or justify test follow-up: Acceptance clause
  • PRA-6 In-scope improvement: PR fix: support reasoning compatible endpoints (Fixes #3279) #3286 overlaps same files - verify no semantic conflict on merge in src/lib/onboard/machine/core-flow-phases.ts:1
  • PRA-8 In-scope improvement: Shared factory abstraction improves type safety at trust boundary in src/lib/onboard/machine/flow-phases/provider-sandbox.ts:1
  • PRA-9 In-scope improvement: Parameterized test validates sandbox phase rejects execution when required fields missing in src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:45
  • PRA-10 In-scope improvement: Missing integration test for resume path through shared provider/sandbox factories in flow sequence in src/lib/onboard/machine/flow-sequence.test.ts:1
  • PRA-11 In-scope improvement: Monolith growth: core-flow-phases.test.ts approaching size where shared fixtures would improve maintainability in src/lib/onboard/machine/core-flow-phases.test.ts:1

Findings index

ID Severity Category Location Required action
PRA-1 Resolve/justify architecture Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
PRA-2 Required architecture src/lib/onboard/machine/core-flow-phases.ts:146 Either (a) link a tracking issue with acceptance criteria for removing this fallback, or (b) add a regression test in core-flow-phases.test.ts that would fail when compatibilityWhenState no longer requires the extended downstream state list for resume=true, indicating FSM recovery states now handle ahead-state resume correctly. Resolve in this PR or explicitly justify the accepted risk.
PRA-3 Resolve/justify architecture src/lib/onboard/machine/core-flow-phases.ts:146 Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. Either fix the source (convert repair/backstop checks to strict FSM recovery states) or document accepted risk with tracking issue.
PRA-4 Resolve/justify tests src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:1 Add integration test in provider-sandbox.test.ts for resume path with pre-completed provider_selection step, verifying sandbox phase receives correct narrowed context with all required fields (model, provider, sandboxGpuConfig, hermesToolGateways, preferredInferenceApi, nimContainer, webSearchConfig) through shared factories.
PRA-5 Resolve/justify correctness src/lib/onboard/machine/core-flow-phases.test.ts:1 Extract shared test helpers/fixtures for provider/sandbox handoff scenarios into a common test utility module (e.g., provider-sandbox-test-fixtures.ts), or consolidate handoff-specific tests into provider-sandbox.test.ts and keep core-flow-phases.test.ts focused on full sequence orchestration.
PRA-6 Improvement correctness src/lib/onboard/machine/core-flow-phases.ts:1 Coordinate with PR #3286 author or rebase sequentially. Verify type signatures and handler contracts align before merging.
PRA-7 Resolve/justify acceptance src/lib/onboard/machine/core-flow-phases.test.ts:1 Complete test consolidation as part of this PR per acceptance criteria, or explicitly justify why consolidation is deferred.
PRA-8 Improvement security src/lib/onboard/machine/flow-phases/provider-sandbox.ts:1 No action needed. The abstraction is justified and well-scoped.
PRA-9 Improvement tests src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:45 No action needed. Good security regression coverage.
PRA-10 Improvement tests src/lib/onboard/machine/flow-sequence.test.ts:1 Add integration test for resume path through shared factories in flow-sequence.test.ts or provider-sandbox.test.ts.
PRA-11 Improvement correctness src/lib/onboard/machine/core-flow-phases.test.ts:1 Consider extracting shared test fixtures for provider/sandbox handoff scenarios as part of test consolidation (PRA-5).

🚨 Required before merge

Address these before merging unless a maintainer explicitly overrides the advisor with rationale.

PRA-2 Required — Compatibility bridge workaround lacks tracking issue and regression test

  • Location: src/lib/onboard/machine/core-flow-phases.ts:146
  • Category: architecture
  • Problem: runCoreOnboardFlowSlice contains a documented compatibility bridge (lines 146-175) that tolerates ahead-state machine snapshots from legacy step mutation (updateMachine === true) and repaired-resume replay. The comment documents the invalid state, source boundaries, source-fix constraint, and intended removal condition but provides no tracking issue, no regression test that would fail when the workaround becomes unnecessary, and no timeline.
  • Impact: Localized workaround preserves invalid state (ahead-state snapshots downstream of provider/sandbox repair checks) with no mechanism ensuring temporariness. If the source boundary (imperative resume work vs strict FSM recovery states) is never fixed, this fallback becomes permanent technical debt that masks upstream issues. Sessions in downstream states (openclaw, agent_setup, policies, finalizing, post_verify) can re-enter the core slice on resume without re-running repair/backstop checks.
  • Required action: Either (a) link a tracking issue with acceptance criteria for removing this fallback, or (b) add a regression test in core-flow-phases.test.ts that would fail when compatibilityWhenState no longer requires the extended downstream state list for resume=true, indicating FSM recovery states now handle ahead-state resume correctly. Resolve in this PR or explicitly justify the accepted risk.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Inspect core-flow-phases.ts:146-175 comment block and search for any linked issue, tracking label, or test asserting the compatibility bridge is no longer needed.
  • Missing regression test: Test that fails when compatibilityWhenState no longer requires the extended downstream state list ["provider_selection", "inference", "sandbox", "openclaw", "agent_setup", "policies", "finalizing", "post_verify"] for resume=true, indicating FSM recovery states now handle ahead-state resume correctly.
  • Done when: The required change is committed and verification passes: Inspect core-flow-phases.ts:146-175 comment block and search for any linked issue, tracking label, or test asserting the compatibility bridge is no longer needed.
  • Evidence: Comment block at core-flow-phases.ts:146-175 explicitly documents workaround, tolerated downstream family, and intended removal condition but provides no tracking mechanism. Previous review PRA-1/PRA-6 identified this gap; unchanged in current diff.
Review findings by urgency: 1 required fix, 5 items to resolve/justify, 5 in-scope improvements

⚠️ Resolve or justify before merge

Investigate these in the current review; either fix them, explain why they are not applicable, or document the accepted risk.

PRA-1 Resolve/justify — Source-of-truth review needed: src/lib/onboard/machine/core-flow-phases.ts:146-175 (compatibility bridge workaround in runCoreOnboardFlowSlice)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: MISSING — comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list for resume=true
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Comment block at core-flow-phases.ts:146-175 documents workaround, tolerated downstream family, and intended removal condition but provides no tracking issue, no regression test, no timeline. Unchanged from prior review (PRA-1/PRA-2/PRA-10).

PRA-3 Resolve/justify — Source-of-truth review needed for compatibility bridge workaround

  • Location: src/lib/onboard/machine/core-flow-phases.ts:146
  • Category: architecture
  • Problem: The localized patch tolerates an invalid state (ahead-state snapshots) but the source boundary (imperative resume work vs strict FSM recovery states) is not fixed in this PR. The comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear. Without a regression test that fails when the workaround becomes unnecessary, there is no automated signal that the source fix is complete.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. Either fix the source (convert repair/backstop checks to strict FSM recovery states) or document accepted risk with tracking issue.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: MISSING — comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Identified in prior review PRA-1/PRA-6. Unchanged in current diff. No tracking issue linked.

PRA-4 Resolve/justify — Missing integration test for resume path with pre-completed provider_selection through shared factories

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:1
  • Category: tests
  • Problem: No integration test exercises the resume path where provider_selection step is already complete and the sandbox phase receives the narrowed context through shared factories. The provider→sandbox handoff crosses a trust boundary (GPU config, credentials, sandbox config) that benefits from runtime validation.
  • Impact: A type-only refactor could pass unit tests while a real handler contract mismatch (e.g., missing field in narrowed context during resume) surfaces only in integration or E2E runs. Validation context flags runtime_validation_recommended for changed sandbox/infrastructure paths.
  • Recommended action: Add integration test in provider-sandbox.test.ts for resume path with pre-completed provider_selection step, verifying sandbox phase receives correct narrowed context with all required fields (model, provider, sandboxGpuConfig, hermesToolGateways, preferredInferenceApi, nimContainer, webSearchConfig) through shared factories.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Check provider-sandbox.test.ts for a test that creates a session with provider_selection step complete, runs provider phase (which should skip/return existing context), then runs sandbox phase with resume=true.
  • Missing regression test: Integration test for resume path with pre-completed provider_selection through shared factories, verifying narrowed context flows end-to-end without field loss.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Check provider-sandbox.test.ts for a test that creates a session with provider_selection step complete, runs provider phase (which should skip/return existing context), then runs sandbox phase with resume=true.
  • Evidence: Previous review PRA-4 identified this gap. Current diff adds parameterized rejection test but no resume integration test.

PRA-5 Resolve/justify — Test consolidation between core-flow-phases.test.ts and provider-sandbox.test.ts incomplete

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: correctness
  • Problem: Both files test provider→sandbox handoff with overlapping scenarios. provider-sandbox.test.ts has new consolidated test 'passes full context and results through the shared handoff' but core-flow-phases.test.ts retains 'carries provider selection output into sandbox setup' using real deps.
  • Impact: Duplicated test maintenance burden and potential drift between test expectations. Acceptance criterion fix: update all repo references to NVIDIA/NemoClaw #7 (test consolidation) from issue Delete duplicate provider-to-sandbox phase assembly #5938 not fully satisfied.
  • Recommended action: Extract shared test helpers/fixtures for provider/sandbox handoff scenarios into a common test utility module (e.g., provider-sandbox-test-fixtures.ts), or consolidate handoff-specific tests into provider-sandbox.test.ts and keep core-flow-phases.test.ts focused on full sequence orchestration.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Compare test scenarios in core-flow-phases.test.ts (carries provider selection output into sandbox setup) and provider-sandbox.test.ts (passes full context and results through the shared handoff) for overlapping coverage.
  • Missing regression test: Shared test fixtures for provider/sandbox handoff scenarios used by both test files.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Compare test scenarios in core-flow-phases.test.ts (carries provider selection output into sandbox setup) and provider-sandbox.test.ts (passes full context and results through the shared handoff) for overlapping coverage.
  • Evidence: PR body claims 'Consolidate the shared-factory mapping tests into one full-context handoff test' but both tests exist.

PRA-7 Resolve/justify — Acceptance criterion #7 (test consolidation) not fully met

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: acceptance
  • Problem: Issue Delete duplicate provider-to-sandbox phase assembly #5938 acceptance criteria requires 'Existing phase tests are consolidated around the shared handoff rather than copied.' The PR adds a consolidated test in provider-sandbox.test.ts but retains overlapping handoff test in core-flow-phases.test.ts.
  • Impact: Acceptance criteria from linked issue not fully satisfied. Test consolidation was expected as part of this refactor.
  • Recommended action: Complete test consolidation as part of this PR per acceptance criteria, or explicitly justify why consolidation is deferred.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify whether the linked issue/acceptance criteria required test consolidation and whether the current state meets that requirement.
  • Missing regression test: N/A - acceptance criteria tracking
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify whether the linked issue/acceptance criteria required test consolidation and whether the current state meets that requirement.
  • Evidence: Issue Delete duplicate provider-to-sandbox phase assembly #5938 acceptance criteria fix: update all repo references to NVIDIA/NemoClaw #7 vs current test file state.

💡 In-scope improvements

These are lower-risk, not throwaway. Prefer fixing them in this PR when they are local to changed code; defer only with rationale or a linked follow-up.

PRA-6 Improvement — PR #3286 overlaps same files - verify no semantic conflict on merge

PRA-8 Improvement — Shared factory abstraction improves type safety at trust boundary

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.ts:1
  • Category: security
  • Problem: The sandbox phase handler now requires ProviderSelectedOnboardFlowContext which enforces NonNullable<sandboxGpuConfig>. Parameterized test validates rejection when model, provider, or sandboxGpuConfig missing.
  • Impact: Positive security improvement: type system now prevents sandbox phase from executing without required GPU config fields. Good security regression coverage via parameterized test.
  • Suggested action: No action needed. The abstraction is justified and well-scoped.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Verify ProviderSelectedOnboardFlowContext type definition in flow-context.ts enforces sandboxGpuConfig non-null.
  • Missing regression test: N/A - already covered by parameterized test in provider-sandbox.test.ts
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: flow-context.ts:35-38 defines ProviderSelectedOnboardFlowContext with sandboxGpuConfig: NonNullable<Context["sandboxGpuConfig"]>. provider-sandbox.test.ts:45 parameterized test covers 3 missing-field cases.

PRA-9 Improvement — Parameterized test validates sandbox phase rejects execution when required fields missing

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:45
  • Category: tests
  • Problem: Parameterized test (it.each) validates sandbox phase rejects execution before model, provider, or sandboxGpuConfig is selected, referencing issue Delete duplicate provider-to-sandbox phase assembly #5938.
  • Impact: Good security regression coverage for trust boundary validation. Ensures sandbox phase cannot execute with incomplete provider-selected context.
  • Suggested action: No action needed. Good security regression coverage.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Run test suite and verify parameterized test passes for all three missing field cases.
  • Missing regression test: N/A - already implemented
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: provider-sandbox.test.ts:45-70 parameterized test for missing model/provider/sandboxGpuConfig.

PRA-10 Improvement — Missing integration test for resume path through shared provider/sandbox factories in flow sequence

  • Location: src/lib/onboard/machine/flow-sequence.test.ts:1
  • Category: tests
  • Problem: flow-sequence.test.ts integration test updated with more context fields (credentialEnv, fromDockerfile, hermesToolGateways, sandboxGpuConfig sentinel, selectedMessagingChannels) but no test for resume path through shared factories.
  • Impact: Integration test covers fresh flow with expanded context but not resume path through shared provider/sandbox factories.
  • Suggested action: Add integration test for resume path through shared factories in flow-sequence.test.ts or provider-sandbox.test.ts.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check flow-sequence.test.ts for any test with resume=true that exercises providerInference and sandbox phases through buildOnboardFlowPhaseSequence.
  • Missing regression test: Integration test for resume path through shared provider/sandbox factories in flow sequence.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: flow-sequence.test.ts only test with runtime is 'runs ordered provider results through runtime transition validation' which uses fresh context.

PRA-11 Improvement — Monolith growth: core-flow-phases.test.ts approaching size where shared fixtures would improve maintainability

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: correctness
  • Problem: core-flow-phases.test.ts grew by 6 lines (from 565 to 571). While not a hotspot, continued growth should be monitored. Extraction of shared fixtures would support test consolidation (PRA-5).
  • Impact: Test file approaching size where extraction of shared fixtures would improve maintainability and reduce duplication.
  • Suggested action: Consider extracting shared test fixtures for provider/sandbox handoff scenarios as part of test consolidation (PRA-5).
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Monitor file size in future PRs.
  • Missing regression test: N/A
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Drift context shows monolith delta +6 lines, severity warning. Simplification signal flags large_file_hotspot.
Simplification opportunities: 1 possible cut, net -50 lines possible

These are safe simplification checks only. Do not remove validation, security controls, data-loss prevention, or required tests.

  • PRA-11 shrink (src/lib/onboard/machine/core-flow-phases.test.ts:1): Duplicate provider/sandbox handoff test setup code in core-flow-phases.test.ts and provider-sandbox.test.ts
    • Replacement: Shared test fixture module (e.g., provider-sandbox-test-fixtures.ts) with reusable context builders and phase runner helpers
    • Net: -50 lines
    • Safety boundary: Must not weaken security assertions (assertProviderSelectedContext, assertSandboxCreatedContext) or credential/GPU config validation coverage
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — provider-sandbox.test.ts: resume path with pre-completed provider_selection through shared factories - create session with provider_selection step complete, run provider phase (should return existing context), then sandbox phase with resume=true, verify narrowed context flows end-to-end. Runtime/sandbox/infrastructure paths need behavioral runtime validation: src/lib/onboard/machine/core-flow-phases.ts, src/lib/onboard/machine/flow-phases/provider-sandbox.ts, src/lib/onboard/machine/flow-sequence.ts. Provider→sandbox handoff crosses trust boundary (GPU config, credentials, sandbox config). Current tests use mocked handler callbacks, not real handler chain.
  • PRA-T2 Runtime validation — provider-sandbox.test.ts: real handler chain through shared factories - wire real handleProviderInferenceState → handleSandboxState stubs through createProviderInferencePhase/createSandboxPhase, verify context (model, provider, sandboxGpuConfig, hermesToolGateways, preferredInferenceApi, nimContainer, webSearchConfig) flows without field loss. Runtime/sandbox/infrastructure paths need behavioral runtime validation: src/lib/onboard/machine/core-flow-phases.ts, src/lib/onboard/machine/flow-phases/provider-sandbox.ts, src/lib/onboard/machine/flow-sequence.ts. Provider→sandbox handoff crosses trust boundary (GPU config, credentials, sandbox config). Current tests use mocked handler callbacks, not real handler chain.
  • PRA-T3 Runtime validation — flow-sequence.test.ts: resume path through shared provider/sandbox factories in full sequence - test with resume=true exercising providerInference and sandbox phases through buildOnboardFlowPhaseSequence. Runtime/sandbox/infrastructure paths need behavioral runtime validation: src/lib/onboard/machine/core-flow-phases.ts, src/lib/onboard/machine/flow-phases/provider-sandbox.ts, src/lib/onboard/machine/flow-sequence.ts. Provider→sandbox handoff crosses trust boundary (GPU config, credentials, sandbox config). Current tests use mocked handler callbacks, not real handler chain.
  • PRA-T4 Runtime validation — core-flow-phases.test.ts: compatibility bridge removal detection - test that fails when compatibilityWhenState no longer requires extended downstream list for resume=true, indicating FSM recovery states handle ahead-state resume correctly. Runtime/sandbox/infrastructure paths need behavioral runtime validation: src/lib/onboard/machine/core-flow-phases.ts, src/lib/onboard/machine/flow-phases/provider-sandbox.ts, src/lib/onboard/machine/flow-sequence.ts. Provider→sandbox handoff crosses trust boundary (GPU config, credentials, sandbox config). Current tests use mocked handler callbacks, not real handler chain.
  • PRA-T5 Missing integration test for resume path with pre-completed provider_selection through shared factories — Add integration test in provider-sandbox.test.ts for resume path with pre-completed provider_selection step, verifying sandbox phase receives correct narrowed context with all required fields (model, provider, sandboxGpuConfig, hermesToolGateways, preferredInferenceApi, nimContainer, webSearchConfig) through shared factories.
  • PRA-T6 Parameterized test validates sandbox phase rejects execution when required fields missing — No action needed. Good security regression coverage.
  • PRA-T7 Missing integration test for resume path through shared provider/sandbox factories in flow sequence — Add integration test for resume path through shared factories in flow-sequence.test.ts or provider-sandbox.test.ts.
  • PRA-T8 Acceptance clause — Existing phase tests are consolidated around the shared handoff rather than copied — add test evidence or identify existing coverage. New consolidated test in provider-sandbox.test.ts ('passes full context and results through the shared handoff') but core-flow-phases.test.ts retains overlapping handoff test ('carries provider selection output into sandbox setup').
Since last review details

Current findings, using the urgency labels above:

PRA-1 Resolve/justify — Source-of-truth review needed: src/lib/onboard/machine/core-flow-phases.ts:146-175 (compatibility bridge workaround in runCoreOnboardFlowSlice)

  • Location: not file-specific
  • Category: architecture
  • Problem: The advisor marked localized patch analysis as needs_followup.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: MISSING — comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list for resume=true
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Comment block at core-flow-phases.ts:146-175 documents workaround, tolerated downstream family, and intended removal condition but provides no tracking issue, no regression test, no timeline. Unchanged from prior review (PRA-1/PRA-2/PRA-10).

PRA-2 Required — Compatibility bridge workaround lacks tracking issue and regression test

  • Location: src/lib/onboard/machine/core-flow-phases.ts:146
  • Category: architecture
  • Problem: runCoreOnboardFlowSlice contains a documented compatibility bridge (lines 146-175) that tolerates ahead-state machine snapshots from legacy step mutation (updateMachine === true) and repaired-resume replay. The comment documents the invalid state, source boundaries, source-fix constraint, and intended removal condition but provides no tracking issue, no regression test that would fail when the workaround becomes unnecessary, and no timeline.
  • Impact: Localized workaround preserves invalid state (ahead-state snapshots downstream of provider/sandbox repair checks) with no mechanism ensuring temporariness. If the source boundary (imperative resume work vs strict FSM recovery states) is never fixed, this fallback becomes permanent technical debt that masks upstream issues. Sessions in downstream states (openclaw, agent_setup, policies, finalizing, post_verify) can re-enter the core slice on resume without re-running repair/backstop checks.
  • Required action: Either (a) link a tracking issue with acceptance criteria for removing this fallback, or (b) add a regression test in core-flow-phases.test.ts that would fail when compatibilityWhenState no longer requires the extended downstream state list for resume=true, indicating FSM recovery states now handle ahead-state resume correctly. Resolve in this PR or explicitly justify the accepted risk.
  • Expected follow-up: Fix before merge or get explicit maintainer override.
  • Verification: Inspect core-flow-phases.ts:146-175 comment block and search for any linked issue, tracking label, or test asserting the compatibility bridge is no longer needed.
  • Missing regression test: Test that fails when compatibilityWhenState no longer requires the extended downstream state list ["provider_selection", "inference", "sandbox", "openclaw", "agent_setup", "policies", "finalizing", "post_verify"] for resume=true, indicating FSM recovery states now handle ahead-state resume correctly.
  • Done when: The required change is committed and verification passes: Inspect core-flow-phases.ts:146-175 comment block and search for any linked issue, tracking label, or test asserting the compatibility bridge is no longer needed.
  • Evidence: Comment block at core-flow-phases.ts:146-175 explicitly documents workaround, tolerated downstream family, and intended removal condition but provides no tracking mechanism. Previous review PRA-1/PRA-6 identified this gap; unchanged in current diff.

PRA-3 Resolve/justify — Source-of-truth review needed for compatibility bridge workaround

  • Location: src/lib/onboard/machine/core-flow-phases.ts:146
  • Category: architecture
  • Problem: The localized patch tolerates an invalid state (ahead-state snapshots) but the source boundary (imperative resume work vs strict FSM recovery states) is not fixed in this PR. The comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list.
  • Impact: A localized workaround can preserve or hide an invalid state when the source boundary is unclear. Without a regression test that fails when the workaround becomes unnecessary, there is no automated signal that the source fix is complete.
  • Recommended action: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior. Either fix the source (convert repair/backstop checks to strict FSM recovery states) or document accepted risk with tracking issue.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Missing regression test: MISSING — comment claims 'Phase tests cover ahead-state resume' but no test would fail when compatibilityWhenState no longer needs extended downstream list
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Inspect the localized patch and source-of-truth review fields for a concrete invalid state, source boundary, source-fix constraint, regression test, and removal condition.
  • Evidence: Identified in prior review PRA-1/PRA-6. Unchanged in current diff. No tracking issue linked.

PRA-4 Resolve/justify — Missing integration test for resume path with pre-completed provider_selection through shared factories

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:1
  • Category: tests
  • Problem: No integration test exercises the resume path where provider_selection step is already complete and the sandbox phase receives the narrowed context through shared factories. The provider→sandbox handoff crosses a trust boundary (GPU config, credentials, sandbox config) that benefits from runtime validation.
  • Impact: A type-only refactor could pass unit tests while a real handler contract mismatch (e.g., missing field in narrowed context during resume) surfaces only in integration or E2E runs. Validation context flags runtime_validation_recommended for changed sandbox/infrastructure paths.
  • Recommended action: Add integration test in provider-sandbox.test.ts for resume path with pre-completed provider_selection step, verifying sandbox phase receives correct narrowed context with all required fields (model, provider, sandboxGpuConfig, hermesToolGateways, preferredInferenceApi, nimContainer, webSearchConfig) through shared factories.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Check provider-sandbox.test.ts for a test that creates a session with provider_selection step complete, runs provider phase (which should skip/return existing context), then runs sandbox phase with resume=true.
  • Missing regression test: Integration test for resume path with pre-completed provider_selection through shared factories, verifying narrowed context flows end-to-end without field loss.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Check provider-sandbox.test.ts for a test that creates a session with provider_selection step complete, runs provider phase (which should skip/return existing context), then runs sandbox phase with resume=true.
  • Evidence: Previous review PRA-4 identified this gap. Current diff adds parameterized rejection test but no resume integration test.

PRA-5 Resolve/justify — Test consolidation between core-flow-phases.test.ts and provider-sandbox.test.ts incomplete

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: correctness
  • Problem: Both files test provider→sandbox handoff with overlapping scenarios. provider-sandbox.test.ts has new consolidated test 'passes full context and results through the shared handoff' but core-flow-phases.test.ts retains 'carries provider selection output into sandbox setup' using real deps.
  • Impact: Duplicated test maintenance burden and potential drift between test expectations. Acceptance criterion fix: update all repo references to NVIDIA/NemoClaw #7 (test consolidation) from issue Delete duplicate provider-to-sandbox phase assembly #5938 not fully satisfied.
  • Recommended action: Extract shared test helpers/fixtures for provider/sandbox handoff scenarios into a common test utility module (e.g., provider-sandbox-test-fixtures.ts), or consolidate handoff-specific tests into provider-sandbox.test.ts and keep core-flow-phases.test.ts focused on full sequence orchestration.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Compare test scenarios in core-flow-phases.test.ts (carries provider selection output into sandbox setup) and provider-sandbox.test.ts (passes full context and results through the shared handoff) for overlapping coverage.
  • Missing regression test: Shared test fixtures for provider/sandbox handoff scenarios used by both test files.
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Compare test scenarios in core-flow-phases.test.ts (carries provider selection output into sandbox setup) and provider-sandbox.test.ts (passes full context and results through the shared handoff) for overlapping coverage.
  • Evidence: PR body claims 'Consolidate the shared-factory mapping tests into one full-context handoff test' but both tests exist.

PRA-6 Improvement — PR #3286 overlaps same files - verify no semantic conflict on merge

PRA-7 Resolve/justify — Acceptance criterion #7 (test consolidation) not fully met

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: acceptance
  • Problem: Issue Delete duplicate provider-to-sandbox phase assembly #5938 acceptance criteria requires 'Existing phase tests are consolidated around the shared handoff rather than copied.' The PR adds a consolidated test in provider-sandbox.test.ts but retains overlapping handoff test in core-flow-phases.test.ts.
  • Impact: Acceptance criteria from linked issue not fully satisfied. Test consolidation was expected as part of this refactor.
  • Recommended action: Complete test consolidation as part of this PR per acceptance criteria, or explicitly justify why consolidation is deferred.
  • Expected follow-up: Resolve in this PR or explain why the risk is acceptable.
  • Verification: Verify whether the linked issue/acceptance criteria required test consolidation and whether the current state meets that requirement.
  • Missing regression test: N/A - acceptance criteria tracking
  • Done when: The risk is fixed or explicitly justified in the PR. Verification: Verify whether the linked issue/acceptance criteria required test consolidation and whether the current state meets that requirement.
  • Evidence: Issue Delete duplicate provider-to-sandbox phase assembly #5938 acceptance criteria fix: update all repo references to NVIDIA/NemoClaw #7 vs current test file state.

PRA-8 Improvement — Shared factory abstraction improves type safety at trust boundary

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.ts:1
  • Category: security
  • Problem: The sandbox phase handler now requires ProviderSelectedOnboardFlowContext which enforces NonNullable<sandboxGpuConfig>. Parameterized test validates rejection when model, provider, or sandboxGpuConfig missing.
  • Impact: Positive security improvement: type system now prevents sandbox phase from executing without required GPU config fields. Good security regression coverage via parameterized test.
  • Suggested action: No action needed. The abstraction is justified and well-scoped.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Verify ProviderSelectedOnboardFlowContext type definition in flow-context.ts enforces sandboxGpuConfig non-null.
  • Missing regression test: N/A - already covered by parameterized test in provider-sandbox.test.ts
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: flow-context.ts:35-38 defines ProviderSelectedOnboardFlowContext with sandboxGpuConfig: NonNullable<Context["sandboxGpuConfig"]>. provider-sandbox.test.ts:45 parameterized test covers 3 missing-field cases.

PRA-9 Improvement — Parameterized test validates sandbox phase rejects execution when required fields missing

  • Location: src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts:45
  • Category: tests
  • Problem: Parameterized test (it.each) validates sandbox phase rejects execution before model, provider, or sandboxGpuConfig is selected, referencing issue Delete duplicate provider-to-sandbox phase assembly #5938.
  • Impact: Good security regression coverage for trust boundary validation. Ensures sandbox phase cannot execute with incomplete provider-selected context.
  • Suggested action: No action needed. Good security regression coverage.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Run test suite and verify parameterized test passes for all three missing field cases.
  • Missing regression test: N/A - already implemented
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: provider-sandbox.test.ts:45-70 parameterized test for missing model/provider/sandboxGpuConfig.

PRA-10 Improvement — Missing integration test for resume path through shared provider/sandbox factories in flow sequence

  • Location: src/lib/onboard/machine/flow-sequence.test.ts:1
  • Category: tests
  • Problem: flow-sequence.test.ts integration test updated with more context fields (credentialEnv, fromDockerfile, hermesToolGateways, sandboxGpuConfig sentinel, selectedMessagingChannels) but no test for resume path through shared factories.
  • Impact: Integration test covers fresh flow with expanded context but not resume path through shared provider/sandbox factories.
  • Suggested action: Add integration test for resume path through shared factories in flow-sequence.test.ts or provider-sandbox.test.ts.
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Check flow-sequence.test.ts for any test with resume=true that exercises providerInference and sandbox phases through buildOnboardFlowPhaseSequence.
  • Missing regression test: Integration test for resume path through shared provider/sandbox factories in flow sequence.
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: flow-sequence.test.ts only test with runtime is 'runs ordered provider results through runtime transition validation' which uses fresh context.

PRA-11 Improvement — Monolith growth: core-flow-phases.test.ts approaching size where shared fixtures would improve maintainability

  • Location: src/lib/onboard/machine/core-flow-phases.test.ts:1
  • Category: correctness
  • Problem: core-flow-phases.test.ts grew by 6 lines (from 565 to 571). While not a hotspot, continued growth should be monitored. Extraction of shared fixtures would support test consolidation (PRA-5).
  • Impact: Test file approaching size where extraction of shared fixtures would improve maintainability and reduce duplication.
  • Suggested action: Consider extracting shared test fixtures for provider/sandbox handoff scenarios as part of test consolidation (PRA-5).
  • Expected follow-up: Prefer a current-PR fix when local to changed code; defer only with rationale or linked follow-up.
  • Verification: Monitor file size in future PRs.
  • Missing regression test: N/A
  • Done when: The local improvement is applied, or the PR notes why it should be deferred.
  • Evidence: Drift context shows monolith delta +6 lines, severity warning. Simplification signal flags large_file_hotspot.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: onboard-resume-e2e, onboard-repair-e2e, cloud-onboard-e2e
Optional E2E: double-onboard-e2e

Dispatch hint: onboard-resume-e2e,onboard-repair-e2e,cloud-onboard-e2e

Auto-dispatched E2E: onboard-resume-e2e via nightly-e2e.yaml at 2a3cbf1fd31867f7d9a8353a10fdc8030c425c4cnightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • onboard-resume-e2e (medium): Required by the onboarding resume compatibility rule because non-test source changes touch src/lib/onboard/machine flow orchestration and provider/sandbox state transitions. This validates resuming an interrupted onboarding session through the live flow.
  • onboard-repair-e2e (medium): Required by the onboarding resume compatibility rule because the changed core flow slice can affect resume repair/backstop behavior around provider inference and sandbox setup.
  • cloud-onboard-e2e (high): Required because these provider/sandbox phase and flow-sequence changes can affect the full hosted onboarding user flow, including hosted inference selection, credential propagation, sandbox creation, policy setup, and post-onboard verification.

Optional E2E

  • double-onboard-e2e (medium): Useful adjacent confidence for context preservation and sandbox reuse/recreate behavior across repeated onboarding runs, but the targeted resume, repair, and full hosted onboard lanes are the merge-blocking coverage.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: onboard-resume-e2e,onboard-repair-e2e,cloud-onboard-e2e

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Vitest E2E Scenario Recommendation

Required Vitest E2E scenarios: onboard-resume-vitest, onboard-repair-vitest
Optional Vitest E2E scenarios: None

Dispatch required Vitest E2E scenarios:

  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=onboard-resume-vitest
  • gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=onboard-repair-vitest

Workflow run

Full Vitest E2E advisor summary

Vitest E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required Vitest E2E scenarios

  • onboard-resume-vitest: Changes touch src/lib/onboard/machine provider/sandbox phase handoff and ordered onboarding flow sequencing. The onboarding resume compatibility rule requires the wired onboard-resume Vitest job for state-machine transition and resume-path changes.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=onboard-resume-vitest
  • onboard-repair-vitest: The provider/sandbox phase and core flow sequence changes can affect persisted-session repair/backstop execution during resume, so the onboarding resume compatibility rule requires the wired onboard-repair Vitest job as well.
    • Dispatch: gh workflow run e2e-vitest-scenarios.yaml --ref <pr-head-ref> --field jobs=onboard-repair-vitest

Optional Vitest E2E scenarios

  • None.

Relevant changed files

  • src/lib/onboard/machine/core-flow-phases.test.ts
  • src/lib/onboard/machine/core-flow-phases.ts
  • src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts
  • src/lib/onboard/machine/flow-phases/provider-sandbox.ts
  • src/lib/onboard/machine/flow-sequence.test.ts
  • src/lib/onboard/machine/flow-sequence.ts

@github-actions

github-actions Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

PR Review Advisor — No blocking findings

Merge posture: No blocking advisor findings
Primary next action: Add or justify PRA-T1 and any related test follow-ups.
Open items: 0 required · 0 warnings · 0 suggestions · 6 test follow-ups
Since last review: 1 prior item resolved · 0 still apply · 0 new items found

Action checklist

  • PRA-T1 Add or justify test follow-up: Runtime validation
  • PRA-T2 Add or justify test follow-up: Runtime validation
  • PRA-T3 Add or justify test follow-up: Runtime validation
  • PRA-T4 Add or justify test follow-up: Runtime validation
  • PRA-T5 Add or justify test follow-up: Acceptance clause
  • PRA-T6 Add or justify test follow-up: Acceptance clause
Test follow-ups to resolve or justify

If these cover changed behavior, prefer adding them in this PR; otherwise state why existing coverage is enough or link the follow-up.

  • PRA-T1 Runtime validation — Run or identify the targeted unit coverage for `core-flow-phases.test.ts` proving provider output flows into sandbox setup for fresh and completed-provider resume paths.. The changed source is security-adjacent onboarding/inference/sandbox phase assembly. Static coverage is targeted and behavior-specific, but this read-only review did not execute the tests, CLI typecheck, or hooks.
  • PRA-T2 Runtime validation — Run or identify the targeted unit coverage for `flow-phases/provider-sandbox.test.ts` proving full-context handoff preserves Dockerfile path, credential env name, messaging channels, and sandbox GPU config while rejecting missing model/provider/GPU config before sandbox setup.. The changed source is security-adjacent onboarding/inference/sandbox phase assembly. Static coverage is targeted and behavior-specific, but this read-only review did not execute the tests, CLI typecheck, or hooks.
  • PRA-T3 Runtime validation — Run or identify the targeted unit coverage for `flow-sequence.test.ts` proving ordered provider transition validation preserves sentinel context through the generic full-sequence path.. The changed source is security-adjacent onboarding/inference/sandbox phase assembly. Static coverage is targeted and behavior-specific, but this read-only review did not execute the tests, CLI typecheck, or hooks.
  • PRA-T4 Runtime validation — Run or identify CLI typecheck and hook evidence for the changed TypeScript type contracts.. The changed source is security-adjacent onboarding/inference/sandbox phase assembly. Static coverage is targeted and behavior-specific, but this read-only review did not execute the tests, CLI typecheck, or hooks.
  • PRA-T5 Acceptance clause — Targeted tests, CLI typecheck, and hooks pass. — add test evidence or identify existing coverage. Targeted changed tests are present and behavior-specific, but this read-only review did not execute tests, CLI typecheck, or hooks and does not report external CI status.
  • PRA-T6 Acceptance clause — I searched existing issues and this is not a duplicate — add test evidence or identify existing coverage. This checklist item is an author assertion in issue text; the read-only code review did not independently verify issue-search history.

Workflow run details

This is an automated, non-binding review; it still expects maintainers and agents to respond to each required or warning item. Treat suggestions as current-PR improvements when they touch changed code; defer only with maintainer rationale or a linked follow-up. A human maintainer must make the final merge decision.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28332122143
Target ref: c9599aad27dee7fe0d406c10b1e99bbfa07a1ea2
Workflow ref: main
Requested jobs: onboard-resume-e2e
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
onboard-resume-e2e ✅ success

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv

cv commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

Review follow-up for the Nemotron findings:

  • PRA-1 / PRA-T1 / PRA-T3: addressed in 5b75f5573. The core phase test now runs the production createCoreOnboardFlowPhases provider handler and passes its complete result context directly into the production sandbox handler, asserting that provider/model, Hermes gateways, inference API, NIM container, messaging channels, and sandbox output survive the handoff. This uses the real handler signatures with injected external dependencies.
  • PRA-2 / PRA-T4: addressed without adding fixture bookkeeping: I merged the formerly separate provider-output and sandbox-input tests into that single chained behavior test. The follow-up removes 7 net test lines and one overlapping test case.
  • PRA-T2: the existing resume test still exercises a completed provider_selection step through the real provider handler and verifies normalized resume context. The advisor also auto-dispatched the focused resume E2E; I am leaving broader cloud/repair dispatch to the normal maintainer E2E gate because those jobs mutate external infrastructure.

Verification after the follow-up: 4 targeted files / 29 tests passed, npm run typecheck:cli passed, and the complete CLI pre-commit coverage hook plus pre-push typecheck passed. The PR is now 50 lines smaller overall.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28332394125
Target ref: 5b75f55737427482a3c91ecc64e08c74827ddfdc
Workflow ref: main
Requested jobs: onboard-resume-e2e
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
onboard-resume-e2e ✅ success

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv

cv commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

Final review follow-up:

  • PRA-2/PRA-3/PRA-5 and GPT PRA-1: resolved in 39db4bdae. The chained production-handler test now explicitly proves credential env-name, endpoint, Dockerfile, host GPU, sandbox GPU config, GPU passthrough, Hermes gateways, inference API, NIM container, and messaging context survive provider→sandbox handoff. The completed-provider_selection resume test now continues through the sandbox handler. One parameterized shared-factory test rejects missing provider, model, and sandbox GPU config before invoking the handler.
  • PRA-4: the two files test different layers: provider-sandbox.test.ts owns the generic shared-factory contract; core-flow-phases.test.ts owns live composition of the real provider and sandbox handlers with injected external dependencies. The previously overlapping core tests were merged into one chained test. Extracting another shared fixture would add the bookkeeping this refactor is intended to remove.
  • PRA-1/PRA-6: runCoreOnboardFlowSlice and its compatibility bridge are unchanged from main; this PR only replaces the two phase objects constructed above it. Existing tests cover resume compatibility at provider_selection, downstream policies/finalizing/post_verify, terminal rejection, and non-resume ahead-state behavior. Redesigning that pre-existing bridge is explicitly outside Delete duplicate provider-to-sandbox phase assembly #5938's bounded handoff slice.
  • PRA-7: I inspected open PR fix: support reasoning compatible endpoints (Fixes #3279) #3286. Its overlap adds compatibleEndpointReasoning to the old explicit provider-result copies. With this refactor, the generic sequence carries that full context automatically; the core adapter will retain one mechanical merge-field addition when fix: support reasoning compatible endpoints (Fixes #3279) #3286 rebases. There is no provider/model/GPU contract conflict.

Post-follow-up verification: 4 files / 31 tests passed, CLI typecheck passed, and complete commit/push hooks passed. The PR remains 23 lines smaller overall and 42 production lines smaller.

Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@cv

cv commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator Author

Additional consolidation follow-up in 7292a518b:

  • The generic buildOnboardFlowPhaseSequence path now proves credential, Dockerfile, GPU, sandbox GPU, GPU passthrough, Hermes gateway, and messaging context survive the complete provider→sandbox→finalization sequence.
  • The two positive shared-factory mapping tests are now one chained full-context handoff test. It asserts the sandbox callback receives the narrowed provider/model/GPU context and prior credential, Dockerfile, and messaging fields.
  • The production-core test remains intentionally separate because it exercises the real provider and sandbox handlers; the shared-factory test exercises the generic phase contract. No fixture/helper layer was introduced.

The compatibility bridge finding remains unrelated to this diff: runCoreOnboardFlowSlice is byte-for-byte behaviorally unchanged from main, its existing ahead-state/terminal tests passed, and #5938 explicitly excludes redesigning the full state machine. Creating a tracking abstraction or test that predicts when unrelated legacy recovery debt becomes removable would expand this bounded deletion slice and add the bookkeeping it is meant to avoid.

Verification: 4 files / 30 tests passed, CLI typecheck passed, and complete commit/push hooks passed. The final diff remains 7 lines smaller overall and 42 production lines smaller.

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ⚠️ Run cancelled — no signal

Run: 28332687622
Target ref: 39db4bdae2d82524f9466c5f8bef7da42a57e670
Workflow ref: main
Requested jobs: onboard-resume-e2e
Summary: 0 passed, 0 failed, 1 cancelled, 0 skipped

Job Result
onboard-resume-e2e ⚠️ cancelled

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/onboard/machine/flow-sequence.test.ts`:
- Line 254: The test is asserting the default sandboxGpuConfig value already
seeded by context(), so it does not verify the provider-to-sandbox handoff in
flow-sequence.test.ts. Update the relevant test setup in the flow sequence
scenario to override the fixture with a distinct sentinel sandboxGpuConfig
value, then assert that exact value through the public boundary. Use the
existing context() helper and the provider/sandbox handoff path in this test to
ensure the observed config comes from the handler output, not the initial
fixture.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f921774a-f0ee-4272-8628-d4859f03c7a5

📥 Commits

Reviewing files that changed from the base of the PR and between 39db4bd and 7292a51.

📒 Files selected for processing (2)
  • src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts
  • src/lib/onboard/machine/flow-sequence.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lib/onboard/machine/flow-phases/provider-sandbox.test.ts

Comment thread src/lib/onboard/machine/flow-sequence.test.ts Outdated
Signed-off-by: Carlos Villela <cvillela@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28332841685
Target ref: 7292a518b90fe3db461938e963ce393602e6d394
Workflow ref: main
Requested jobs: onboard-resume-e2e
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
onboard-resume-e2e ✅ success

@github-actions

Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 28333089681
Target ref: 2a3cbf1fd31867f7d9a8353a10fdc8030c425c4c
Workflow ref: main
Requested jobs: onboard-resume-e2e
Summary: 1 passed, 0 failed, 0 cancelled, 0 skipped

Job Result
onboard-resume-e2e ✅ success

@cv cv merged commit a65850a into main Jun 28, 2026
50 checks passed
@cv cv deleted the codex/5938-provider-sandbox-handoff branch June 28, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.70 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Delete duplicate provider-to-sandbox phase assembly

1 participant