Skip to content

fix: resolve follow-up issues from CodeQL sandbox recipe refactor (#50)#51

Merged
pruiz merged 2 commits into
masterfrom
fix/issue-50-codeql-sandbox-recipe-follow-ups
Jun 9, 2026
Merged

fix: resolve follow-up issues from CodeQL sandbox recipe refactor (#50)#51
pruiz merged 2 commits into
masterfrom
fix/issue-50-codeql-sandbox-recipe-follow-ups

Conversation

@pruiz

@pruiz pruiz commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary

Resolves the 19 follow-up issues identified in #50 after the CodeQL sandbox recipe refactor (PR #45), plus addresses all 8 PR review comments from Copilot, CodeRabbit, and Greptile.


Original fixes (commit 107e1b2)

Phase ownership fixes:

  • Swapped 1b/1c graceful completion artifact sets in completion.py
  • Same swap fixed in codecome.py phase status reporting
  • Renamed REQUIRED_NOTES_1BPHASE_1C_REQUIRED_NOTES

Prompt fixes:

  • phase-1b-sandbox.md: removed Phase 1c artifacts from required reading
  • phase-1c-recon.md: corrected run summary path and old CodeQL path

CodeQL health model fixes:

  • Expanded COMPILED_LANGUAGES, injected total_alerts, made completed-partial reachable, switched unknown sentinel to None

Crash fixes:

  • Added missing _validate_codeql_plan_for_repair function
  • Fixed undefined progress reference in write_summary

Sandbox recipe validation:

  • Added sandbox-level docker checks + build target environment/service checks
  • Wired recipe validation into sandbox-validate

Stale docs: Makefile, docstring


Review feedback fixes (commit eee7c6d)

Reviewer Comment Resolution
Greptile _inject_total_alerts silently discards exceptions ✅ Now appends warning to manifest["warnings"]
Copilot build_command treated as filesystem path (rejects shell commands) ✅ Added _command_looks_path_like() helper; only path-validates when first token starts with ./ or /
Copilot Phase 1b progress checks only sandbox-plan.md, missing sandbox-recipe.yml ✅ Now requires both files
Copilot + CodeRabbit _collect_plan_errors() hard-fails unsupported languages even under fail_policy=soft ✅ Now reads fail_policy; only errors under hard
Copilot _collect_plan_errors() hard-fails empty languages:[] under soft ✅ Gated behind fail_policy=hard
Copilot _collect_plan_errors() treats unexpected confidence as error (gate only warns) ✅ Removed confidence check entirely
CodeRabbit Recipe validation doesn't affect sandbox-validate exit status recipe_errors now sets overall_outcome = "failed"
CodeRabbit nitpick write_summary re-parses alerts.yml instead of using manifest ✅ Now reads manifest["total_alerts"] first
CodeRabbit Move recipe validation to package helper ⏸️ Deferred — architectural cleanup, out of scope for bug-fix PR
CodeRabbit Move phase progress to package helper ⏸️ Deferred — tools/codecome.py already has many non-thin functions

Tests

775/775 pass.

- Fix Phase 1b/1c graceful completion artifact set swap
- Rename REQUIRED_NOTES_1B to PHASE_1C_REQUIRED_NOTES
- Fix prompts: 1b reading list, 1c summary path, old CodeQL paths
- Fix Makefile and docstring stale phase references
- Add missing _validate_codeql_plan_for_repair function
- Expand COMPILED_LANGUAGES to include csharp/java-kotlin
- Inject total_alerts into manifest before health computation
- Fix write_summary undefined progress reference
- Strengthen sandbox-recipe.yml validation with service/environment checks
- Wire recipe validation into sandbox-validate
- Use None instead of -1 for unknown extractor sentinel
- Make completed-partial health classification reachable
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR restructures Phase 1 of a CodeCome workflow by redefining subphase artifact ownership: Phase 1b now gates on sandbox bootstrap artifacts, Phase 1c gates on reconnaissance notes. It introduces CodeQL plan repair validation, improves CodeQL health classification with optional integer handling, expands sandbox recipe validation with docker-specific rules, and updates all related tests and documentation to reflect the new structure.

Changes

Phase 1 Subphase Ownership and CodeQL Validation

Layer / File(s) Summary
Phase 1 subphase artifact ownership definition
tools/phases/phase_1_gates.py, tools/phases/completion.py
Introduces PHASE_1C_REQUIRED_NOTES constant and restructures check_phase_graceful_completion() to clearly separate artifact ownership: Phase 1b freshness gates on sandbox-plan/sandbox-recipe/CODECOME-GENERATED; Phase 1c freshness gates on recon notes (attack-surface, threat-model, execution-model, validation-model).
Phase 1c required notes constant and validation
tools/phases/phase_1_gates.py, tests/test_phase_1_gates_threat_model.py
Adds PHASE_1C_REQUIRED_NOTES constant defining reconnaissance note filenames and updates check_phase_1c() to validate missing notes against this constant; test fixtures now iterate over the constant to create required note files.
Phase 1 completion and freshness tests
tests/test_phase_graceful_completion_subphases.py, tests/test_phases_completion.py
Updates graceful-completion tests to verify new artifact ownership: Phase 1b success with sandbox-plan/CODECOME-GENERATED, Phase 1c success with recon notes; failure tests now expect appropriate missing-artifact diagnostics per subphase; acceptance tests patch paths to use itemdb/notes directory structure.
Phase 1 progress reporting and orchestration updates
tools/codecome.py, tools/codecome/phase_1.py
Updates check_phase_progress() to report Phase 1b status from sandbox-plan.md presence and Phase 1c status from missing recon notes count; updates module docstring to reflect CodeQL analysis occurring between 1b and 1c.
CodeQL plan repair validation helpers
tools/codecome/phase_1.py
Adds _validate_codeql_plan_for_repair() and _collect_plan_errors() to validate Phase 1c's codeql-plan.yml under repair mode: enforces unit structure, ID uniqueness, path constraints (under src/), file existence, language/build-mode support, confidence values, and manual-build requirements.
CodeQL health classification with optional integer handling
tools/codeql/health.py
Refactors health classification to handle unknown numeric counts as None instead of sentinels; expands COMPILED_LANGUAGES to include csharp and java-kotlin; distinguishes unknown vs. zero extractor success counts in failure classification; treats unresolvable alert counts as completed-partial.
CodeQL pipeline total alerts injection
tools/codeql/pipeline.py, tools/codeql/runner.py
Adds _inject_total_alerts() helper to read and inject alerts.yml alert count into manifest; wires injection into run_full_pipeline() before health computation; updates write_summary() to report alerts.yml read failures explicitly.
Sandbox recipe docker-specific validation
tools/sandbox/recipe.py
Extends validate_recipe() with docker-model rules requiring compose_file/default_service; updates _validate_build_targets() to accept validation_model and enforce per-model service/environment requirements; adds _validate_target_environment() for type/compose_file/service validation.
Sandbox recipe validation integration and test fixtures
tools/sandbox-bootstrap.py, tests/test_sandbox_recipe.py
Integrates recipe validation into cmd_validate() to capture recipe errors and report in output/text mode; expands test fixture to create sandbox/scripts, docker-compose.yml, and script files for recipe validation tests.
Documentation and reference updates
Makefile, prompts/phase-1b-sandbox.md, prompts/phase-1c-recon.md
Updates Makefile sandbox hints to reference Phase 1b; updates phase-1b sandbox prompts to require sandbox-recipe.yml.example; updates phase-1c prompts to resolve CodeQL run_id from last-run-manifest.yml and use per-run signal paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • pruiz/CodeCome#50: This PR directly implements the Phase 1 gate name changes, artifact ownership clarification, and sandbox-recipe validation improvements requested in that issue.

Possibly related PRs

  • pruiz/CodeCome#40: Both PRs update Phase 1b/1c subphase freshness gating in tools/phases/completion.py and corresponding test expectations.
  • pruiz/CodeCome#42: Both PRs expand CodeQL plan repair validation in tools/codecome/phase_1.py with schema constraints and per-language build_mode semantics.
  • pruiz/CodeCome#34: Both PRs restructure Phase 1 reconnaissance notes requirements and update corresponding gate/test validation in tools/phases/phase_1_gates.py.

Poem

🐰 Hopping through phases with a tidy spin,
1b readies sandboxes, scripts tucked within,
1c gathers recon notes, neat and clear,
CodeQL counts signal when numbers aren't near,
Docker recipes checked — the path is akin.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: resolving follow-up issues from a CodeQL sandbox recipe refactor, which is exactly what the changeset demonstrates across 13 files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-50-codeql-sandbox-recipe-follow-ups

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Coverage Report

Metric Value
Line Coverage 75.6%
Lines Covered 0 / 0

Download detailed HTML coverage reports per OS/Python from the workflow artifacts.

Generated by pytest-cov on 2026-06-09T23:01:46.932Z

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR resolves 19 follow-up issues from the CodeQL sandbox recipe refactor (#50), primarily correcting swapped phase-ownership assignments and a number of related crash and classification bugs.

  • Phase ownership fix: Phase 1b graceful-completion checks now gate on sandbox bootstrap artifacts (sandbox-plan.md, sandbox-recipe.yml, CODECOME-GENERATED.md), and Phase 1c gates on recon notes. The rename REQUIRED_NOTES_1BPHASE_1C_REQUIRED_NOTES propagates consistently across phase_1_gates.py, completion.py, codecome.py, and 8 test files.
  • CodeQL health model: Unknown extractor counts now return None instead of -1, making the completed-partial classification reachable; COMPILED_LANGUAGES gains csharp and java-kotlin; total_alerts is injected into the manifest before compute_health() runs.
  • New validation logic: sandbox/recipe.py adds docker-specific checks for compose_file, default_service, service, environment, and build_command path existence; sandbox-bootstrap.py wires this into sandbox-validate output.

Confidence Score: 4/5

Safe to merge after fixing the build_command path check in recipe validation.

The newly added build_command path existence check in _validate_build_targets passes the full command string (including arguments) to Path(root) / build_command instead of extracting just the first token the way _command_looks_path_like does. Any path-like build command that includes arguments will be resolved to a path that can never exist, causing false-positive validation errors. All other changes in the PR are well-tested and look correct.

tools/sandbox/recipe.py — the build_command path resolution in _validate_build_targets

Important Files Changed

Filename Overview
tools/sandbox/recipe.py Adds docker-specific sandbox and build-target validation; contains a bug where path existence for path-like build_command strings uses the full argument string instead of just the first token, generating false-positive errors.
tools/phases/completion.py Core phase-ownership swap: Phase 1b now checks sandbox bootstrap artifacts, Phase 1c checks recon notes. Logic and test coverage look correct.
tools/codeql/health.py Unknown extractor sentinel changed from -1 to None, completed-partial path now reachable, COMPILED_LANGUAGES expanded; all changes look consistent.
tools/codeql/pipeline.py Adds _inject_total_alerts to populate manifest before health computation; exception now surfaced as a warning in manifest rather than being silently discarded.
tools/codecome/phase_1.py Adds the previously-missing _validate_codeql_plan_for_repair and _collect_plan_errors functions; notes_dir parameter in _collect_plan_errors is accepted but unused internally.
tools/sandbox-bootstrap.py Wires recipe validation into sandbox-validate; overall_outcome initialized to 'passed' before the loop, so the new recipe_errors override path is safe.
tools/codeql/runner.py write_summary updated to prefer manifest-injected total_alerts and fixes undefined progress reference in error handler.
tools/phases/phase_1_gates.py Renames REQUIRED_NOTES_1B to PHASE_1C_REQUIRED_NOTES and updates all internal usages; straightforward rename, no logic changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Phase 1a] -->|target-profile.md\nbuild-model.md\ncodeql-plan.yml| B[CodeQL Analysis]
    B --> C[Phase 1b\nSandbox Bootstrap]
    C -->|sandbox-plan.md\nsandbox-recipe.yml\nCODECOME-GENERATED.md| D[Phase 1c\nDetailed Recon]
    D -->|attack-surface.md\nthreat-model.md\nfile-risk-index.yml\n...| E[Phase 2]
    C --> V[sandbox-validate]
    V --> RV{Recipe Valid?}
    RV -->|Yes| VS[overall: passed]
    RV -->|No| VF[overall: failed]
    B --> H[compute_health]
    H --> H1{total_alerts in manifest?}
    H1 -->|None| CP[completed-partial]
    H1 -->|0| CEV[completed-empty-valid]
    H1 -->|gt 0| CC[completed]
Loading

Reviews (2): Last reviewed commit: "fix: address PR review feedback — plan v..." | Re-trigger Greptile

Comment thread tools/codeql/pipeline.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses follow-up correctness gaps after the CodeQL + sandbox-recipe refactor by realigning Phase 1b/1c artifact ownership, improving sandbox recipe validation and reporting, and tightening CodeQL run-health classification and manifest contents so downstream consumers can reliably interpret results.

Changes:

  • Correct Phase 1b vs 1c artifact ownership in gates, graceful-completion logic, phase progress reporting, and prompts.
  • Expand sandbox recipe validation (docker compose/service checks, build-target validation) and surface validation results in sandbox-bootstrap validate.
  • Improve CodeQL health/summary correctness by injecting total_alerts, making completed-partial reachable, and refining compiled-language extraction checks.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tools/sandbox/recipe.py Adds docker-specific recipe validation and deeper build target checks.
tools/sandbox-bootstrap.py Wires recipe validation into validate output and JSON payload.
tools/phases/phase_1_gates.py Renames Phase 1c required recon notes constant and updates gate logic usage.
tools/phases/completion.py Realigns Phase 1 artifact ownership and graceful-completion checks per subphase.
tools/codeql/runner.py Fixes write_summary error handling for alerts count rendering.
tools/codeql/pipeline.py Injects total_alerts from normalized output before health computation.
tools/codeql/health.py Expands compiled language set and improves classification for unknown extractor/alert counts.
tools/codecome/phase_1.py Adds missing check-codeql-plan validation implementation and plan error collection helper.
tools/codecome.py Updates phase progress reporting for new Phase 1b/1c semantics.
tests/test_sandbox_recipe.py Updates sandbox recipe tests to satisfy new validation expectations.
tests/test_phases_completion.py Updates Phase 1 completion tests for new notes root layout and ownership.
tests/test_phase_graceful_completion_subphases.py Updates subphase graceful-completion tests for new artifact ownership.
tests/test_phase_1_gates_threat_model.py Updates tests to use renamed Phase 1c notes constant.
prompts/phase-1c-recon.md Fixes CodeQL signals path resolution via last-run-manifest.yml + run id.
prompts/phase-1b-sandbox.md Removes Phase 1c artifacts from Phase 1b required reading; points to recipe template.
Makefile Updates sandbox bootstrap phase references from 1c → 1b.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/sandbox/recipe.py Outdated
Comment thread tools/codecome.py
Comment thread tools/codecome/phase_1.py
Comment thread tools/codecome/phase_1.py Outdated
Comment thread tools/codecome/phase_1.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tools/codecome.py (1)

356-395: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move phase progress implementation out of tools/codecome.py and delegate from the wrapper.

This changed logic keeps business implementation in a root tools/*.py script, which should remain a thin delegator to a package module.

♻️ Suggested direction
- def check_phase_progress() -> None:
-     ...
+ from codecome.status import check_phase_progress
# tools/codecome/status.py
def check_phase_progress(root: Path, yaml_module) -> None:
    ...

As per coding guidelines, “Standalone scripts at the tools/ root ... should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codecome.py` around lines 356 - 395, The phase-progress logic in
tools/codecome.py (the block using ROOT, notes_dir, evidence_root,
count_findings(), PHASE_1C_REQUIRED_NOTES, manifest_path and constructing rows
for "Phase 1a"/"CodeQL"/"Phase 1b"/"Phase 1c") should be moved into a package
helper function (e.g. tools/codecome/status.py:def check_phase_progress(root:
Path, yaml_module) -> list[tuple[str,str,str]]). Implement check_phase_progress
to accept the root Path and yaml module, perform the same checks
(count_findings(), presence checks for
target-profile.md/build-model.md/codeql-plan.yml, sandbox-plan.md, read
run-manifest.yml with safe_load and status handling, evaluate
PHASE_1C_REQUIRED_NOTES) and return the rows list; then replace the moved code
in tools/codecome.py with a thin delegator that calls check_phase_progress(ROOT,
yaml) and uses its returned rows. Ensure you reference PHASE_1C_REQUIRED_NOTES
inside the new module by importing it and keep the same status/level mapping
logic.

Source: Coding guidelines

tools/sandbox-bootstrap.py (1)

1521-1571: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Recipe validation failures currently do not fail validate exit status.

recipe_errors only affects output fields; if tiers pass, the command still returns success. That can greenlight an invalid recipe even though downstream CodeQL flow consumes recipe service/compose settings.

Suggested fix
     recipe_errors: list[str] = []
     if SANDBOX_RECIPE_PATH.exists():
         try:
             from sandbox.recipe import load_recipe, validate_recipe
             recipe = load_recipe(SANDBOX_RECIPE_PATH)
             recipe_errors = validate_recipe(recipe, root=ROOT)
         except Exception as exc:
             recipe_errors = [f"Failed to load recipe: {exc}"]
     elif not args.scripts_only:
         recipe_errors = ["sandbox-recipe.yml not found; run Phase 1b to generate it"]
+
+    if recipe_errors:
+        overall_outcome = "failed"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/sandbox-bootstrap.py` around lines 1521 - 1571, The recipe validation
errors are only used for output but don't affect the exit status; after
validating and building recipe_errors (the block that computes recipe_errors
from SANDBOX_RECIPE_PATH / args.scripts_only), if recipe_errors is non-empty set
overall_outcome = "failed" (or set an explicit exit_code variable to non-zero)
and ensure payload["recipe_valid"] is False before emitting/printing; keep the
final return logic (return 0 if overall_outcome == "passed" else 1) so the
validate command fails when recipe_errors exist.
🧹 Nitpick comments (1)
tools/codeql/runner.py (1)

660-671: 💤 Low value

Error handling added for alerts.yml read failures.

The broad exception handler is acceptable here for summary generation resilience. The user gets explicit feedback rather than a crash or missing information.

Optional: Consider using manifest["total_alerts"] instead of re-reading.

write_summary() currently re-parses alerts.yml to count alerts, duplicating work already done by _inject_total_alerts() in the pipeline. Using the manifest field would be more efficient and ensure consistency with health classification:

total_alerts = manifest.get("total_alerts")
if total_alerts is not None:
    lines.append(f"- **Total alerts**: {total_alerts}")
    lines.append("")
elif (normalized_dir / "alerts.yml").is_file():
    lines.append("- **Total alerts**: (error reading alerts.yml)")
    lines.append("")

This change is out of scope for the current PR but could be considered for future cleanup.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codeql/runner.py` around lines 660 - 671, The current write_summary()
block re-parses alerts.yml using alerts_path and load_yaml_mapping and catches
Exception; instead, use the already-populated manifest entry populated by
_inject_total_alerts: read total_alerts = manifest.get("total_alerts") inside
write_summary(), and if total_alerts is not None append the lines with that
value; only fall back to checking alerts_path.is_file() to append the “(error
reading alerts.yml)” message if manifest lacks the key. Update references to
normalized_dir/alerts.yml and remove the duplicated load_yaml_mapping path to
avoid re-parsing and keep behavior identical when manifest has the count.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/codecome/phase_1.py`:
- Around line 224-228: _summary: _collect_plan_errors() currently hard-fails
unsupported languages and empty language lists causing check-codeql-plan to
report failures that check_phase_1a() only blocks under fail_policy=hard._
Modify _collect_plan_errors to accept a fail_policy (or soft_fail boolean)
parameter and stop treating unsupported languages / empty analysis_units as hard
errors; instead emit non-blocking notices or warnings when fail_policy is soft,
and preserve hard errors when fail_policy is hard; update callers (e.g.,
check_phase_1a and any invocation of _collect_plan_errors) to pass the correct
fail_policy so behavior matches Gate 1a semantics and ensure the same change is
applied to the other similar blocks referenced (the checks around analysis_units
and language lists).

In `@tools/sandbox-bootstrap.py`:
- Around line 1521-1531: The recipe-loading and validation logic in
tools/sandbox-bootstrap.py (the block using SANDBOX_RECIPE_PATH, load_recipe and
validate_recipe and assigning recipe_errors) should be moved into a package
helper (e.g., a new function like
sandbox.recipe_helper.load_and_validate_recipe) and the script should call that
helper instead; implement a function in the sandbox package that accepts the
path, ROOT and args.scripts_only (or equivalent) and returns the recipe_errors
list (including catching exceptions and returning the failure message), then
replace the inline logic in tools/sandbox-bootstrap.py with a single call to
that helper and use its returned recipe_errors.

---

Outside diff comments:
In `@tools/codecome.py`:
- Around line 356-395: The phase-progress logic in tools/codecome.py (the block
using ROOT, notes_dir, evidence_root, count_findings(), PHASE_1C_REQUIRED_NOTES,
manifest_path and constructing rows for "Phase 1a"/"CodeQL"/"Phase 1b"/"Phase
1c") should be moved into a package helper function (e.g.
tools/codecome/status.py:def check_phase_progress(root: Path, yaml_module) ->
list[tuple[str,str,str]]). Implement check_phase_progress to accept the root
Path and yaml module, perform the same checks (count_findings(), presence checks
for target-profile.md/build-model.md/codeql-plan.yml, sandbox-plan.md, read
run-manifest.yml with safe_load and status handling, evaluate
PHASE_1C_REQUIRED_NOTES) and return the rows list; then replace the moved code
in tools/codecome.py with a thin delegator that calls check_phase_progress(ROOT,
yaml) and uses its returned rows. Ensure you reference PHASE_1C_REQUIRED_NOTES
inside the new module by importing it and keep the same status/level mapping
logic.

In `@tools/sandbox-bootstrap.py`:
- Around line 1521-1571: The recipe validation errors are only used for output
but don't affect the exit status; after validating and building recipe_errors
(the block that computes recipe_errors from SANDBOX_RECIPE_PATH /
args.scripts_only), if recipe_errors is non-empty set overall_outcome = "failed"
(or set an explicit exit_code variable to non-zero) and ensure
payload["recipe_valid"] is False before emitting/printing; keep the final return
logic (return 0 if overall_outcome == "passed" else 1) so the validate command
fails when recipe_errors exist.

---

Nitpick comments:
In `@tools/codeql/runner.py`:
- Around line 660-671: The current write_summary() block re-parses alerts.yml
using alerts_path and load_yaml_mapping and catches Exception; instead, use the
already-populated manifest entry populated by _inject_total_alerts: read
total_alerts = manifest.get("total_alerts") inside write_summary(), and if
total_alerts is not None append the lines with that value; only fall back to
checking alerts_path.is_file() to append the “(error reading alerts.yml)”
message if manifest lacks the key. Update references to
normalized_dir/alerts.yml and remove the duplicated load_yaml_mapping path to
avoid re-parsing and keep behavior identical when manifest has the count.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b08ac110-238c-4f4f-8284-72b0e9d967c6

📥 Commits

Reviewing files that changed from the base of the PR and between 66654f1 and 107e1b2.

📒 Files selected for processing (16)
  • Makefile
  • prompts/phase-1b-sandbox.md
  • prompts/phase-1c-recon.md
  • tests/test_phase_1_gates_threat_model.py
  • tests/test_phase_graceful_completion_subphases.py
  • tests/test_phases_completion.py
  • tests/test_sandbox_recipe.py
  • tools/codecome.py
  • tools/codecome/phase_1.py
  • tools/codeql/health.py
  • tools/codeql/pipeline.py
  • tools/codeql/runner.py
  • tools/phases/completion.py
  • tools/phases/phase_1_gates.py
  • tools/sandbox-bootstrap.py
  • tools/sandbox/recipe.py

Comment thread tools/codecome/phase_1.py
Comment thread tools/sandbox-bootstrap.py
…, alerts injection

- _collect_plan_errors() now respects fail_policy (soft vs hard)
  for unsupported languages and empty language lists, matching Gate 1a
- Removed confidence check from _collect_plan_errors (gate warns only)
- Recipe validation failures now fail sandbox-validate exit status
- _inject_total_alerts() appends warning instead of silently passing
- build_command validation only path-checks when first token looks
  like a filesystem path (./ or /), allowing shell commands
- Phase 1b progress check now requires both sandbox-plan.md and recipe
- write_summary() uses manifest total_alerts instead of re-parsing

@pruiz pruiz left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review follow-up on the three out-of-diff items from coderabbit:

1. tools/codecome.py phase progress → Deferred

Moving check_phase_progress() into a package module is an architectural cleanup (thin-wrapper guideline). tools/codecome.py already contains many non-thin functions (check_codeql_status, check_sandbox_status, etc.). Extracting just this one function is a good idea but out of scope for this bug-fix PR. Will address in a future refactor.

2. tools/sandbox-bootstrap.py recipe validation exit status → ✅ Addressed in eee7c6d

Recipe validation errors now set overall_outcome = "failed", which flows through the existing exit gate: return 0 if overall_outcome == "passed" else 1. The JSON payload already exposes recipe_valid: false and recipe_errors.

3. tools/codeql/runner.py write_summary use manifest total_alerts → ✅ Addressed in eee7c6d

write_summary() now reads manifest.get("total_alerts") first, and only falls back to re-parsing alerts.yml when the manifest lacks the key. This avoids the duplicated parse and keeps the summary consistent with the health classification.

@pruiz pruiz merged commit bc37225 into master Jun 9, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants