fix: resolve follow-up issues from CodeQL sandbox recipe refactor (#50) by pruiz · Pull Request #51 · pruiz/CodeCome

pruiz · 2026-06-09T22:40:03Z

Summary

Resolves the 19 follow-up issues identified in #50 after the CodeQL sandbox recipe refactor (PR #45), plus addresses all 8 PR review comments from Copilot, CodeRabbit, and Greptile.

Original fixes (commit `107e1b2`)

Phase ownership fixes:

Swapped 1b/1c graceful completion artifact sets in completion.py
Same swap fixed in codecome.py phase status reporting
Renamed REQUIRED_NOTES_1B → PHASE_1C_REQUIRED_NOTES

Prompt fixes:

phase-1b-sandbox.md: removed Phase 1c artifacts from required reading
phase-1c-recon.md: corrected run summary path and old CodeQL path

CodeQL health model fixes:

Expanded COMPILED_LANGUAGES, injected total_alerts, made completed-partial reachable, switched unknown sentinel to None

Crash fixes:

Added missing _validate_codeql_plan_for_repair function
Fixed undefined progress reference in write_summary

Sandbox recipe validation:

Added sandbox-level docker checks + build target environment/service checks
Wired recipe validation into sandbox-validate

Stale docs: Makefile, docstring

Review feedback fixes (commit `eee7c6d`)

Reviewer	Comment	Resolution
Greptile	`_inject_total_alerts` silently discards exceptions	✅ Now appends warning to `manifest["warnings"]`
Copilot	`build_command` treated as filesystem path (rejects shell commands)	✅ Added `_command_looks_path_like()` helper; only path-validates when first token starts with `./` or `/`
Copilot	Phase 1b progress checks only `sandbox-plan.md`, missing `sandbox-recipe.yml`	✅ Now requires both files
Copilot + CodeRabbit	`_collect_plan_errors()` hard-fails unsupported languages even under `fail_policy=soft`	✅ Now reads `fail_policy`; only errors under `hard`
Copilot	`_collect_plan_errors()` hard-fails empty `languages:[]` under soft	✅ Gated behind `fail_policy=hard`
Copilot	`_collect_plan_errors()` treats unexpected confidence as error (gate only warns)	✅ Removed confidence check entirely
CodeRabbit	Recipe validation doesn't affect `sandbox-validate` exit status	✅ `recipe_errors` now sets `overall_outcome = "failed"`
CodeRabbit nitpick	`write_summary` re-parses `alerts.yml` instead of using manifest	✅ Now reads `manifest["total_alerts"]` first
CodeRabbit	Move recipe validation to package helper	⏸️ Deferred — architectural cleanup, out of scope for bug-fix PR
CodeRabbit	Move phase progress to package helper	⏸️ Deferred — `tools/codecome.py` already has many non-thin functions

Tests

775/775 pass.

- Fix Phase 1b/1c graceful completion artifact set swap - Rename REQUIRED_NOTES_1B to PHASE_1C_REQUIRED_NOTES - Fix prompts: 1b reading list, 1c summary path, old CodeQL paths - Fix Makefile and docstring stale phase references - Add missing _validate_codeql_plan_for_repair function - Expand COMPILED_LANGUAGES to include csharp/java-kotlin - Inject total_alerts into manifest before health computation - Fix write_summary undefined progress reference - Strengthen sandbox-recipe.yml validation with service/environment checks - Wire recipe validation into sandbox-validate - Use None instead of -1 for unknown extractor sentinel - Make completed-partial health classification reachable

coderabbitai · 2026-06-09T22:40:14Z

📝 Walkthrough

Walkthrough

This PR restructures Phase 1 of a CodeCome workflow by redefining subphase artifact ownership: Phase 1b now gates on sandbox bootstrap artifacts, Phase 1c gates on reconnaissance notes. It introduces CodeQL plan repair validation, improves CodeQL health classification with optional integer handling, expands sandbox recipe validation with docker-specific rules, and updates all related tests and documentation to reflect the new structure.

Changes

Phase 1 Subphase Ownership and CodeQL Validation

Layer / File(s)	Summary
Phase 1 subphase artifact ownership definition `tools/phases/phase_1_gates.py`, `tools/phases/completion.py`	Introduces `PHASE_1C_REQUIRED_NOTES` constant and restructures `check_phase_graceful_completion()` to clearly separate artifact ownership: Phase 1b freshness gates on sandbox-plan/sandbox-recipe/CODECOME-GENERATED; Phase 1c freshness gates on recon notes (attack-surface, threat-model, execution-model, validation-model).
Phase 1c required notes constant and validation `tools/phases/phase_1_gates.py`, `tests/test_phase_1_gates_threat_model.py`	Adds `PHASE_1C_REQUIRED_NOTES` constant defining reconnaissance note filenames and updates `check_phase_1c()` to validate missing notes against this constant; test fixtures now iterate over the constant to create required note files.
Phase 1 completion and freshness tests `tests/test_phase_graceful_completion_subphases.py`, `tests/test_phases_completion.py`	Updates graceful-completion tests to verify new artifact ownership: Phase 1b success with sandbox-plan/CODECOME-GENERATED, Phase 1c success with recon notes; failure tests now expect appropriate missing-artifact diagnostics per subphase; acceptance tests patch paths to use `itemdb/notes` directory structure.
Phase 1 progress reporting and orchestration updates `tools/codecome.py`, `tools/codecome/phase_1.py`	Updates `check_phase_progress()` to report Phase 1b status from sandbox-plan.md presence and Phase 1c status from missing recon notes count; updates module docstring to reflect CodeQL analysis occurring between 1b and 1c.
CodeQL plan repair validation helpers `tools/codecome/phase_1.py`	Adds `_validate_codeql_plan_for_repair()` and `_collect_plan_errors()` to validate Phase 1c's `codeql-plan.yml` under repair mode: enforces unit structure, ID uniqueness, path constraints (under `src/`), file existence, language/build-mode support, confidence values, and manual-build requirements.
CodeQL health classification with optional integer handling `tools/codeql/health.py`	Refactors health classification to handle unknown numeric counts as `None` instead of sentinels; expands `COMPILED_LANGUAGES` to include `csharp` and `java-kotlin`; distinguishes unknown vs. zero extractor success counts in failure classification; treats unresolvable alert counts as `completed-partial`.
CodeQL pipeline total alerts injection `tools/codeql/pipeline.py`, `tools/codeql/runner.py`	Adds `_inject_total_alerts()` helper to read and inject `alerts.yml` alert count into manifest; wires injection into `run_full_pipeline()` before health computation; updates `write_summary()` to report alerts.yml read failures explicitly.
Sandbox recipe docker-specific validation `tools/sandbox/recipe.py`	Extends `validate_recipe()` with docker-model rules requiring `compose_file`/`default_service`; updates `_validate_build_targets()` to accept `validation_model` and enforce per-model service/environment requirements; adds `_validate_target_environment()` for type/compose_file/service validation.
Sandbox recipe validation integration and test fixtures `tools/sandbox-bootstrap.py`, `tests/test_sandbox_recipe.py`	Integrates recipe validation into `cmd_validate()` to capture recipe errors and report in output/text mode; expands test fixture to create `sandbox/scripts`, `docker-compose.yml`, and script files for recipe validation tests.
Documentation and reference updates `Makefile`, `prompts/phase-1b-sandbox.md`, `prompts/phase-1c-recon.md`	Updates Makefile sandbox hints to reference Phase 1b; updates phase-1b sandbox prompts to require `sandbox-recipe.yml.example`; updates phase-1c prompts to resolve CodeQL run_id from `last-run-manifest.yml` and use per-run signal paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

pruiz/CodeCome#50: This PR directly implements the Phase 1 gate name changes, artifact ownership clarification, and sandbox-recipe validation improvements requested in that issue.

Possibly related PRs

pruiz/CodeCome#40: Both PRs update Phase 1b/1c subphase freshness gating in tools/phases/completion.py and corresponding test expectations.
pruiz/CodeCome#42: Both PRs expand CodeQL plan repair validation in tools/codecome/phase_1.py with schema constraints and per-language build_mode semantics.
pruiz/CodeCome#34: Both PRs restructure Phase 1 reconnaissance notes requirements and update corresponding gate/test validation in tools/phases/phase_1_gates.py.

Poem

🐰 Hopping through phases with a tidy spin,
1b readies sandboxes, scripts tucked within,
1c gathers recon notes, neat and clear,
CodeQL counts signal when numbers aren't near,
Docker recipes checked — the path is akin.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 38.24% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: resolving follow-up issues from a CodeQL sandbox recipe refactor, which is exactly what the changeset demonstrates across 13 files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/issue-50-codeql-sandbox-recipe-follow-ups

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-09T22:41:48Z

Coverage Report

Metric	Value
Line Coverage	75.6%
Lines Covered	0 / 0

Download detailed HTML coverage reports per OS/Python from the workflow artifacts.

Generated by pytest-cov on 2026-06-09T23:01:46.932Z

greptile-apps · 2026-06-09T22:45:04Z

Greptile Summary

This PR resolves 19 follow-up issues from the CodeQL sandbox recipe refactor (#50), primarily correcting swapped phase-ownership assignments and a number of related crash and classification bugs.

Phase ownership fix: Phase 1b graceful-completion checks now gate on sandbox bootstrap artifacts (sandbox-plan.md, sandbox-recipe.yml, CODECOME-GENERATED.md), and Phase 1c gates on recon notes. The rename REQUIRED_NOTES_1B → PHASE_1C_REQUIRED_NOTES propagates consistently across phase_1_gates.py, completion.py, codecome.py, and 8 test files.
CodeQL health model: Unknown extractor counts now return None instead of -1, making the completed-partial classification reachable; COMPILED_LANGUAGES gains csharp and java-kotlin; total_alerts is injected into the manifest before compute_health() runs.
New validation logic: sandbox/recipe.py adds docker-specific checks for compose_file, default_service, service, environment, and build_command path existence; sandbox-bootstrap.py wires this into sandbox-validate output.

Confidence Score: 4/5

Safe to merge after fixing the build_command path check in recipe validation.

The newly added build_command path existence check in _validate_build_targets passes the full command string (including arguments) to Path(root) / build_command instead of extracting just the first token the way _command_looks_path_like does. Any path-like build command that includes arguments will be resolved to a path that can never exist, causing false-positive validation errors. All other changes in the PR are well-tested and look correct.

tools/sandbox/recipe.py — the build_command path resolution in _validate_build_targets

Important Files Changed

Filename	Overview
tools/sandbox/recipe.py	Adds docker-specific sandbox and build-target validation; contains a bug where path existence for path-like build_command strings uses the full argument string instead of just the first token, generating false-positive errors.
tools/phases/completion.py	Core phase-ownership swap: Phase 1b now checks sandbox bootstrap artifacts, Phase 1c checks recon notes. Logic and test coverage look correct.
tools/codeql/health.py	Unknown extractor sentinel changed from -1 to None, completed-partial path now reachable, COMPILED_LANGUAGES expanded; all changes look consistent.
tools/codeql/pipeline.py	Adds _inject_total_alerts to populate manifest before health computation; exception now surfaced as a warning in manifest rather than being silently discarded.
tools/codecome/phase_1.py	Adds the previously-missing _validate_codeql_plan_for_repair and _collect_plan_errors functions; notes_dir parameter in _collect_plan_errors is accepted but unused internally.
tools/sandbox-bootstrap.py	Wires recipe validation into sandbox-validate; overall_outcome initialized to 'passed' before the loop, so the new recipe_errors override path is safe.
tools/codeql/runner.py	write_summary updated to prefer manifest-injected total_alerts and fixes undefined progress reference in error handler.
tools/phases/phase_1_gates.py	Renames REQUIRED_NOTES_1B to PHASE_1C_REQUIRED_NOTES and updates all internal usages; straightforward rename, no logic changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Phase 1a] -->|target-profile.md\nbuild-model.md\ncodeql-plan.yml| B[CodeQL Analysis]
    B --> C[Phase 1b\nSandbox Bootstrap]
    C -->|sandbox-plan.md\nsandbox-recipe.yml\nCODECOME-GENERATED.md| D[Phase 1c\nDetailed Recon]
    D -->|attack-surface.md\nthreat-model.md\nfile-risk-index.yml\n...| E[Phase 2]
    C --> V[sandbox-validate]
    V --> RV{Recipe Valid?}
    RV -->|Yes| VS[overall: passed]
    RV -->|No| VF[overall: failed]
    B --> H[compute_health]
    H --> H1{total_alerts in manifest?}
    H1 -->|None| CP[completed-partial]
    H1 -->|0| CEV[completed-empty-valid]
    H1 -->|gt 0| CC[completed]

_{Reviews (2): Last reviewed commit: "fix: address PR review feedback — plan v..." | Re-trigger Greptile}

Copilot

Pull request overview

This PR addresses follow-up correctness gaps after the CodeQL + sandbox-recipe refactor by realigning Phase 1b/1c artifact ownership, improving sandbox recipe validation and reporting, and tightening CodeQL run-health classification and manifest contents so downstream consumers can reliably interpret results.

Changes:

Correct Phase 1b vs 1c artifact ownership in gates, graceful-completion logic, phase progress reporting, and prompts.
Expand sandbox recipe validation (docker compose/service checks, build-target validation) and surface validation results in sandbox-bootstrap validate.
Improve CodeQL health/summary correctness by injecting total_alerts, making completed-partial reachable, and refining compiled-language extraction checks.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tools/sandbox/recipe.py	Adds docker-specific recipe validation and deeper build target checks.
tools/sandbox-bootstrap.py	Wires recipe validation into `validate` output and JSON payload.
tools/phases/phase_1_gates.py	Renames Phase 1c required recon notes constant and updates gate logic usage.
tools/phases/completion.py	Realigns Phase 1 artifact ownership and graceful-completion checks per subphase.
tools/codeql/runner.py	Fixes `write_summary` error handling for alerts count rendering.
tools/codeql/pipeline.py	Injects `total_alerts` from normalized output before health computation.
tools/codeql/health.py	Expands compiled language set and improves classification for unknown extractor/alert counts.
tools/codecome/phase_1.py	Adds missing `check-codeql-plan` validation implementation and plan error collection helper.
tools/codecome.py	Updates phase progress reporting for new Phase 1b/1c semantics.
tests/test_sandbox_recipe.py	Updates sandbox recipe tests to satisfy new validation expectations.
tests/test_phases_completion.py	Updates Phase 1 completion tests for new notes root layout and ownership.
tests/test_phase_graceful_completion_subphases.py	Updates subphase graceful-completion tests for new artifact ownership.
tests/test_phase_1_gates_threat_model.py	Updates tests to use renamed Phase 1c notes constant.
prompts/phase-1c-recon.md	Fixes CodeQL signals path resolution via `last-run-manifest.yml` + run id.
prompts/phase-1b-sandbox.md	Removes Phase 1c artifacts from Phase 1b required reading; points to recipe template.
Makefile	Updates sandbox bootstrap phase references from 1c → 1b.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

tools/codecome.py (1)

356-395: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move phase progress implementation out of tools/codecome.py and delegate from the wrapper.

This changed logic keeps business implementation in a root tools/*.py script, which should remain a thin delegator to a package module.

♻️ Suggested direction

- def check_phase_progress() -> None:
-     ...
+ from codecome.status import check_phase_progress

# tools/codecome/status.py
def check_phase_progress(root: Path, yaml_module) -> None:
    ...

As per coding guidelines, “Standalone scripts at the tools/ root ... should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codecome.py` around lines 356 - 395, The phase-progress logic in
tools/codecome.py (the block using ROOT, notes_dir, evidence_root,
count_findings(), PHASE_1C_REQUIRED_NOTES, manifest_path and constructing rows
for "Phase 1a"/"CodeQL"/"Phase 1b"/"Phase 1c") should be moved into a package
helper function (e.g. tools/codecome/status.py:def check_phase_progress(root:
Path, yaml_module) -> list[tuple[str,str,str]]). Implement check_phase_progress
to accept the root Path and yaml module, perform the same checks
(count_findings(), presence checks for
target-profile.md/build-model.md/codeql-plan.yml, sandbox-plan.md, read
run-manifest.yml with safe_load and status handling, evaluate
PHASE_1C_REQUIRED_NOTES) and return the rows list; then replace the moved code
in tools/codecome.py with a thin delegator that calls check_phase_progress(ROOT,
yaml) and uses its returned rows. Ensure you reference PHASE_1C_REQUIRED_NOTES
inside the new module by importing it and keep the same status/level mapping
logic.

Source: Coding guidelines

tools/sandbox-bootstrap.py (1)

1521-1571: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Recipe validation failures currently do not fail validate exit status.

recipe_errors only affects output fields; if tiers pass, the command still returns success. That can greenlight an invalid recipe even though downstream CodeQL flow consumes recipe service/compose settings.

Suggested fix

     recipe_errors: list[str] = []
     if SANDBOX_RECIPE_PATH.exists():
         try:
             from sandbox.recipe import load_recipe, validate_recipe
             recipe = load_recipe(SANDBOX_RECIPE_PATH)
             recipe_errors = validate_recipe(recipe, root=ROOT)
         except Exception as exc:
             recipe_errors = [f"Failed to load recipe: {exc}"]
     elif not args.scripts_only:
         recipe_errors = ["sandbox-recipe.yml not found; run Phase 1b to generate it"]
+
+    if recipe_errors:
+        overall_outcome = "failed"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/sandbox-bootstrap.py` around lines 1521 - 1571, The recipe validation
errors are only used for output but don't affect the exit status; after
validating and building recipe_errors (the block that computes recipe_errors
from SANDBOX_RECIPE_PATH / args.scripts_only), if recipe_errors is non-empty set
overall_outcome = "failed" (or set an explicit exit_code variable to non-zero)
and ensure payload["recipe_valid"] is False before emitting/printing; keep the
final return logic (return 0 if overall_outcome == "passed" else 1) so the
validate command fails when recipe_errors exist.

🧹 Nitpick comments (1)

tools/codeql/runner.py (1)
660-671: 💤 Low value

Error handling added for alerts.yml read failures.

The broad exception handler is acceptable here for summary generation resilience. The user gets explicit feedback rather than a crash or missing information.

Optional: Consider using manifest["total_alerts"] instead of re-reading.

write_summary() currently re-parses alerts.yml to count alerts, duplicating work already done by _inject_total_alerts() in the pipeline. Using the manifest field would be more efficient and ensure consistency with health classification:
total_alerts = manifest.get("total_alerts")
if total_alerts is not None:
    lines.append(f"- **Total alerts**: {total_alerts}")
    lines.append("")
elif (normalized_dir / "alerts.yml").is_file():
    lines.append("- **Total alerts**: (error reading alerts.yml)")
    lines.append("")
This change is out of scope for the current PR but could be considered for future cleanup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codeql/runner.py` around lines 660 - 671, The current write_summary()
block re-parses alerts.yml using alerts_path and load_yaml_mapping and catches
Exception; instead, use the already-populated manifest entry populated by
_inject_total_alerts: read total_alerts = manifest.get("total_alerts") inside
write_summary(), and if total_alerts is not None append the lines with that
value; only fall back to checking alerts_path.is_file() to append the “(error
reading alerts.yml)” message if manifest lacks the key. Update references to
normalized_dir/alerts.yml and remove the duplicated load_yaml_mapping path to
avoid re-parsing and keep behavior identical when manifest has the count.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/codecome/phase_1.py`:
- Around line 224-228: _summary: _collect_plan_errors() currently hard-fails
unsupported languages and empty language lists causing check-codeql-plan to
report failures that check_phase_1a() only blocks under fail_policy=hard._
Modify _collect_plan_errors to accept a fail_policy (or soft_fail boolean)
parameter and stop treating unsupported languages / empty analysis_units as hard
errors; instead emit non-blocking notices or warnings when fail_policy is soft,
and preserve hard errors when fail_policy is hard; update callers (e.g.,
check_phase_1a and any invocation of _collect_plan_errors) to pass the correct
fail_policy so behavior matches Gate 1a semantics and ensure the same change is
applied to the other similar blocks referenced (the checks around analysis_units
and language lists).

In `@tools/sandbox-bootstrap.py`:
- Around line 1521-1531: The recipe-loading and validation logic in
tools/sandbox-bootstrap.py (the block using SANDBOX_RECIPE_PATH, load_recipe and
validate_recipe and assigning recipe_errors) should be moved into a package
helper (e.g., a new function like
sandbox.recipe_helper.load_and_validate_recipe) and the script should call that
helper instead; implement a function in the sandbox package that accepts the
path, ROOT and args.scripts_only (or equivalent) and returns the recipe_errors
list (including catching exceptions and returning the failure message), then
replace the inline logic in tools/sandbox-bootstrap.py with a single call to
that helper and use its returned recipe_errors.

---

Outside diff comments:
In `@tools/codecome.py`:
- Around line 356-395: The phase-progress logic in tools/codecome.py (the block
using ROOT, notes_dir, evidence_root, count_findings(), PHASE_1C_REQUIRED_NOTES,
manifest_path and constructing rows for "Phase 1a"/"CodeQL"/"Phase 1b"/"Phase
1c") should be moved into a package helper function (e.g.
tools/codecome/status.py:def check_phase_progress(root: Path, yaml_module) ->
list[tuple[str,str,str]]). Implement check_phase_progress to accept the root
Path and yaml module, perform the same checks (count_findings(), presence checks
for target-profile.md/build-model.md/codeql-plan.yml, sandbox-plan.md, read
run-manifest.yml with safe_load and status handling, evaluate
PHASE_1C_REQUIRED_NOTES) and return the rows list; then replace the moved code
in tools/codecome.py with a thin delegator that calls check_phase_progress(ROOT,
yaml) and uses its returned rows. Ensure you reference PHASE_1C_REQUIRED_NOTES
inside the new module by importing it and keep the same status/level mapping
logic.

In `@tools/sandbox-bootstrap.py`:
- Around line 1521-1571: The recipe validation errors are only used for output
but don't affect the exit status; after validating and building recipe_errors
(the block that computes recipe_errors from SANDBOX_RECIPE_PATH /
args.scripts_only), if recipe_errors is non-empty set overall_outcome = "failed"
(or set an explicit exit_code variable to non-zero) and ensure
payload["recipe_valid"] is False before emitting/printing; keep the final return
logic (return 0 if overall_outcome == "passed" else 1) so the validate command
fails when recipe_errors exist.

---

Nitpick comments:
In `@tools/codeql/runner.py`:
- Around line 660-671: The current write_summary() block re-parses alerts.yml
using alerts_path and load_yaml_mapping and catches Exception; instead, use the
already-populated manifest entry populated by _inject_total_alerts: read
total_alerts = manifest.get("total_alerts") inside write_summary(), and if
total_alerts is not None append the lines with that value; only fall back to
checking alerts_path.is_file() to append the “(error reading alerts.yml)”
message if manifest lacks the key. Update references to
normalized_dir/alerts.yml and remove the duplicated load_yaml_mapping path to
avoid re-parsing and keep behavior identical when manifest has the count.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b08ac110-238c-4f4f-8284-72b0e9d967c6

📥 Commits

Reviewing files that changed from the base of the PR and between 66654f1 and 107e1b2.

📒 Files selected for processing (16)

Makefile
prompts/phase-1b-sandbox.md
prompts/phase-1c-recon.md
tests/test_phase_1_gates_threat_model.py
tests/test_phase_graceful_completion_subphases.py
tests/test_phases_completion.py
tests/test_sandbox_recipe.py
tools/codecome.py
tools/codecome/phase_1.py
tools/codeql/health.py
tools/codeql/pipeline.py
tools/codeql/runner.py
tools/phases/completion.py
tools/phases/phase_1_gates.py
tools/sandbox-bootstrap.py
tools/sandbox/recipe.py

…, alerts injection - _collect_plan_errors() now respects fail_policy (soft vs hard) for unsupported languages and empty language lists, matching Gate 1a - Removed confidence check from _collect_plan_errors (gate warns only) - Recipe validation failures now fail sandbox-validate exit status - _inject_total_alerts() appends warning instead of silently passing - build_command validation only path-checks when first token looks like a filesystem path (./ or /), allowing shell commands - Phase 1b progress check now requires both sandbox-plan.md and recipe - write_summary() uses manifest total_alerts instead of re-parsing

pruiz

Review follow-up on the three out-of-diff items from coderabbit:

1. `tools/codecome.py` phase progress → Deferred

Moving check_phase_progress() into a package module is an architectural cleanup (thin-wrapper guideline). tools/codecome.py already contains many non-thin functions (check_codeql_status, check_sandbox_status, etc.). Extracting just this one function is a good idea but out of scope for this bug-fix PR. Will address in a future refactor.

2. `tools/sandbox-bootstrap.py` recipe validation exit status → ✅ Addressed in `eee7c6d`

Recipe validation errors now set overall_outcome = "failed", which flows through the existing exit gate: return 0 if overall_outcome == "passed" else 1. The JSON payload already exposes recipe_valid: false and recipe_errors.

3. `tools/codeql/runner.py` write_summary use manifest total_alerts → ✅ Addressed in `eee7c6d`

write_summary() now reads manifest.get("total_alerts") first, and only falls back to re-parsing alerts.yml when the manifest lacks the key. This avoids the duplicated parse and keeps the summary consistent with the health classification.

pruiz requested a review from Copilot June 9, 2026 22:40

Copilot started reviewing on behalf of pruiz June 9, 2026 22:40 View session

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread tools/codeql/pipeline.py Outdated

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Comment thread tools/sandbox/recipe.py Outdated

Comment thread tools/codecome.py

Comment thread tools/codecome/phase_1.py

Comment thread tools/codecome/phase_1.py Outdated

Comment thread tools/codecome/phase_1.py

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread tools/codecome/phase_1.py

Comment thread tools/sandbox-bootstrap.py

pruiz commented Jun 9, 2026

View reviewed changes

pruiz merged commit bc37225 into master Jun 9, 2026
7 of 8 checks passed

Conversation

pruiz commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Original fixes (commit 107e1b2)

Review feedback fixes (commit eee7c6d)

Tests

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

greptile-apps Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pruiz left a comment

Choose a reason for hiding this comment

1. tools/codecome.py phase progress → Deferred

2. tools/sandbox-bootstrap.py recipe validation exit status → ✅ Addressed in eee7c6d

3. tools/codeql/runner.py write_summary use manifest total_alerts → ✅ Addressed in eee7c6d

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pruiz commented Jun 9, 2026 •

edited

Loading

Original fixes (commit `107e1b2`)

Review feedback fixes (commit `eee7c6d`)

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading

greptile-apps Bot commented Jun 9, 2026 •

edited

Loading

1. `tools/codecome.py` phase progress → Deferred

2. `tools/sandbox-bootstrap.py` recipe validation exit status → ✅ Addressed in `eee7c6d`

3. `tools/codeql/runner.py` write_summary use manifest total_alerts → ✅ Addressed in `eee7c6d`