CNTRLPLANE-3339: add eval-agents job for openshift/hypershift#78630
CNTRLPLANE-3339: add eval-agents job for openshift/hypershift#78630enxebre wants to merge 1 commit into
Conversation
|
@enxebre: This pull request references CNTRLPLANE-3339 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new optional HyperShift CI workflow Changeseval-agents workflow + step-refs
sequenceDiagram
participant Prow as Prow CI
participant Job as Job Container
participant Repo as Hypershift Repo
participant Vertex as Vertex AI / Claude Service
Prow->>Job: schedule `hypershift-eval-agents`
Job->>Repo: run setup script (verify `claude`, install node/npm)
Job->>Vertex: authenticate using mounted credentials
Job->>Repo: run eval (make eval-agents, write ${ARTIFACT_DIR}/results.xml)
Job->>Prow: upload artifacts/results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: enxebre The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`:
- Around line 16-24: Add a precheck before invoking "make eval-agents" that when
Vertex mode is enabled (e.g., VERTEX_MODE=true) verifies the
GOOGLE_APPLICATION_CREDENTIALS env var is set and the referenced file is
readable; if the var is empty or the file is not readable, print an error and
exit 1 so the script fails fast instead of letting make eval-agents (and targets
driven by EVAL_FOCUS/EVAL_VERBOSE) fail later.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: dee9528e-8165-4b24-9e92-faed98366fee
📒 Files selected for processing (10)
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yamlci-operator/step-registry/hypershift/eval-agents/OWNERSci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.jsonci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yamlci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.shci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yamlci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.shci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
| cd /go/src/github.com/openshift/hypershift | ||
|
|
||
| if [ -n "${EVAL_FOCUS:-}" ]; then | ||
| echo "Focus: ${EVAL_FOCUS}" | ||
| make eval-agents EVAL_FOCUS="${EVAL_FOCUS}" EVAL_VERBOSE=1 | ||
| else | ||
| echo "Running all eval scenarios..." | ||
| make eval-agents EVAL_VERBOSE=1 | ||
| fi |
There was a problem hiding this comment.
Fail fast on missing Vertex credentials before running evaluations.
When Vertex mode is enabled, missing/unreadable GOOGLE_APPLICATION_CREDENTIALS will fail later with less actionable output. Add an explicit precheck here.
Suggested patch
cd /go/src/github.com/openshift/hypershift
+if [ "${CLAUDE_CODE_USE_VERTEX:-1}" = "1" ]; then
+ if [ -z "${GOOGLE_APPLICATION_CREDENTIALS:-}" ] || [ ! -r "${GOOGLE_APPLICATION_CREDENTIALS}" ]; then
+ echo "ERROR: GOOGLE_APPLICATION_CREDENTIALS is unset or unreadable: ${GOOGLE_APPLICATION_CREDENTIALS:-<unset>}"
+ exit 1
+ fi
+fi
+
if [ -n "${EVAL_FOCUS:-}" ]; then
echo "Focus: ${EVAL_FOCUS}"
make eval-agents EVAL_FOCUS="${EVAL_FOCUS}" EVAL_VERBOSE=1📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cd /go/src/github.com/openshift/hypershift | |
| if [ -n "${EVAL_FOCUS:-}" ]; then | |
| echo "Focus: ${EVAL_FOCUS}" | |
| make eval-agents EVAL_FOCUS="${EVAL_FOCUS}" EVAL_VERBOSE=1 | |
| else | |
| echo "Running all eval scenarios..." | |
| make eval-agents EVAL_VERBOSE=1 | |
| fi | |
| cd /go/src/github.com/openshift/hypershift | |
| if [ "${CLAUDE_CODE_USE_VERTEX:-1}" = "1" ]; then | |
| if [ -z "${GOOGLE_APPLICATION_CREDENTIALS:-}" ] || [ ! -r "${GOOGLE_APPLICATION_CREDENTIALS}" ]; then | |
| echo "ERROR: GOOGLE_APPLICATION_CREDENTIALS is unset or unreadable: ${GOOGLE_APPLICATION_CREDENTIALS:-<unset>}" | |
| exit 1 | |
| fi | |
| fi | |
| if [ -n "${EVAL_FOCUS:-}" ]; then | |
| echo "Focus: ${EVAL_FOCUS}" | |
| make eval-agents EVAL_FOCUS="${EVAL_FOCUS}" EVAL_VERBOSE=1 | |
| else | |
| echo "Running all eval scenarios..." | |
| make eval-agents EVAL_VERBOSE=1 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`
around lines 16 - 24, Add a precheck before invoking "make eval-agents" that
when Vertex mode is enabled (e.g., VERTEX_MODE=true) verifies the
GOOGLE_APPLICATION_CREDENTIALS env var is set and the referenced file is
readable; if the var is empty or the file is not readable, print an error and
exit 1 so the script fails fast instead of letting make eval-agents (and targets
driven by EVAL_FOCUS/EVAL_VERBOSE) fail later.
b174a9d to
f8ff049
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`:
- Around line 8-16: The script builds MAKE_ARGS as a plain string so expanding
${MAKE_ARGS} unquoted when calling make can split values with spaces (e.g.,
EVAL_FOCUS) into multiple args; change MAKE_ARGS to an array: initialize
MAKE_ARGS=( "EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" ), when EVAL_FOCUS is set
append MAKE_ARGS+=( "EVAL_FILTER=${EVAL_FOCUS}" ), and invoke make as: make
eval-agents "${MAKE_ARGS[@]}" so EVAL_FILTER is passed as a single argument;
update references to MAKE_ARGS, EVAL_FOCUS, EVAL_FILTER and the make eval-agents
call accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 15cbe2ee-52dd-49a8-b6dc-c9ad0fd4b8bd
📒 Files selected for processing (10)
ci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.jsonci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yamlci-operator/step-registry/hypershift/eval-agents/run/OWNERSci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.shci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yamlci-operator/step-registry/hypershift/eval-agents/setup/OWNERSci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.shci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
✅ Files skipped from review due to trivial changes (2)
- ci-operator/step-registry/hypershift/eval-agents/run/OWNERS
- ci-operator/step-registry/hypershift/eval-agents/setup/OWNERS
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
| MAKE_ARGS="EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" | ||
|
|
||
| if [ -n "${EVAL_FOCUS:-}" ]; then | ||
| echo "Filter: ${EVAL_FOCUS}" | ||
| MAKE_ARGS="${MAKE_ARGS} EVAL_FILTER=${EVAL_FOCUS}" | ||
| fi | ||
|
|
||
| echo "Running eval-agents with promptfoo..." | ||
| make eval-agents ${MAKE_ARGS} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verifies shell splitting/globbing risk from unquoted scalar expansion.
set -euo pipefail
tmp="$(mktemp -d)"
cd "$tmp"
touch a b
EVAL_FOCUS='*'
MAKE_ARGS="EVAL_OUTPUT=/tmp/results.xml EVAL_FILTER=${EVAL_FOCUS}"
echo "Args produced by unquoted expansion:"
for arg in ${MAKE_ARGS}; do printf ' [%s]\n' "$arg"; doneRepository: openshift/release
Length of output: 149
🏁 Script executed:
# Check if file exists and read the relevant section
git ls-files | grep -i "hypershift-eval-agents-run-commands.sh"Repository: openshift/release
Length of output: 153
🏁 Script executed:
# Read the file and verify lines 8-16
file_path="ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh"
if [ -f "$file_path" ]; then
echo "=== File found. Lines 8-16: ==="
sed -n '8,16p' "$file_path"
else
echo "File not found at: $file_path"
fiRepository: openshift/release
Length of output: 347
🏁 Script executed:
# More comprehensive test with spaces and special characters
#!/bin/bash
set -euo pipefail
tmp="$(mktemp -d)"
cd "$tmp"
touch "file with spaces" "another*file"
# Test 1: EVAL_FOCUS with spaces
echo "=== Test 1: EVAL_FOCUS with spaces ==="
EVAL_FOCUS='value with spaces'
MAKE_ARGS="EVAL_OUTPUT=/tmp/results.xml EVAL_FILTER=${EVAL_FOCUS}"
echo "Unquoted expansion (current approach):"
for arg in ${MAKE_ARGS}; do printf ' [%s]\n' "$arg"; done
echo "Expected: 2 args, Got: $(echo ${MAKE_ARGS} | wc -w) args"
# Test 2: Array-based approach (proposed fix)
echo ""
echo "=== Test 2: Array-based approach (proposed fix) ==="
MAKE_ARGS=("EVAL_OUTPUT=/tmp/results.xml")
EVAL_FOCUS='value with spaces'
MAKE_ARGS+=("EVAL_FILTER=${EVAL_FOCUS}")
echo "Quoted array expansion:"
for arg in "${MAKE_ARGS[@]}"; do printf ' [%s]\n' "$arg"; done
echo "Expected: 2 args, Got: ${`#MAKE_ARGS`[@]} args"Repository: openshift/release
Length of output: 418
Use an argument array for make vars to avoid shell word splitting.
On line 16, ${MAKE_ARGS} is expanded unquoted. If EVAL_FOCUS contains spaces, EVAL_FILTER gets split into multiple arguments before reaching make. For example, EVAL_FOCUS='value with spaces' produces EVAL_FILTER=value with spaces as three separate arguments instead of one.
Proposed patch
-MAKE_ARGS="EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml"
+MAKE_ARGS=("EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml")
if [ -n "${EVAL_FOCUS:-}" ]; then
echo "Filter: ${EVAL_FOCUS}"
- MAKE_ARGS="${MAKE_ARGS} EVAL_FILTER=${EVAL_FOCUS}"
+ MAKE_ARGS+=("EVAL_FILTER=${EVAL_FOCUS}")
fi
echo "Running eval-agents with promptfoo..."
-make eval-agents ${MAKE_ARGS}
+make eval-agents "${MAKE_ARGS[@]}"🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 16-16: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`
around lines 8 - 16, The script builds MAKE_ARGS as a plain string so expanding
${MAKE_ARGS} unquoted when calling make can split values with spaces (e.g.,
EVAL_FOCUS) into multiple args; change MAKE_ARGS to an array: initialize
MAKE_ARGS=( "EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" ), when EVAL_FOCUS is set
append MAKE_ARGS+=( "EVAL_FILTER=${EVAL_FOCUS}" ), and invoke make as: make
eval-agents "${MAKE_ARGS[@]}" so EVAL_FILTER is passed as a single argument;
update references to MAKE_ARGS, EVAL_FOCUS, EVAL_FILTER and the make eval-agents
call accordingly.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh (2)
8-16:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse an argument array for
makevariables to avoid shell word splitting.This issue was previously identified. The unquoted expansion
${MAKE_ARGS}on line 16 will splitEVAL_FOCUSvalues containing spaces into multiple arguments. For example,EVAL_FOCUS="agent validation"produces three separate arguments (EVAL_FILTER=agent,validation) instead ofEVAL_FILTER="agent validation".🔧 Array-based fix
-MAKE_ARGS="EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" +MAKE_ARGS=("EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml") if [ -n "${EVAL_FOCUS:-}" ]; then echo "Filter: ${EVAL_FOCUS}" - MAKE_ARGS="${MAKE_ARGS} EVAL_FILTER=${EVAL_FOCUS}" + MAKE_ARGS+=("EVAL_FILTER=${EVAL_FOCUS}") fi echo "Running eval-agents with promptfoo..." -make eval-agents ${MAKE_ARGS} +make eval-agents "${MAKE_ARGS[@]}"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh` around lines 8 - 16, The make invocation uses an unquoted string variable MAKE_ARGS causing word-splitting of values like EVAL_FOCUS; replace the string accumulation with an array (e.g., MAKE_ARGS_ARRAY) and push elements such as "EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" and, when EVAL_FOCUS is set, "EVAL_FILTER=${EVAL_FOCUS}" into that array, then invoke make with make eval-agents "${MAKE_ARGS_ARRAY[@]}" so space-containing values in EVAL_FOCUS are passed as a single argument; update references to MAKE_ARGS to use the new array name and ensure expansions are quoted where used.
6-7:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFail fast on missing Vertex credentials before running evaluations.
This issue was previously identified. When Vertex mode is enabled (likely the default), missing or unreadable
GOOGLE_APPLICATION_CREDENTIALSwill causemake eval-agentsto fail later with less actionable output. Adding an explicit precheck after line 6 would improve the debugging experience.🛡️ Suggested precheck
cd /go/src/github.com/openshift/hypershift +if [ "${CLAUDE_CODE_USE_VERTEX:-1}" = "1" ]; then + if [ -z "${GOOGLE_APPLICATION_CREDENTIALS:-}" ] || [ ! -r "${GOOGLE_APPLICATION_CREDENTIALS}" ]; then + echo "ERROR: GOOGLE_APPLICATION_CREDENTIALS is unset or unreadable: ${GOOGLE_APPLICATION_CREDENTIALS:-<unset>}" + exit 1 + fi +fi + MAKE_ARGS="EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh` around lines 6 - 7, Add a fail-fast precheck in hypershift-eval-agents-run-commands.sh to validate GOOGLE_APPLICATION_CREDENTIALS before invoking make eval-agents: detect if Vertex mode is enabled (e.g., via the existing env flag or by checking $VERTEX_MODE/$USE_VERTEX variable) and then verify the GOOGLE_APPLICATION_CREDENTIALS env var is set and points to a readable file; if not, print a clear error to stderr and exit non‑zero so the job fails early with actionable output.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`:
- Around line 8-16: The make invocation uses an unquoted string variable
MAKE_ARGS causing word-splitting of values like EVAL_FOCUS; replace the string
accumulation with an array (e.g., MAKE_ARGS_ARRAY) and push elements such as
"EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" and, when EVAL_FOCUS is set,
"EVAL_FILTER=${EVAL_FOCUS}" into that array, then invoke make with make
eval-agents "${MAKE_ARGS_ARRAY[@]}" so space-containing values in EVAL_FOCUS are
passed as a single argument; update references to MAKE_ARGS to use the new array
name and ensure expansions are quoted where used.
- Around line 6-7: Add a fail-fast precheck in
hypershift-eval-agents-run-commands.sh to validate
GOOGLE_APPLICATION_CREDENTIALS before invoking make eval-agents: detect if
Vertex mode is enabled (e.g., via the existing env flag or by checking
$VERTEX_MODE/$USE_VERTEX variable) and then verify the
GOOGLE_APPLICATION_CREDENTIALS env var is set and points to a readable file; if
not, print a clear error to stderr and exit non‑zero so the job fails early with
actionable output.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: c07bd22c-54f6-4702-9903-34a9d785dc0a
📒 Files selected for processing (12)
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yamlci-operator/step-registry/hypershift/eval-agents/OWNERSci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.jsonci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yamlci-operator/step-registry/hypershift/eval-agents/run/OWNERSci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.shci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yamlci-operator/step-registry/hypershift/eval-agents/setup/OWNERSci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.shci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
✅ Files skipped from review due to trivial changes (10)
- ci-operator/step-registry/hypershift/eval-agents/OWNERS
- ci-operator/step-registry/hypershift/eval-agents/run/OWNERS
- ci-operator/step-registry/hypershift/eval-agents/setup/OWNERS
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.sh
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.json
- ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.json
- ci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.json
- ci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yaml
- ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yaml
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/config/openshift/hypershift/openshift-hypershift-main.yaml
5147370 to
b100381
Compare
Add a promptfoo-based eval workflow that tests Claude Code agent definitions using LLM judge assertions. The setup step installs Node.js/npm and the run step invokes make eval-agents via npx promptfoo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
b100381 to
52bfa65
Compare
|
[REHEARSALNOTIFIER] Note: If this PR includes changes to step registry files ( Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/hold |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh (1)
8-16:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse an argument array to prevent
makearg splitting.On Line 16, unquoted
${MAKE_ARGS}can split/glob values introduced on Line 12 (for example,EVAL_FOCUS='foo bar'), soEVAL_FILTERmay be passed incorrectly.Proposed patch
-MAKE_ARGS="EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" +MAKE_ARGS=("EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml") if [ -n "${EVAL_FOCUS:-}" ]; then echo "Filter: ${EVAL_FOCUS}" - MAKE_ARGS="${MAKE_ARGS} EVAL_FILTER=${EVAL_FOCUS}" + MAKE_ARGS+=("EVAL_FILTER=${EVAL_FOCUS}") fi echo "Running eval-agents with promptfoo..." -make eval-agents ${MAKE_ARGS} +make eval-agents "${MAKE_ARGS[@]}"#!/bin/bash # Read-only verification for arg-splitting risk in current implementation. set -euo pipefail file="$(fd -p 'hypershift-eval-agents-run-commands.sh' | head -n1)" echo "Inspecting: ${file}" sed -n '8,18p' "${file}" tmp="$(mktemp -d)" cd "${tmp}" touch a b EVAL_FOCUS='value with spaces' MAKE_ARGS="EVAL_OUTPUT=/tmp/results.xml EVAL_FILTER=${EVAL_FOCUS}" echo "Unquoted expansion arguments:" for arg in ${MAKE_ARGS}; do printf ' [%s]\n' "${arg}" done echo "Expected behavior after array fix: EVAL_FILTER stays one argument."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh` around lines 8 - 16, The current script builds a space-joined string in MAKE_ARGS and then expands it unquoted in the make call, which will split/glob values like EVAL_FOCUS='foo bar'; change to build an array (e.g., MAKE_ARGS_ARR=() ; MAKE_ARGS_ARR+=( "EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" ) and if EVAL_FOCUS is set then MAKE_ARGS_ARR+=( "EVAL_FILTER=${EVAL_FOCUS}" ) ) and pass it to make as: make eval-agents "${MAKE_ARGS_ARR[@]}", updating the MAKE_ARGS variable usage and the final make invocation to use the array form to preserve arguments with spaces.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In
`@ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.sh`:
- Around line 8-16: The current script builds a space-joined string in MAKE_ARGS
and then expands it unquoted in the make call, which will split/glob values like
EVAL_FOCUS='foo bar'; change to build an array (e.g., MAKE_ARGS_ARR=() ;
MAKE_ARGS_ARR+=( "EVAL_OUTPUT=${ARTIFACT_DIR}/results.xml" ) and if EVAL_FOCUS
is set then MAKE_ARGS_ARR+=( "EVAL_FILTER=${EVAL_FOCUS}" ) ) and pass it to make
as: make eval-agents "${MAKE_ARGS_ARR[@]}", updating the MAKE_ARGS variable
usage and the final make invocation to use the array form to preserve arguments
with spaces.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: cbf05b7f-1299-4de9-93c7-f01bf4c9396b
📒 Files selected for processing (12)
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yamlci-operator/step-registry/hypershift/eval-agents/OWNERSci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.jsonci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yamlci-operator/step-registry/hypershift/eval-agents/run/OWNERSci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-commands.shci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yamlci-operator/step-registry/hypershift/eval-agents/setup/OWNERSci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.shci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.jsonci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
✅ Files skipped from review due to trivial changes (6)
- ci-operator/step-registry/hypershift/eval-agents/setup/OWNERS
- ci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.yaml
- ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.metadata.json
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.metadata.json
- ci-operator/step-registry/hypershift/eval-agents/run/hypershift-eval-agents-run-ref.yaml
- ci-operator/step-registry/hypershift/eval-agents/hypershift-eval-agents-workflow.metadata.json
🚧 Files skipped from review as they are similar to previous changes (2)
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-commands.sh
- ci-operator/step-registry/hypershift/eval-agents/setup/hypershift-eval-agents-setup-ref.yaml
|
[REHEARSALNOTIFIER] Note: If this PR includes changes to step registry files ( Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@enxebre: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
hypershift-eval-agentsstep-registry workflow that runs promptfoo-based LLM judge evaluations for Claude Code agent definitionsclaude-ai-helpersimage, run step invokesmake eval-agentswhich usesnpx promptfooto execute eval scenariosTest plan
eval-agentspresubmit via gangway or/test eval-agentsto verify Node.js installs and promptfoo runs successfully${ARTIFACT_DIR}/results.xml🤖 Generated with Claude Code