Kestrel Operator Onprem#17
Conversation
…erify for on-prem Server changes: - Add GetHelmChartVersionForChart() to fetch versions for specific charts - handleOnboardCluster returns chart_name based on ONPREM_MODE - Remove dead code: fetchLatestHelmVersion, helmCache, ServerCertificateBundle Operator changes: - Add TLS_INSECURE_SKIP_VERIFY env var support in deployment.yaml - Add server.tls.insecureSkipVerify option in values.yaml (default: false) - stream_client.go reads TLS_INSECURE_SKIP_VERIFY and applies to TLS config Frontend changes: - Add chartName to TokenModalData interface and API response type - Use dynamic chart name in helm install commands - Generate port: 443 and insecureSkipVerify: true for on-prem operators This enables on-prem operators to connect via NLB without server cert hostname verification, while keeping TLS encryption enabled.
📝 WalkthroughWalkthroughAdds a Helm value and Deployment env var to control TLS insecure-skip, wires that through the StreamClient tls.Config, increases gRPC default message sizes to 10MB, and skips JSON unmarshalling for log/exec/proxy k8s executor endpoints. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pkg/k8s_api/k8s_api_executor.go`:
- Around line 99-118: The current substring checks
(isLogEndpoint/isExecEndpoint/isProxyEndpoint) on apiPath are too broad and can
skip JSON validation erroneously; replace them with a helper (e.g.,
isNonJSONEndpoint(apiPath, suffix)) that strips query params, checks path
segments and suffixes (use strings.Split(apiPath, "?")[0], strings.HasSuffix for
exact suffixes, and for proxy allow "/proxy/" or suffix "/proxy" matches) and
then use that helper for "/log", "/exec", and "/proxy" when deciding whether to
run json.Unmarshal on resultBytes; update the condition around json.Unmarshal,
keep setting result fields and e.Logger.Error behavior unchanged.
🧹 Nitpick comments (3)
pkg/k8s_api/k8s_api_executor.go (1)
125-128: Logging is incomplete for debugging non-JSON endpoints.Only
is_log_endpointis logged, butis_exec_endpointandis_proxy_endpointare also checked. For debugging purposes, consider logging which specific non-JSON endpoint type was matched.♻️ Suggested improvement
e.Logger.Debug("API call completed successfully", zap.String("api_path", apiPath), - zap.Int("response_size", len(resultBytes)), - zap.Bool("is_log_endpoint", isLogEndpoint)) + zap.Int("response_size", len(resultBytes)), + zap.Bool("is_non_json_endpoint", isLogEndpoint || isExecEndpoint || isProxyEndpoint))Or, if more granularity is needed:
e.Logger.Debug("API call completed successfully", zap.String("api_path", apiPath), - zap.Int("response_size", len(resultBytes)), - zap.Bool("is_log_endpoint", isLogEndpoint)) + zap.Int("response_size", len(resultBytes)), + zap.Bool("skip_json_validation", isLogEndpoint || isExecEndpoint || isProxyEndpoint))pkg/client/stream_client.go (1)
364-368: Consider extracting the message size as a named constant.The 10MB limit is reasonable for streaming large payloads like pod logs. For better maintainability and discoverability, consider defining this as a named constant alongside other constants in this file.
♻️ Suggested refactor
const ( runtimeSecretName = "kestrel-operator-jwt-runtime" tokenRenewalInterval = 3 * time.Hour // Based on 24 hour JWT TTL defaultHubbleRelayNamespace = "kube-system" dataplaneV2HubbleRelayNamespace = "gke-managed-dpv2-observability" // Constants for liveness probe minEOFErrorsForLiveness = 3 // Minimum EOF errors for liveness failure minUnhealthyForLiveness = 1 * time.Minute // Minimum unhealthy duration for liveness failure maxHealthyGapForLiveness = 30 * time.Second // Max gap between EOFs for liveness failure + + // gRPC message size limit (10MB for large payloads like pod logs) + maxGRPCMessageSize = 10 * 1024 * 1024 )Then use
maxGRPCMessageSizein the dial options.helm/kestrel-operator/templates/deployment.yaml (1)
35-38: Add nil-safety for nested value access.If a user has an older
values.yamlwithout thetlssection, accessing.Values.server.tls.insecureSkipVerifycould fail. Consider using a more defensive template pattern.♻️ Suggested fix for nil-safety
# TLS configuration - name: TLS_INSECURE_SKIP_VERIFY - value: "{{ .Values.server.tls.insecureSkipVerify | default false }}" + value: "{{ ((.Values.server).tls).insecureSkipVerify | default false }}"This uses Helm's parenthetical nil-safe access pattern to prevent template errors when intermediate keys are missing.
| // Check if this is a log endpoint or other non-JSON endpoint | ||
| // Pod logs return plain text, not JSON | ||
| isLogEndpoint := strings.Contains(apiPath, "/log") | ||
| isExecEndpoint := strings.Contains(apiPath, "/exec") | ||
| isProxyEndpoint := strings.Contains(apiPath, "/proxy") | ||
|
|
||
| // Only validate JSON for non-log/exec/proxy endpoints | ||
| if !isLogEndpoint && !isExecEndpoint && !isProxyEndpoint { | ||
| var jsonValidation interface{} | ||
| if err := json.Unmarshal(resultBytes, &jsonValidation); err != nil { | ||
| result.Success = false | ||
| result.ErrorMessage = fmt.Sprintf("Invalid JSON response: %v", err) | ||
| result.StatusCode = http.StatusInternalServerError | ||
|
|
||
| e.Logger.Error("Invalid JSON response from Kubernetes API", | ||
| zap.String("api_path", apiPath), | ||
| zap.Error(err)) | ||
| return result | ||
| } | ||
| } |
There was a problem hiding this comment.
Substring matching for endpoint detection is too broad and may cause false positives.
Using strings.Contains(apiPath, "/log") will match any path containing that substring, including resources with "log" in their name (e.g., /api/v1/namespaces/logging/pods/my-pod, /apis/apps/v1/deployments/my-logger-app). This could incorrectly skip JSON validation for standard JSON-returning endpoints.
Consider using more precise path matching that checks for these as terminal path segments:
🔧 Proposed fix using suffix/segment matching
- // Check if this is a log endpoint or other non-JSON endpoint
- // Pod logs return plain text, not JSON
- isLogEndpoint := strings.Contains(apiPath, "/log")
- isExecEndpoint := strings.Contains(apiPath, "/exec")
- isProxyEndpoint := strings.Contains(apiPath, "/proxy")
+ // Check if this is a log endpoint or other non-JSON endpoint
+ // Pod logs return plain text, not JSON
+ // Match paths ending with /log, /exec, /proxy or having them as segments before query params
+ isLogEndpoint := isNonJSONEndpoint(apiPath, "/log")
+ isExecEndpoint := isNonJSONEndpoint(apiPath, "/exec")
+ isProxyEndpoint := isNonJSONEndpoint(apiPath, "/proxy")Add a helper function:
// isNonJSONEndpoint checks if the API path ends with the given suffix
// or has it as a path segment (possibly followed by query params or sub-paths for proxy)
func isNonJSONEndpoint(apiPath, suffix string) bool {
// Remove query params for checking
path := strings.Split(apiPath, "?")[0]
// Check if path ends with the suffix
if strings.HasSuffix(path, suffix) {
return true
}
// For proxy endpoints, check if suffix appears as a path segment (proxy can have sub-paths)
if suffix == "/proxy" && strings.Contains(path, "/proxy/") {
return true
}
return false
}🤖 Prompt for AI Agents
In `@pkg/k8s_api/k8s_api_executor.go` around lines 99 - 118, The current substring
checks (isLogEndpoint/isExecEndpoint/isProxyEndpoint) on apiPath are too broad
and can skip JSON validation erroneously; replace them with a helper (e.g.,
isNonJSONEndpoint(apiPath, suffix)) that strips query params, checks path
segments and suffixes (use strings.Split(apiPath, "?")[0], strings.HasSuffix for
exact suffixes, and for proxy allow "/proxy/" or suffix "/proxy" matches) and
then use that helper for "/log", "/exec", and "/proxy" when deciding whether to
run json.Unmarshal on resultBytes; update the condition around json.Unmarshal,
keep setting result fields and e.Logger.Error behavior unchanged.
Allows building kestrel-operator-onprem image and chart by specifying: - chart_name: kestrel-operator-onprem (for push-helm.yml) - image_name: kestrel-operator-onprem (for build-image.yml) Default values remain kestrel-operator for backward compatibility.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/build-image.yml:
- Line 50: The images path uses github.repository_owner raw which can contain
uppercase letters; change it to a lowercase owner to satisfy GHCR. Either inline
lowercase the owner in the images expression (use the toLower(...) expression
around github.repository_owner) or add a prior step that sets an env var
REPO_OWNER to a lowercased github.repository_owner (e.g., using tr) and then use
${ env.REPO_OWNER } in the images value; update the images line that references
env.REGISTRY and github.event.inputs.image_name || 'kestrel-operator' to use the
lowercased owner reference instead.
| uses: docker/metadata-action@v5 | ||
| with: | ||
| images: ${{ env.REGISTRY }}/${{ github.repository }} | ||
| images: ${{ env.REGISTRY }}/${{ github.repository_owner }}/${{ github.event.inputs.image_name || 'kestrel-operator' }} |
There was a problem hiding this comment.
Lowercase github.repository_owner for GHCR compatibility.
GHCR requires lowercase image paths. Unlike push-helm.yml (line 61) which explicitly lowercases REPO_OWNER, this workflow uses github.repository_owner directly. If the owner contains uppercase characters (e.g., "KestrelAI"), this could cause push failures.
🔧 Proposed fix
You can lowercase the owner inline. One approach:
- images: ${{ env.REGISTRY }}/${{ github.repository_owner }}/${{ github.event.inputs.image_name || 'kestrel-operator' }}
+ images: ${{ env.REGISTRY }}/${{ format('{0}', github.repository_owner) | lower }}/${{ github.event.inputs.image_name || 'kestrel-operator' }}Alternatively, add a step to set the lowercased owner as an environment variable (similar to push-helm.yml):
- name: Set lowercase owner
run: echo "REPO_OWNER=$(echo '${{ github.repository_owner }}' | tr '[:upper:]' '[:lower:]')" >> $GITHUB_ENV
- name: Docker meta for operator
id: meta-operator
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.REPO_OWNER }}/${{ github.event.inputs.image_name || 'kestrel-operator' }}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| images: ${{ env.REGISTRY }}/${{ github.repository_owner }}/${{ github.event.inputs.image_name || 'kestrel-operator' }} | |
| images: ${{ env.REGISTRY }}/${{ format('{0}', github.repository_owner) | lower }}/${{ github.event.inputs.image_name || 'kestrel-operator' }} |
🤖 Prompt for AI Agents
In @.github/workflows/build-image.yml at line 50, The images path uses
github.repository_owner raw which can contain uppercase letters; change it to a
lowercase owner to satisfy GHCR. Either inline lowercase the owner in the images
expression (use the toLower(...) expression around github.repository_owner) or
add a prior step that sets an env var REPO_OWNER to a lowercased
github.repository_owner (e.g., using tr) and then use ${ env.REPO_OWNER } in the
images value; update the images line that references env.REGISTRY and
github.event.inputs.image_name || 'kestrel-operator' to use the lowercased owner
reference instead.
The nested curl command to fetch the latest stable version was failing in GitHub Actions. Pinning to a specific version for reliability.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@Dockerfile`:
- Around line 25-30: The RUN block that downloads the pinned kubectl binary
(uses ARCH variable, curl to
"https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl" and moves kubectl
to /usr/local/bin/) lacks checksum verification; update the RUN sequence to also
download the matching .sha256 (e.g., curl -LO
"https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl.sha256"), verify
the binary with sha256sum -c or by comparing the file's sha256 to the downloaded
value before chmod +x and mv, and fail the build if the check does not pass so
kubectl is only installed when the checksum matches.
| # Install kubectl (multi-arch) - pinned to v1.31.0 for reliability | ||
| RUN ARCH=$(uname -m) && \ | ||
| if [ "$ARCH" = "x86_64" ]; then ARCH="amd64"; elif [ "$ARCH" = "aarch64" ]; then ARCH="arm64"; fi && \ | ||
| curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/${ARCH}/kubectl" && \ | ||
| curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl" && \ | ||
| chmod +x kubectl && \ | ||
| mv kubectl /usr/local/bin/ |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Kubernetes kubectl official installation checksum verification v1.31.0 site:kubernetes.io
💡 Result:
Official Kubernetes docs show verifying a kubectl download by fetching the matching .sha256 file and checking it locally. Here are the equivalent commands for kubectl v1.31.0 (adjust amd64/arm64 as needed):
Linux (example: amd64)
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/amd64/kubectl.sha256"
echo "$(cat kubectl.sha256) kubectl" | sha256sum --check[1]
macOS (example: Intel amd64)
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/darwin/amd64/kubectl"
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/darwin/amd64/kubectl.sha256"
echo "$(cat kubectl.sha256) kubectl" | shasum -a 256 --check[2]
Windows (example: amd64)
curl.exe -LO "https://dl.k8s.io/release/v1.31.0/bin/windows/amd64/kubectl.exe"
curl.exe -LO "https://dl.k8s.io/v1.31.0/bin/windows/amd64/kubectl.exe.sha256"
# Command Prompt manual compare:
CertUtil -hashfile kubectl.exe SHA256
type kubectl.exe.sha256
# PowerShell boolean check:
$(Get-FileHash -Algorithm SHA256 .\kubectl.exe).Hash -eq $(Get-Content .\kubectl.exe.sha256)[3]
Sources: [1] Linux install / checksum verification docs, [2] macOS install / checksum verification docs, [3] Windows install / checksum verification docs.
🌐 Web query:
dl.k8s.io release v1.31.0 kubectl.sha256 checksum download
💡 Result:
kubectl.sha256 is published per OS/arch under the dl.k8s.io/release/v1.31.0/bin/... path. Download it like this:
# Linux x86_64
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/amd64/kubectl.sha256"
# macOS Intel
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/darwin/amd64/kubectl.sha256"
# macOS Apple Silicon
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/darwin/arm64/kubectl.sha256"(These URLs follow the official Kubernetes install docs’ download pattern for kubectl.sha256.) [1], [2]
Sources: [1] [2]
Add checksum verification for the pinned kubectl binary.
Pinning helps reproducibility, but downloading without integrity verification leaves a supply-chain gap. Fetch and verify the official .sha256 before installing—this follows Kubernetes' official installation guidance and is supported for v1.31.0.
🔐 Suggested patch (checksum verification)
-RUN ARCH=$(uname -m) && \
- if [ "$ARCH" = "x86_64" ]; then ARCH="amd64"; elif [ "$ARCH" = "aarch64" ]; then ARCH="arm64"; fi && \
- curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl" && \
- chmod +x kubectl && \
- mv kubectl /usr/local/bin/
+RUN ARCH=$(uname -m) && \
+ if [ "$ARCH" = "x86_64" ]; then ARCH="amd64"; elif [ "$ARCH" = "aarch64" ]; then ARCH="arm64"; fi && \
+ KUBECTL_VERSION="v1.31.0" && \
+ curl -fL -O "https://dl.k8s.io/release/${KUBECTL_VERSION}/bin/linux/${ARCH}/kubectl" && \
+ curl -fL -O "https://dl.k8s.io/release/${KUBECTL_VERSION}/bin/linux/${ARCH}/kubectl.sha256" && \
+ echo "$(cat kubectl.sha256) kubectl" | sha256sum -c - && \
+ chmod +x kubectl && \
+ mv kubectl /usr/local/bin/ && \
+ rm kubectl.sha256🤖 Prompt for AI Agents
In `@Dockerfile` around lines 25 - 30, The RUN block that downloads the pinned
kubectl binary (uses ARCH variable, curl to
"https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl" and moves kubectl
to /usr/local/bin/) lacks checksum verification; update the RUN sequence to also
download the matching .sha256 (e.g., curl -LO
"https://dl.k8s.io/release/v1.31.0/bin/linux/${ARCH}/kubectl.sha256"), verify
the binary with sha256sum -c or by comparing the file's sha256 to the downloaded
value before chmod +x and mv, and fail the build if the check does not pass so
kubectl is only installed when the checksum matches.
Implement support for Kestrel Operators deployed on-prem.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.