Skip to content

Bug: Apply MetricsSet relabel configs to KAS ServiceMonitor#8715

Open
wanghaoran1988 wants to merge 1 commit into
openshift:mainfrom
wanghaoran1988:fix-kas-servicemonitor-metrics-set
Open

Bug: Apply MetricsSet relabel configs to KAS ServiceMonitor#8715
wanghaoran1988 wants to merge 1 commit into
openshift:mainfrom
wanghaoran1988:fix-kas-servicemonitor-metrics-set

Conversation

@wanghaoran1988

@wanghaoran1988 wanghaoran1988 commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

  • Fix KAS adaptServiceMonitor() to apply MetricRelabelConfigs from the configured MetricsSet, matching the pattern used by every other component (etcd, KCM, CVO, scheduler, OAPI, OCM, routecm, NTO, OLM, catalog-operator)
  • Add unit test for TestAdaptServiceMonitor following the same pattern as kcm/servicemonitor_test.go

Problem

The KAS ServiceMonitor always uses the hardcoded Telemetry default from the YAML asset file (3 metrics only: apiserver_storage_objects, apiserver_request_total, apiserver_current_inflight_requests), regardless of the METRICS_SET environment variable.

When METRICS_SET=SRE is set and the sre-metric-set ConfigMap is configured with additional metrics (e.g., apiserver_envelope_encryption_* for KMS monitoring), the KAS ServiceMonitor ignores them because adaptServiceMonitor() never calls metrics.KASRelabelConfigs(cpContext.MetricsSet).

Every other component correctly applies its metrics set:

Component Has MetricRelabelConfigs line
etcd metrics.EtcdRelabelConfigs(cpContext.MetricsSet)
kcm metrics.KCMRelabelConfigs(cpContext.MetricsSet)
cvo metrics.CVORelabelConfigs(cpContext.MetricsSet)
scheduler metrics.SchedulerRelabelConfigs(cpContext.MetricsSet)
oapi metrics.OpenShiftAPIServerRelabelConfigs(cpContext.MetricsSet)
ocm metrics.OpenShiftControllerManagerRelabelConfigs(cpContext.MetricsSet)
routecm metrics.OpenShiftRouteControllerManagerRelabelConfigs(cpContext.MetricsSet)
nto metrics.NTORelabelConfigs(cpContext.MetricsSet)
olm metrics.OLMRelabelConfigs(cpContext.MetricsSet)
catalog metrics.CatalogOperatorRelabelConfigs(cpContext.MetricsSet)
kas missing

Fix

Add the missing line to v2/kas/servicemonitor.go:

sm.Spec.Endpoints[0].MetricRelabelConfigs = metrics.KASRelabelConfigs(cpContext.MetricsSet)

Test plan

  • Unit test TestAdaptServiceMonitor added (mirrors kcm/servicemonitor_test.go)
  • Tests pass: go test ./control-plane-operator/controllers/hostedcontrolplane/v2/kas/ -run TestAdaptServiceMonitor -v

Summary by CodeRabbit

  • New Features

    • Service monitor endpoints now apply metric relabeling according to the selected metrics set and inject the cluster identifier into endpoint labels when applicable.
  • Tests

    • Added tests validating service monitor behavior across metrics configurations, ensuring metric relabeling rules and cluster-ID labeling are applied correctly.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 7cc1a0a2-e861-468c-be1d-3cbcb1f3e944

📥 Commits

Reviewing files that changed from the base of the PR and between 8147217 and 707e294.

⛔ Files ignored due to path filters (5)
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/GCP/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
📒 Files selected for processing (2)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go
✅ Files skipped from review due to trivial changes (1)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go

📝 Walkthrough

Walkthrough

This PR imports the metrics package and updates adaptServiceMonitor to populate the ServiceMonitor's first endpoint MetricRelabelConfigs via metrics.KASRelabelConfigs(cpContext.MetricsSet) before injecting the cluster ID label. It also adds TestAdaptServiceMonitor, a table-driven test that runs parallel subtests to verify namespace selector assignment, metric relabel config application for different MetricsSet values, and cluster ID relabel injection when MetricsSetAll is used.


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 inconclusive)

Check name Status Explanation Resolution
Stable And Deterministic Test Names ❓ Inconclusive This check targets Ginkgo test names (It(), Describe(), Context(), When()), but the PR contains only standard Go tests using testing.T with t.Run(). The check is not applicable to standard Go tests. If the intent is to verify all test naming patterns in this codebase, clarify the scope of test frameworks covered.
Test Structure And Quality ❓ Inconclusive The custom check is specifically for Ginkgo BDD test code (Describe/It blocks with BeforeEach/AfterEach), but the test added uses standard Go table-driven testing with Gomega assertions. This testi... Clarify whether the check applies to standard Go testing patterns or only Ginkgo BDD tests. If applied to standard Go tests: the test follows good patterns (table-driven, parallel, proper setup isolation, 2 assertions with messages).
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bug: Apply MetricsSet relabel configs to KAS ServiceMonitor' accurately describes the main change: adding missing metric relabeling configuration to the KAS ServiceMonitor based on the configured MetricsSet.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Topology-Aware Scheduling Compatibility ✅ Passed PR only modifies Prometheus ServiceMonitor metric relabeling configuration; no scheduling constraints, pod anti-affinity, topology constraints, or node selectors are introduced or modified.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The PR adds only a standard Go unit test (TestAdaptServiceMonitor), not a Ginkgo e2e test. The custom check applies exclusively to Ginkgo e2e tests (It(), Describe(), Context(), When()). This check...
No-Weak-Crypto ✅ Passed The PR only modifies metrics configuration in servicemonitor.go/.test.go files. No weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or non-co...
Container-Privileges ✅ Passed PR modifies only Go source code (servicemonitor.go and servicemonitor_test.go) for metrics relabeling logic; contains no Kubernetes manifests or container security configurations to evaluate for pr...
No-Sensitive-Data-In-Logs ✅ Passed No logging statements found in the modified files. Changes only configure metric relabeling in-memory without exposing passwords, tokens, API keys, PII, session IDs, hostnames, or customer data.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from devguyio and enxebre June 11, 2026 09:18
@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels Jun 11, 2026
@wanghaoran1988 wanghaoran1988 force-pushed the fix-kas-servicemonitor-metrics-set branch from 63aef41 to 7e6810a Compare June 11, 2026 09:21

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go (1)

36-47: ⚡ Quick win

Consider validating metric relabel config content, not just existence.

The test checks that MetricRelabelConfigs is not nil, but doesn't validate the actual configs returned by metrics.KASRelabelConfigs. For MetricsSetTelemetry, you could verify the length is 1 and spot-check the Action or Regex field to ensure the correct config was applied.

💡 Example assertion strengthening
 g.Expect(sm.Spec.Endpoints[0].MetricRelabelConfigs).ToNot(BeNil())
+g.Expect(sm.Spec.Endpoints[0].MetricRelabelConfigs).To(HaveLen(1))
+g.Expect(sm.Spec.Endpoints[0].MetricRelabelConfigs[0].Action).To(Equal("keep"))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go`
around lines 36 - 47, The test case "When service monitor is adapted, it should
apply metric relabel configs" currently only asserts that
sm.Spec.Endpoints[0].MetricRelabelConfigs is not nil; update the validate
closure to compare the actual relabel configs to the expected output from
metrics.KASRelabelConfigs by asserting the slice length equals 1 and
spot-checking one or two key fields (e.g., MetricRelabelConfigs[0].Action and
MetricRelabelConfigs[0].Regex) against the known values returned by
metrics.KASRelabelConfigs (or compare the first element directly) to ensure the
correct config was applied.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go`:
- Around line 48-67: Update the test's validate closure to also assert the
cluster ID relabel config is present in MetricRelabelConfigs on the
ServiceMonitor endpoints (not just RelabelConfigs): after locating
sm.Spec.Endpoints[0].RelabelConfigs, iterate
sm.Spec.Endpoints[0].MetricRelabelConfigs and verify there is a config with
TargetLabel == "_id" and Replacement == "cluster-abc-123"; reference
util.ApplyClusterIDLabel and the ServiceMonitor endpoint slice
(sm.Spec.Endpoints) when adding this assertion so both RelabelConfigs and
MetricRelabelConfigs are validated.

---

Nitpick comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go`:
- Around line 36-47: The test case "When service monitor is adapted, it should
apply metric relabel configs" currently only asserts that
sm.Spec.Endpoints[0].MetricRelabelConfigs is not nil; update the validate
closure to compare the actual relabel configs to the expected output from
metrics.KASRelabelConfigs by asserting the slice length equals 1 and
spot-checking one or two key fields (e.g., MetricRelabelConfigs[0].Action and
MetricRelabelConfigs[0].Regex) against the known values returned by
metrics.KASRelabelConfigs (or compare the first element directly) to ensure the
correct config was applied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d73fe37b-a2da-42f3-af73-b1908d4d4dae

📥 Commits

Reviewing files that changed from the base of the PR and between 4755e9c and 63aef41.

📒 Files selected for processing (2)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go

@muraee muraee left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch — the production fix is correct and minimal. KAS was the only component missing the KASRelabelConfigs() call.

A few issues with the test, and a cleanup suggestion — see inline comments.

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 41.55%. Comparing base (4755e9c) to head (707e294).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8715   +/-   ##
=======================================
  Coverage   41.54%   41.55%           
=======================================
  Files         758      758           
  Lines       93838    93839    +1     
=======================================
+ Hits        38986    38994    +8     
+ Misses      52107    52100    -7     
  Partials     2745     2745           
Files with missing lines Coverage Δ
...ollers/hostedcontrolplane/v2/kas/servicemonitor.go 53.33% <100.00%> (+53.33%) ⬆️
Flag Coverage Δ
cmd-support 34.96% <ø> (ø)
cpo-hostedcontrolplane 43.64% <100.00%> (+0.04%) ⬆️
cpo-other 43.17% <ø> (ø)
hypershift-operator 51.62% <ø> (ø)
other 31.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@muraee

muraee commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

/approve

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: muraee, wanghaoran1988

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026
@wanghaoran1988 wanghaoran1988 force-pushed the fix-kas-servicemonitor-metrics-set branch 2 times, most recently from 81a8677 to 8147217 Compare June 11, 2026 11:39
@wanghaoran1988

Copy link
Copy Markdown
Member Author

Thanks for the review @muraee! Addressed all feedback:

  1. SRE assertion (line 82): Fixed — KASRelabelConfigs(SRE) returns nil when no ConfigMap is loaded in unit tests. Changed to assert MetricRelabelConfigs has length 1 (only the cluster ID entry from ApplyClusterIDLabel), which accurately reflects the unit test behavior.

  2. Weak Telemetry assertion (line 45): Strengthened — now asserts on the actual Telemetry relabel config content (action=keep, exact regex string), proving the configs came from KASRelabelConfigs() rather than YAML defaults.

  3. Cluster ID in MetricRelabelConfigs (@coderabbitai): Added — the cluster ID label test now verifies it appears in both RelabelConfigs and MetricRelabelConfigs, matching the behavior of ApplyClusterIDLabel().

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go

Length of output: 4825



@wanghaoran1988 Thanks for the update! I've reviewed the current state of servicemonitor_test.go and all three points have been well addressed:

  1. SRE assertion — correctly asserts HaveLen(1) and verifies the single entry has TargetLabel == "_id", accurately reflecting the nil KASRelabelConfigs(SRE) + single cluster ID entry from ApplyClusterIDLabel.

  2. Telemetry assertion — now strongly validates the actual config content: Action == "keep" and the exact regex (apiserver_storage_objects|apiserver_request_total|apiserver_current_inflight_requests), proving the configs originate from KASRelabelConfigs().

  3. Cluster ID in MetricRelabelConfigs — the test correctly iterates both RelabelConfigs and MetricRelabelConfigs to verify the _id label with the expected replacement value appears in both, matching ApplyClusterIDLabel() behavior.

The test suite looks solid. Thanks for the thorough follow-up!

@hypershift-jira-solve-ci

Copy link
Copy Markdown

I have all the evidence needed. Let me compile the final analysis report.

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

task fetch-product-metadata has the status "TaskRunImagePullFailed":
the step "parse-product-metadata" in TaskRun "hypershicceea78afae7730597173fa75d5b0e60-fetch-product-metadata"
failed to pull the image "". The pod errored with the message:
"Back-off pulling image "quay.io/konflux-ci/task-runner:1.3.0@sha256:3f007bf58821885f8aa30d72c84fcbfcb14babc6521eaf6ac1bc4f8c078d9e58"."

Summary

This is a Konflux CI infrastructure failure, not a code defect in PR #8715. The pipeline's fetch-product-metadata task failed because Kubernetes could not pull the container image quay.io/konflux-ci/task-runner:1.3.0 within the back-off window. The image exists and is valid on quay.io (verified independently), and the same hypershift-cli-mce-50-on-pull-request check passed on the 10 most recent PRs (#8701#8713). The pipeline ran for only 45 seconds total, never reaching the build or test stages — the PR's code changes (MetricsSet relabel configs for KAS ServiceMonitor) were never compiled or tested. A re-trigger of the pipeline should resolve the failure.

Root Cause

The Konflux pipeline task fetch-product-metadata uses a Tekton TaskRun that runs the step parse-product-metadata inside the container image quay.io/konflux-ci/task-runner:1.3.0@sha256:3f007bf58821885f8aa30d72c84fcbfcb14babc6521eaf6ac1bc4f8c078d9e58. The Kubernetes pod backing this TaskRun entered an ImagePullBackOff state, meaning the kubelet attempted to pull the image, failed (likely due to a transient network issue, registry throttling, or node-level image pull timeout on the Konflux build cluster stone-prd-rh01), and the back-off retry window expired before a successful pull.

Key evidence that this is transient infrastructure, not a code problem:

  1. The image is validskopeo inspect confirms the digest exists on quay.io with manifests for amd64, arm64, s390x, and ppc64le.
  2. Other PRs pass — PRs NO-JIRA: Add gitlint to pre-commit hooks #8701 through CNTRLPLANE-3611: propagate tls profile to aws-pod-identity-webhook #8713 all show SUCCESS for this exact check name, meaning the pipeline definition and image reference are correct in the mainline.
  3. Extremely short runtime — The entire pipeline ran for 45 seconds. Only init (succeeded, 4s) and clone-repository (succeeded, 31s) completed. The failure happened at the fetch-product-metadata task (7s), which is the second task in the pipeline — no code from the PR was ever built or tested.
  4. No code relation — The PR modifies ServiceMonitor YAML test fixtures and Go controller logic for KAS metric relabeling. The failing task (fetch-product-metadata) is a Konflux infrastructure task that fetches product metadata before any build occurs.
Recommendations
  1. Re-trigger the pipeline — Post a /retest comment or push an empty commit to re-run the Konflux checks. The transient image pull failure should not recur.
  2. No code changes needed — The PR's changes to MetricsSet relabel configs are unrelated to this infrastructure failure.
  3. If the failure persists on retry — Escalate to the Konflux/RHTAP team, as it may indicate a broader issue with the stone-prd-rh01 build cluster or quay.io registry connectivity from that cluster. Reference the image quay.io/konflux-ci/task-runner:1.3.0 and the pipeline run ID hypershift-cli-mce-50-on-pull-request-nj7z8 in the report.
Evidence
Evidence Detail
Failed task fetch-product-metadata (TaskRun status: TaskRunImagePullFailed)
Failed step parse-product-metadata
Image reference quay.io/konflux-ci/task-runner:1.3.0@sha256:3f007bf58821885f8aa30d72c84fcbfcb14babc6521eaf6ac1bc4f8c078d9e58
Image exists on registry ✅ Yes — verified via skopeo inspect, multi-arch manifest (amd64, arm64, s390x, ppc64le)
Pipeline duration 45 seconds (11:40:08Z → 11:40:53Z)
Tasks completed init ✅ (4s), clone-repository ✅ (31s)
Tasks failed fetch-product-metadata ❌ (7s)
Same check on PR #8713 ✅ SUCCESS
Same check on PR #8712 ✅ SUCCESS
Same check on PR #8711 ✅ SUCCESS
Same check on PRs #8701#8710 ✅ All SUCCESS
PR code relation to failure None — PR modifies KAS ServiceMonitor relabel configs; failure is in Konflux infra task

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go`:
- Around line 45-48: The test comment says there should be two
MetricRelabelConfigs (a "keep" rule plus a cluster ID relabel) but the assertion
currently only checks >= 1; update the assertions in servicemonitor_test.go to
assert the expected count and presence of both rules by checking
len(sm.Spec.Endpoints[0].MetricRelabelConfigs) is >= 2 (or == 2 if exact) and
then verify the second config corresponds to the cluster ID relabel (inspect
sm.Spec.Endpoints[0].MetricRelabelConfigs[1].Action/Regex or the relabel target)
in addition to the existing check of MetricRelabelConfigs[0] being the "keep"
rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b925ef48-0a99-4744-b22a-f9d4e63d7bc2

📥 Commits

Reviewing files that changed from the base of the PR and between 81a8677 and 8147217.

⛔ Files ignored due to path filters (5)
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/AROSwift/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/GCP/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/IBMCloud/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/TechPreviewNoUpgrade/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
  • control-plane-operator/controllers/hostedcontrolplane/testdata/kube-apiserver/zz_fixture_TestControlPlaneComponents_kube_apiserver_servicemonitor.yaml is excluded by !**/testdata/**
📒 Files selected for processing (2)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • control-plane-operator/controllers/hostedcontrolplane/v2/kas/servicemonitor.go

The KAS adaptServiceMonitor() function was missing the line that applies
MetricRelabelConfigs from the configured MetricsSet (Telemetry/SRE/All).
Every other component (etcd, KCM, CVO, scheduler, OAPI, OCM, routecm,
NTO, OLM, catalog-operator) correctly calls its corresponding
metrics.XxxRelabelConfigs(cpContext.MetricsSet), but KAS did not.

This caused the KAS ServiceMonitor to always use the hardcoded Telemetry
default from the YAML asset (3 metrics: apiserver_storage_objects,
apiserver_request_total, apiserver_current_inflight_requests), ignoring
the sre-metric-set ConfigMap entirely when METRICS_SET=SRE.

The fix adds the missing call to metrics.KASRelabelConfigs() and a unit
test matching the pattern used by KCM (kcm/servicemonitor_test.go).
@wanghaoran1988 wanghaoran1988 force-pushed the fix-kas-servicemonitor-metrics-set branch from 8147217 to 707e294 Compare June 11, 2026 11:48
@wanghaoran1988

wanghaoran1988 commented Jun 11, 2026

Copy link
Copy Markdown
Member Author

/jira ARO-25913

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@wanghaoran1988: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants