Skip to content

fix: reconcileVersionUpgrade creates UpgradePolicy in seam-tenant-{cluster}#28

Merged
ontave merged 11 commits into
mainfrom
session/25e-upgrade-policy-tenant-ns
May 17, 2026
Merged

fix: reconcileVersionUpgrade creates UpgradePolicy in seam-tenant-{cluster}#28
ontave merged 11 commits into
mainfrom
session/25e-upgrade-policy-tenant-ns

Conversation

@ontave
Copy link
Copy Markdown
Contributor

@ontave ontave commented May 7, 2026

Summary

  • reconcileVersionUpgrade was creating the UpgradePolicy in tc.Namespace (seam-system for imported clusters)
  • Conductor's stackUpgradeHandler reads the UpgradePolicy from tenantNamespace(clusterRef) = seam-tenant-{cluster} -- namespace mismatch caused executor Job failure
  • Fix: create UpgradePolicy in "seam-tenant-" + tc.Name where the platform-executor SA, talosconfig Secret, and all conductor executor infrastructure already live

Closes

  • STACK-UPGRADE-UP-NAMESPACE (backlog)
  • STACK-UPGRADE-MGMT-SA (superseded -- SA already exists in tenant ns)
  • STACK-UPGRADE-TALOSCONFIG-SCOPE (superseded -- talosconfig already there)

Test plan

  • All 8 version upgrade unit tests pass (updated namespace assertions)
  • Verified live on ccs-dev (session/25e): after fix, UpgradePolicy created in seam-tenant-ccs-dev, executor Job runs in that namespace, conductor reads policy successfully

ontave added 11 commits May 7, 2026 07:33
When a drift-k8s-version-{cluster} DriftSignal arrives (emitted by
KubernetesVersionDriftLoop on the tenant conductor), create a corrective
UpgradePolicy (type=kubernetes, targetKubernetesVersion=spec.kubernetesVersion)
in seam-tenant-{cluster}. UpgradePolicyReconciler picks it up and submits
a kube-upgrade executor Job to bring the cluster back to declared state.

Routing: InfrastructureTalosCluster signals are now distinguished by name
prefix -- drift-k8s-version-* routes to handleKubernetesVersionDrift,
all others continue to handleTalosVersionDrift.

1 unit test: TestDriftSignalReconciler_K8sVersionDrift_CreatesUpgradePolicy.
All 7 DriftSignal unit tests pass.
…paths

reconcileVersionUpgrade now derives UpgradePolicy type from which version
fields are set: talosVersion only -> UpgradeTypeTalos (existing), kubernetesVersion
only -> UpgradeTypeKubernetes, both -> UpgradeTypeStack. Two new unit tests:
TestTalosCluster_VersionUpgrade_KubernetesOnly_CreatesKubePolicy and
TestTalosCluster_VersionUpgrade_Stack_CreatesBothVersions. All 8 version
upgrade tests pass.
…uster}

The UpgradePolicy was created in tc.Namespace (seam-system for imported
clusters). Conductor's stackUpgradeHandler reads the UpgradePolicy from
tenantNamespace(clusterRef) = seam-tenant-{cluster}, so the executor Job
looked in the wrong namespace and could not find the policy.

Fix: create UpgradePolicy in seam-tenant-{tc.Name} where the
platform-executor SA, talosconfig Secret, and Conductor executor all
already live. Closes STACK-UPGRADE-UP-NAMESPACE; STACK-UPGRADE-MGMT-SA
and STACK-UPGRADE-TALOSCONFIG-SCOPE are superseded.

Tests: update all UpgradePolicy namespace lookups to seam-tenant-ccs-mgmt.
… under seam.ontai.dev

Defines TalosCluster under api/seam/v1alpha1 (seam.ontai.dev/v1alpha1).
Removes the dead InfrastructureTalosCluster stub from api/v1alpha1.
Adds seam.ontai.dev_talosclusters.yaml CRD manifest.
Updates main.go, reconciler, and all consumer tests to the new type.
Also adds CRD manifests for day-2 types produced during session 25.
Replace seam-core -> seam in go.mod replace/require. Update all Go
import paths from github.com/ontai-dev/seam-core/ to
github.com/ontai-dev/seam/. Add seam-sdk replace + require. Update
runnerconfig_cr.go type aliases to use post-MIGRATION-3.8 names
(RunnerConfig, RunnerConfigSpec, RunnerConfigStatus).
Replace ../seam-core with ../seam following the seam-core -> seam
filesystem rename. Module path github.com/ontai-dev/seam was already
updated in Phase 4; this aligns the local path pointer.
…ontai.dev

Update rbacPolicyGVK, rbacProfileGVK and APIGroups arrays from security.ontai.dev
to guardian.ontai.dev in taloscluster_helpers.go and associated tests.
…names

- Replace testdata/crds/infrastructure.ontai.dev_infrastructurerunnerconfigs.yaml
  with seam.ontai.dev_runnerconfigs.yaml (current seam CRD, same group/kind)
- Comments: InfrastructureTalosCluster -> TalosCluster,
  InfrastructureTalosClusterOperationResult -> ClusterLog,
  seam-core -> seam (module/repo references, not schema doc names)
All 3 platform test packages pass (unit, integration/capi, integration/day2).
Fresh documentation from current codebase. seam-core references replaced
with seam. wrapper references replaced with dispatcher. TalosCluster and
ClusterLog ownership under seam.ontai.dev clarified. platform.ontai.dev
day-2 CRD catalog updated to match current Go types.
@ontave ontave merged commit 93542ef into main May 17, 2026
1 of 3 checks passed
@ontave ontave deleted the session/25e-upgrade-policy-tenant-ns branch May 17, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant