Kubeconfig hotreload#25
Conversation
|
Warning Review limit reached
More reviews will be available in 55 minutes and 57 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR adds a filewatcher package that debounces fsnotify events and reruns a provided init function on file updates. Tests and a go.mod change add fsnotify as a direct dependency. main.go is updated to use RerunOnFileUpdate to reload the manager when the watch target changes. ChangesKubeconfig Hot-Reload Feature
Sequence DiagramsequenceDiagram
participant Process as Process Startup
participant SignalHandler as Signal Handler
participant Watcher as RerunOnFileUpdate
participant FSN as fsnotify
participant Manager as Manager + Reconcilers
participant Kubeconfig as Kubeconfig / CA file
Process->>SignalHandler: SetupSignalHandler()
Process->>Watcher: RerunOnFileUpdate(ctx, watch_path, initFunc)
Watcher->>Manager: execute initFunc (1st invocation)
Manager->>Kubeconfig: load kubeconfig / read cert
Manager->>Manager: create manager & register reconcilers
Manager->>Manager: start manager
Kubeconfig->>FSN: file modified (write/rename)
FSN->>Watcher: event observed
Watcher->>Watcher: debounce (200ms)
Watcher->>Manager: cancel child context
Manager->>Manager: stop (context done)
Watcher->>Manager: execute initFunc (2nd invocation)
Manager->>Kubeconfig: load updated kubeconfig / cert
Manager->>Manager: recreate manager & reconcilers
Manager->>Manager: start manager
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (1)
filewatcher/watcher_test.go (1)
18-56: ⚡ Quick winAdd a spec for the Kubernetes
..dataupdate path.This test only exercises direct writes to the file. The watcher’s Kubernetes-specific branch is the
base == "..data"case, and that path is never hit here, so the behavior this package was added for can still regress unnoticed.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@filewatcher/watcher_test.go` around lines 18 - 56, Add a new spec in watcher_test.go that simulates the Kubernetes "..data" update path: create a temp directory, make a subdir "..data/v1" with the watched file (e.g. "watched.yaml") inside, create a symlink in the temp dir so the visible file path points to "..data/v1/watched.yaml", then call filewatcher.RerunOnFileUpdate with that visible path and the existing runFunc/startCount; to trigger an update create "..data/v2/watched.yaml" and atomically replace the "..data" symlink to point to "v2" (remove+symlink), and assert startCount increments just like the direct-write test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@filewatcher/watcher.go`:
- Around line 40-58: The loop currently restarts unconditionally after runFunc
returns; change it to only restart when the reloadCh actually fired. Inside the
loop, create a fresh one-shot channel or flag (e.g., reloadFired channel) and
have the goroutine that listens to reloadCh close that channel (or set the flag)
just before calling runCancel(); after runFunc(runCtx) returns, check the
reloadFired channel non-blockingly (or the flag) and only continue the loop and
log the "restarting after file change" when that reload indicator was triggered;
otherwise return to avoid relaunching on normal exit or startup failure. Use the
existing symbols runFunc, runCtx, reloadCh, runCancel, ctx, and path to
implement this.
In `@main.go`:
- Around line 74-77: Replace the hard process exit with a controlled return in
main: when evaluating the error from mgr.Start(ctx) (the call to mgr.Start and
the accompanying setupLog.Error invocation), remove the os.Exit(1) and instead
return from main so the program can follow the hot-reload/manager lifecycle and
potentially recover from transient kubeconfig/connectivity issues; keep the
existing setupLog.Error(err, "problem running manager") logging intact.
- Around line 60-63: The code currently calls os.Exit(1) after logging the
manager start error which kills the process; change that to return so the
enclosing runFunc can exit gracefully and the hot-reload watcher can restart on
next change. Specifically, in the block that logs via setupLog.Error(err,
"unable to start manager") (inside runFunc where the manager Start/StartManager
is invoked), remove the os.Exit(1) and return from the function (propagate the
error or a nil as appropriate to the runFunc signature) instead of terminating
the process.
- Line 78: The log message calling setupLog.Info("received SIGTERM or SIGINT.
See you later.") is misleading because it runs on every file-change-triggered
context cancel inside the RerunOnFileUpdate callback; remove or relocate this
log so it only runs on real process termination: delete the message from inside
the RerunOnFileUpdate callback and instead emit a shutdown log from the actual
signal-handling/termination path (the code that handles SIGTERM/SIGINT or where
the main context is canceled by OS signals), ensuring you keep setupLog.Info
only in the signal-handling function and not in the file-reload callback.
- Around line 68-71: The exit on setup failure prevents hot-reload recovery;
replace the os.Exit(1) call in the reconciler.SetupWithManager error branch with
a plain return so the process can exit gracefully and let the hot-reload loop
restart it; update the block around reconciler.SetupWithManager(mgr) and
setupLog.Error(err, ...) to return after logging (propagating error if this code
is in a function that returns error).
- Line 51: The code reads GetLoadingPrecedence()[0] into kubeconfigPath without
a bounds check and only watches that single path; change the logic to first call
clientcmd.NewDefaultClientConfigLoadingRules().GetLoadingPrecedence(), verify
its length before indexing, and if empty return a clear error; then pass the
full precedence slice (or at least all non-empty entries) to
filewatcher.RerunOnFileUpdate instead of only using kubeconfigPath so changes to
any precedence entry (including merged KUBECONFIG entries) trigger reloads;
update references around kubeconfigPath, GetLoadingPrecedence(),
filewatcher.RerunOnFileUpdate, and ctrlconfig.GetConfigWithContext accordingly.
---
Nitpick comments:
In `@filewatcher/watcher_test.go`:
- Around line 18-56: Add a new spec in watcher_test.go that simulates the
Kubernetes "..data" update path: create a temp directory, make a subdir
"..data/v1" with the watched file (e.g. "watched.yaml") inside, create a symlink
in the temp dir so the visible file path points to "..data/v1/watched.yaml",
then call filewatcher.RerunOnFileUpdate with that visible path and the existing
runFunc/startCount; to trigger an update create "..data/v2/watched.yaml" and
atomically replace the "..data" symlink to point to "v2" (remove+symlink), and
assert startCount increments just like the direct-write test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b28aa5ae-2070-4fb9-ab83-787dfbda99bf
📒 Files selected for processing (5)
filewatcher/suite_test.gofilewatcher/watcher.gofilewatcher/watcher_test.gogo.modmain.go
| for { | ||
| runCtx, runCancel := context.WithCancel(ctx) | ||
| go func() { | ||
| select { | ||
| case <-runCtx.Done(): | ||
| case <-reloadCh: | ||
| log.Info("file changed, restarting") | ||
| runCancel() | ||
| } | ||
| }() | ||
|
|
||
| runFunc(runCtx) | ||
| runCancel() | ||
|
|
||
| if ctx.Err() != nil { | ||
| return | ||
| } | ||
| log.Info("restarting after file change", "path", path) | ||
| } |
There was a problem hiding this comment.
Only restart when a reload actually fired.
Lines 51-58 currently loop on any runFunc return, not just file-change cancellations. If runFunc exits because startup failed or it shut down normally, this code immediately relaunches it and logs it as a file-change restart, which can turn a real failure into a tight restart loop.
Suggested fix
for {
runCtx, runCancel := context.WithCancel(ctx)
+ restartedByReload := make(chan struct{}, 1)
go func() {
select {
case <-runCtx.Done():
case <-reloadCh:
log.Info("file changed, restarting")
+ restartedByReload <- struct{}{}
runCancel()
}
}()
runFunc(runCtx)
runCancel()
if ctx.Err() != nil {
return
}
+
+ select {
+ case <-restartedByReload:
+ default:
+ return
+ }
+
log.Info("restarting after file change", "path", path)
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for { | |
| runCtx, runCancel := context.WithCancel(ctx) | |
| go func() { | |
| select { | |
| case <-runCtx.Done(): | |
| case <-reloadCh: | |
| log.Info("file changed, restarting") | |
| runCancel() | |
| } | |
| }() | |
| runFunc(runCtx) | |
| runCancel() | |
| if ctx.Err() != nil { | |
| return | |
| } | |
| log.Info("restarting after file change", "path", path) | |
| } | |
| for { | |
| runCtx, runCancel := context.WithCancel(ctx) | |
| restartedByReload := make(chan struct{}, 1) | |
| go func() { | |
| select { | |
| case <-runCtx.Done(): | |
| case <-reloadCh: | |
| log.Info("file changed, restarting") | |
| restartedByReload <- struct{}{} | |
| runCancel() | |
| } | |
| }() | |
| runFunc(runCtx) | |
| runCancel() | |
| if ctx.Err() != nil { | |
| return | |
| } | |
| select { | |
| case <-restartedByReload: | |
| default: | |
| return | |
| } | |
| log.Info("restarting after file change", "path", path) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@filewatcher/watcher.go` around lines 40 - 58, The loop currently restarts
unconditionally after runFunc returns; change it to only restart when the
reloadCh actually fired. Inside the loop, create a fresh one-shot channel or
flag (e.g., reloadFired channel) and have the goroutine that listens to reloadCh
close that channel (or set the flag) just before calling runCancel(); after
runFunc(runCtx) returns, check the reloadFired channel non-blockingly (or the
flag) and only continue the loop and log the "restarting after file change" when
that reload indicator was triggered; otherwise return to avoid relaunching on
normal exit or startup failure. Use the existing symbols runFunc, runCtx,
reloadCh, runCancel, ctx, and path to implement this.
| setupLog.Error(err, "unable to start manager") | ||
| os.Exit(1) | ||
| } | ||
| kubeconfigPath := clientcmd.NewDefaultClientConfigLoadingRules().GetLoadingPrecedence()[0] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that GetLoadingPrecedence returns the paths and check for typical values
rg -n "GetLoadingPrecedence|GetConfigWithContext" --type=go -C 3Repository: cobaltcore-dev/cloud-profile-sync
Length of output: 841
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect main.go around the reported line and where getKubeconfigOrDie is defined/used
echo "=== main.go context around kubeconfigPath and watcher ==="
nl -ba main.go | sed -n '1,140p'
# 2) Find getKubeconfigOrDie definition
echo "=== getKubeconfigOrDie references/definition ==="
rg -n "func getKubeconfigOrDie|getKubeconfigOrDie\(" --type=go
# 3) Find filewatcher.RerunOnFileUpdate implementation
echo "=== filewatcher.RerunOnFileUpdate implementation ==="
rg -n "RerunOnFileUpdate" --type=go -SRepository: cobaltcore-dev/cloud-profile-sync
Length of output: 196
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== main.go context around kubeconfigPath and watcher ==="
cat -n main.go | sed -n '1,160p'
echo "=== getKubeconfigOrDie references/definition ==="
rg -n "func getKubeconfigOrDie|getKubeconfigOrDie\(" --type=go
echo "=== filewatcher.RerunOnFileUpdate implementation ==="
rg -n "RerunOnFileUpdate" --type=go -SRepository: cobaltcore-dev/cloud-profile-sync
Length of output: 4083
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== filewatcher/watcher.go ==="
cat -n filewatcher/watcher.go | sed -n '1,200p'
echo "=== Any other GetLoadingPrecedence usage ==="
rg -n "GetLoadingPrecedence\(\)" --type=go
echo "=== Any other GetConfigWithContext usage ==="
rg -n "GetConfigWithContext\(" --type=goRepository: cobaltcore-dev/cloud-profile-sync
Length of output: 3756
🌐 Web query:
client-go clientcmd NewDefaultClientConfigLoadingRules GetLoadingPrecedence implementation
💡 Result:
In the Kubernetes client-go library, GetLoadingPrecedence is a method defined on the ConfigAccess interface and implemented by the ClientConfigLoadingRules struct [1][2]. It determines the ordered slice of file paths that the client-go configuration loader will check when attempting to load and merge kubeconfig files [1][2]. The implementation follows this logic [2]: 1. Explicit Path Check: If the ClientConfigLoadingRules.ExplicitPath field is non-empty, the method returns a slice containing only that path [2]. This allows users or tools to force the loader to use a specific file [3][4]. 2. Default Precedence: If no explicit path is set, the method returns a copy of the Precedence slice defined within the ClientConfigLoadingRules object [2]. This slice typically contains a list of default locations, such as paths derived from the KUBECONFIG environment variable or standard home directory locations (e.g., ~/.kube/config), depending on how the rules were initialized [5][4]. By returning a copy of the precedence list, the implementation ensures that external callers cannot inadvertently modify the internal state of the loading rules by sorting or mutating the returned slice [2]. NewDefaultClientConfigLoadingRules is the standard factory function used to initialize these rules with sensible defaults, ensuring that standard Kubernetes configuration file discovery patterns are applied unless explicitly overridden [6][4].
Citations:
- 1: https://github.com/kubernetes/client-go/blob/master/tools/clientcmd/config.go
- 2: https://github.com/kubernetes/client-go/blob/master/tools/clientcmd/loader.go
- 3: https://github.com/kubernetes/client-go/blob/e478dd3a68a8ebf260beb7b9522bcc87cf4c971f/tools/clientcmd/loader.go
- 4: https://kubernetes.io/blog/2026/01/19/clientcmd-apiserver-access/
- 5: https://pkg.go.dev/k8s.io/Client-go/1.5/tools/clientcmd
- 6: https://github.com/kubernetes/client-go/blob/master/tools/clientcmd/client_config.go
🌐 Web query:
sigs.k8s.io controller-runtime pkg client config GetConfigWithContext implementation kubeconfig path selection
💡 Result:
In sigs.k8s.io/controller-runtime, the GetConfigWithContext function follows a specific precedence order to locate and load a Kubernetes configuration [1]. It first checks for an explicit file path provided via a flag (if configured), then checks the KUBECONFIG environment variable, then attempts an in-cluster configuration, and finally defaults to the user's home directory [2][1]. The implementation details and precedence are as follows: 1. Flag-provided path: If a kubeconfig flag has been set (typically via RegisterFlags), GetConfigWithContext uses that specific file path [2][1]. 2. KUBECONFIG environment variable: If no flag is set, it checks the KUBECONFIG environment variable [2][1]. This implementation supports multi-value paths as defined by standard Kubernetes conventions [3][4]. 3. In-cluster configuration: If neither of the above is provided, the function attempts to load an in-cluster configuration (typically found at /var/run/secrets/kubernetes.io/serviceaccount/token) [2][1]. 4. Default location: If in-cluster config is unavailable, it defaults to the standard location, which is $HOME/.kube/config [2][1]. The implementation specifically handles cases where the $HOME environment variable might be unset by attempting to resolve the user's home directory via os/user.Current [2]. The underlying loading logic utilizes standard client-go clientcmd libraries to handle the rules for these loading paths, ensuring consistency with standard Kubernetes tooling [2][4].
Citations:
- 1: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/client/config
- 2: https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/client/config/config.go
- 3: ✨ Add support for multi-value KUBECONFIG kubernetes-sigs/controller-runtime#642
- 4: GetConfig does not respect multi-value KUBECONFIG kubernetes-sigs/controller-runtime#564
🌐 Web query:
kubernetes client-go NewDefaultClientConfigLoadingRules Precedence initialization empty slice
💡 Result:
In Kubernetes client-go, the NewDefaultClientConfigLoadingRules function initializes a ClientConfigLoadingRules object with a default Precedence slice [1]. This slice determines the order in which kubeconfig files are searched and merged [1]. When NewDefaultClientConfigLoadingRules is called, it does not initialize the Precedence slice as an empty slice for final use; rather, it populates it based on the following logic [1]: 1. Environment Variables: It checks the KUBECONFIG environment variable (RecommendedConfigPathEnvVar) [1]. 2. Population: If KUBECONFIG is set, it splits the value (using filepath.SplitList) and populates the Precedence slice with those paths [1]. 3. Default Fallback: If the KUBECONFIG environment variable is NOT set (or is empty), it appends the RecommendedHomeFile (typically ~/.kube/config) to the Precedence slice [1]. If you manually create a ClientConfigLoadingRules struct and leave the Precedence field as an empty slice (nil or []string{}), the loader will behave as follows during configuration loading [1]: - Merging Behavior: If no ExplicitPath is provided, the loading process uses the Precedence slice to merge configuration files [1]. If Precedence is empty, there are no files to merge, which typically results in an empty or default configuration object (depending on other factors like overrides) [1]. - ExplicitPath Precedence: If you provide an ExplicitPath, the loader ignores the Precedence slice entirely and attempts to load only that specific file [1]. In summary, NewDefaultClientConfigLoadingRules never leaves the slice empty in its returned object [1]. If you are seeing an empty slice, it is because it was initialized manually without populating the Precedence field [1].
Citations:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "KUBECONFIG|KUBECONTEXT" --type=goRepository: cobaltcore-dev/cloud-profile-sync
Length of output: 129
Fix kubeconfig hot-reload path derivation (bounds check + watch the right precedence path(s))
main.go:51:GetLoadingPrecedence()[0]is indexed without a bounds check; default loading rules usually populate precedence, but adding a guard prevents a potential startup panic if it ever ends up empty.filewatcher.RerunOnFileUpdatewatches onlyprecedence[0], butctrlconfig.GetConfigWithContextcan load/merge from later precedence entries (e.g.,KUBECONFIGcontains multiple paths or the first path is missing), so kubeconfig changes may not trigger reloads.
🛡️ Suggested defensive bounds check
- kubeconfigPath := clientcmd.NewDefaultClientConfigLoadingRules().GetLoadingPrecedence()[0]
+ loadingRules := clientcmd.NewDefaultClientConfigLoadingRules()
+ precedence := loadingRules.GetLoadingPrecedence()
+ if len(precedence) == 0 {
+ setupLog.Error(nil, "no kubeconfig paths in precedence; hot-reload disabled")
+ os.Exit(1)
+ }
+ kubeconfigPath := precedence[0]
+ setupLog.Info("watching kubeconfig", "path", kubeconfigPath)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| kubeconfigPath := clientcmd.NewDefaultClientConfigLoadingRules().GetLoadingPrecedence()[0] | |
| loadingRules := clientcmd.NewDefaultClientConfigLoadingRules() | |
| precedence := loadingRules.GetLoadingPrecedence() | |
| if len(precedence) == 0 { | |
| setupLog.Error(nil, "no kubeconfig paths in precedence; hot-reload disabled") | |
| os.Exit(1) | |
| } | |
| kubeconfigPath := precedence[0] | |
| setupLog.Info("watching kubeconfig", "path", kubeconfigPath) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@main.go` at line 51, The code reads GetLoadingPrecedence()[0] into
kubeconfigPath without a bounds check and only watches that single path; change
the logic to first call
clientcmd.NewDefaultClientConfigLoadingRules().GetLoadingPrecedence(), verify
its length before indexing, and if empty return a clear error; then pass the
full precedence slice (or at least all non-empty entries) to
filewatcher.RerunOnFileUpdate instead of only using kubeconfigPath so changes to
any precedence entry (including merged KUBECONFIG entries) trigger reloads;
update references around kubeconfigPath, GetLoadingPrecedence(),
filewatcher.RerunOnFileUpdate, and ctrlconfig.GetConfigWithContext accordingly.
898587a to
835d4d6
Compare
835d4d6 to
b33c210
Compare
b33c210 to
a1e97c5
Compare
Add kubeconfig hotreload to make rotation easier
Summary by CodeRabbit
New Features
Tests
Chores