etcd(ticdc): update etcd cluster member correctly (#12646)#12650
Conversation
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request updates the etcd client configuration by removing the AutoSyncInterval, making the isHealthy function private, and refining the healthyChecker.update logic to properly close and remove stale client connections. Feedback was provided regarding the update function, specifically pointing out that an empty endpoint list could lead to unnecessary connection churn and suggesting a guard clause to prevent clearing all clients during transient failures.
| } | ||
|
|
||
| func (checker *healthyChecker) update(eps []string) { | ||
| updateEps := make(map[string]struct{}, len(eps)) |
There was a problem hiding this comment.
If syncUrls fails to retrieve the etcd member list (e.g., due to a transient network error or leader election), it returns an empty slice. In the current implementation, this causes the update function to clear all existing clients from the healthyChecker and close their connections. When syncUrls succeeds in a subsequent iteration, all clients will be recreated, leading to unnecessary connection churn and a temporary loss of health monitoring for existing members. Adding a guard to skip the update when eps is empty ensures continuity during transient etcd control plane issues.
if len(eps) == 0 {
return
}
updateEps := make(map[string]struct{}, len(eps))
Codecov Report❌ Patch coverage is Additional details and impacted files
Flags with carried forward coverage won't be shown. Click here to find out more. @@ Coverage Diff @@
## release-8.5 #12650 +/- ##
================================================
Coverage ? 53.4621%
================================================
Files ? 1026
Lines ? 137935
Branches ? 0
================================================
Hits ? 73743
Misses ? 58681
Partials ? 5511 🚀 New features to boost your workflow:
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wk989898 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
1 similar comment
|
/retest |
This is an automated cherry-pick of #12646
What problem does this PR solve?
Issue Number: close #12368
What is changed and how it works?
When updating etcd client URLs, remove the client that is not a member of the etcd cluster.
Check List
Tests
Print only once
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note