Commit 3aa8923
authored
fix(breaker): prevent latent busy-spin in closed-state drain loop (#4481)
* fix(breaker): prevent latent busy-spin in closed-state drain loop
A latent busy-spin exists in the closed-state backoff drain loop:
`_ = self.rsps.recv() => continue` discards the recv() result, so when
the response channel closes recv() returns None immediately on every
iteration and the loop spins, saturating a CPU core.
Currently this is unreachable because gate::Rx and mpsc::Sender live in
the same Gate<BroadcastClassification<...>> service struct and drop
together in a single synchronous destructor chain, so gate.lost() always
fires when the channel is closing. However, gate::Rx is a
watch::Receiver, implementing Clone. If a future change cloned it for a
legitimate purpose (monitoring gate state, exposing it to an admin
endpoint, observability, etc), the clone would keep gate.lost() from
resolving after the endpoint service is dropped, while the mpsc channel
would close normally once in-flight requests drain. The breaker task
would then busy-spin instead of terminating. [NB: the reason for this
is that a clone would extend the lifetime of the receiver, and for
gate.lost() to fire the internal Sender::closed() requires that all
receivers are gone]
Check the recv() result and propagate None as Err(()) to terminate the
task cleanly, mirroring the pattern already used in open() and
probation(). This makes the breaker self-contained, terminating on its
own channel state rather than depending on the fragility of the
ownership structure, where currently gate::Rx and mpsc::Sender happen to
share a destructor chain.
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
* test(breaker): regression test for channel close with surviving gate::Rx
Verify that the breaker task terminates when the response channel closes
while a clone of gate::Rx outlives the endpoint service.
We build a minimal Gate, clone gate::Rx, trip the breaker via status 500
into closed state then let the service drop. The surviving gate::Rx
clone should prevent gate.lost() from firing, so the breaker must
terminate via the recv() call, returning None in the closed state drain
loop.
Without the channel close check in closed() this test would hang
(spinning on recv() returning None indefinitely).
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
---------
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>1 parent 6a89cd7 commit 3aa8923
1 file changed
Lines changed: 120 additions & 3 deletions
Lines changed: 120 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
79 | | - | |
80 | | - | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
| |||
105 | 108 | | |
106 | 109 | | |
107 | 110 | | |
108 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
109 | 117 | | |
110 | 118 | | |
111 | 119 | | |
| |||
208 | 216 | | |
209 | 217 | | |
210 | 218 | | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
211 | 328 | | |
0 commit comments