feat: load-adaptive per-repo rate limiting for shared public bots#1423
Open
thomasrockhu-codecov wants to merge 2 commits into
Open
feat: load-adaptive per-repo rate limiting for shared public bots#1423thomasrockhu-codecov wants to merge 2 commits into
thomasrockhu-codecov wants to merge 2 commits into
Conversation
Stop any single public repo from monopolizing the shared dedicated-app
tokens (commit_dedicated_app / pull_dedicated_app) while letting repos burst
freely when the pool is idle.
Each shared bot's live GitHub utilization (U = 1 - remaining/limit, read from
X-RateLimit-* headers) drives a per-repo cap that slides from a generous burst
share when the pool is quiet down to a small guaranteed share when the pool is
saturated. A repo whose rolling usage in the trailing window exceeds its cap is
over-limit; over-limit requests are silently dropped (synthetic 204 -> None, no
exception) and emitted to Prometheus.
- New shared/rate_limits/public_bot.py: config loader, sliding_repo_cap(),
minute-bucket per-repo usage counters, pool-utilization read/write, and the
fail-open evaluate_public_bot_request() decision helper.
- Enforce in torngit github make_http_call for repo calls whose token
entity_name is a configured bot; record pool utilization after each response.
- Prometheus instruments: git_provider_public_bot_over_cap{bot,repo_slug,mode},
git_provider_public_bot_requests{bot}, git_provider_public_bot_utilization{bot}.
- Ships inert (enabled=false); observe-first (enforce=false) before dropping.
Co-authored-by: Cursor <cursoragent@cursor.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! 🚀 New features to boost your workflow:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1423 +/- ##
==========================================
+ Coverage 91.89% 91.90% +0.01%
==========================================
Files 1325 1326 +1
Lines 50868 51011 +143
Branches 1626 1637 +11
==========================================
+ Hits 46744 46884 +140
Misses 3818 3818
- Partials 306 309 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. |
…code - Break the monolithic make_http_call into small, well-named helpers (header building, host header, public-bot cap, response logging, token refresh, rate-limit fallback, terminal-status raising) so the retry loop reads top-to-bottom. - Drop unnecessary input/Redis protection in shared/rate_limits/public_bot.py: remove bytes-decode helpers (int/float already accept bytes), per-function fail-open try/except, redundant clamping, config type coercion, and an unreachable degenerate-config branch. - Centralize the one needed safety (a Redis outage must not block provider requests) into single fail-open guards at the call sites in github.py. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stops any single public repo from monopolizing the shared dedicated-app tokens (
commit_dedicated_app/pull_dedicated_app) while still letting repos burst freely when the pool is idle. Enforcement is per-request, keyed per(shared bot, repo).U = 1 - remaining/limit(read fromX-RateLimit-*headers we already log) drives a per-repo cap. Quiet pool (U <= util_low) → generousmax_share; saturated pool (U >= util_high) → smallguaranteed_share; linear interpolation between. A repo whose rolling usage in the trailing window exceeds its cap is over-limit.make_http_callreturns a synthetic204(parsed toNone, pagination stops) — noTorngitRateLimitErrorraised. Callers get an empty result as if GitHub returned nothing.enabled=false); withenabled=true, enforce=falseit emits metrics on every over-cap request but still sends it, so thresholds can be validated against live traffic before any drops.token_to_use["entity_name"]against the configuredbotslist).What's added
shared/rate_limits/public_bot.py: config loader,sliding_repo_cap(), minute-bucket per-repo usage counters, pool-utilization read/write, and a fail-openevaluate_public_bot_request()decision helper (any Redis error → not limited).torngit/github.pymake_http_call.git_provider_public_bot_over_cap{bot,repo_slug,mode},git_provider_public_bot_requests{bot},git_provider_public_bot_utilization{bot}.Configuration
Test plan
make test.path TEST_PATH="tests/unit/rate_limits/test_public_bot.py tests/unit/torngit/test_github_public_bot.py"fromlibs/sharedsliding_repo_capboundaries/interpolation, config loader defaults+overrides, Redis read/write fail-open paths,evaluate_public_bot_requestunder/over cap, andmake_http_callobserve (sends,mode=observe) vs enforce (drops silently →None, not counted).enabled: true, enforce: false; watchgit_provider_public_bot_over_capand tune thresholds; then flipenforce: true.Rollout
enabled: true, enforce: false(metrics only).enforce: trueto begin dropping.Made with Cursor