Skip to content

util/stmtsummary: add tidb_stmt_summary_persist_evicted#68513

Merged
ti-chi-bot[bot] merged 14 commits into
pingcap:masterfrom
nolouch:stmt_summary_log_evicted
Jun 3, 2026
Merged

util/stmtsummary: add tidb_stmt_summary_persist_evicted#68513
ti-chi-bot[bot] merged 14 commits into
pingcap:masterfrom
nolouch:stmt_summary_log_evicted

Conversation

@nolouch
Copy link
Copy Markdown
Member

@nolouch nolouch commented May 20, 2026

What problem does this PR solve?

Issue Number: ref #67924

Problem Summary:

The v2 (persistent) statement summary maintains an LRU per window; when an entry is evicted before the window is persisted, downstream consumers of the stmt log have no record that the digest ever existed. Today the only signal is the aggregate "other" bucket appended at rotation time, which loses per-record identity.

What changed and how does it work?

Introduce a new global system variable tidb_stmt_summary_persist_evicted (default OFF).

  • When ON, every LRU eviction in the v2 backend persists a JSON entry to the stmt summary log with an "evicted":true marker, embedding the full StmtRecord snapshot, so downstream consumers can track which records were dropped from memory between rotations.
  • Persistence is asynchronous: evictions are cloned and enqueued on a buffered channel, so the Add() hot path never blocks on log I/O.
  • The async logger batches evicted records before writing them, flushing when the batch reaches 64 records, after 100 ms, or during shutdown. This reduces logger calls under high-frequency eviction while preserving one JSON line per evicted record.
  • The channel is bounded (1024 records). When full, evictions are dropped and a periodic warn log reports the dropped count so the operator can size accordingly.

The window callback was extended to optionally fire a per-eviction hook (onEvictFn). When the flag is OFF, the hook short-circuits immediately and no record cloning is performed.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

New test:

  • pkg/util/stmtsummary/v2/stmtsummary_test.go::TestStmtSummaryPersistEvicted

Verified locally:

  • GOCACHE=/private/tmp/tidb-go-build go test -run 'TestStmtSummaryPersistEvicted|TestWindowEvictedCountResetOnRotate|TestDefaultConfig' -tags=intest,deadlock -count=1 ./pkg/util/stmtsummary/v2
  • GOCACHE=/private/tmp/tidb-go-build go test -tags=intest,deadlock -count=1 ./pkg/util/stmtsummary/v2
  • make bazel_prepare
  • GOCACHE=/private/tmp/tidb-go-build make -o tools/bin/revive lint

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

When ON, stmt summary log volume grows in proportion to the eviction rate. The feature is default OFF, uses bounded buffering, performs non-blocking enqueue on the hot path, batches writes in the background, and may drop evicted records when the queue is full.

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Add a new system variable `tidb_stmt_summary_persist_evicted` (default OFF). When enabled, the v2 (persistent) statement summary persists JSON entries for LRU-evicted records to the stmt summary log with an `"evicted":true` marker, allowing downstream consumers to track records that were dropped from memory.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added a new system variable to optionally persist evicted statement summary records, providing improved visibility into statement summary operations. Default is disabled, maintaining backward compatibility.

Review Change Stack

Introduce a new global system variable tidb_stmt_summary_log_evicted:
when ON, the v2 (persistent) backend writes a JSON entry (with
"evicted":true) to the stmt log for each LRU eviction, allowing
downstream consumers to track which records were dropped from memory.

Uses a buffered async channel so the Add() hot path never blocks; when
the channel is full, evictions are dropped and a periodic warn log
reports the dropped count. Off by default because it adds log volume
proportional to eviction rate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. labels May 20, 2026
@pantheon-ai
Copy link
Copy Markdown

pantheon-ai Bot commented May 20, 2026

@nolouch I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

ℹ️ Learn more details on Pantheon AI.

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 20, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR adds optional per-record LRU eviction logging to TiDB's v2 persistent statement summary. When the tidb_stmt_summary_log_evicted system variable is enabled, evicted statement records are asynchronously logged to persistent storage via a buffered queue and background goroutine, with explicit "evicted" JSON markers.

Changes

Statement Summary Eviction Logging

Layer / File(s) Summary
System variable configuration and wiring
pkg/sessionctx/vardef/tidb_vars.go, pkg/sessionctx/variable/sysvar.go
Introduces tidb_stmt_summary_log_evicted global boolean system variable with default value false; wires SetGlobal handler to invoke stmtsummaryv2.SetLogEvicted().
Evicted record logging infrastructure
pkg/util/stmtsummary/v2/logger.go
Adds logEvicted() method and evictedStmtRecord struct to serialize and persist individual evicted records with explicit "evicted" JSON marker.
Core eviction logging state and callbacks
pkg/util/stmtsummary/v2/stmtsummary.go
Extends StmtSummary with eviction buffering state (optLogEvicted, evictedCh, evictedDropped); implements onEvict callback to enqueue cloned records non-blockingly, and evictedLogLoop background goroutine to drain queue, invoke stmtStorage.logEvicted(), and perform graceful shutdown flushing.
Window construction and LRU eviction hook integration
pkg/util/stmtsummary/v2/stmtsummary.go
Extends stmtStorage interface with logEvicted(*StmtRecord) method; updates newStmtWindow to accept optional onEvictFn callback and modifies LRU eviction hook to invoke callback with locked records.
Constructor and lifecycle initialization
pkg/util/stmtsummary/v2/stmtsummary.go
Wires eviction logging into persistent and test constructors with callback initialization; updates window rotation/flush paths to recreate windows with eviction callback preserved across window cycles; adds package-level SetLogEvicted() proxy for system integration.
Mock storage implementation and test suite
pkg/util/stmtsummary/v2/BUILD.bazel, pkg/util/stmtsummary/v2/stmtsummary_test.go
Extends mockStmtStorage to track evicted records and implement logEvicted() with safe map cloning; increases test shard count to 14; adds TestStmtSummaryLogEvicted to validate async eviction logging with enable/disable toggling and digest verification.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • pingcap/tidb#66700: Adds eviction counting and metrics to the same v2 statement summary LRU eviction path; complementary to this PR's per-record logging feature.

Suggested labels

size/M, ok-to-test, lgtm, stmt-summary

Suggested reviewers

  • guo-shaoge
  • xzhangxian1008

Poem

🐰 When statements grow too full, the LRU must evict,
Now whispers of their passing to the logger are script!
Async queues and callbacks dance without a hitch,
Each evicted record logged—no statement left in a ditch! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title mentions 'tidb_stmt_summary_persist_evicted' but the actual variable name in the PR description is 'tidb_stmt_summary_log_evicted', creating ambiguity about which variable was implemented. Clarify the PR title to match the actual variable name: either rename to 'util/stmtsummary: add tidb_stmt_summary_log_evicted' or confirm if the variable name was changed.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description comprehensively covers the problem statement, implementation details, testing approach, side effects, documentation impact, and release notes as required by the template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/util/stmtsummary/v2/stmtsummary_test.go (1)

99-105: ⚡ Quick win

Replace time.Sleep with require.Never for deterministic negative assertion.

Using time.Sleep to verify that no further evictions are logged is not fully deterministic. Consider replacing it with require.Never, which polls periodically and provides a clearer intent that the condition should never become true.

♻️ Proposed refactor using require.Never
 	// Disable and verify no further log writes.
 	require.NoError(t, ss.SetLogEvicted(false))
 	ss.Add(GenerateStmtExecInfo4Test("digest5")) // evicts digest3
-	time.Sleep(50 * time.Millisecond)
-	storage.Lock()
-	require.Equal(t, 2, len(storage.evicted))
-	storage.Unlock()
+	require.Never(t, func() bool {
+		storage.Lock()
+		defer storage.Unlock()
+		return len(storage.evicted) != 2
+	}, 100*time.Millisecond, 10*time.Millisecond, "evicted count should remain 2 after disabling")
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/util/stmtsummary/v2/stmtsummary_test.go` around lines 99 - 105, Replace
the non-deterministic time.Sleep assertion with a polling-based negative
assertion using require.Never: after calling ss.SetLogEvicted(false) and
ss.Add(...), assert that storage.evicted never changes from the expected length
by polling the condition (use a closure that locks storage, checks
len(storage.evicted) == 2, and unlocks) for a short timeout and interval; target
the existing symbols ss.SetLogEvicted, ss.Add, and storage.evicted and use
require.Never(t, ...) from testify to reliably assert the condition instead of
sleeping.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/util/stmtsummary/v2/stmtsummary.go`:
- Around line 425-460: evictedLogLoop currently returns on s.ctx.Done() and
drains evictedCh immediately, which can lose records if Add() races during
shutdown; modify evictedLogLoop so that on ctx.Done() it first waits for the
producer shutdown gate (call the same wait used by Close(), e.g. closeWg.Wait()
or wait until s.closed is set) before doing the final drain and return, then
drain evictedCh completely calling s.storage.logEvicted(r), call report(), and
return; reference the symbols evictedLogLoop, s.ctx.Done(), evictedCh, Add(),
Close(), s.closed, closeWg.Wait(), and s.storage.logEvicted when making the
change.

---

Nitpick comments:
In `@pkg/util/stmtsummary/v2/stmtsummary_test.go`:
- Around line 99-105: Replace the non-deterministic time.Sleep assertion with a
polling-based negative assertion using require.Never: after calling
ss.SetLogEvicted(false) and ss.Add(...), assert that storage.evicted never
changes from the expected length by polling the condition (use a closure that
locks storage, checks len(storage.evicted) == 2, and unlocks) for a short
timeout and interval; target the existing symbols ss.SetLogEvicted, ss.Add, and
storage.evicted and use require.Never(t, ...) from testify to reliably assert
the condition instead of sleeping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 95ec43e1-52e2-4ce6-bea3-fd35936e0f07

📥 Commits

Reviewing files that changed from the base of the PR and between 548ad7d and 43e173e.

📒 Files selected for processing (6)
  • pkg/sessionctx/vardef/tidb_vars.go
  • pkg/sessionctx/variable/sysvar.go
  • pkg/util/stmtsummary/v2/BUILD.bazel
  • pkg/util/stmtsummary/v2/logger.go
  • pkg/util/stmtsummary/v2/stmtsummary.go
  • pkg/util/stmtsummary/v2/stmtsummary_test.go

Comment thread pkg/util/stmtsummary/v2/stmtsummary.go
…icted

# Conflicts:
#	pkg/util/stmtsummary/v2/BUILD.bazel
#	pkg/util/stmtsummary/v2/stmtsummary_test.go
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 2.23214% with 219 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.1507%. Comparing base (2c307d6) to head (f4d4e21).

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #68513        +/-   ##
================================================
- Coverage   76.3149%   75.1507%   -1.1642%     
================================================
  Files          2041       2026        -15     
  Lines        562966     568648      +5682     
================================================
- Hits         429627     427343      -2284     
- Misses       132425     141252      +8827     
+ Partials        914         53       -861     
Flag Coverage Δ
integration 41.2516% <2.2321%> (+1.5128%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 60.4610% <ø> (ø)
parser ∅ <ø> (∅)
br 49.5727% <ø> (-13.2241%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ti-chi-bot ti-chi-bot Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 22, 2026
@nolouch nolouch changed the title util/stmtsummary: add tidb_stmt_summary_log_evicted util/stmtsummary: add tidb_stmt_summary_persist_evicted May 22, 2026
@yibin87
Copy link
Copy Markdown
Contributor

yibin87 commented May 25, 2026

minor: cloneRecordForLog currently relies on the invariant that r is never nil (from the current construction path). Could we add a defensive nil guard (e.g. early return) to avoid panic if future refactors accidentally break that invariant?


// evictedCh carries per-record evictions to the async logger.
// A nil channel means evicted-logging is disabled. Sends are non-blocking.
evictedCh chan *StmtRecord
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually,even if evicted-logging is disabled, the channel seems non-nil, the comment is a little miss-leading.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the current version: the comment now says eviction persistence is controlled by optPersistEvicted, and evictedCh only carries records to the async logger with non-blocking sends.

Copy link
Copy Markdown
Contributor

@yibin87 yibin87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others LGTM

@ti-chi-bot ti-chi-bot Bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label May 25, 2026
Comment thread pkg/util/stmtsummary/v2/stmtsummary.go Outdated
@nolouch
Copy link
Copy Markdown
Member Author

nolouch commented May 29, 2026

/retest

Comment thread pkg/util/stmtsummary/v2/logger.go
@ti-chi-bot ti-chi-bot Bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 2, 2026
@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jun 3, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Jun 3, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-05-25 05:49:31.834134603 +0000 UTC m=+244841.804299934: ☑️ agreed by yibin87.
  • 2026-06-03 15:34:42.548184482 +0000 UTC m=+369383.618501873: ☑️ agreed by zimulala.

Copy link
Copy Markdown

@yudongusa yudongusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open document PR on this

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Jun 3, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yibin87, yudongusa, zimulala

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nolouch
Copy link
Copy Markdown
Member Author

nolouch commented Jun 3, 2026

…icted

# Conflicts:
#	pkg/sessionctx/vardef/tidb_vars.go
#	pkg/sessionctx/variable/sysvar.go
#	pkg/util/stmtsummary/v2/BUILD.bazel
#	pkg/util/stmtsummary/v2/stmtsummary.go
#	pkg/util/stmtsummary/v2/stmtsummary_test.go
@ti-chi-bot ti-chi-bot Bot merged commit 6c43104 into pingcap:master Jun 3, 2026
36 checks passed
@nolouch nolouch deleted the stmt_summary_log_evicted branch June 3, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants