From cf297024ae7f5ac49d386e25bf52a2f7605dce71 Mon Sep 17 00:00:00 2001 From: nolouch Date: Wed, 3 Jun 2026 13:17:23 -0700 Subject: [PATCH 1/2] docs: add statement summary system variables --- statement-summary-tables.md | 4 ++++ system-variable-reference.md | 16 ++++++++++++++++ system-variables.md | 22 ++++++++++++++++++++++ 3 files changed, 42 insertions(+) diff --git a/statement-summary-tables.md b/statement-summary-tables.md index db190136c129c..ab203ac9f8e12 100644 --- a/statement-summary-tables.md +++ b/statement-summary-tables.md @@ -88,6 +88,7 @@ The following is a sample output of querying `statements_summary`: > > - In TiDB, the time unit of fields in statement summary tables is nanosecond (ns), whereas in MySQL the time unit is picosecond (ps). > - Starting from v7.5.1 and v7.6.0, for clusters with [resource control](/tidb-resource-control-ru-groups.md) enabled, `statements_summary` will be aggregated by resource group, for example, the same statements executed in different resource groups will be collected as different records. +> - Starting from v8.5.7, you can use [`tidb_stmt_summary_group_by_user`](/system-variables.md#tidb_stmt_summary_group_by_user-new-in-v857) to control whether to further aggregate statement summaries by execution user. When the variable value is `ON`, the same SQL digest executed by different users is collected as different records, and the `SAMPLE_USER` field of each record indicates the execution user corresponding to the record. ## `statements_summary_history` @@ -145,6 +146,8 @@ The following system variables are used to control the statement summary: - `tidb_stmt_summary_max_sql_length`: Specifies the longest display length of `DIGEST_TEXT` and `QUERY_SAMPLE_TEXT`. The default value is `4096`. - `tidb_stmt_summary_internal_query`: Determines whether to count the TiDB SQL statements. `1` means to count, and `0` means not to count. The default value is `0`. +- `tidb_stmt_summary_group_by_user`: Determines whether to further aggregate statement summaries by execution user. `1` means to aggregate by user, and `0` means not to aggregate by user. The default value is `0`. After this variable is enabled, the same SQL digest executed by different users is aggregated into different rows, which might increase the number of statement summary records and memory usage. Modifying this variable clears the current in-memory statement summary data. +- `tidb_stmt_summary_persist_evicted`: Determines whether to write statement summary records that are evicted by LRU to the statement summary log after [statements summary persistence](#persist-statements-summary) is enabled. `1` means to write, and `0` means not to write. The default value is `0`. After this variable is enabled, the log contains JSON records marked with `"evicted": true`, and the amount of logs increases as LRU evictions become more frequent. An example of the statement summary configuration is shown as follows: @@ -262,6 +265,7 @@ After statements summary persistence is enabled, the memory keeps only the curre > > - When statements summary persistence is enabled, the `tidb_stmt_summary_history_size` configuration described in the [Parameter configuration](#parameter-configuration) section will no longer take effect because the memory does not keep the history data. Instead, the following three configurations will be used to control the retention period and size of history data for persistence: [`tidb_stmt_summary_file_max_days`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_days-new-in-v660), [`tidb_stmt_summary_file_max_size`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_size-new-in-v660), and [`tidb_stmt_summary_file_max_backups`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_backups-new-in-v660). > - The smaller the value of `tidb_stmt_summary_refresh_interval`, the more immediate data is written to the disk. However, this also means more redundant data is written to the disk. +> - Starting from v8.5.7, you can enable [`tidb_stmt_summary_persist_evicted`](/system-variables.md#tidb_stmt_summary_persist_evicted-new-in-v857) to write records evicted by LRU to the statement summary log. The written JSON records are marked with `"evicted": true` for downstream log consumers to identify and are not returned as query results of `statements_summary_history` or `cluster_statements_summary_history`. diff --git a/system-variable-reference.md b/system-variable-reference.md index 76a846ad2b039..9dd862baee3c4 100644 --- a/system-variable-reference.md +++ b/system-variable-reference.md @@ -3949,6 +3949,14 @@ Referenced in: - [TiDB Configuration File](/tidb-configuration-file.md) - [TiDB 6.6.0 Release Notes](/releases/release-6.6.0.md) +### tidb_stmt_summary_group_by_user + +Referenced in: + +- [SHOW [GLOBAL|SESSION] VARIABLES](/sql-statements/sql-statement-show-variables.md) +- [Statement Summary Tables](/statement-summary-tables.md) +- [System Variables](/system-variables.md#tidb_stmt_summary_group_by_user-new-in-v857) + ### tidb_stmt_summary_history_size Referenced in: @@ -3992,6 +4000,14 @@ Referenced in: - [TiDB 5.0.4 Release Notes](/releases/release-5.0.4.md) - [TiDB 4.0.14 Release Notes](/releases/release-4.0.14.md) +### tidb_stmt_summary_persist_evicted + +Referenced in: + +- [SHOW [GLOBAL|SESSION] VARIABLES](/sql-statements/sql-statement-show-variables.md) +- [Statement Summary Tables](/statement-summary-tables.md) +- [System Variables](/system-variables.md#tidb_stmt_summary_persist_evicted-new-in-v857) + ### tidb_stmt_summary_refresh_interval Referenced in: diff --git a/system-variables.md b/system-variables.md index c16e0dc2ee434..22e9e628866ac 100644 --- a/system-variables.md +++ b/system-variables.md @@ -6330,6 +6330,17 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). +### tidb_stmt_summary_group_by_user New in v8.5.7 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Boolean +- Default value: `OFF` +- This variable controls whether to include the user that executes SQL statements as an aggregation dimension in [statement summary tables](/statement-summary-tables.md). When the variable value is `OFF`, the same SQL digest executed by different users is aggregated into the same row, and the `SAMPLE_USER` field displays one sampled user. When the variable value is `ON`, the same SQL digest executed by different users is aggregated into different rows, and the `SAMPLE_USER` field of each row indicates the execution user corresponding to the row. +- Modifying this variable clears the current in-memory statement summary data because data before and after the modification uses different aggregation dimensions. Historical data that has been persisted to the disk is not affected. +- After this variable is enabled, the number of statement summary records might increase with the number of different execution users for the same SQL digest, which increases memory usage. + ### tidb_stmt_summary_history_size New in v4.0 > **Note:** @@ -6405,6 +6416,17 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). +### tidb_stmt_summary_persist_evicted New in v8.5.7 + +- Scope: GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No +- Type: Boolean +- Default value: `OFF` +- This variable controls whether to write statement summary records that are evicted by LRU to the statement summary log after [statements summary persistence](/statement-summary-tables.md#persist-statements-summary) is enabled. The written JSON records are marked with `"evicted": true` for downstream log consumers to identify. +- This variable takes effect only for the persistent implementation of statement summary. Records marked with `"evicted": true` are not returned as query results of `statements_summary_history` or `cluster_statements_summary_history`. +- After this variable is enabled, the amount of statement summary logs increases as LRU evictions become more frequent. Evicted records are written by using an asynchronous buffer mechanism. When the buffer queue is full, new evicted records might be dropped. + ### tidb_stmt_summary_refresh_interval New in v4.0 > **Note:** From b8ce27055ea3f9833f02fbd5f0cb1804781d6599 Mon Sep 17 00:00:00 2001 From: nolouch Date: Wed, 3 Jun 2026 13:23:28 -0700 Subject: [PATCH 2/2] docs: address statement summary review comments --- statement-summary-tables.md | 8 ++++---- system-variables.md | 12 ++++++------ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/statement-summary-tables.md b/statement-summary-tables.md index ab203ac9f8e12..b2d09628b0ee8 100644 --- a/statement-summary-tables.md +++ b/statement-summary-tables.md @@ -88,7 +88,7 @@ The following is a sample output of querying `statements_summary`: > > - In TiDB, the time unit of fields in statement summary tables is nanosecond (ns), whereas in MySQL the time unit is picosecond (ps). > - Starting from v7.5.1 and v7.6.0, for clusters with [resource control](/tidb-resource-control-ru-groups.md) enabled, `statements_summary` will be aggregated by resource group, for example, the same statements executed in different resource groups will be collected as different records. -> - Starting from v8.5.7, you can use [`tidb_stmt_summary_group_by_user`](/system-variables.md#tidb_stmt_summary_group_by_user-new-in-v857) to control whether to further aggregate statement summaries by execution user. When the variable value is `ON`, the same SQL digest executed by different users is collected as different records, and the `SAMPLE_USER` field of each record indicates the execution user corresponding to the record. +> - Starting from v8.5.7, you can use [`tidb_stmt_summary_group_by_user`](/system-variables.md#tidb_stmt_summary_group_by_user-new-in-v857) to control whether to aggregate statement summaries by execution user. When this variable is set to `ON`, TiDB aggregates the same SQL digest executed by different users into separate records, and the `SAMPLE_USER` field of each record indicates the user who executed the statement. ## `statements_summary_history` @@ -146,8 +146,8 @@ The following system variables are used to control the statement summary: - `tidb_stmt_summary_max_sql_length`: Specifies the longest display length of `DIGEST_TEXT` and `QUERY_SAMPLE_TEXT`. The default value is `4096`. - `tidb_stmt_summary_internal_query`: Determines whether to count the TiDB SQL statements. `1` means to count, and `0` means not to count. The default value is `0`. -- `tidb_stmt_summary_group_by_user`: Determines whether to further aggregate statement summaries by execution user. `1` means to aggregate by user, and `0` means not to aggregate by user. The default value is `0`. After this variable is enabled, the same SQL digest executed by different users is aggregated into different rows, which might increase the number of statement summary records and memory usage. Modifying this variable clears the current in-memory statement summary data. -- `tidb_stmt_summary_persist_evicted`: Determines whether to write statement summary records that are evicted by LRU to the statement summary log after [statements summary persistence](#persist-statements-summary) is enabled. `1` means to write, and `0` means not to write. The default value is `0`. After this variable is enabled, the log contains JSON records marked with `"evicted": true`, and the amount of logs increases as LRU evictions become more frequent. +- `tidb_stmt_summary_group_by_user`: Determines whether to aggregate statement summaries by execution user. `1` means to aggregate by user, and `0` means not to aggregate by user. The default value is `0`. After you enable this variable, TiDB aggregates the same SQL digest executed by different users into separate rows, which might increase the number of statement summary records and memory usage. Modifying this variable clears the current in-memory statement summary data. +- `tidb_stmt_summary_persist_evicted`: Determines whether to write statement summary records evicted by LRU to the statement summary log after you enable [statements summary persistence](#persist-statements-summary). `1` means to write, and `0` means not to write. The default value is `0`. After you enable this variable, the log contains JSON records marked with `"evicted": true`, and the log volume increases as LRU evictions become more frequent. An example of the statement summary configuration is shown as follows: @@ -265,7 +265,7 @@ After statements summary persistence is enabled, the memory keeps only the curre > > - When statements summary persistence is enabled, the `tidb_stmt_summary_history_size` configuration described in the [Parameter configuration](#parameter-configuration) section will no longer take effect because the memory does not keep the history data. Instead, the following three configurations will be used to control the retention period and size of history data for persistence: [`tidb_stmt_summary_file_max_days`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_days-new-in-v660), [`tidb_stmt_summary_file_max_size`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_size-new-in-v660), and [`tidb_stmt_summary_file_max_backups`](/tidb-configuration-file.md#tidb_stmt_summary_file_max_backups-new-in-v660). > - The smaller the value of `tidb_stmt_summary_refresh_interval`, the more immediate data is written to the disk. However, this also means more redundant data is written to the disk. -> - Starting from v8.5.7, you can enable [`tidb_stmt_summary_persist_evicted`](/system-variables.md#tidb_stmt_summary_persist_evicted-new-in-v857) to write records evicted by LRU to the statement summary log. The written JSON records are marked with `"evicted": true` for downstream log consumers to identify and are not returned as query results of `statements_summary_history` or `cluster_statements_summary_history`. +> - Starting from v8.5.7, you can enable [`tidb_stmt_summary_persist_evicted`](/system-variables.md#tidb_stmt_summary_persist_evicted-new-in-v857) to write records evicted by LRU to the statement summary log. TiDB marks these JSON records with `"evicted": true` for downstream log consumers to identify. TiDB does not return these records as query results of `statements_summary_history` or `cluster_statements_summary_history`. diff --git a/system-variables.md b/system-variables.md index 22e9e628866ac..a346c2adff68a 100644 --- a/system-variables.md +++ b/system-variables.md @@ -6337,9 +6337,9 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean - Default value: `OFF` -- This variable controls whether to include the user that executes SQL statements as an aggregation dimension in [statement summary tables](/statement-summary-tables.md). When the variable value is `OFF`, the same SQL digest executed by different users is aggregated into the same row, and the `SAMPLE_USER` field displays one sampled user. When the variable value is `ON`, the same SQL digest executed by different users is aggregated into different rows, and the `SAMPLE_USER` field of each row indicates the execution user corresponding to the row. -- Modifying this variable clears the current in-memory statement summary data because data before and after the modification uses different aggregation dimensions. Historical data that has been persisted to the disk is not affected. -- After this variable is enabled, the number of statement summary records might increase with the number of different execution users for the same SQL digest, which increases memory usage. +- This variable controls whether to include the user who executes SQL statements as an aggregation dimension in [statement summary tables](/statement-summary-tables.md). When this variable is set to `OFF`, TiDB aggregates the same SQL digest executed by different users into the same row, and the `SAMPLE_USER` field displays one sampled user. When this variable is set to `ON`, TiDB aggregates the same SQL digest executed by different users into separate rows, and the `SAMPLE_USER` field of each row indicates the user who executed the statement. +- Modifying this variable clears the current in-memory statement summary data because data before and after the modification uses different aggregation dimensions. This does not affect historical data persisted to the disk. +- After you enable this variable, the number of statement summary records might increase with the number of different execution users for the same SQL digest, which increases memory usage. ### tidb_stmt_summary_history_size New in v4.0 @@ -6423,9 +6423,9 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No - Type: Boolean - Default value: `OFF` -- This variable controls whether to write statement summary records that are evicted by LRU to the statement summary log after [statements summary persistence](/statement-summary-tables.md#persist-statements-summary) is enabled. The written JSON records are marked with `"evicted": true` for downstream log consumers to identify. -- This variable takes effect only for the persistent implementation of statement summary. Records marked with `"evicted": true` are not returned as query results of `statements_summary_history` or `cluster_statements_summary_history`. -- After this variable is enabled, the amount of statement summary logs increases as LRU evictions become more frequent. Evicted records are written by using an asynchronous buffer mechanism. When the buffer queue is full, new evicted records might be dropped. +- This variable controls whether to write statement summary records evicted by LRU to the statement summary log after you enable [statements summary persistence](/statement-summary-tables.md#persist-statements-summary). TiDB marks these JSON records with `"evicted": true` for downstream log consumers to identify. +- This variable takes effect only for the persistent implementation of statement summary. TiDB does not return records marked with `"evicted": true` as query results of `statements_summary_history` or `cluster_statements_summary_history`. +- After you enable this variable, the log volume increases as LRU evictions become more frequent. TiDB writes evicted records using an asynchronous buffer mechanism. When the buffer queue is full, TiDB might drop new evicted records. ### tidb_stmt_summary_refresh_interval New in v4.0