From 0f756f9bb089b1afed6c8b387e93605fc897abc7 Mon Sep 17 00:00:00 2001 From: Ashwini Ahire Date: Thu, 7 May 2026 16:07:51 +0800 Subject: [PATCH] Enhance who-ate-my-memory with intro, section headers, and cross-links - Add introduction listing major ClickHouse memory consumers and explaining the role of SYSTEM JEMALLOC PURGE. - Add section headers and short context for each existing query block: all-in-one breakdown, drill-down by subsystem, live polling, retrospective query_log/part_log analysis, trace_log analysis, server text logs. - Fix typo: ThreadStacksVirual -> ThreadStacksVirtual. - Add a Next steps section linking to memory-configuration-settings, memory-overcommit, configure_clickhouse_for_low_mem_envs, and system-tables-eat-my-disk. - No changes to any SQL or shell logic. --- .../altinity-kb-who-ate-my-memory.md | 120 +++++++++++++++--- 1 file changed, 101 insertions(+), 19 deletions(-) diff --git a/content/en/altinity-kb-setup-and-maintenance/altinity-kb-who-ate-my-memory.md b/content/en/altinity-kb-setup-and-maintenance/altinity-kb-who-ate-my-memory.md index 6c7ff553a3..4630f51d61 100644 --- a/content/en/altinity-kb-setup-and-maintenance/altinity-kb-who-ate-my-memory.md +++ b/content/en/altinity-kb-setup-and-maintenance/altinity-kb-who-ate-my-memory.md @@ -1,13 +1,46 @@ --- title: "Who ate my ClickHouse® memory?" linkTitle: "Who ate my memory?" +weight: 100 description: > - *"It was here a few minutes ago..."* -keywords: + *"It was here a few minutes ago..."* — finding what is using ClickHouse® RAM. +keywords: - clickhouse memory - clickhouse memory usage + - MemoryTracking + - jemalloc --- +When `MemoryTracking` is high or the OS shows ClickHouse® RSS far above what +queries account for, the goal is to attribute the memory to one of its +consumers. The major ones are: + +- **Caches** — mark cache, uncompressed cache, mmap cache, query cache, + filesystem cache, compiled-expression cache, primary-index cache. +- **Primary keys** loaded in memory for every active part (`system.parts.primary_key_bytes_in_memory_allocated`). +- **Dictionaries** (`system.dictionaries.bytes_allocated`). +- **Running queries** (`system.processes.memory_usage`) and **merges/mutations** + (`system.merges.memory_usage`). +- **In-memory engines** — `Memory`, `Set`, `Join`, `Buffer` tables. +- **Async inserts** waiting to flush (`system.asynchronous_inserts`). +- **In-memory parts** (`part_type = 'InMemory'`). +- **Virtual / accounting overhead** — file read buffers, thread stacks, mmaps. +- **jemalloc retained pages** — memory the allocator has freed but not yet + returned to the OS. This inflates RSS without being attributable to any + ClickHouse subsystem; running `SYSTEM JEMALLOC PURGE` first makes the rest + of the numbers comparable. + +For the related limits and overcommit behavior, see +[memory configuration settings](/altinity-kb-setup-and-maintenance/altinity-kb-memory-configuration-settings/), +[Memory Overcommiter](/altinity-kb-setup-and-maintenance/altinity-kb-memory-overcommit/), and +[Configure ClickHouse for low memory environments](/altinity-kb-setup-and-maintenance/configure_clickhouse_for_low_mem_envs/). + +## All-in-one breakdown + +Run this first. It purges jemalloc retained pages, then unions every memory +consumer into one `(group, name, val)` result so you can compare them at a +glance. + ```sql SYSTEM JEMALLOC PURGE; @@ -39,7 +72,7 @@ SELECT 'AsyncInserts' as group, 'db:'||database as name, toInt64(sum(total_bytes UNION ALL SELECT 'FileBuffersVirtual' as group, metric as name, toInt64(value * 2*1024*1024) FROM system.metrics WHERE metric like 'OpenFileFor%' UNION ALL -SELECT 'ThreadStacksVirual' as group, metric as name, toInt64(value * 8*1024*1024) FROM system.metrics WHERE metric = 'GlobalThread' +SELECT 'ThreadStacksVirtual' as group, metric as name, toInt64(value * 8*1024*1024) FROM system.metrics WHERE metric = 'GlobalThread' UNION ALL SELECT 'UserMemoryTracking' as group, user as name, toInt64(memory_usage) FROM system.user_processes UNION ALL @@ -48,24 +81,36 @@ select 'QueryCacheBytes' as group, '', toInt64(sum(result_size)) FROM system.que SELECT 'MemoryTracking' as group, 'total' as name, toInt64(value) FROM system.metrics WHERE metric = 'MemoryTracking' ``` +> Note: `FileBuffersVirtual` and `ThreadStacksVirtual` are *upper bounds* on +> reservation, not committed RSS — useful for spotting runaway thread/file-handle +> counts but not directly comparable to `MemoryTracking`. + +## Drill down by subsystem + +Once the all-in-one query points at a suspect group, use these to inspect it. + ```sql -SELECT *, formatReadableSize(value) -FROM system.metrics +-- Live memory/cache metrics +SELECT *, formatReadableSize(value) +FROM system.metrics WHERE (metric ilike '%Cach%' or metric ilike '%Mem%') and value != 0 order by metric format PrettyCompactMonoBlock; -SELECT *, formatReadableSize(value) -FROM system.asynchronous_metrics -WHERE metric like '%Cach%' or metric like '%Mem%' +SELECT *, formatReadableSize(value) +FROM system.asynchronous_metrics +WHERE metric like '%Cach%' or metric like '%Mem%' order by metric format PrettyCompactMonoBlock; -SELECT event_time, metric, value, formatReadableSize(value) -FROM system.asynchronous_metric_log -WHERE event_time > now() - 600 and (metric like '%Cach%' or metric like '%Mem%') and value <> 0 +-- Last 10 minutes of memory/cache metrics from the async log +SELECT event_time, metric, value, formatReadableSize(value) +FROM system.asynchronous_metric_log +WHERE event_time > now() - 600 and (metric like '%Cach%' or metric like '%Mem%') and value <> 0 order by metric, event_time format PrettyCompactMonoBlock; +-- Dictionaries SELECT formatReadableSize(sum(bytes_allocated)) FROM system.dictionaries; +-- In-memory engines SELECT database, name, @@ -73,18 +118,19 @@ SELECT FROM system.tables WHERE engine IN ('Memory','Set','Join'); +-- Primary keys + in-memory parts SELECT sumIf(data_uncompressed_bytes, part_type = 'InMemory') as memory_parts, formatReadableSize(sum(primary_key_bytes_in_memory)) AS primary_key_bytes_in_memory, formatReadableSize(sum(primary_key_bytes_in_memory_allocated)) AS primary_key_bytes_in_memory_allocated FROM system.parts; +-- Merges, in-flight queries, query cache SELECT formatReadableSize(sum(memory_usage)) FROM system.merges; - SELECT formatReadableSize(sum(memory_usage)) FROM system.processes; - select formatReadableSize(sum(result_size)) FROM system.query_cache; +-- Top current queries by peak memory SELECT initial_query_id, elapsed, @@ -95,6 +141,7 @@ FROM system.processes ORDER BY peak_memory_usage DESC LIMIT 10; +-- Top recent (last 2h) finished queries by memory SELECT type, event_time, @@ -105,14 +152,20 @@ FROM system.query_log WHERE (event_date >= today()) AND (event_time >= (now() - 7200)) ORDER BY memory_usage DESC LIMIT 10; - ``` +## Polling memory live + +When the suspect is moving (an active merge or query), poll the relevant +tables in a loop. Run from the host shell. + +Just merges + processes: + ```bash for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_empty_set=0 -q "select (select 'Merges: \ '||formatReadableSize(sum(memory_usage)) from system.merges), (select \ 'Processes: '||formatReadableSize(sum(memory_usage)) from system.processes)";\ -sleep 3; done +sleep 3; done Merges: 96.57 MiB Processes: 41.98 MiB Merges: 82.24 MiB Processes: 41.91 MiB @@ -121,6 +174,8 @@ Merges: 66.49 MiB Processes: 37.13 MiB Merges: 67.78 MiB Processes: 37.13 MiB ``` +Wider view — merges, queries, primary keys, in-memory tables, dictionaries: + ```bash echo " Merges Processes PrimaryK TempTabs Dicts"; \ for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_empty_set=0 -q "select \ @@ -130,7 +185,7 @@ for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_emp (select leftPad(formatReadableSize(sum(total_bytes)),15, ' ') from system.tables \ WHERE engine IN ('Memory','Set','Join'))|| (select leftPad(formatReadableSize(sum(bytes_allocated)),15, ' ') FROM system.dictionaries) -"; sleep 3; done +"; sleep 3; done Merges Processes PrimaryK TempTabs Dicts 0.00 B 0.00 B 21.36 MiB 1.58 GiB 911.07 MiB @@ -140,7 +195,13 @@ for i in `seq 1 600`; do clickhouse-client --empty_result_for_aggregation_by_emp ``` -## retrospection analysis of the RAM usage based on query_log and part_log (shows peaks) +## Retrospective: peaks from query_log + part_log + +If the spike already happened, reconstruct the timeline by replaying +allocations and releases from `query_log`, `part_log`, and `query_views_log`. +Each row contributes `+peak_memory_usage` at start and `-peak_memory_usage` +at end; the running sum approximates concurrent RAM use, broken out by event +type. ```sql WITH @@ -201,7 +262,11 @@ ORDER BY timeframe FORMAT PrettyCompactMonoBlock; ``` -## retrospection analysis of trace_log +## Retrospective: trace_log + +`trace_log` records `MemoryPeak` samples per query. This bucketizes them in +5-minute windows and surfaces the worst offender per bucket — useful when +the query is gone but the trace is still on disk. ```sql WITH @@ -239,9 +304,26 @@ ORDER BY t ASC; -- later on you can check particular query_ids in query_log ``` -## analysis of the server text logs +## Server text logs + +`MemoryTracker` lines in the server log show every time a query, merge, or +global tracker hits a peak or a limit — handy when system tables have already +rolled over. ``` grep MemoryTracker /var/log/clickhouse-server.log zgrep MemoryTracker /var/log/clickhouse-server.log.*.gz ``` + +## Next steps + +Once you've identified the culprit: + +- Hitting `Memory limit (for query) exceeded` — see + [memory configuration settings](/altinity-kb-setup-and-maintenance/altinity-kb-memory-configuration-settings/) + and consider enabling the + [Memory Overcommiter](/altinity-kb-setup-and-maintenance/altinity-kb-memory-overcommit/). +- Caches/dictionaries dominating on a small box — see + [Configure ClickHouse for low memory environments](/altinity-kb-setup-and-maintenance/configure_clickhouse_for_low_mem_envs/). +- System tables themselves are large — see + [system tables eat my disk](/altinity-kb-setup-and-maintenance/altinity-kb-system-tables-eat-my-disk/).