Skip to content

fix: bound retained AI file-outline memory per repo (APP-4794)#13141

Draft
warp-dev-github-integration[bot] wants to merge 1 commit into
masterfrom
fix/ai-outline-memory-budget-app-4794
Draft

fix: bound retained AI file-outline memory per repo (APP-4794)#13141
warp-dev-github-integration[bot] wants to merge 1 commit into
masterfrom
fix/ai-outline-memory-budget-app-4794

Conversation

@warp-dev-github-integration

Copy link
Copy Markdown
Contributor

Description

The AI repo symbol-outline index retains gigabytes of owned Symbol strings in memory.

ai::index::file_outline::build_outline parses every parsable file in a repository in parallel and parse_file_outline allocates owned Strings for each symbol (name, type_prefix, comment). The resulting FileOutlines are collected into Outline.file_id_to_outline and held per repo by RepoOutlines.outlines (app/src/ai/outline/native.rs).

Per-file size is capped (MAX_FILE_SIZE = 3 MB) and per-repo file count is capped (MAX_REPO_FILE_SIZE_LIMIT = 5000), but there is no cap on the cumulative bytes of the retained outline. One outline is kept per detected repo across all working directories, so large repos — and especially multiple indexed repos — grow the index to multiple GB.

Heap profile evidence

A heap profile from Sentry event 981dd4e8127345738eac30598946cff9 (group 7259255054, "Excessive memory usage detected") was unsymbolized on upload; symbolized offline against the build dSYM it showed:

  • Process footprint ≈ 10 GB; sampled live heap ≈ 7.09 GiB
  • ai::index::file_outline::native::parse_file_outline (cumulative): ~6.0 GiB / 84%, on rayon workers via build_outline
    • native.rs:250 name.to_owned()53%
    • native.rs:259 collect_vec() (building Vec<Symbol>) — 30%

This is a distinct facet from the other open outline PRs (#11982 batches parsing to reduce peak memory; #11984 reuses the tree-sitter Parser to reduce fragmentation) — neither bounds the retained outline.

Fix

Track the cumulative approximate heap size of the outlines built so far and stop building new outlines once a generous 256 MiB per-repo budget is exceeded. The index then degrades to a bounded partial outline instead of growing without limit. Typical repos are far below the budget and are unaffected.

  • Symbol::approx_heap_size / FileOutline::approx_heap_size estimate retained bytes.
  • build_outline accumulates this across rayon workers (AtomicUsize) and skips remaining files once over budget, logging once.

Follow-up (noted in APP-4794): optionally extend the same budget to the incremental Outline::update path.

Linked Issue

Testing

  • cargo check -p ai --features local_fs
  • cargo clippy -p ai --features local_fs --all-targets -- -D warnings
  • cargo test -p ai --features local_fs -- index::file_outline — 5 passed (3 new approx_heap_size tests + 2 existing)

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

CHANGELOG-BUG-FIX: Bound the memory used by AI codebase symbol indexing so very large repositories no longer cause excessive memory usage.

Conversation: https://staging.warp.dev/conversation/d4d630aa-3fbf-42bc-aec0-194a8a1de4f7
Run: https://oz.staging.warp.dev/runs/019f0e38-2fc5-73f7-91a7-7c7af781fbec

This PR was generated with Oz.

`ai::index::file_outline::build_outline` -> `parse_file_outline` allocates
owned `Symbol` strings for every parsable file and retains them per-repo in
`RepoOutlines`. Per-file size (3MB) and file count (5000) are capped, but the
cumulative outline size is unbounded, so large or numerous repositories grow
the in-memory index to multiple GB.

A symbolized heap profile from Sentry 7259255054 showed ~6 GB (84%) of live
heap retained here (`name.to_owned()` 53%, `collect_vec()` 30%).

Track the cumulative heap size of the retained outlines while parsing and stop
building new outlines once a generous 256 MiB per-repo budget is exceeded, so
the index degrades to a bounded partial outline instead of growing without
limit. Typical repos are well under the limit and unaffected.

Co-Authored-By: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant