fix: bound retained AI file-outline memory per repo (APP-4794)#13141
Draft
warp-dev-github-integration[bot] wants to merge 1 commit into
Draft
fix: bound retained AI file-outline memory per repo (APP-4794)#13141warp-dev-github-integration[bot] wants to merge 1 commit into
warp-dev-github-integration[bot] wants to merge 1 commit into
Conversation
`ai::index::file_outline::build_outline` -> `parse_file_outline` allocates owned `Symbol` strings for every parsable file and retains them per-repo in `RepoOutlines`. Per-file size (3MB) and file count (5000) are capped, but the cumulative outline size is unbounded, so large or numerous repositories grow the in-memory index to multiple GB. A symbolized heap profile from Sentry 7259255054 showed ~6 GB (84%) of live heap retained here (`name.to_owned()` 53%, `collect_vec()` 30%). Track the cumulative heap size of the retained outlines while parsing and stop building new outlines once a generous 256 MiB per-repo budget is exceeded, so the index degrades to a bounded partial outline instead of growing without limit. Typical repos are well under the limit and unaffected. Co-Authored-By: Oz <oz-agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The AI repo symbol-outline index retains gigabytes of owned
Symbolstrings in memory.ai::index::file_outline::build_outlineparses every parsable file in a repository in parallel andparse_file_outlineallocates ownedStrings for each symbol (name,type_prefix,comment). The resultingFileOutlines are collected intoOutline.file_id_to_outlineand held per repo byRepoOutlines.outlines(app/src/ai/outline/native.rs).Per-file size is capped (
MAX_FILE_SIZE = 3 MB) and per-repo file count is capped (MAX_REPO_FILE_SIZE_LIMIT = 5000), but there is no cap on the cumulative bytes of the retained outline. One outline is kept per detected repo across all working directories, so large repos — and especially multiple indexed repos — grow the index to multiple GB.Heap profile evidence
A heap profile from Sentry event
981dd4e8127345738eac30598946cff9(group 7259255054, "Excessive memory usage detected") was unsymbolized on upload; symbolized offline against the build dSYM it showed:ai::index::file_outline::native::parse_file_outline(cumulative): ~6.0 GiB / 84%, on rayon workers viabuild_outlinenative.rs:250name.to_owned()— 53%native.rs:259collect_vec()(buildingVec<Symbol>) — 30%This is a distinct facet from the other open outline PRs (#11982 batches parsing to reduce peak memory; #11984 reuses the tree-sitter
Parserto reduce fragmentation) — neither bounds the retained outline.Fix
Track the cumulative approximate heap size of the outlines built so far and stop building new outlines once a generous 256 MiB per-repo budget is exceeded. The index then degrades to a bounded partial outline instead of growing without limit. Typical repos are far below the budget and are unaffected.
Symbol::approx_heap_size/FileOutline::approx_heap_sizeestimate retained bytes.build_outlineaccumulates this across rayon workers (AtomicUsize) and skips remaining files once over budget, logging once.Follow-up (noted in APP-4794): optionally extend the same budget to the incremental
Outline::updatepath.Linked Issue
Testing
cargo check -p ai --features local_fscargo clippy -p ai --features local_fs --all-targets -- -D warningscargo test -p ai --features local_fs -- index::file_outline— 5 passed (3 newapprox_heap_sizetests + 2 existing)Agent Mode
CHANGELOG-BUG-FIX: Bound the memory used by AI codebase symbol indexing so very large repositories no longer cause excessive memory usage.
Conversation: https://staging.warp.dev/conversation/d4d630aa-3fbf-42bc-aec0-194a8a1de4f7
Run: https://oz.staging.warp.dev/runs/019f0e38-2fc5-73f7-91a7-7c7af781fbec
This PR was generated with Oz.