[NFC] cache repeated tree walks to avoid O(N^2) in optimizeTerminatingTails in CodeFolding#8602
Open
Changqing-JING wants to merge 6 commits intoWebAssembly:mainfrom
Open
[NFC] cache repeated tree walks to avoid O(N^2) in optimizeTerminatingTails in CodeFolding#8602Changqing-JING wants to merge 6 commits intoWebAssembly:mainfrom
Changqing-JING wants to merge 6 commits intoWebAssembly:mainfrom
Conversation
7454548 to
a1924cf
Compare
c2c710f to
82d92bb
Compare
82d92bb to
672efc7
Compare
Contributor
Author
|
@kripken |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cache the result of getBranchTargets(getFunction()->body) in optimizeTerminatingTails so that recursive calls share the same computed set rather than each re-walking the entire function body. This avoids O(N²) behavior where N is the size of the function body, since the recursive calls previously each performed an O(N) tree walk. The cached targets are computed lazily on first need and passed through to the canMove overload that accepts pre-computed branch targets.
Benmark data
For the test case in #7319 (comment)
Main head:
time ./build/bin/wasm-opt --code-folding --enable-bulk-memory --enable-multivalue --enable-reference-types --enable-gc --enable-tail-call --enable-exception-handling -o /dev/null ./test3.wasm real 5m45.996s user 6m6.267s sys 0m3.798sThis PR:
time ./build/bin/wasm-opt --code-folding --enable-bulk-memory --enable-multivalue --enable-reference-types --enable-gc --enable-tail-call --enable-exception-handling -o /dev/null ./test3.wasm real 2m2.380s user 2m25.700s sys 0m2.449sBenchmark regression test
Test case: https://jetbrains.github.io/kotlinconf-app/73cbe24d7cf5a54d37ad.wasm
On main
On current PR