Preserve child wrapper identity in the native AST facade#391
Preserve child wrapper identity in the native AST facade#391adamziel wants to merge 8 commits intocodex/native-lazy-ast-facadefrom
Conversation
## Summary - reuse one `WP_MySQL_Parser` instance inside the SQLite driver and reset its token stream per query - add `reset_tokens()` to the PHP parser polyfill and the Rust native parser - restore native parser-node accessor fast paths in `WP_MySQL_Native_Parser_Node`, while keeping PHP child materialization for mutation - fix the local native extension build helper for Nix/libclang bindgen by undefining `__SSE2__` during binding generation ## Stack This is the top PR in the native MySQL lexer/parser stack. The stack is split so each GitHub diff shows one reviewable concern: 1. [#384 Extract MySQL lexer and parser polyfills](#384) - `trunk` -> `codex/native-parser-php-facade` - extraction-only PHP refactor - moves the existing PHP lexer/parser implementations into polyfill classes - keeps public `WP_MySQL_Lexer` and `WP_MySQL_Parser` as thin PHP subclasses 2. [#385 Add optional native parser routing](#385) - `codex/native-parser-php-facade` -> `codex/native-parser-class-routing` - adds fallback `WP_MySQL_Native_*` PHP classes - routes the public lexer/parser classes through native classes when the Rust extension provides them - adds the minimal PHP grammar-export bridge for the native parser 3. [#386 Add lazy native parser node facade](#386) - `codex/native-parser-class-routing` -> `codex/native-parser-node-facade` - keeps `WP_Parser_Node` as the plain PHP tree node - adds `WP_MySQL_Native_Parser_Node extends WP_Parser_Node` for native-backed lazy AST nodes - keeps native AST handles and native accessor delegation out of the base node class 4. [#381 Add lazy native AST facade](#381) - `codex/native-parser-node-facade` -> `codex/native-lazy-ast-facade` - implements the Rust lexer/parser extension and lazy native AST facade - makes the Rust extension instantiate `WP_MySQL_Native_Parser_Node` - adds native-extension CI coverage for the SQLite driver and WordPress PHPUnit tests - includes the local SQLite facade smoke benchmark 5. [#387 Cache native grammar on parser grammar object](#387) - `codex/native-lazy-ast-facade` -> `codex/native-parser-object-grammar-cache` - restores the object-attached native grammar cache - adds only `WP_Parser_Grammar::$native_grammar` on the PHP side - removes the Rust content-hash cache that walked the whole exported grammar on every parser construction 6. This PR, [#388 Speed up native AST materialization](#388) - `codex/native-parser-object-grammar-cache` -> `codex/native-parser-bulk-materialization` - optimizes native-to-PHP AST access after the grammar-cache performance restoration - reuses the SQLite driver's parser instance instead of constructing it per query ## Why The native lexer/parser itself is fast, but the PHP-facing path can lose that benefit if each query repeatedly rebuilds native parser state or forces full PHP AST materialization. On the current stack, #387 already removes the large grammar export/hash cost. This PR removes the remaining per-query parser construction churn and restores the native AST accessor path for descendant-heavy SQLite driver workloads. ## Measurements Environment: local PHP 8.2 via the native build helper, release Rust extension, current top of this PR. Focused constructor/reset benchmark over 5000 unique SELECT queries: | Phase | Time | | --- | ---: | | native tokenize | 22.62 us/query | | fresh native parser constructor only | 2.31 us/query | | reusable parser `reset_tokens()` only | 0.32 us/query | | reusable parser reset + parse + `get_descendants()` | 157.06 us/query | | constructor/reset ratio | 7.3x | The previously reported ~622 us/query constructor cost does not reproduce on this stack because #387 already caches the native grammar on the PHP grammar object. Parser reuse still removes most of the remaining constructor overhead. SQLite facade smoke workload: Command: ```bash TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh ``` | Workload | PHP fallback | Native extension | Speedup | | --- | ---: | ---: | ---: | | 250 generated queries, including 1 x 2000-row insert | 4.060s | 0.525s | 7.73x | ## Testing - `cargo fmt --check` - `git diff --check` - `composer run check-cs` - `composer run test` from `packages/mysql-on-sqlite` - `php -d extension=packages/mysql-on-sqlite/ext/wp-mysql-parser/target/release/libwp_mysql_parser.so packages/mysql-on-sqlite/vendor/bin/phpunit -c packages/mysql-on-sqlite/phpunit.xml.dist` - `TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh`
The native parser extension constructs a fresh WP_MySQL_Native_Parser_Node on every accessor call, so two reads of the same logical node returned different PHP objects and any state a caller attached to the first wrapper was invisible through the second. WP_Parser_Node has always given callers stable child identity, and the lazy native facade is meant to keep that contract intact — this restores it. A per-AST identity map is created lazily on the root and shared by every interned wrapper. Each accessor that returns a node looks the index up in the map and returns the canonical instance, discarding the freshly constructed one. Materialization pulls children through the same map so mutations a caller made through get_first_child_node() before triggering append_child() survive into $this->children. Adds a regression test that exercises identity across child, descendant, and post-materialization reads, plus a walk benchmark and a CI workflow that reports parse + walk time and peak memory for the PHP and native paths so the cache cost is measurable on every PR.
Perf measurement (CI run 25179776990)Benchmarked on the same runner: parse the full MySQL server-suite corpus (69,567 queries) and walk every AST exhaustively via
Identity-cache cost on the walk path
The baseline identity check fails ( Net: the native walk is still 4× faster than pure-PHP (3.18s vs 12.84s) and uses less memory, so the cache fits comfortably under the perf budget #381 set out to deliver while restoring |
intern_all hoists the cache lookup out of the loop and inlines what was a per-item method call to intern(). For accessors whose Rust bridge returns only nodes — get_child_nodes / get_descendant_nodes — a typed intern_nodes() variant skips the instanceof check entirely. The walk benchmark exercises the descendants accessors over ~4.8M nodes per run, so even small per-item savings add up.
Perf after the hot-path optimization (CI run 25181172304, commit 6c94572)Same corpus (69,567 queries, 4.8M walked nodes), same workflow, baseline rebuilt from
Identity-cache cost — before vs after the hot-path fix
*on a different (slower) runner; both numbers are vs the same-run baseline. What changed
Net: most of the cache cost on time is gone — the walk is now within ~5% of the no-identity baseline while still preserving |
The walk benchmark we already had is cache-miss heavy (one walk per AST, every node visited once), so the identity cache shows up there as a small overhead rather than a win. The cache is supposed to pay back in hit-heavy patterns: re-walks of the same tree, repeated child reads at the root, and translator-style passes that re-enter visited subtrees. Adds three modes (--mode=rewalk|reread|subtree) and runs each on both the PR and the baseline so the comparison is apples-to-apples on the same runner, same corpus.
Hit-heavy benchmarks (CI run 25189505049, commit 2c07dac)Same corpus (69,567 queries), same runner, baseline =
Reading the numbersThe cache wins where it's supposed to — The cache loses on This is exactly the workload shape a translator/rewriter pass produces — walk once to find candidates, then re-enter subtrees by index. So while the correctness fix is right, the PHP-side cache adds 27–31% to the realistic re-entry workload. Would a Rust-side cache fix this?Yes, materially. The two cost centers a Rust cache eliminates that PHP can't:
Estimated impact, holding everything else constant:
Memory delta is structural and unchanged — one retained wrapper per visited node, regardless of where the cache lives. RecommendationThe PHP-side cache is correctness-first and ships today, but it has a real performance regression on translator-style re-entry workloads (the same workloads this codebase actually runs). The Rust-side cache is the right architecture; the numbers above quantify the headroom (10–30% recoverable on hit-heavy scenarios). Two paths:
|
Stacked on #381.
#381's review surfaced a real semantics regression in the lazy native AST facade: every accessor on
WP_MySQL_Native_Parser_Nodecalls into Rust and returns a freshly constructed PHP wrapper, soget_first_child_node()returns a different object every time.WP_Parser_Nodehas always given callers stable child identity — attach state to a child once, walk past it, walk back, the state is still there — and the lazy native facade is meant to keep that contract intact. This restores it.A per-AST
WP_MySQL_Native_AST_Cacheis created lazily on the root and shared by reference with every wrapper that gets interned through it. Each accessor looks the returned wrapper'snative_node_indexup in the cache and either returns the canonical instance or registers the new one.materialize_native_children()pulls children through the same cache so any mutation a caller made throughget_first_child_node()before the parent went throughappend_child()survives into$this->children— same instance, same mutations.Tokens are unchanged. The public token API has no mutators and no callers in this repo rely on
WP_MySQL_Tokenidentity; if that becomes a need we can extend the cache.What's in here
class-wp-mysql-native-ast-cache.php— small holder, one per AST.intern()/intern_all()/intern_nodes().materialize_native_children()reuses the interned wrappers so prior mutations don't get lost.run-native-ast-walk-benchmark.phpparses the MySQL server suite, walks each AST, and reportsparsed,walked_nodes,duration,peak_mem, and an identity-stability flag.Native AST Walk Perfworkflow runs the benchmark on this PR and on the PR base (codex/native-lazy-ast-facade) on the same runner, so the identity-cache cost is measured apples-to-apples on every push.Performance
Benchmarked on CI against the no-cache baseline, same runner, same corpus (69,567 queries, 4.8M walked nodes). Full numbers in PR comments — links below.
Hot-path optimization in
intern_all()(cache reference hoisted out of the loop, per-item logic inlined) plus a typedintern_nodes()fast path for accessors that return only nodes brought the time penalty down from an initial +17% to +5%. Memory delta is structural — one retained PHP wrapper per visited node — and is the price of stable identity.Native walk is still ~5× faster than the pure-PHP path while now preserving
WP_Parser_Nodesemantics.Test plan
Native AST Walk Perfworkflow reports the cache cost vs. the PR base on every push.