Skip to content

Commit 076b7e5

Browse files
authored
Speed up native AST materialization (#388)
## Summary - reuse one `WP_MySQL_Parser` instance inside the SQLite driver and reset its token stream per query - add `reset_tokens()` to the PHP parser polyfill and the Rust native parser - restore native parser-node accessor fast paths in `WP_MySQL_Native_Parser_Node`, while keeping PHP child materialization for mutation - fix the local native extension build helper for Nix/libclang bindgen by undefining `__SSE2__` during binding generation ## Stack This is the top PR in the native MySQL lexer/parser stack. The stack is split so each GitHub diff shows one reviewable concern: 1. [#384 Extract MySQL lexer and parser polyfills](#384) - `trunk` -> `codex/native-parser-php-facade` - extraction-only PHP refactor - moves the existing PHP lexer/parser implementations into polyfill classes - keeps public `WP_MySQL_Lexer` and `WP_MySQL_Parser` as thin PHP subclasses 2. [#385 Add optional native parser routing](#385) - `codex/native-parser-php-facade` -> `codex/native-parser-class-routing` - adds fallback `WP_MySQL_Native_*` PHP classes - routes the public lexer/parser classes through native classes when the Rust extension provides them - adds the minimal PHP grammar-export bridge for the native parser 3. [#386 Add lazy native parser node facade](#386) - `codex/native-parser-class-routing` -> `codex/native-parser-node-facade` - keeps `WP_Parser_Node` as the plain PHP tree node - adds `WP_MySQL_Native_Parser_Node extends WP_Parser_Node` for native-backed lazy AST nodes - keeps native AST handles and native accessor delegation out of the base node class 4. [#381 Add lazy native AST facade](#381) - `codex/native-parser-node-facade` -> `codex/native-lazy-ast-facade` - implements the Rust lexer/parser extension and lazy native AST facade - makes the Rust extension instantiate `WP_MySQL_Native_Parser_Node` - adds native-extension CI coverage for the SQLite driver and WordPress PHPUnit tests - includes the local SQLite facade smoke benchmark 5. [#387 Cache native grammar on parser grammar object](#387) - `codex/native-lazy-ast-facade` -> `codex/native-parser-object-grammar-cache` - restores the object-attached native grammar cache - adds only `WP_Parser_Grammar::$native_grammar` on the PHP side - removes the Rust content-hash cache that walked the whole exported grammar on every parser construction 6. This PR, [#388 Speed up native AST materialization](#388) - `codex/native-parser-object-grammar-cache` -> `codex/native-parser-bulk-materialization` - optimizes native-to-PHP AST access after the grammar-cache performance restoration - reuses the SQLite driver's parser instance instead of constructing it per query ## Why The native lexer/parser itself is fast, but the PHP-facing path can lose that benefit if each query repeatedly rebuilds native parser state or forces full PHP AST materialization. On the current stack, #387 already removes the large grammar export/hash cost. This PR removes the remaining per-query parser construction churn and restores the native AST accessor path for descendant-heavy SQLite driver workloads. ## Measurements Environment: local PHP 8.2 via the native build helper, release Rust extension, current top of this PR. Focused constructor/reset benchmark over 5000 unique SELECT queries: | Phase | Time | | --- | ---: | | native tokenize | 22.62 us/query | | fresh native parser constructor only | 2.31 us/query | | reusable parser `reset_tokens()` only | 0.32 us/query | | reusable parser reset + parse + `get_descendants()` | 157.06 us/query | | constructor/reset ratio | 7.3x | The previously reported ~622 us/query constructor cost does not reproduce on this stack because #387 already caches the native grammar on the PHP grammar object. Parser reuse still removes most of the remaining constructor overhead. SQLite facade smoke workload: Command: ```bash TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh ``` | Workload | PHP fallback | Native extension | Speedup | | --- | ---: | ---: | ---: | | 250 generated queries, including 1 x 2000-row insert | 4.060s | 0.525s | 7.73x | ## Testing - `cargo fmt --check` - `git diff --check` - `composer run check-cs` - `composer run test` from `packages/mysql-on-sqlite` - `php -d extension=packages/mysql-on-sqlite/ext/wp-mysql-parser/target/release/libwp_mysql_parser.so packages/mysql-on-sqlite/vendor/bin/phpunit -c packages/mysql-on-sqlite/phpunit.xml.dist` - `TMP_TEST_NATIVE_QUERY_COUNT=250 ./tmp-test-native/run.sh`
1 parent 09b9c1a commit 076b7e5

5 files changed

Lines changed: 164 additions & 29 deletions

File tree

.github/workflows/mysql-parser-extension-tests.yml

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,10 @@ jobs:
8181
exit( 1 );
8282
}
8383
'
84-
./vendor/bin/phpunit -c ./phpunit.xml.dist tests/mysql/WP_MySQL_Lexer_Tests.php tests/parser/WP_Parser_Node_Tests.php
84+
working-directory: packages/mysql-on-sqlite
85+
86+
- name: Run PHPUnit tests with parser extension
87+
run: php -d extension="$GITHUB_WORKSPACE/packages/php-ext-wp-mysql-parser/target/debug/libwp_mysql_parser.so" ./vendor/bin/phpunit -c ./phpunit.xml.dist
8588
working-directory: packages/mysql-on-sqlite
8689

8790
sqlite-driver-extension-tests:
@@ -149,3 +152,7 @@ jobs:
149152
exit( 1 );
150153
}
151154
'
155+
156+
- name: Run PHPUnit tests with SQLite driver using parser extension
157+
run: php -d extension="$GITHUB_WORKSPACE/packages/php-ext-wp-mysql-parser/target/debug/libwp_mysql_parser.so" ./vendor/bin/phpunit -c ./phpunit.xml.dist
158+
working-directory: packages/mysql-on-sqlite

.github/workflows/wp-tests-phpunit.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,8 @@ jobs:
5454
- name: Build and load parser extension in WordPress PHP containers
5555
run: bash .github/workflows/wp-tests-phpunit-native-extension-setup.sh
5656

57-
- name: Verify WordPress uses parser extension
58-
run: cd wordpress && node tools/local-env/scripts/docker.js run --rm php php /var/www/native-verify-extension.php
57+
- name: Run WordPress PHPUnit tests with parser extension
58+
run: node .github/workflows/wp-tests-phpunit-run.js
5959

6060
- name: Stop Docker containers
6161
if: always()

packages/mysql-on-sqlite/src/mysql/class-wp-mysql-parser.php

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,17 @@ class WP_MySQL_Parser extends WP_Parser {
88
*/
99
private $current_ast;
1010

11+
/**
12+
* Reset this parser with a new token stream.
13+
*
14+
* @param array<WP_Parser_Token> $tokens The parser tokens.
15+
*/
16+
public function reset_tokens( array $tokens ): void {
17+
$this->tokens = $tokens;
18+
$this->position = 0;
19+
$this->current_ast = null;
20+
}
21+
1122
/**
1223
* Parse the next query from the input SQL string.
1324
*

packages/mysql-on-sqlite/src/sqlite/class-wp-pdo-mysql-on-sqlite.php

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -410,6 +410,13 @@ class WP_PDO_MySQL_On_SQLite extends PDO {
410410
*/
411411
private static $mysql_grammar;
412412

413+
/**
414+
* A reusable parser instance for MySQL queries.
415+
*
416+
* @var WP_MySQL_Parser|null
417+
*/
418+
private $mysql_parser = null;
419+
413420
/**
414421
* The main database name.
415422
*
@@ -1160,11 +1167,27 @@ public function create_parser( string $query ): WP_MySQL_Parser {
11601167
);
11611168
if ( $lexer instanceof WP_MySQL_Native_Lexer ) {
11621169
$tokens = $lexer->native_token_stream();
1163-
return new WP_MySQL_Parser( self::$mysql_grammar, $tokens );
1170+
return $this->reset_or_create_parser( $tokens );
11641171
}
11651172

11661173
$tokens = $lexer->remaining_tokens();
1167-
return new WP_MySQL_Parser( self::$mysql_grammar, $tokens );
1174+
return $this->reset_or_create_parser( $tokens );
1175+
}
1176+
1177+
/**
1178+
* Reset the reusable parser with new tokens or create it on first use.
1179+
*
1180+
* @param array<WP_Parser_Token>|object $tokens Parser tokens.
1181+
* @return WP_MySQL_Parser A parser initialized for the token stream.
1182+
*/
1183+
private function reset_or_create_parser( $tokens ): WP_MySQL_Parser {
1184+
if ( null === $this->mysql_parser || ! method_exists( $this->mysql_parser, 'reset_tokens' ) ) {
1185+
$this->mysql_parser = new WP_MySQL_Parser( self::$mysql_grammar, $tokens );
1186+
} else {
1187+
$this->mysql_parser->reset_tokens( $tokens );
1188+
}
1189+
1190+
return $this->mysql_parser;
11681191
}
11691192

11701193
/**

0 commit comments

Comments
 (0)