Skip to content

Commit 104a872

Browse files
authored
Support MySQL BINARY operator (#369)
## Summary MySQL supports `BINARY expr` as a shorthand for `CAST(expr AS BINARY)` — a type cast that forces byte-by-byte string comparison. The translator previously stripped the `BINARY` keyword entirely (`translate_token()` returned `null`), which silently produced wrong results in two cases: 1. **Columns with explicit `COLLATE NOCASE`** (notably `information_schema` tables, where `TABLE_NAME` is `NOCASE`): `WHERE TABLE_NAME = BINARY 'Foo'` became case-insensitive instead of case-sensitive. 2. **`CAST(x AS BINARY) = y`**: translated to `CAST(x AS BLOB) = y`, which always returned `FALSE` due to SQLite's storage-class ordering (`BLOB > TEXT`), even when the bytes were equal. ## Translation `BINARY expr` is now emitted as `expr COLLATE BINARY`: ```sql -- MySQL SELECT * FROM t WHERE a = BINARY b SELECT a FROM t ORDER BY BINARY a SELECT BINARY 'abc' -- Translated SQLite SELECT * FROM `t` WHERE `a` = `b` COLLATE BINARY SELECT `a` FROM `t` ORDER BY `a` COLLATE BINARY SELECT 'abc' COLLATE BINARY AS `BINARY 'abc'` ``` `CAST(x AS BINARY)` and `CONVERT(x, BINARY)` now emit `CAST(x AS TEXT) COLLATE BINARY` via a shared helper — the value keeps `TEXT` storage class (so equality against `TEXT` works) while the explicit BINARY collation preserves byte-by-byte comparison semantics: ```sql -- MySQL SELECT CAST('abc' AS BINARY) SELECT CONVERT('abc', BINARY) -- Translated SQLite SELECT CAST('abc' AS TEXT) COLLATE BINARY AS `CAST('abc' AS BINARY)` SELECT CAST('abc' AS TEXT) COLLATE BINARY AS `CONVERT('abc', BINARY)` ``` SQLite's `COLLATE BINARY` overrides an operand's declared collation, so the `NOCASE`-column case in `information_schema` now works correctly. ## Additional fix The last commit fixes a pre-existing bug in `translate_select_item()` surfaced during this work: its alias-inference heuristic (`$item === translate($textStringLiteral)`) misfired for `CONVERT(expr USING charset)`, producing alias `` `Customer` `` instead of the original expression text for `SELECT CONVERT('Customer' USING utf8mb4)`. Replaced with a structural walk that stops at a `textStringLiteral` only when each intermediate level has exactly one child node. Fixes #31
1 parent c214cda commit 104a872

3 files changed

Lines changed: 160 additions & 17 deletions

File tree

packages/mysql-on-sqlite/src/sqlite/class-wp-pdo-mysql-on-sqlite.php

Lines changed: 58 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3960,9 +3960,9 @@ private function translate_token( WP_MySQL_Token $token ): ?string {
39603960
return 'AUTOINCREMENT';
39613961
case WP_MySQL_Lexer::BINARY_SYMBOL:
39623962
/*
3963-
* There is no "BINARY expr" equivalent in SQLite. We look for the
3964-
* keyword from a higher level to respect it in particular cases
3965-
* (REGEXP, LIKE, etc.) and then remove it from the output here.
3963+
* "BINARY expr" is translated in "translate_simple_expr_body()".
3964+
* Returning null here is a safety net for any unhandled context
3965+
* where a bare BINARY token would otherwise leak into the output.
39663966
*/
39673967
return null;
39683968
case WP_MySQL_Lexer::SQL_CALC_FOUND_ROWS_SYMBOL:
@@ -4310,6 +4310,24 @@ private function translate_simple_expr_body( WP_Parser_Node $node ): string {
43104310
);
43114311
}
43124312

4313+
/*
4314+
* Translate "BINARY expr" to "expr COLLATE BINARY".
4315+
*
4316+
* The MySQL BINARY operator enforces byte-by-byte string comparison.
4317+
* In SQLite, COLLATE BINARY is equivalent in comparison contexts.
4318+
*/
4319+
if ( null !== $token && WP_MySQL_Lexer::BINARY_SYMBOL === $token->id ) {
4320+
$expr = $node->get_first_child_node( 'simpleExpr' );
4321+
return sprintf( '%s COLLATE BINARY', $this->translate( $expr ) );
4322+
}
4323+
4324+
// Translate "CAST(expr AS type)" to its SQLite equivalent.
4325+
if ( null !== $token && WP_MySQL_Lexer::CAST_SYMBOL === $token->id ) {
4326+
$expr = $node->get_first_child_node( 'expr' );
4327+
$cast_type = $node->get_first_child_node( 'castType' );
4328+
return $this->translate_cast_expr( $expr, $cast_type );
4329+
}
4330+
43134331
/**
43144332
* Translate MySQL CONVERT() expression.
43154333
*
@@ -4318,23 +4336,44 @@ private function translate_simple_expr_body( WP_Parser_Node $node ): string {
43184336
* 2. CONVERT(expr USING charset): Converts the character set.
43194337
*/
43204338
if ( null !== $token && WP_MySQL_Lexer::CONVERT_SYMBOL === $token->id ) {
4321-
$expr = $this->translate( $node->get_first_child_node( 'expr' ) );
4339+
$expr = $node->get_first_child_node( 'expr' );
43224340
$cast_type = $node->get_first_child_node( 'castType' );
43234341

43244342
if ( null !== $cast_type ) {
43254343
// CONVERT(expr, type): Translate to cast expression.
43264344
// TODO: Emulate UNSIGNED cast. SQLite has no unsigned integer type.
4327-
return sprintf( 'CAST(%s AS %s)', $expr, $this->translate( $cast_type ) );
4345+
return $this->translate_cast_expr( $expr, $cast_type );
43284346
} else {
43294347
// CONVERT(expr USING charset): Keep "expr" as is (no SQLite support).
43304348
// TODO: Consider rejecting UTF-8-incompatible charasets.
4331-
return $expr;
4349+
return $this->translate( $expr );
43324350
}
43334351
}
43344352

43354353
return $this->translate_sequence( $node->get_children() );
43364354
}
43374355

4356+
/**
4357+
* Translate a MySQL CAST expression to SQLite.
4358+
*
4359+
* Shared by the CAST(expr AS type) and CONVERT(expr, type) forms.
4360+
*
4361+
* @param WP_Parser_Node $expr The "expr" AST node.
4362+
* @param WP_Parser_Node $cast_type The "castType" AST node.
4363+
* @return string The translated SQLite expression.
4364+
*/
4365+
private function translate_cast_expr( WP_Parser_Node $expr, WP_Parser_Node $cast_type ): string {
4366+
/*
4367+
* Translate "CAST(expr AS BINARY)" to "CAST(expr AS TEXT) COLLATE BINARY".
4368+
* Emitting "CAST(expr AS BLOB)" would break equality against TEXT values
4369+
* due to SQLite's storage-class ordering (BLOB > TEXT).
4370+
*/
4371+
if ( $cast_type->has_child_token( WP_MySQL_Lexer::BINARY_SYMBOL ) ) {
4372+
return sprintf( 'CAST(%s AS TEXT) COLLATE BINARY', $this->translate( $expr ) );
4373+
}
4374+
return sprintf( 'CAST(%s AS %s)', $this->translate( $expr ), $this->translate( $cast_type ) );
4375+
}
4376+
43384377
/**
43394378
* Translate a MySQL LIKE expression to SQLite.
43404379
*
@@ -4693,11 +4732,20 @@ public function translate_select_item( WP_Parser_Node $node ): string {
46934732
*
46944733
* For example, for "SELECT 'abc'", the resulting column name is "abc"
46954734
* in MySQL, but would be "'abc'" in SQLite if an alias was not used.
4735+
*
4736+
* Descend the AST until we reach a textStringLiteral. If at any level
4737+
* we don't have a single child node, bail out; it's not a bare literal.
46964738
*/
4697-
$text_string_literal = $node->get_first_descendant_node( 'textStringLiteral' );
4698-
$is_text_string_literal = $text_string_literal && $item === $this->translate( $text_string_literal );
4699-
if ( $is_text_string_literal ) {
4700-
$alias = $text_string_literal->get_first_child_token()->get_value();
4739+
$current = $node;
4740+
while ( 'textStringLiteral' !== $current->rule_name ) {
4741+
$children = $current->get_children();
4742+
if ( 1 !== count( $children ) || ! $children[0] instanceof WP_Parser_Node ) {
4743+
break;
4744+
}
4745+
$current = $children[0];
4746+
}
4747+
if ( 'textStringLiteral' === $current->rule_name ) {
4748+
$alias = $current->get_first_child_token()->get_value();
47014749

47024750
// When the literal value contains a NULL byte, MySQL truncates the
47034751
// resulting identifier at the position of the first one of them.

packages/mysql-on-sqlite/tests/WP_SQLite_Driver_Tests.php

Lines changed: 49 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -281,15 +281,60 @@ public function testDeleteWithAliasAndLimit() {
281281
}
282282

283283
public function testCastAsBinary() {
284-
$this->assertQuery(
285-
// Use a confusing alias to make sure it replaces only the correct token
286-
"SELECT CAST('ABC' AS BINARY) as `binary`;"
287-
);
284+
// Use a confusing alias to make sure it replaces only the correct token
285+
$this->assertQuery( "SELECT CAST('ABC' AS BINARY) as `binary`" );
288286
$results = $this->engine->get_query_results();
289287
$this->assertCount( 1, $results );
290288
$this->assertEquals( 'ABC', $results[0]->binary );
291289
}
292290

291+
public function testBinaryOperator() {
292+
// Use a NOCASE column so plain "=" is case-insensitive. This makes the
293+
// BINARY operator's effect (forcing byte-by-byte comparison) visible.
294+
$this->assertQuery( 'CREATE TABLE t (name TEXT COLLATE NOCASE NOT NULL)' );
295+
$this->assertQuery( "INSERT INTO t (name) VALUES ('abc'), ('ABC')" );
296+
297+
// Sanity: with NOCASE, plain "=" matches both rows.
298+
$result = $this->assertQuery( "SELECT name FROM t WHERE name = 'abc' ORDER BY name" );
299+
$this->assertCount( 2, $result );
300+
301+
// "= BINARY 'x'" forces byte-by-byte equality.
302+
$result = $this->assertQuery( "SELECT name FROM t WHERE name = BINARY 'abc'" );
303+
$this->assertCount( 1, $result );
304+
$this->assertEquals( 'abc', $result[0]->name );
305+
306+
// "BINARY x = 'X'" is symmetric.
307+
$result = $this->assertQuery( "SELECT name FROM t WHERE BINARY name = 'ABC'" );
308+
$this->assertCount( 1, $result );
309+
$this->assertEquals( 'ABC', $result[0]->name );
310+
311+
// "ORDER BY BINARY" sorts by byte value ('ABC' before 'abc').
312+
$result = $this->assertQuery( 'SELECT name FROM t ORDER BY BINARY name' );
313+
$this->assertEquals( 'ABC', $result[0]->name );
314+
$this->assertEquals( 'abc', $result[1]->name );
315+
316+
// CAST(expr AS BINARY) = expr
317+
$result = $this->assertQuery( "SELECT CAST('abc' AS BINARY) = 'abc' AS r" );
318+
$this->assertEquals( 1, $result[0]->r );
319+
320+
// CONVERT(expr, BINARY) = expr
321+
$result = $this->assertQuery( "SELECT CONVERT('abc', BINARY) = 'abc' AS r" );
322+
$this->assertEquals( 1, $result[0]->r );
323+
324+
// The "information_schema.TABLES.TABLE_NAME" column uses COLLATE NOCASE.
325+
// Let's verify that the BINARY override works for this scenario as well.
326+
$this->assertQuery( 'CREATE TABLE CaseSensitive (id INT)' );
327+
328+
$result = $this->assertQuery( "SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_NAME = 'casesensitive'" );
329+
$this->assertCount( 1, $result );
330+
331+
$result = $this->assertQuery( "SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_NAME = BINARY 'casesensitive'" );
332+
$this->assertCount( 0, $result );
333+
334+
$result = $this->assertQuery( "SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_NAME = BINARY 'CaseSensitive'" );
335+
$this->assertCount( 1, $result );
336+
}
337+
293338
public function testSelectFromDual() {
294339
$result = $this->assertQuery(
295340
'SELECT 1 as output FROM DUAL'

packages/mysql-on-sqlite/tests/WP_SQLite_Driver_Translation_Tests.php

Lines changed: 53 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ public function testSelect(): void {
104104
public function testConvert(): void {
105105
// CONVERT(expr, type) → CAST(expr AS type)
106106
$this->assertQuery(
107-
"SELECT CAST('abc' AS BLOB) AS `CONVERT('abc', BINARY)`",
107+
"SELECT CAST('abc' AS TEXT) COLLATE BINARY AS `CONVERT('abc', BINARY)`",
108108
"SELECT CONVERT('abc', BINARY)"
109109
);
110110

@@ -120,12 +120,12 @@ public function testConvert(): void {
120120

121121
// CONVERT(expr USING charset) → expr
122122
$this->assertQuery(
123-
"SELECT 'Customer' AS `Customer`",
123+
"SELECT 'Customer' AS `CONVERT('Customer' USING utf8mb4)`",
124124
"SELECT CONVERT('Customer' USING utf8mb4)"
125125
);
126126

127127
$this->assertQuery(
128-
"SELECT 'test' AS `test`",
128+
"SELECT 'test' AS `CONVERT('test' USING utf8)`",
129129
"SELECT CONVERT('test' USING utf8)"
130130
);
131131

@@ -136,6 +136,56 @@ public function testConvert(): void {
136136
);
137137
}
138138

139+
public function testBinary(): void {
140+
// "BINARY expr" on the left side of comparison
141+
$this->assertQuery(
142+
'SELECT `a` COLLATE BINARY = `b` AS `BINARY a = b` FROM `t`',
143+
'SELECT BINARY a = b FROM t'
144+
);
145+
146+
// "BINARY expr" on the right side of comparison
147+
$this->assertQuery(
148+
'SELECT `a` = `b` COLLATE BINARY AS `a = BINARY b` FROM `t`',
149+
'SELECT a = BINARY b FROM t'
150+
);
151+
152+
// "BINARY literal"
153+
$this->assertQuery(
154+
"SELECT 'abc' COLLATE BINARY AS `BINARY 'abc'`",
155+
"SELECT BINARY 'abc'"
156+
);
157+
158+
// "BINARY expr" in ORDER BY
159+
$this->assertQuery(
160+
'SELECT `a` FROM `t` ORDER BY `a` COLLATE BINARY',
161+
'SELECT a FROM t ORDER BY BINARY a'
162+
);
163+
164+
// "BINARY expr" in GROUP BY
165+
$this->assertQuery(
166+
'SELECT `a` FROM `t` GROUP BY `a` COLLATE BINARY',
167+
'SELECT a FROM t GROUP BY BINARY a'
168+
);
169+
170+
// "BINARY expr" wrapping a parenthesized expression
171+
$this->assertQuery(
172+
"SELECT ( `a` || `b` ) COLLATE BINARY = 'x' AS `BINARY (a || b) = 'x'` FROM `t`",
173+
"SELECT BINARY (a || b) = 'x' FROM t"
174+
);
175+
176+
// "CAST(expr AS BINARY)" → "CAST(expr AS TEXT) COLLATE BINARY"
177+
$this->assertQuery(
178+
"SELECT CAST('abc' AS TEXT) COLLATE BINARY AS `CAST('abc' AS BINARY)`",
179+
"SELECT CAST('abc' AS BINARY)"
180+
);
181+
182+
// "CAST(expr AS BINARY) = expr" → "CAST(expr AS TEXT) COLLATE BINARY = expr"
183+
$this->assertQuery(
184+
"SELECT CAST('abc' AS TEXT) COLLATE BINARY = 'abc' AS `CAST('abc' AS BINARY) = 'abc'`",
185+
"SELECT CAST('abc' AS BINARY) = 'abc'"
186+
);
187+
}
188+
139189
public function testInsert(): void {
140190
$this->driver->query( 'CREATE TABLE t (c INT, c1 INT, c2 INT)' );
141191
$this->driver->query( 'CREATE TABLE t1 (c1 INT, c2 INT)' );

0 commit comments

Comments
 (0)