Skip to content

[#600] prevent case-sensitive testing with BETWEEN where infeasible#818

Merged
alanking merged 1 commit intoirods:mainfrom
d-w-moore:600.m
May 6, 2026
Merged

[#600] prevent case-sensitive testing with BETWEEN where infeasible#818
alanking merged 1 commit intoirods:mainfrom
d-w-moore:600.m

Conversation

@d-w-moore
Copy link
Copy Markdown
Collaborator

@d-w-moore d-w-moore commented Apr 27, 2026

Tests using the BETWEEN operator are now skipped if they are predicated on being case sensitive and if the installed object catalog ignores alphabetic case for query conditions dependent on lexical order (as tested by BETWEEN, >, >=, <, and <=.)

Comment thread irods/test/query_test.py Outdated
@korydraughn
Copy link
Copy Markdown
Contributor

It just occurred to me. Why does the PRC have tests which care about the database in use at all? Tests which care about that sort of thing should be part of the server.

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

It just occurred to me. Why does the PRC have tests which care about the database in use at all? Tests which care about that sort of thing should be part of the server.

This doesn't care about which DB, just whether the server is evincing default case-insensitive behavior at the outset. If it is, there's no point in running the BETWEEN tests that will predictably error out.

If you agree more with Alan's temporary solution, in which we just comment those tests out so that they never run, I'm fine with that too. In that case, we just close this PR and forget about it.

I rather prefer a situation in which these tests run by default. Unless they shouldn't. Since Postsgres is the rule for the usual test scenario (i.e. our current set of PRC-test Github Actions) I prefer to keep the changes of this PR. But that is only my preference.

@alanking
Copy link
Copy Markdown
Contributor

It just occurred to me. Why does the PRC have tests which care about the database in use at all? Tests which care about that sort of thing should be part of the server.

Are you suggesting that this might be a bug in the server? It does seem odd that GenQuery would yield different results depending on database flavor. Kind of defeats the purpose a little bit.

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

d-w-moore commented Apr 28, 2026

This doesn't care about which DB, just whether the server is evincing default case-insensitive behavior at the outset. If it is, there's no point in running the BETWEEN tests that will predictably error out.

I think what I've done in this PR is simply allow the BETWEEN operator to be tested independently of the LT, GT type comparisons that many on the Internet have asserted are equivalent (but see my caveat at the end of this comment.) So, we end up doing what I think was intended : make sure the operator is actually doing as-intended in interactions between client and server.

Of course, it will be moot if we find out that sometimes x BETWEEN a b is not semantically equivalent to

  a <= x <= b || b <= x <= a

in GenQuery/SQL.

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

It just occurred to me. Why does the PRC have tests which care about the database in use at all? Tests which care about that sort of thing should be part of the server.

Are you suggesting that this might be a bug in the server? It does seem odd that GenQuery would yield different results depending on database flavor. Kind of defeats the purpose a little bit.

Not a server bug. Kory's original point was that BETWEEN is passed right to the database. aka, it's not the server's job to mediate between client expectations and various flavors of DB. Only to interface. True, @korydraughn ?

@trel
Copy link
Copy Markdown
Member

trel commented Apr 28, 2026

Looks like postgres and mysql will both return False if a is not less than b.

https://dev.mysql.com/doc/refman/8.4/en/comparison-operators.html#operator_between :

mysql> SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
        -> 1, 0

https://www.postgresql.org/docs/current/functions-comparison.html#FUNCTIONS-COMPARISON-PRED-TABLE :

datatype BETWEEN datatype AND datatype → boolean
Between (inclusive of the range endpoints).
2 BETWEEN 1 AND 3 → t
2 BETWEEN 3 AND 1 → f

postgres offers BETWEEN SYMMETRIC to handle that case

https://www.postgresql.org/docs/current/functions-comparison.html#FUNCTIONS-COMPARISON-PRED-TABLE :

datatype BETWEEN SYMMETRIC datatype AND datatype → boolean
Between, after sorting the two endpoint values.
2 BETWEEN SYMMETRIC 3 AND 1 → t

So, i'm not sure the point of discussion about case-insensitivity... do we want to offer that? feels confusing to me.

@korydraughn
Copy link
Copy Markdown
Contributor

Are you suggesting that this might be a bug in the server? It does seem odd that GenQuery would yield different results depending on database flavor. Kind of defeats the purpose a little bit.

@alanking I don't necessarily think it's a bug. I'm pointing out that addressing behavioral differences for the BETWEEN keyword across databases feels like something that should be part of the core server tests rather than the PRC.

I think what I've done in this PR is simply allow the BETWEEN operator to be tested independently of the LT, GT type comparisons that many on the Internet have asserted are equivalent (but see my caveat at the end of this comment.) So, we end up doing what I think was intended : make sure the operator is actually doing as-intended in interactions between client and server.

@d-w-moore I need to hear more. Does this make the PRC special?

Not a server bug. Kory's original point was that BETWEEN is passed right to the database. aka, it's not the server's job to mediate between client expectations and various flavors of DB. Only to interface. True, @korydraughn ?

True. Now, I'm wondering if we should enforce our own rules around BETWEEN.

That said, let's discuss offline.

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

Looks like postgres and mysql will both return False if a is not less than b.

https://dev.mysql.com/doc/refman/8.4/en/comparison-operators.html#operator_between :

mysql> SELECT 2 BETWEEN 1 AND 3, 2 BETWEEN 3 and 1;
        -> 1, 0

https://www.postgresql.org/docs/current/functions-comparison.html#FUNCTIONS-COMPARISON-PRED-TABLE :

datatype BETWEEN datatype AND datatype → boolean
Between (inclusive of the range endpoints).
2 BETWEEN 1 AND 3 → t
2 BETWEEN 3 AND 1 → f

postgres offers BETWEEN SYMMETRIC to handle that case

https://www.postgresql.org/docs/current/functions-comparison.html#FUNCTIONS-COMPARISON-PRED-TABLE :

datatype BETWEEN SYMMETRIC datatype AND datatype → boolean
Between, after sorting the two endpoint values.
2 BETWEEN SYMMETRIC 3 AND 1 → t

So, i'm not sure the point of discussion about case-insensitivity... do we want to offer that? feels confusing to me.

I wasn't aware of SYMMETRIC being available, but it makes sense. Wrt the discussion, it turns out to be apropos where strings are concerned: see in the discussion here the results of 'a' to 'A' comparisons in various default DB setups. They will surprise you.

Oh - final point - BETWEEN is polymorphic, as in having different behavior based on the data type to which it is applied. So case folding in MySQL is absolutely something that bears warning against in the docs.

@d-w-moore d-w-moore self-assigned this Apr 30, 2026
@trel
Copy link
Copy Markdown
Member

trel commented Apr 30, 2026

okay, so what is left to do in THIS PR? are we just deleting the tests?

@korydraughn
Copy link
Copy Markdown
Contributor

The PR now reflects the removal of those tests and we have an issue for documenting how users should approach the BETWEEN keyword.

When the tests pass, I think we'll be ready to squash.

@d-w-moore You agree?

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

yep . will squash

@korydraughn
Copy link
Copy Markdown
Contributor

There's one unresolved (outdated) review comment too.

Copy link
Copy Markdown
Contributor

@korydraughn korydraughn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is ready?

Need an issue number in the commit message.

@d-w-moore
Copy link
Copy Markdown
Collaborator Author

Added issue number to commit message

Copy link
Copy Markdown
Contributor

@korydraughn korydraughn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pound it.

Not all databases are predictable as lexicographical order mixes up case in
arbitrarily varying ways.  'A' can be less than, greater than, or equal to 'a'
depending on the DB and/or its configuration.
@d-w-moore
Copy link
Copy Markdown
Collaborator Author

Pounded

@alanking alanking merged commit c19dd38 into irods:main May 6, 2026
16 checks passed
@d-w-moore
Copy link
Copy Markdown
Collaborator Author

okay, so what is left to do in THIS PR? are we just deleting the tests?

Just deletion, here. Notating dangers of dependending on DB case sensitivity in the iRODS docs proper with this PR irods/irods_docs#420

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants