Skip to content

feat(sql): add SQLDialect for PostgreSQL setll/setgt optimisation#126

Draft
lanarimarco wants to merge 12 commits into
developfrom
feat/add-postgressqldialect
Draft

feat(sql): add SQLDialect for PostgreSQL setll/setgt optimisation#126
lanarimarco wants to merge 12 commits into
developfrom
feat/add-postgressqldialect

Conversation

@lanarimarco

Copy link
Copy Markdown
Collaborator

Summary

  • Introduces a SQLDialect interface so each database can generate optimal positioning SQL for setll/setgt operations
  • DefaultSQLDialect uses a UNION-based decomposition that correctly handles composite keys with lexicographic ordering (fixes a bug where getReadCoherentSql() applied AND >= per key independently instead of the correct UNION form)
  • PostgreSQLDialect uses row value constructors ((k1, k2) >= (?, ?)) — semantically correct and index-efficient; wraps each query execution with a scoped autoCommit=false toggle required for server-side cursors, with fetchSize=100 for streaming
  • reload.dialect.enabled property allows opting out of dialect selection per connection
  • SQLDBMManager derives the dialect from the JDBC URL via SQLDialect.forUrl() and passes it down to SQLDBFileNative2SQLAdapter
  • SQLDBFile applies the fetchSize hint and issues an explicit commit() after write/update/delete when autoCommit is temporarily disabled
  • Unit tests cover all four SETLL/SETGT × forward/backward combinations and pin the exact WHERE clause shape for strict-exclusion cases

Test plan

  • PostgreSQLDialectTest — unit tests for buildPositioningConditions (all four positioning combinations)
  • SQLReadEqualTest — verify corrected row count for setllReadeWithLessKeyEof (409 instead of 155, which was a composite-key ordering bug)
  • Run SQL module tests: mvn test -pl sql
  • For a PostgreSQL instance: set PG_USER, PG_PASSWORD, PG_DIALECT_ENABLED=true and confirm setll/setgt/reade/readp behave correctly with composite keys

claude and others added 12 commits June 29, 2026 16:11
Introduces a SQLDialect interface so each database can generate
optimal positioning SQL for setll/setgt operations:

- DefaultSQLDialect: UNION-based approach that correctly handles
  composite keys with lexicographic ordering. Fixes a bug where
  getReadCoherentSql() generated AND >= for each key independently
  instead of the correct UNION decomposition.

- PostgreSQLDialect: uses row value constructors
  e.g. ("k1","k2","k3") >= (?,?,?) — correct and index-efficient.
  Also sets autoCommit=false (required for server-side cursors)
  and fetchSize=100 for efficient streaming.

Other changes:
- SQLDBMManager: derives dialect from JDBC URL via SQLDialect.forUrl();
  calls configureConnection() after opening the connection; passes
  dialect to SQLDBFile.
- SQLDBFile: accepts dialect; passes it to Native2SQL; applies
  fetchSize hint in executeQuery(); adds explicit commit() after
  write/update/delete when autoCommit=false.
- pom.xml: removes unused testcontainers:junit-4 dependency (not
  resolvable; not needed since lifecycle is managed via @BeforeClass).
- SQLReadEqualTest: updates expected count for setllReadeWithLessKeyEof
  from 155 to 409 — the old value reflected the composite-key bug
  (only CS province), the new value correctly covers all Calabrian
  provinces from CS onwards (CS, CZ, KR, RC, VV).
- SQLPostgreSQLDialectTest: new Testcontainers-based test that spins
  up a real PostgreSQL Docker container and verifies setll/setgt/
  reade/readp behavior with 4-key composite municipality data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Pg4KVk3EVDhG8CCkqU261D
… dialect opt-out

- Replace configureConnection with withQueryExecution so PostgreSQLDialect
  saves and restores autoCommit around each query instead of setting it
  permanently on connection open
- Remove dialect.configureConnection call from SQLDBMManager connection setup
- Add reload.dialect.enabled property to allow disabling dialect per connection
- Remove testcontainers-based SQLPostgreSQLDialectTest and its dependency
  in favour of env-var driven connection config (PG_USER, PG_PASSWORD,
  PG_DIALECT_ENABLED)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Cover all four SETLL/SETGT × forward/backward combinations:
- SETLL forward  → >= without NOT exclusion
- SETGT forward  → >= with NOT exclusion (params duplicated)
- SETLL backward → <= with NOT exclusion (params duplicated)
- SETGT backward → <= without NOT exclusion

Also include two SQL-shape assertions that pin the exact WHERE clause
produced for the strict-exclusion cases.
- Add destroyDatabase lambda to POSTGRESQL config to drop/recreate schema
- Fix destroyView and destroyIndex to execute the prepared statement (were no-ops)
- Drop before create in createFile and createAndPopulateEmployeeView to allow re-runs
- Add PostgreSQL CLUSTER step after employee view creation for stable ORDER BY on ties
- Quote table and column identifiers in SQLExecuteQueryTest for PG compatibility
- Replace hardcoded TestSQLDBType.HSQLDB with dbManagerForTest() in SQLJAsinchLoggingTest
Set idle_in_transaction_session_timeout on first query and tolerate a
dead connection when restoring autoCommit, so an abandoned ResultSet
can't hold locks indefinitely. Route executeQuery's reset through the
existing closeResultSet() helper so the dialect hook fires.
createDatabase now (re)creates the public schema for the PostgreSQL
test dialect, guarding against a prior test run's destroyDatabase step
having dropped it. This makes the test suite robust when run
repeatedly against a real PostgreSQL instance in CI, instead of
assuming the schema is always left in place between runs.
closeFile() and the close() cleanup loop were no-ops (commented out
since Dec 2023), so opened SQLDBFile instances were never told to
close their ResultSet/PreparedStatements. On PostgreSQL this left
transactions dangling until idle_in_transaction_session_timeout
killed them, blocking subsequent DDL (e.g. test schema teardown) for
up to 30s. openFile now tracks the returned SQLDBFile so closeFile/
close can release it, closing any stale handle for the same name
first.
Replace PostgreSQLDialect's per-query beforeQuery/afterResultSetClose
reference counting with onConnectionOpened/onConnectionClosing, invoked
exactly once per connection instead of once per query/ResultSet.

Add DBMManager.abort() (defaults to close()) so a unit of work can be
rolled back instead of committed. ConnectionProvider.withScope now
calls abort() on every manager it created when the scoped block
throws, and close() on normal completion — needed because other
components can write directly against the same shared connection via
ConnectionProvider.currentDataSource()/requireDataSource(), and a
failure partway through a scope must not silently commit their
half-finished work.

SQLPooledDBMManager now also closes its openedFiles before returning
the connection to the pool, matching the equivalent fix already made
to the base SQLDBMManager.
SQLDBMManager connections now stay in one transaction for their whole
lifetime instead of per query, so a test class that never closed its
dbManager left the connection idle-in-transaction indefinitely. The
next test class's DDL (DROP/CREATE TABLE) would then block on that
held lock until PostgreSQL's idle_in_transaction_session_timeout
(30s) killed the abandoned connection — a real, systematic ~30s stall
per test-class boundary, confirmed via a real PostgreSQL run.

Close dbManager before destroyDatabase() in @afterclass for every
affected test class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants