[SPARK-57738][CONNECT] Restore fast-fail guard for nanosecond timestamp types in ArrowVectorReader by jubins · Pull Request #56849 · apache/spark

jubins · 2026-06-28T15:29:25Z

What is the purpose of the change

Fixes SPARK-57738 — restores the fast-fail guard for nanosecond-precision timestamp types in ArrowVectorReader, which was silently broken by SPARK-57303.

SPARK-57303 updated UpCastRule.canUpCast to return true for lossless widening within the timestamp family (e.g. TimestampType -> TimestampLTZNanosType(p)). As a side effect, the existing unsupported-type guard in ArrowVectorReader.applyDefault no longer rejects nanosecond timestamp targets — the SPARK-57303 commit message explicitly flagged this as a known follow-up item.

Without this fix, a request to read a TIMESTAMP_LTZ(p) or TIMESTAMP_NTZ(p) (p in [7, 9]) column over Spark Connect silently passes the guard and then crashes with a confusing "Unsupported Vector Type" error from the catch-all branch of the vector match. With this fix it fails fast with a clear "not yet supported" message.

Brief change log

sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala: added AnyTimestampNanoType to the import and inserted an explicit rejection guard between the canUpCast check and the vector match block

Verifying this change

No pre-existing unit tests covered ArrowVectorReader directly. This PR adds
ArrowVectorReaderSuite with three cases:

ArrowVectorReader rejects TimestampLTZNanosType with a clear error — asserts
that passing a TimestampLTZNanosType(9) target throws a RuntimeException
with "not yet supported" in the message, rather than falling through to the
generic "Unsupported Vector Type" crash
ArrowVectorReader rejects TimestampNTZNanosType with a clear error — same
check for TimestampNTZNanosType(7)
ArrowVectorReader still succeeds for plain TimestampType — sanity-checks that
the guard does not regress the existing supported path

Does this pull request potentially affect one of the following parts

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public/@Evolving: no — ArrowVectorReader is private[connect]
The serializers: no
The runtime per-record code paths (performance sensitive): no — the guard only fires for an unsupported type that cannot currently be produced
Anything that affects deployment or recovery: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? No — this is a bug fix restoring a guard that was inadvertently disabled by SPARK-57303.

Was generative AI tooling used to co-author this PR?

Yes — Claude Code was used as a pair-programming assistant. All code was written, understood, and verified by the author.
Generated-by: Claude Opus 4.8

…mp types in ArrowVectorReader ### What is the purpose of the change Fixes SPARK-57738 — restores the fast-fail guard for nanosecond-precision timestamp types in `ArrowVectorReader`, which was silently broken by SPARK-57303. SPARK-57303 updated `UpCastRule.canUpCast` to return `true` for lossless widening within the timestamp family (e.g. `TimestampType -> TimestampLTZNanosType(p)`). As a side effect, the existing unsupported-type guard in `ArrowVectorReader.applyDefault` no longer rejects nanosecond timestamp targets — the SPARK-57303 commit message explicitly flagged this as a known follow-up item. Without this fix, a request to read a `TIMESTAMP_LTZ(p)` or `TIMESTAMP_NTZ(p)` (`p` in `[7, 9]`) column over Spark Connect silently passes the guard and then crashes with a confusing `"Unsupported Vector Type"` error from the catch-all branch of the `vector match`. With this fix it fails fast with a clear `"not yet supported"` message. ### Brief change log - `sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala`: added `AnyTimestampNanoType` to the import and inserted an explicit rejection guard between the `canUpCast` check and the `vector match` block ### Verifying this change No existing unit tests cover `ArrowVectorReader` directly. The fix is a defensive guard on an unsupported code path (nanosecond-precision timestamps are not yet reachable over Connect in any supported workflow), so the primary verification is: - Manual inspection: the guard fires before the `vector match`, so no nanosecond type can reach the `"Unsupported Vector Type"` catch-all - The fix will be superseded and removed when Connect nanos support is implemented (the comment in the code points to this) ### Does this pull request potentially affect one of the following parts - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public`/`@Evolving`: no — `ArrowVectorReader` is `private[connect]` - The serializers: no - The runtime per-record code paths (performance sensitive): no — the guard only fires for an unsupported type that cannot currently be produced - Anything that affects deployment or recovery: no - The S3 file system connector: no ### Documentation Does this pull request introduce a new feature? No — this is a bug fix restoring a guard that was inadvertently disabled by SPARK-57303. ### Was generative AI tooling used to co-author this PR? Yes — Claude Code was used as a pair-programming assistant. All code was written, understood, and verified by the author. Generated-by: Claude Sonnet 4.6

Run with: build/sbt 'connect-client-jvm/testOnly *ArrowVectorReaderSuite'

jubins added 2 commits June 28, 2026 08:28

Added tests

e17ac75

Run with: build/sbt 'connect-client-jvm/testOnly *ArrowVectorReaderSuite'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-57738][CONNECT] Restore fast-fail guard for nanosecond timestamp types in ArrowVectorReader#56849

[SPARK-57738][CONNECT] Restore fast-fail guard for nanosecond timestamp types in ArrowVectorReader#56849
jubins wants to merge 2 commits into
apache:masterfrom
jubins:j-SPARK-57738-arrow-vector-reader

jubins commented Jun 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jubins commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts

Documentation

Was generative AI tooling used to co-author this PR?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jubins commented Jun 28, 2026 •

edited

Loading