[SPARK-57738][CONNECT] Restore fast-fail guard for nanosecond timestamp types in ArrowVectorReader#56849
Open
jubins wants to merge 2 commits into
Open
[SPARK-57738][CONNECT] Restore fast-fail guard for nanosecond timestamp types in ArrowVectorReader#56849jubins wants to merge 2 commits into
jubins wants to merge 2 commits into
Conversation
…mp types in ArrowVectorReader ### What is the purpose of the change Fixes SPARK-57738 — restores the fast-fail guard for nanosecond-precision timestamp types in `ArrowVectorReader`, which was silently broken by SPARK-57303. SPARK-57303 updated `UpCastRule.canUpCast` to return `true` for lossless widening within the timestamp family (e.g. `TimestampType -> TimestampLTZNanosType(p)`). As a side effect, the existing unsupported-type guard in `ArrowVectorReader.applyDefault` no longer rejects nanosecond timestamp targets — the SPARK-57303 commit message explicitly flagged this as a known follow-up item. Without this fix, a request to read a `TIMESTAMP_LTZ(p)` or `TIMESTAMP_NTZ(p)` (`p` in `[7, 9]`) column over Spark Connect silently passes the guard and then crashes with a confusing `"Unsupported Vector Type"` error from the catch-all branch of the `vector match`. With this fix it fails fast with a clear `"not yet supported"` message. ### Brief change log - `sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala`: added `AnyTimestampNanoType` to the import and inserted an explicit rejection guard between the `canUpCast` check and the `vector match` block ### Verifying this change No existing unit tests cover `ArrowVectorReader` directly. The fix is a defensive guard on an unsupported code path (nanosecond-precision timestamps are not yet reachable over Connect in any supported workflow), so the primary verification is: - Manual inspection: the guard fires before the `vector match`, so no nanosecond type can reach the `"Unsupported Vector Type"` catch-all - The fix will be superseded and removed when Connect nanos support is implemented (the comment in the code points to this) ### Does this pull request potentially affect one of the following parts - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public`/`@Evolving`: no — `ArrowVectorReader` is `private[connect]` - The serializers: no - The runtime per-record code paths (performance sensitive): no — the guard only fires for an unsupported type that cannot currently be produced - Anything that affects deployment or recovery: no - The S3 file system connector: no ### Documentation Does this pull request introduce a new feature? No — this is a bug fix restoring a guard that was inadvertently disabled by SPARK-57303. ### Was generative AI tooling used to co-author this PR? Yes — Claude Code was used as a pair-programming assistant. All code was written, understood, and verified by the author. Generated-by: Claude Sonnet 4.6
Run with: build/sbt 'connect-client-jvm/testOnly *ArrowVectorReaderSuite'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
Fixes SPARK-57738 — restores the fast-fail guard for nanosecond-precision timestamp types in
ArrowVectorReader, which was silently broken by SPARK-57303.SPARK-57303 updated
UpCastRule.canUpCastto returntruefor lossless widening within the timestamp family (e.g.TimestampType -> TimestampLTZNanosType(p)). As a side effect, the existing unsupported-type guard inArrowVectorReader.applyDefaultno longer rejects nanosecond timestamp targets — the SPARK-57303 commit message explicitly flagged this as a known follow-up item.Without this fix, a request to read a
TIMESTAMP_LTZ(p)orTIMESTAMP_NTZ(p)(pin[7, 9]) column over Spark Connect silently passes the guard and then crashes with a confusing"Unsupported Vector Type"error from the catch-all branch of thevector match. With this fix it fails fast with a clear"not yet supported"message.Brief change log
sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow/ArrowVectorReader.scala: addedAnyTimestampNanoTypeto the import and inserted an explicit rejection guard between thecanUpCastcheck and thevector matchblockVerifying this change
No pre-existing unit tests covered
ArrowVectorReaderdirectly. This PR addsArrowVectorReaderSuitewith three cases:ArrowVectorReader rejects TimestampLTZNanosType with a clear error— assertsthat passing a
TimestampLTZNanosType(9)target throws aRuntimeExceptionwith
"not yet supported"in the message, rather than falling through to thegeneric
"Unsupported Vector Type"crashArrowVectorReader rejects TimestampNTZNanosType with a clear error— samecheck for
TimestampNTZNanosType(7)ArrowVectorReader still succeeds for plain TimestampType— sanity-checks thatthe guard does not regress the existing supported path
Does this pull request potentially affect one of the following parts
@Public/@Evolving: no —ArrowVectorReaderisprivate[connect]Documentation
Does this pull request introduce a new feature? No — this is a bug fix restoring a guard that was inadvertently disabled by SPARK-57303.
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.8