[SPARK-57735][SQL] Support nanosecond-precision timestamp types in the in-memory columnar cache by viirya · Pull Request #56842 · apache/spark

viirya · 2026-06-28T03:59:03Z

What changes were proposed in this pull request?

The default in-memory columnar cache serializer (DefaultCachedBatchSerializer) did not support TimestampNTZNanosType / TimestampLTZNanosType. Caching a DataFrame with such a column failed at materialization with not support type: TimestampNTZNanosType(9), because none of the cache's type-dispatch sites had a case for them.

This adds full support, following the fixed-width multi-field pattern already used by CalendarInterval. The physical value TimestampNanosVal is a fixed 16-byte payload (an 8-byte epochMicros plus an 8-byte word holding nanosWithinMicro), so it maps cleanly onto that pattern:

ColumnType: a TIMESTAMP_NANOS column type (with TIMESTAMP_NTZ_NANOS / TIMESTAMP_LTZ_NANOS singletons) whose append/extract read and write the 16-byte payload, with a MutableUnsafeRow direct-copy fast path.
ColumnBuilder, ColumnAccessor: builder and accessor classes plus dispatch cases.
ColumnStats: a TimestampNanosColumnStats collector (fixed size, no min/max bounds).
GenerateColumnAccessor: the codegen accessor-class selection and initialization branch.

TIMESTAMP_NTZ and TIMESTAMP_LTZ nanos types share the same storage and differ only by physical type and row getter/setter, so the encode/decode logic is shared between them.

Why are the changes needed?

Nanosecond-precision timestamp types are otherwise unsupported by the cache, so df.cache() on a column of these types throws. With this change such DataFrames cache and read back correctly, consistent with the microsecond TIMESTAMP_NTZ / TIMESTAMP types which the cache already supports.

Does this PR introduce any user-facing change?

Yes. Previously, caching a DataFrame containing a TIMESTAMP_NTZ(p) / TIMESTAMP_LTZ(p) column with p in the nanosecond range threw not support type. Now it caches and reads back the values, including sub-microsecond precision.

How was this patch tested?

ColumnTypeSuite: append/extract round-trip for TIMESTAMP_NTZ_NANOS and TIMESTAMP_LTZ_NANOS (random values), plus defaultSize checks.
InMemoryColumnarQuerySuite: an end-to-end cache roundtrip for both nanos types, with the vectorized reader both on and off, covering sub-microsecond precision and null values.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code

…e in-memory columnar cache ### What changes were proposed in this pull request? The default in-memory columnar cache serializer (`DefaultCachedBatchSerializer`) did not support `TimestampNTZNanosType` / `TimestampLTZNanosType`. Caching a DataFrame with such a column failed at materialization with `not support type: TimestampNTZNanosType(9)`, because none of the cache's type-dispatch sites had a case for them. This adds full support, following the fixed-width multi-field pattern already used by `CalendarInterval`. The physical value `TimestampNanosVal` is a fixed 16-byte payload (an 8-byte epochMicros plus an 8-byte word holding nanosWithinMicro), so it maps cleanly onto that pattern: - `ColumnType`: a `TIMESTAMP_NANOS` column type (with `TIMESTAMP_NTZ_NANOS` / `TIMESTAMP_LTZ_NANOS` singletons) whose `append`/`extract` read and write the 16-byte payload, with a `MutableUnsafeRow` direct-copy fast path. - `ColumnBuilder`, `ColumnAccessor`: builder and accessor classes and dispatch cases. - `ColumnStats`: a `TimestampNanosColumnStats` collector (fixed size, no min/max bounds). - `GenerateColumnAccessor`: the codegen accessor-class selection and initialization branch. NTZ and LTZ share the same storage and differ only by physical type and row getter/setter, so the encode/decode logic is shared. ### Why are the changes needed? Nanosecond-precision timestamp types are otherwise unsupported by the cache, so `df.cache()` on a column of these types throws. With this change such DataFrames cache and read back correctly. ### Does this PR introduce _any_ user-facing change? Yes. Previously, caching a DataFrame containing a `TIMESTAMP_NTZ(p)` / `TIMESTAMP_LTZ(p)` column with `p` in the nanosecond range threw `not support type`. Now it caches and reads back the values, including sub-microsecond precision. ### How was this patch tested? - `ColumnTypeSuite`: append/extract round-trip for `TIMESTAMP_NTZ_NANOS` and `TIMESTAMP_LTZ_NANOS` (random values), plus `defaultSize` checks. - `InMemoryColumnarQuerySuite`: an end-to-end cache roundtrip for both nanos types, with the vectorized reader both on and off, covering sub-microsecond precision and null values. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code Co-authored-by: Claude Code

dongjoon-hyun · 2026-06-28T05:38:52Z

+    withSQLConf(SQLConf.TIMESTAMP_NANOS_TYPES_ENABLED.key -> "true") {
+      Seq("TIMESTAMP_NTZ(9)", "TIMESTAMP_LTZ(9)").foreach { typeName =>
+        Seq("false", "true").foreach { vectorized =>
+          withSQLConf(SQLConf.CACHE_VECTORIZED_READER_ENABLED.key -> vectorized) {


SQLConf.CACHE_VECTORIZED_READER_ENABLED.key=true seems to be a dead test coverage because of the following. Could you double-check, @viirya ?

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala

Lines 175 to 182 in 64fb953

override def supportsColumnarOutput(schema: StructType): Boolean = schema.fields.forall(f =>

f.dataType match {

// More types can be supported, but this is to match the original implementation that

// only supported primitive types "for ease of review"

case BooleanType | ByteType | ShortType | IntegerType | LongType |

FloatType | DoubleType => true

case _ => false

})

You're right, thanks. Nanosecond timestamps are non-primitive for the default cache, and DefaultCachedBatchSerializer.supportsColumnarOutput returns true only for the primitive types, so they always read back through the row path -- the CACHE_VECTORIZED_READER_ENABLED=true case exercised the same path as false. I've dropped the loop and test the single (row) path, with a comment noting why (same as CalendarInterval/Variant/Decimal).

…the nanos cache test The cache test looped over CACHE_VECTORIZED_READER_ENABLED true/false, but nanosecond timestamps are non-primitive for the default cache (DefaultCachedBatchSerializer.supportsColumnarOutput returns true only for primitive types), so they always read back through the row path regardless of that flag -- the two cases exercised the same path. Test the single (row) path and document why, matching CalendarInterval/Variant/Decimal. Co-authored-by: Claude Code

MaxGekk

0 blocking, 1 non-blocking, 0 nits.
Correct, complete addition that follows the fixed-width CalendarInterval cache pattern — I verified the on-buffer 16-byte layout ([epochMicros][nanosWithinMicro→long]) is byte-identical to UnsafeRow's TimestampNanosRowValues payload, so the MutableUnsafeRow direct-copy fast path matches the slow path; every row-path dispatch site is wired; no columnar/vectorized path is correctly added (non-primitive); and the tests are meaningful.

Design / architecture (1)

ColumnStats.scala:329: TimestampNanosColumnStats collects no min/max bounds — see inline.

Verification

Traced the MutableUnsafeRow fast path: append writes [epochMicros:8][nanosWithinMicro.toLong:8], which equals TimestampNanosRowValues.writePayload (Platform.putLong(epochMicros) then Platform.putLong(nanosWithinMicro)), so the direct 16-byte copy and the slow path (fromTrustedRowBytes / setTimestampNanosPayload) produce identical rows; the (short) narrowing on read matches readNanosWithinMicro. Endianness is consistent (both go through Platform; codegen wraps with .order(nativeOrder)).

MaxGekk · 2026-06-28T07:01:01Z

    Array[Any](null, null, nullCount, count, sizeInBytes)
 }

+private[columnar] final class TimestampNanosColumnStats extends ColumnStats {


TimestampNanosColumnStats emits null/null for lower/upper (the CalendarInterval / IntervalColumnStats pattern), so cached nanosecond-timestamp columns get no batch-level partition pruning.

The same logical type at micro precision takes a different path: TimestampType/TimestampNTZType -> LongColumnBuilder -> LongColumnStats, which collects min/max. So a range filter (WHERE ts > '...') over a cached TIMESTAMP_NTZ(6) column skips non-matching batches, while the same filter over a cached TIMESTAMP_NTZ(9) column scans every batch.

TimestampNanosVal is Comparable (its total order is calendar order), and ordered non-primitive cache types already keep bounds — DecimalColumnStats collects Decimal min/max. So tracking upper/lower as TimestampNanosVal here (modeled on DecimalColumnStats rather than IntervalColumnStats) would preserve the pruning the micro path provides.

Not a correctness issue — the feature works. Is the bounds-less choice intentional (follow CalendarInterval), or worth collecting min/max so cached nanos timestamps prune like micro timestamps?

Good point -- collecting min/max is the right call, thanks. You're right that the bounds-less version was a regression from the micro path: TIMESTAMP_NTZ(6) prunes via LongColumnStats while TIMESTAMP_NTZ(9) scanned every batch.

Following your suggestion, TimestampNanosColumnStats now collects upper/lower as TimestampNanosVal (modeled on DecimalColumnStats rather than IntervalColumnStats), using its compareTo (which is calendar order). The pruning path is already wired for it -- TimestampNTZNanosType is an AtomicType so ExtractableLiteral extracts the literal, and PhysicalTimestampNTZNanosType defines an ordering, so the bound comparisons buildFilter generates are valid -- so cached nanos timestamps now prune like micro timestamps.

Added coverage: ColumnStatsSuite asserts the min/max bounds for both NTZ and LTZ, and PartitionBatchPruningSuite verifies a range filter over a cached nanos column reads fewer batches with in-memory partition pruning on than off (and returns the same rows as a pre-cache evaluation).

…timestamps TimestampNanosColumnStats followed the IntervalColumnStats pattern (no min/max bounds), so cached nanosecond-timestamp columns got no batch-level partition pruning -- a regression from the micro-precision path (TimestampType / TimestampNTZType -> LongColumnStats), where a range filter skips non-matching batches. TimestampNanosVal has a total order matching calendar order, and the pruning machinery is already wired for it (the type is an AtomicType so ExtractableLiteral extracts its literals, and PhysicalTimestampNTZNanosType defines an ordering, so the bound comparisons buildFilter generates are valid). Collect upper/lower as TimestampNanosVal (modeled on DecimalColumnStats), so cached nanos timestamps prune like micro timestamps. Tests: - ColumnStatsSuite: min/max bound collection for both NTZ and LTZ nanos stats. - PartitionBatchPruningSuite: a range filter over a cached nanos column reads fewer batches with in-memory partition pruning on than off, and returns the same rows as an uncached evaluation. Co-authored-by: Claude Code

dongjoon-hyun

+1, it looks good to me.

…n the pruning test The correctness check compared the cached + pruned read against an equivalent query built after the table was cached. The CacheManager matches by logical plan rather than DataFrame identity, so that "uncached" evaluation could itself be served from the InMemoryRelation, making the assertion compare the cache with itself. Compute the expected result before cacheTable so it cannot hit the cache. Co-authored-by: Claude Code

viirya · 2026-06-28T17:20:35Z

Thanks @dongjoon-hyun @MaxGekk.

dongjoon-hyun reviewed Jun 28, 2026

View reviewed changes

MaxGekk reviewed Jun 28, 2026

View reviewed changes

dongjoon-hyun approved these changes Jun 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-57735][SQL] Support nanosecond-precision timestamp types in the in-memory columnar cache#56842

[SPARK-57735][SQL] Support nanosecond-precision timestamp types in the in-memory columnar cache#56842
viirya wants to merge 4 commits into
apache:masterfrom
viirya:nanos-timestamp-default-cache

viirya commented Jun 28, 2026

Uh oh!

dongjoon-hyun Jun 28, 2026

Uh oh!

viirya Jun 28, 2026

Uh oh!

MaxGekk left a comment

Uh oh!

MaxGekk Jun 28, 2026

Uh oh!

viirya Jun 28, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

viirya commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	override def supportsColumnarOutput(schema: StructType): Boolean = schema.fields.forall(f =>
	f.dataType match {
	// More types can be supported, but this is to match the original implementation that
	// only supported primitive types "for ease of review"
	case BooleanType \| ByteType \| ShortType \| IntegerType \| LongType \|
	FloatType \| DoubleType => true
	case _ => false
	})

Uh oh!

Conversation

viirya commented Jun 28, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

viirya Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGekk left a comment

Choose a reason for hiding this comment

Design / architecture (1)

Verification

Uh oh!

MaxGekk Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

viirya Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

viirya commented Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants