You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`pa.null()`|`UnknownType`(format version 3 only) [[2]](#notes)|
2114
2114
2115
2115
---
2116
2116
2117
-
***Notes***
2117
+
#### Notes
2118
2118
2119
-
- PyIceberg `GeometryType` and `GeographyType` types are mapped to a GeoArrow WKB extension type.
2120
-
Otherwise, falls back to `pa.large_binary()` which stores WKB bytes.
2121
-
- For timestamp types (`TimestampNanoType`, `TimestamptzType`, `TimestamptzNanoType`), writing in format version 3 (which supports the `ns` unit) is not yet implemented
2122
-
(see [Github issue](https://github.com/apache/iceberg-python/issues/1551)). Only the `UTC` timezone and its aliases are supported.
2119
+
[1] Only the `UTC` timezone and its aliases are supported for PyArrow-to-PyIceberg timestamp-with-timezone conversion.
2120
+
2121
+
[2] The PyArrow-to-PyIceberg mappings for `pa.timestamp("ns")`, `pa.timestamp("ns", tz="UTC")`, and `pa.null()` require Iceberg format version 3. By default, `pyarrow_to_schema()` uses format version 2. `TimestampNanoType`, `TimestamptzNanoType`, and `UnknownType` are likewise format-version-3-only Iceberg types.
2122
+
2123
+
[3] For nanosecond Iceberg timestamp types (`TimestampNanoType` and `TimestamptzNanoType`), writing in format version 3 is not yet implemented (see [GitHub issue #1551](https://github.com/apache/iceberg-python/issues/1551)).
2124
+
2125
+
[4] The mappings are not fully symmetric. On read, PyArrow normalizes some families of types into a single Iceberg type, and on write PyIceberg emits a canonical PyArrow type: for example, `pa.int8()` and `pa.int16()` read as `IntegerType` and write back as `pa.int32()`, `pa.string()` reads as `StringType` and writes back as `pa.large_string()`, `pa.binary()` reads as `BinaryType` and writes back as `pa.large_binary()`, `pa.list_(...)` writes back as `pa.large_list(...)`, and `pa.timestamp("s")` / `pa.timestamp("ms")` read as `TimestampType` and write back as `pa.timestamp("us")`.
0 commit comments