Skip to content

Commit b4da27d

Browse files
committed
Merge branch 'main' of github.com:apache/iceberg-python into fd-fix-double-commit
2 parents 915b85f + 34c8949 commit b4da27d

63 files changed

Lines changed: 1966 additions & 803 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/pypi-build-artifacts.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ jobs:
6262
if: startsWith(matrix.os, 'ubuntu')
6363

6464
- name: Build wheels
65-
uses: pypa/cibuildwheel@v2.23.2
65+
uses: pypa/cibuildwheel@v2.23.3
6666
with:
6767
output-dir: wheelhouse
6868
config-file: "pyproject.toml"

.github/workflows/python-ci.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ jobs:
5858
python-version: ${{ matrix.python }}
5959
cache: poetry
6060
cache-dependency-path: ./poetry.lock
61+
- name: Install system dependencies
62+
run: sudo apt-get update && sudo apt-get install -y libkrb5-dev # for kerberos
6163
- name: Install
6264
run: make install-dependencies
6365
- name: Linters

.github/workflows/python-integration.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ jobs:
5050
- uses: actions/checkout@v4
5151
with:
5252
fetch-depth: 2
53+
- name: Install system dependencies
54+
run: sudo apt-get update && sudo apt-get install -y libkrb5-dev # for kerberos
5355
- name: Install
5456
run: make install
5557
- name: Run integration tests

.github/workflows/svn-build-artifacts.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ jobs:
5757
if: startsWith(matrix.os, 'ubuntu')
5858

5959
- name: Build wheels
60-
uses: pypa/cibuildwheel@v2.23.2
60+
uses: pypa/cibuildwheel@v2.23.3
6161
with:
6262
output-dir: wheelhouse
6363
config-file: "pyproject.toml"

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
help: ## Display this help
2020
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m\033[0m\n"} /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
2121

22-
POETRY_VERSION = 2.0.1
22+
POETRY_VERSION = 2.1.1
2323
install-poetry: ## Ensure Poetry is installed and the correct version is being used.
2424
@if ! command -v poetry &> /dev/null; then \
2525
echo "Poetry could not be found. Installing..."; \

dev/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,20 +39,20 @@ WORKDIR ${SPARK_HOME}
3939
# Remember to also update `tests/conftest`'s spark setting
4040
ENV SPARK_VERSION=3.5.4
4141
ENV ICEBERG_SPARK_RUNTIME_VERSION=3.5_2.12
42-
ENV ICEBERG_VERSION=1.9.0-SNAPSHOT
42+
ENV ICEBERG_VERSION=1.9.0
4343
ENV PYICEBERG_VERSION=0.9.0
4444

4545
RUN curl --retry 5 -s -C - https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop3.tgz -o spark-${SPARK_VERSION}-bin-hadoop3.tgz \
4646
&& tar xzf spark-${SPARK_VERSION}-bin-hadoop3.tgz --directory /opt/spark --strip-components 1 \
4747
&& rm -rf spark-${SPARK_VERSION}-bin-hadoop3.tgz
4848

4949
# Download iceberg spark runtime
50-
RUN curl --retry 5 -s https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-spark-runtime-3.5_2.12/1.9.0-SNAPSHOT/iceberg-spark-runtime-3.5_2.12-1.9.0-20250409.001855-44.jar \
50+
RUN curl --retry 5 -s https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-spark-runtime-${ICEBERG_SPARK_RUNTIME_VERSION}/${ICEBERG_VERSION}/iceberg-spark-runtime-${ICEBERG_SPARK_RUNTIME_VERSION}-${ICEBERG_VERSION}.jar \
5151
-Lo /opt/spark/jars/iceberg-spark-runtime-${ICEBERG_SPARK_RUNTIME_VERSION}-${ICEBERG_VERSION}.jar
5252

5353

5454
# Download AWS bundle
55-
RUN curl --retry 5 -s https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-aws-bundle/1.9.0-SNAPSHOT/iceberg-aws-bundle-1.9.0-20250409.002731-88.jar \
55+
RUN curl --retry 5 -s https://repo1.maven.org/maven2/org/apache/iceberg/iceberg-aws-bundle/${ICEBERG_VERSION}/iceberg-aws-bundle-${ICEBERG_VERSION}.jar \
5656
-Lo /opt/spark/jars/iceberg-aws-bundle-${ICEBERG_VERSION}.jar
5757

5858
COPY spark-defaults.conf /opt/spark/conf

mkdocs/docs/api.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,17 @@ static_table = StaticTable.from_metadata(
215215

216216
The static-table is considered read-only.
217217

218+
Alternatively, if your table metadata directory contains a `version-hint.text` file, you can just specify
219+
the table root path, and the latest metadata file will be picked automatically.
220+
221+
```python
222+
from pyiceberg.table import StaticTable
223+
224+
static_table = StaticTable.from_metadata(
225+
"s3://warehouse/wh/nyc.db/taxis
226+
)
227+
```
228+
218229
## Check if a table exists
219230

220231
To check whether the `bids` table exists:

mkdocs/docs/configuration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ PyIceberg uses [S3FileSystem](https://arrow.apache.org/docs/python/generated/pya
189189
| s3.access-key-id | admin | Configure the static access key id used to access the FileIO. |
190190
| s3.secret-access-key | password | Configure the static secret access key used to access the FileIO. |
191191
| s3.session-token | AQoDYXdzEJr... | Configure the static session token used to access the FileIO. |
192-
| s3.force-virtual-addressing | True | Whether to use virtual addressing of buckets. This must be set to True as OSS can only be accessed with virtual hosted style address. |
192+
| s3.force-virtual-addressing | True | Whether to use virtual addressing of buckets. This is set to `True` by default as OSS can only be accessed with virtual hosted style address. |
193193

194194
<!-- markdown-link-check-enable-->
195195

poetry.lock

Lines changed: 38 additions & 23 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyiceberg/avro/file.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -74,10 +74,17 @@
7474

7575

7676
class AvroFileHeader(Record):
77-
__slots__ = ("magic", "meta", "sync")
78-
magic: bytes
79-
meta: Dict[str, str]
80-
sync: bytes
77+
@property
78+
def magic(self) -> bytes:
79+
return self._data[0]
80+
81+
@property
82+
def meta(self) -> Dict[str, str]:
83+
return self._data[1]
84+
85+
@property
86+
def sync(self) -> bytes:
87+
return self._data[2]
8188

8289
def compression_codec(self) -> Optional[Type[Codec]]:
8390
"""Get the file's compression codec algorithm from the file's metadata.
@@ -271,7 +278,7 @@ def __exit__(
271278
def _write_header(self) -> None:
272279
json_schema = json.dumps(AvroSchemaConversion().iceberg_to_avro(self.file_schema, schema_name=self.schema_name))
273280
meta = {**self.metadata, _SCHEMA_KEY: json_schema, _CODEC_KEY: "null"}
274-
header = AvroFileHeader(magic=MAGIC, meta=meta, sync=self.sync_bytes)
281+
header = AvroFileHeader(MAGIC, meta, self.sync_bytes)
275282
construct_writer(META_SCHEMA).write(self.encoder, header)
276283

277284
def write_block(self, objects: List[D]) -> None:

0 commit comments

Comments
 (0)