Skip to content

Write Iceberg table schema into manifest header#801

Open
jvansanten wants to merge 10 commits intoduckdb:mainfrom
jvansanten:manifest-schema-header
Open

Write Iceberg table schema into manifest header#801
jvansanten wants to merge 10 commits intoduckdb:mainfrom
jvansanten:manifest-schema-header

Conversation

@jvansanten
Copy link
Copy Markdown
Contributor

Duckdb tries to write the Iceberg schema of the manifest itself at the key where org.apache.iceberg.ManifestReader expects to find the table schema, while mixing Avro and Iceberg schema syntax.

This PR writes the Iceberg table schema to the key "schema" in the manifest header as required by the spec, and the Iceberg manifest schema to "iceberg.schema" as observed in Spark-created manifests.

Fixes #799.

rather than manifest schema to "schema" in a mix of Iceberg and Avro schemas

Likely implements the intent of 9c8b1fd
Comment thread src/metadata/iceberg_column_definition.cpp Outdated
Copy link
Copy Markdown
Member

@Tishj Tishj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have all this logic already, in IcebergCreateTableRequest::PopulateSchema
Can we please use that instead of duplicating all this logic?

@jvansanten
Copy link
Copy Markdown
Contributor Author

jvansanten commented Mar 17, 2026

I believe we have all this logic already, in IcebergCreateTableRequest::PopulateSchema Can we please use that instead of duplicating all this logic?

Thanks, that's a much better idea; I had missed where this happens. Will do.

@jvansanten jvansanten requested a review from Tishj March 17, 2026 10:42
@Tishj
Copy link
Copy Markdown
Member

Tishj commented Mar 18, 2026

It looks fine to me, but I would really like a test that fails on main and now passes with this.
I couldn't manage to make it fail on main:

    def test_duckdb_drop_table_nested_types(self, spark_con):
        from datetime import datetime, timezone

        ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")

        spark_con.sql(
            """
            DELETE FROM nested_types_2;
            """
        )
        spark_con.sql(f"""
        CALL system.expire_snapshots(
        table => 'default.nested_types_2',
        older_than => TIMESTAMP '{ts}'
        )
        """)

I tried DROP, I tried DELETE + refresh and DELETE + expire snapshots, none of them failed
I'm also hesitant about adding iceberg.schema, as the spec doesn't mention this anywhere

@jvansanten
Copy link
Copy Markdown
Contributor Author

I tried DROP, I tried DELETE + refresh and DELETE + expire snapshots, none of them failed

I agree that it's really nice to have tests to show that the change does something, and that it doesn't regress.

The symptom I observed is that iceberg-rest-fixture silently fails to delete the data files for tables created by duckdb when PURGE_REQUESTED true. From the perspective of the catalog, though, the drop succeeded; the data files are just orphaned in storage. I don't know how to test this in the context of sqllogictest, though.

Since the current header raises an exception inside of org.apache.iceberg.ManifestReader, I think it's likely that attempts to scan the table with Spark would fail as well. One option could be to add a new set of tests that read duckdb-written tables back with Spark. Unless, of course, CatalogUtils is the only thing that actually uses ManifestReader.

I'm also hesitant about adding iceberg.schema, as the spec doesn't mention this anywhere

This also bothers me, but I figured the manifest schema was there for a reason, and the thing 2b97d7f seems to have been indented to write exists in spark-created manifests as "iceberg.schema". I don't know the history behind that change, though.

@Tishj
Copy link
Copy Markdown
Member

Tishj commented Mar 18, 2026

As far as I know we have tests that read duckdb-created tables with Spark, see test_spark_read.py

@Tmonster
Copy link
Copy Markdown
Member

If you search for default.lower_upper_bounds_test, you can see a duckdb test writes it, and later, both pyiceberg and spark read it back. It's not very well documented, but we have been running these kinds of tests for a while now

@jvansanten
Copy link
Copy Markdown
Contributor Author

Okay, I will try to think harder about how to reproduce the failures I observed.

@jvansanten
Copy link
Copy Markdown
Contributor Author

jvansanten commented Mar 19, 2026

I haven't been able to find a way to make Spark choke on the malformed schema, so it would seem that the failure I saw is a fairly rare path. That would go some way towards explaining why it went unnoticed, but does raise the question of what the schema is supposed to be used for.

For want of a better idea, I switched on purge in the fixture catalog config and added a test that reproduces the failure I saw (dropping a duckdb-written table from duckdb, with purge enabled). It fails on main but passes on this PR.

Interestingly, Spark's DROP TABLE PURGE on a duckdb-written table succeeds on main; it does not seem to rely on the catalog to purge files.

@jvansanten jvansanten force-pushed the manifest-schema-header branch 2 times, most recently from b06a141 to aeab59f Compare April 17, 2026 13:03
These catalogs do not purge synchronously on drop
@jvansanten jvansanten force-pushed the manifest-schema-header branch from aeab59f to 6bcc258 Compare April 17, 2026 13:15
@jvansanten jvansanten force-pushed the manifest-schema-header branch from 6bcc258 to 200a020 Compare April 20, 2026 07:27
@jvansanten
Copy link
Copy Markdown
Contributor Author

Tests now pass for all catalogs. The drop test is disabled for all but the REST fixture catalog, as all others seem to purge asynchronously, making it difficult to verify whether they have actually done so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Malformed schema in manifest header

3 participants