Skip to content

feat(table): project reserved row-lineage fields as null when the file lacks them #1010

@laskoviymishka

Description

@laskoviymishka

Parent: #589

When a v3 data file pre-dates row lineage (the file has no _row_id / _last_updated_sequence_number columns), the scanner should project those reserved fields as all-null columns rather than erroring. Java fixed this in apache/iceberg#15187 and #15508.

iceberg-go has a partial path today: synthesizeRowLineageColumns fills nulls with first_row_id + row_position and data_sequence_number when the columns are already present in the batch. The gap is the projection layer above it — if a scan requests these reserved fields against a file that lacks them entirely, the request needs to add the columns as null arrays, not error.

Detect the reserved field IDs (2147483546 for _row_id, 2147483545 for _last_updated_sequence_number) in the projection path; emit null columns of the right type when the underlying file lacks them. The existing synthesis path then handles the "file has them but the rows are null" case as before.

Test: scan a v3 metadata that has next-row-id set but a data file written before row lineage existed — assert the projected columns come back all-null without error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions