Skip to content

RFC: File Format API for Apache Iceberg Rust #2382

@Kurtiscwright

Description

@Kurtiscwright

Is your feature request related to a problem or challenge?

This RFC proposes a File Format API for the iceberg Rust crate that decouples Iceberg's read and write paths from any single file format. Today, iceberg-rust can only read and write Parquet data files: the format is hard-wired into ArrowReader, ParquetWriter, and every layer that touches them. The Java project shipped an analogous abstraction (FormatModel) in February 2026 via PR #12774, and PyIceberg has an open proposal (apache/iceberg-python#3100) for the same concept.

Describe the solution you'd like

The RFC doc below describes how Rust should implement an equivalent capability, specifically a FormatModel trait, a registry, and format-agnostic scan and write paths, using idiomatic Rust constructs: traits with trait objects at the registry boundary, feature flags for compile-time format composition, and RecordBatch as the canonical data type. Because iceberg-rust is pre-1.0, the design takes cleaner tradeoffs than the Java community could: it avoids the wrapper pattern and generic-parameter compromises that were forced on Java by backward-compatibility constraints. The scope is the abstraction layer and its Parquet implementation; ORC, Avro data-file, Vortex, and Lance support are explicitly follow-up work that validates the API's extensibility.

RFC doc PR: #2384

Willingness to contribute

I can contribute to this feature independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions