Skip to content

feat(arrow-ipc): add sans-IO stream encoder#10277

Open
Phoenix500526 wants to merge 1 commit into
apache:mainfrom
Phoenix500526:issue/7812
Open

feat(arrow-ipc): add sans-IO stream encoder#10277
Phoenix500526 wants to merge 1 commit into
apache:mainfrom
Phoenix500526:issue/7812

Conversation

@Phoenix500526

Copy link
Copy Markdown

Which issue does this PR close?

Rationale for this change

StreamWriter currently requires a std::io::Write sink, which is awkward for async or chunk-oriented destinations such as object stores. This PR adds a sans-IO IPC stream encoder so callers can encode Arrow IPC stream data into ordered Buffer chunks and send those chunks through their own IO layer.

What changes are included in this PR?

This PR adds StreamEncoder, a stateful IPC stream encoder that:

  • emits ordered arrow_buffer::Buffer chunks
  • hides stream lifecycle details such as schema emission and EOS markers
  • owns stream state such as dictionary tracking and IPC write context
  • preserves the low-copy path for uncompressed record batch body buffers
  • supports try_new, try_new_with_options, encode, and finish

Are these changes tested?

Yes.
Added tests compare StreamEncoder output byte-for-byte with StreamWriter output for:
a normal record batch stream
an empty stream
a stream containing dictionary batches

Are there any user-facing changes?

Yes. This adds a new public arrow_ipc::writer::StreamEncoder API.
There are no breaking changes. Existing StreamWriter behavior is unchanged.

Introduce StreamEncoder for IPC streaming without requiring a
std::io::Write sink. The encoder owns stream state, emits ordered Buffer
chunks, and preserves the low-copy path for uncompressed record batch
body buffers.

Add byte-for-byte compatibility tests against StreamWriter for normal
batches, empty streams, and dictionary batches.

Closes apache#7812
Signed-off-by: Jiawei Zhao <Phoenix500526@163.com>
@github-actions github-actions Bot added the arrow Changes to the arrow crate label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

arrow-ipc StreamWriter better ergonomics with async

1 participant