Skip to content

Commit 2ddd2be

Browse files
author
Tom McCormick
committed
Add comprehensive ORC batching tests demonstrating stripe size, batch size, and compression interactions.
Tests show ORC batching is based on stripes (like Parquet row groups), with near-perfect 1:1 mapping achievable using large stripe sizes (2-5MB) and hard-to-compress data, achieving 0.91-0.97 ratios between stripe size and actual file size.
1 parent 71143f6 commit 2ddd2be

1 file changed

Lines changed: 1257 additions & 63 deletions

File tree

0 commit comments

Comments
 (0)