Skip to content

test: fix test_flushable_buckets_carry_all_batches assertion drift#46

Merged
tonyalaribe merged 1 commit into
masterfrom
fix/test-flushable-buckets-coalesce
Jun 9, 2026
Merged

test: fix test_flushable_buckets_carry_all_batches assertion drift#46
tonyalaribe merged 1 commit into
masterfrom
fix/test-flushable-buckets-coalesce

Conversation

@tonyalaribe

Copy link
Copy Markdown
Contributor

Summary

PR #45 was squash-merged but left CI red on master: test_flushable_buckets_carry_all_batches was pre-existing failing on master before #45 and is still red now. This is the test fix that was on the original PR branch but dropped during squash.

The test asserted one RecordBatch per insert, but inserts crossing MAX_BATCH_COUNT_PER_BUCKET trigger insert-time coalesce under the bucket lock — so 10 single-row inserts come back as 2 RecordBatches, not 10. The flush path actually only cares that every row makes it through; the batch-count breakdown is an implementation detail of the in-bucket coalesce gate.

Rewrites the assertion to check rows (no loss) plus a bounded-batch-count sanity guard, and explains the coalesce trade-off in the test comment so the next reader doesn't bring the loose assertion back.

Test plan

  • cargo test --lib mem_buffer::tests::test_flushable_buckets_carry_all_batches — passes locally.
  • CI green on this PR.

The test asserted one RecordBatch per insert, but inserts cross
MAX_BATCH_COUNT_PER_BUCKET and trigger insert-time coalesce under the
bucket lock — so 10 single-row inserts come back as 2 RecordBatches,
not 10.

The flush path actually only cares that every row makes it through; the
batch-count breakdown is an implementation detail of the in-bucket
coalesce gate. Rewrite the assertion to check rows (no loss) plus a
bounded-batch-count sanity guard, and explain the coalesce trade-off
in the test comment so the next reader doesn't bring the loose
assertion back.

Was failing on master before this PR (verified via git stash); this
removes a CI gate that's been red on every PR.
@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR fixes a broken test (test_flushable_buckets_carry_all_batches) introduced when PR #45 was squash-merged. The old assertion expected one RecordBatch per insert (10 inserts → 10 batches), but insert-time coalescing under MAX_BATCH_COUNT_PER_BUCKET reduces the batch count before flush. The fix correctly shifts the assertion from an exact batch-count to a row-count check plus a bounded-batch-count guard.

The change is well-reasoned and the PR description explains the root cause clearly.


Correctness

The new assertions are logically sound:

  • row_count check – validates no data is silently dropped.
  • summed check – cross-validates the cached counter matches the actual batch contents.
  • Batch-count bound – confirms coalescing keeps the count within the expected ceiling.

The bound analysis in the comment is correct: coalesce fires when len > MAX_BATCH_COUNT_PER_BUCKET (i.e., at 9), collapses to 1, so the high-water mark before the next coalesce is MAX_BATCH_COUNT_PER_BUCKET + 1 = 9.


Suggestions

1. total_rows should be derived, not hardcoded

let total_rows = MAX_BATCH_COUNT_PER_BUCKET + 2; // ensure coalesce fires at least once

Currently total_rows = 10 and MAX_BATCH_COUNT_PER_BUCKET = 8. If the constant is later bumped to >= 10, the test silently stops exercising coalescing — batches.len() would equal total_rows again, and the bounds assertion would pass trivially without the coalesce path ever running. Deriving total_rows from the constant keeps the test self-documenting and future-proof.

2. The bound assertion implicitly assumes bucket_bytes <= MAX_BATCH_BYTES_FOR_COALESCE

The coalesce gate at line 1673:

if g.len() > MAX_BATCH_COUNT_PER_BUCKET && bucket_bytes <= MAX_BATCH_BYTES_FOR_COALESCE {

The batch-count bound only holds when the byte-size guard also passes. For tiny single-row inserts in a test this is fine, but the comment presents the bound as unconditional. A one-liner noting the dependency would prevent a future reader from assuming this bound applies under memory pressure:

// Bound holds here because single-row test batches stay well under MAX_BATCH_BYTES_FOR_COALESCE.

3. Minor: assertion message for row_count

The raw assert_eq! on line 2340 has no failure message:

assert_eq!(flushable[0].row_count, total_rows);

Consider adding one for consistency with the surrounding assertions:

assert_eq!(flushable[0].row_count, total_rows, "cached row_count diverged from total inserts");

Test Coverage

The existing test_coalesce_keeps_batch_count_bounded test directly validates the coalesce invariant. This PR's test is complementary — it validates the flush path end-to-end. The two together give good coverage; no additional tests needed.


Summary

Clean, well-explained fix. The main actionable item is making total_rows derive from MAX_BATCH_COUNT_PER_BUCKET so the test stays meaningful if the constant changes. The other two points are minor polish. No correctness concerns, no security or performance implications.

Verdict: Approve with minor suggestions. The core logic is correct; the derived-constant suggestion is worth addressing before merge to avoid a future silent test regression.

@tonyalaribe tonyalaribe merged commit 2228ac7 into master Jun 9, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant