Delta wire-format: pack HLL delta + add CountMin hh_keys#61
Merged
Conversation
Two delta encoding updates (cross-language byte parity verified via golden tests against the Go reference implementation): - HLL: replace HLLDelta's repeated per-register sub-messages with a packed varint (index_delta, value) blob — the layout HLLSparseRegisters.packed already uses for full sparse state. ~62% smaller deltas; a single-emit delta is about the size of the full sparse frame. - CountMin: add an optional hh_keys (repeated string, field 6) to CountMinDelta so it matches CountSketchDelta. Population is control-plane-gated (empty unless a heavy-hitter source is supplied); no byte change when unset. cargo test --lib: 426 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
29a0868 to
9289505
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two sketch delta encoding updates. Both are verified for cross-language byte parity via golden tests, and neither changes default behavior (delta transmission is opt-in; an unset
hh_keysis byte-identical to before).HLL — pack delta as a varint blob
HLLDelta { repeated HLLRegisterUpdate }→HLLDelta { bytes packed_updates }: the increased registers are varint-packed as(index_delta, value)pairs in ascending index order — the same layoutHLLSparseRegisters.packedalready uses for full sparse state. This removes the per-register sub-message tag/length overhead: a 50-register delta drops from 350 B to 132 B (~62%), and a single-emit delta is about the size of the full sparse frame.CountMin — add optional
hh_keysCountMinDeltagainshh_keys(repeated string, field 6), matchingCountSketchDelta's layout so the two delta encodings are structurally identical. Population is control-plane-gated: it stays empty unless a heavy-hitter source is supplied, so the bytes are unchanged when unset. (CountMin has no Space-Saving tracker, so this adds the field + the decode path, not a forced tracker.)Tests
cargo test --lib: 426 passed, 0 failed. Newtest_hh_keys_matches_go_golden_bytesparity guards in both the CountMin and CountSketch modules; existing golden/round-trip tests unchanged.🤖 Generated with Claude Code