feat: add delta support (proto + compute_delta/apply_delta_bytes) for CountMin, CountSketch, and HLL#59
Merged
Merged
Conversation
…a/apply_delta_bytes Extends the delta-encoding support previously added for DDSketch to three more sketch families so the runtime can emit and apply sparse deltas for them. Per family: - CountMinSketch: additive cell deltas (signed sint64 on the wire; CMS counters only grow) carried as packed cell_rows/cell_cols/d_counts arrays plus full per-row L1/L2 norm deltas. - CountSketch: signed cell deltas carried as packed arrays plus per-row L2 norm deltas; heavy-hitter candidate keys (hh_keys) are forwarded when an upstream tracker provides them. - HyperLogLog: lossless register max-updates -- a register's new value when it increased, applied via max-merge on the receiver. No update is ever dropped (there is no threshold to apply). Each family gains a *Delta proto message (mirroring the Go reference implementation's field numbers and wire encoding exactly), regenerated prost bindings, and compute_delta(snapshot, threshold) + apply_delta_bytes(&[u8]) on its portable wrapper. The emitted delta bytes are byte-identical to the Go reference implementation's delta output for identical input (cross-language byte parity), pinned by per-family golden tests. These deltas carry only sketch-internal state (cells / register updates), so there are no DataPoint-level metric scalars to drop. Tests: per-family round-trip (build -> compute_delta against a snapshot -> apply_delta_bytes -> reconstructs) and a byte-parity assertion against a golden captured from the Go reference implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends the delta-encoding support previously added for DDSketch to three more sketch families, so the runtime can emit and apply sparse deltas for them. Each family gets a
*Deltaproto message, regenerated prost bindings, andcompute_delta(snapshot, threshold)+apply_delta_bytes(&[u8])on its portable wrapper.These deltas carry only sketch-internal state (cells / register updates) -- they never had DataPoint-level metric scalars, so there is nothing to drop.
Per family
cell_rows/cell_cols/d_countsarrays (signedsint64; CMS counters only grow but the signed form covers weighted/decay variants) plus full per-rowl1/l2norm deltas.l2norm deltas. Heavy-hitter candidate keys (hh_keys) are forwarded when an upstream tracker provides them; the minimal wrapper here leaves them empty.register[i] = max(register[i], value). No register update is ever dropped (there is no threshold to apply).Byte parity
The emitted delta bytes are byte-identical to the Go reference implementation's delta output for identical input (cross-language byte parity). Each
*Deltaproto matches the Go reference's field numbers and wire encoding exactly, and each family has a golden test asserting the produced bytes match a fixture captured from the Go reference implementation.Tests
For each family:
compute_deltaagainst a snapshot,apply_delta_bytesonto the snapshot, and assert it reconstructs the sketch (including a delta-against-empty case that reconstructs the full window state);cargo buildandcargo testare green (422 lib tests passing).