Skip to content

feat: add delta support (proto + compute_delta/apply_delta_bytes) for CountMin, CountSketch, and HLL#59

Merged
zzylol merged 2 commits into
mainfrom
feat/cms-cs-hll-delta-support
May 26, 2026
Merged

feat: add delta support (proto + compute_delta/apply_delta_bytes) for CountMin, CountSketch, and HLL#59
zzylol merged 2 commits into
mainfrom
feat/cms-cs-hll-delta-support

Conversation

@zzylol

@zzylol zzylol commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Extends the delta-encoding support previously added for DDSketch to three more sketch families, so the runtime can emit and apply sparse deltas for them. Each family gets a *Delta proto message, regenerated prost bindings, and compute_delta(snapshot, threshold) + apply_delta_bytes(&[u8]) on its portable wrapper.

These deltas carry only sketch-internal state (cells / register updates) -- they never had DataPoint-level metric scalars, so there is nothing to drop.

Per family

  • CountMinSketch -- additive cell deltas. A cell is carried when its absolute count change meets the threshold. The wire form is the packed cell_rows/cell_cols/d_counts arrays (signed sint64; CMS counters only grow but the signed form covers weighted/decay variants) plus full per-row l1/l2 norm deltas.
  • CountSketch -- signed cell deltas in the same packed-array form, plus per-row l2 norm deltas. Heavy-hitter candidate keys (hh_keys) are forwarded when an upstream tracker provides them; the minimal wrapper here leaves them empty.
  • HyperLogLog -- lossless register max-updates: a register's new value when it increased, applied on the receiver via register[i] = max(register[i], value). No register update is ever dropped (there is no threshold to apply).

Byte parity

The emitted delta bytes are byte-identical to the Go reference implementation's delta output for identical input (cross-language byte parity). Each *Delta proto matches the Go reference's field numbers and wire encoding exactly, and each family has a golden test asserting the produced bytes match a fixture captured from the Go reference implementation.

Tests

For each family:

  • a round-trip test: build a sketch, compute_delta against a snapshot, apply_delta_bytes onto the snapshot, and assert it reconstructs the sketch (including a delta-against-empty case that reconstructs the full window state);
  • a byte-parity assertion against a Go-produced golden.

cargo build and cargo test are green (422 lib tests passing).

zzylol and others added 2 commits May 26, 2026 10:08
…a/apply_delta_bytes

Extends the delta-encoding support previously added for DDSketch to three
more sketch families so the runtime can emit and apply sparse deltas for them.

Per family:
- CountMinSketch: additive cell deltas (signed sint64 on the wire; CMS
  counters only grow) carried as packed cell_rows/cell_cols/d_counts arrays
  plus full per-row L1/L2 norm deltas.
- CountSketch: signed cell deltas carried as packed arrays plus per-row L2
  norm deltas; heavy-hitter candidate keys (hh_keys) are forwarded when an
  upstream tracker provides them.
- HyperLogLog: lossless register max-updates -- a register's new value when
  it increased, applied via max-merge on the receiver. No update is ever
  dropped (there is no threshold to apply).

Each family gains a *Delta proto message (mirroring the Go reference
implementation's field numbers and wire encoding exactly), regenerated prost
bindings, and compute_delta(snapshot, threshold) + apply_delta_bytes(&[u8])
on its portable wrapper. The emitted delta bytes are byte-identical to the
Go reference implementation's delta output for identical input
(cross-language byte parity), pinned by per-family golden tests. These deltas
carry only sketch-internal state (cells / register updates), so there are no
DataPoint-level metric scalars to drop.

Tests: per-family round-trip (build -> compute_delta against a snapshot ->
apply_delta_bytes -> reconstructs) and a byte-parity assertion against a
golden captured from the Go reference implementation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zzylol zzylol merged commit 7ee2829 into main May 26, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant