Skip to content

Unified ingestion workflow: streaming daemon — Slice 3 (tx-hash)#804

Closed
chowbao wants to merge 2 commits into
streaming-slice2-eventsfrom
streaming-slice3-txhash
Closed

Unified ingestion workflow: streaming daemon — Slice 3 (tx-hash)#804
chowbao wants to merge 2 commits into
streaming-slice2-eventsfrom
streaming-slice3-txhash

Conversation

@chowbao

@chowbao chowbao commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Slice 3 of the unified ingestion workflow (design #722 / roadmap #777), stacked on slice 2 (#803 — ledgers + events). Base is streaming-slice2-events; the diff is the tx-hash subsystem on top. This is the largest slice — the design's hardest-to-review component — and it completes the daemon: with slices 1–3 applied, the fullhistory tree is byte-identical to the full 3-type implementation.

What this adds (the tx-hash data type + rolling index)

  • The txhash column family in the per-chunk hot DB — one atomic synced WriteBatch per ledger now carries ledgers + events + tx-hashes.
  • The per-chunk sorted .bin run + the per-window streamhash .idx, with the rolling rebuild on each chunk boundary, coverage [lo,hi], and the atomic promote/demote commit batch.
  • The resolver's per-window IndexBuild + the executor's index-build stratum (the chunk→index done-channel dependency); index-aware discard (a hot DB survives until its window index covers the chunk), prune's redundant-.bin branch, surgical recovery of index keys, and the audit INV-2 (single frozen coverage / no leftover .bin) + INV-3/INV-4 index walks.
  • The chunks_per_txhash_index config pin, rebuild observability, and the multi-window tx-hash lookup E2E (cross-window false-positive rejection).

Because tx-hash is the per-window index subsystem (coverage, commit-batch atomicity, the chunk→index DAG), this slice necessarily touches more of the orchestration than slices 1–2 — which is exactly why it was sequenced last.

Composes (already on feature/full-history)

The txhash store + single-index build (#728/#729) and the read path (#794), plus the streamhash dependency.

The stack is now complete

slices 1 (ledgers) → 2 (events) → 3 (tx-hash) reconstruct the full daemon.

Testing

Built against RocksDB 10.9.1 (grocksdb 1.10.7). Full fullhistory tree green on the non-short suite incl. the multi-window lookup E2E: go test ./cmd/stellar-rpc/internal/fullhistory/....

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@chowbao chowbao force-pushed the streaming-slice2-events branch 2 times, most recently from cb5c1e3 to a0935aa Compare June 23, 2026 16:35
@chowbao chowbao force-pushed the streaming-slice3-txhash branch from b0eddbc to 007d123 Compare June 23, 2026 16:35
@chowbao chowbao force-pushed the streaming-slice2-events branch from a0935aa to 9c92206 Compare June 23, 2026 17:35
@chowbao chowbao force-pushed the streaming-slice3-txhash branch from 007d123 to 8b558b8 Compare June 23, 2026 17:35
@chowbao chowbao force-pushed the streaming-slice2-events branch from 9c92206 to de84d1d Compare June 23, 2026 22:33
@chowbao chowbao force-pushed the streaming-slice3-txhash branch from 8b558b8 to dba85ae Compare June 23, 2026 22:33
@chowbao chowbao force-pushed the streaming-slice2-events branch from de84d1d to e823bc7 Compare June 24, 2026 02:54
@chowbao chowbao force-pushed the streaming-slice3-txhash branch from dba85ae to e0b2efa Compare June 24, 2026 02:54
@chowbao chowbao force-pushed the streaming-slice2-events branch from e823bc7 to a5d2543 Compare June 24, 2026 03:40
@chowbao chowbao force-pushed the streaming-slice3-txhash branch from e0b2efa to 606fa0a Compare June 24, 2026 03:40
chowbao added 2 commits June 23, 2026 23:45
Stacked on slice 2 (ledgers + events); this commit's diff is the tx-hash
subsystem on top. Completes the daemon — with slices 1-3 applied the tree is
byte-identical to the full 3-type implementation.

Adds the TX-HASH data type and its per-window rolling-index subsystem:
- the txhash column family in the per-chunk hot DB (one atomic synced
  WriteBatch per ledger now carries ledgers + events + tx-hashes);
- the per-chunk sorted .bin run + the per-window streamhash .idx, with the
  rolling rebuild on each chunk boundary, coverage [lo,hi], and the atomic
  promote/demote commit batch;
- the resolver's per-window IndexBuild + the executor's index-build stratum
  (chunk->index done-channel dependency); index-aware discard (a hot DB lives
  until its window index covers the chunk), prune's redundant-.bin branch,
  surgical recovery of index keys, and the audit INV-2 (single frozen coverage /
  no leftover .bin) + INV-3/INV-4 index walks;
- the chunks_per_txhash_index config pin, rebuild observability, and the
  multi-window tx-hash lookup E2E (cross-window false-positive rejection).

Composes the txhash store + single-index build (#728/#729) and the read path
(#794), plus the streamhash dependency, already on feature/full-history.

Built against RocksDB 10.9.1 (grocksdb 1.10.7); fullhistory tree green on the
non-short suite incl. the multi-window lookup E2E.
- doc.go (resolved in the rebase): scope the file map to the complete
  daemon, add window.go (geometry) + an Index group (txindex.go), drop the
  forbidden design-docs/full-history-implementation-status.md reference,
  prefer 'catalog' over 'meta-store'.
- Remove PERF.md + perf_test.go: the bench-format-alignment material
  belongs with the bench harness, not the daemon PR.
- Add TestBuildTxhashIndex_SameWindowKeyCollisionFailsLoud: a same-window
  16-byte-prefix collision must fail loudly with streamhash.ErrDuplicateKey
  (issue #814 acceptance), never silently drop — previously uncovered.
- golangci-lint (this slice's own new findings): gci/misspell/modernize/
  unconvert via --fix; //nolint:cyclop on buildTxhashIndex; revive unused
  pendingArtifacts cfg -> _; lll wrap; //nolint:unparam on the general test
  helpers; //nolint:funlen,cyclop,maintidx on the lookup E2E.
@chowbao

chowbao commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Superseded by a new 2-phase stacked series that re-slices this work by phase (backfill → live ingestion + lifecycle), with the MVP scope cuts (recovery / audit / convergence / retention-reconfiguration dropped) and the folded-in fixes (cold+hot ingest service + NewPrometheusSink metrics wiring, exponential withRetries backoff, deletion of the dead RunHot/RunCold stream-drain orchestration):

Each layer builds + go vet + go test -short green; the capstone (#821) also passes the lifecycle E2E. Leaving this open for now — can be closed once the new stack is reviewed.

@chowbao chowbao closed this Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant