Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
51bc19c to
a79bbb0
Compare
doc.go: add the Storage group (process.go, hotsource.go) to the file map and switch the map header to 'by concern' now that the package spans two layers; shrink the 'Later layers' note accordingly. golangci-lint (this layer's own new findings): - hotchunk.DB.MaxCommittedSeq: drop named returns (nonamedreturns) - process.go / ingest driver.go: wrap two >120-char lines (lll)
Comment-only (behavior byte-identical; gofmt/vet/-short green). Same moderate trim applied to #805: consolidate the repeated "decision (a)" explanation (single shared DB, one atomic synced WriteBatch, single watermark) into its authoritative home — the pkg/stores/hotchunk package doc — and have process.go / hotsource.go / service.go / driver.go / hot_store.go reference it instead of re-stating it. Collapse prose dividers, cut restatement/over-justification. Kept all load-bearing rationale: loss-vs-staleness, mark-then-write ordering, the borrowed-buffer contract, the shared-vs-standalone close-once ownership model, and the live-chunk RocksDB-LOCK caller contract. (This branch was also rebased onto the updated slice1a, so it inherits the L1 comment polish from #805; the diff vs slice1a is now L2-only.)
| @@ -0,0 +1,138 @@ | |||
| // Package hotchunk implements decision (a): the per-chunk hot tier is | |||
There was a problem hiding this comment.
Design → The chunk hot DB (decision (a))
The per-chunk hot tier as ONE multi-CF RocksDB instance with a SINGLE authoritative watermark — the max committed ledger seq, read from the ledgers CF's last key — and no per-store frontier reconciliation. Each ledger commits as one atomic, synced WriteBatch, so every CF advances together.
| // Cold ingestion is ingest.RunColdChunk over the same ingester set RunCold uses; | ||
| // processChunk only chooses the LCM source (backfillSource) and drives the | ||
| // protocol around the freeze. | ||
| func processChunk(ctx context.Context, chunkID chunk.ID, artifacts ArtifactSet, cfg ProcessConfig) error { |
There was a problem hiding this comment.
Design → One write protocol (rule 1)
processChunk applies the per-kind freeze: rule-1 per-kind idempotency (a frozen kind self-skips; a freezing/pruning/absent key re-materializes), then mark-then-write — every kind goes freezing before any I/O, RunColdChunk writes the files, barrierNewFile fsyncs file+dirents, and only then do the keys flip to frozen.
| // 2. The frozen local .pack, when ledgers is NOT a requested output (re-derive | ||
| // without a download). | ||
| // 3. The configured bulk backend, gated by a bounded WaitForCoverage. | ||
| func backfillSource( |
There was a problem hiding this comment.
Design → Backfill → The primitives (rule 2 source preference)
backfillSource picks the cheapest LCM source for one chunk: (1) a ready, COMPLETE hot tier, (2) the frozen local .pack (when ledgers is not a requested output), (3) the bulk backend behind a bounded coverage wait. The hot branch fatals only on LOSS (ErrHotVolumeLost); an incomplete hot DB is staleness and falls through, because re-derivation IS its recovery.
| "github.com/stellar/stellar-rpc/cmd/stellar-rpc/internal/fullhistory/pkg/stores/ledger" | ||
| ) | ||
|
|
||
| // rocksHotProbe is the production HotProbe: it opens the chunk's SINGLE shared |
There was a problem hiding this comment.
Design → Hot DB helpers
The production HotProbe: opens a chunk's shared hot DB and exposes it as a freeze source, so a just-closed chunk freezes straight from its hot tier without a refetch. Caller contract: never probe the chunk captive-core is actively ingesting — it holds the RocksDB LOCK.
| // ledger commits as one atomic synced WriteBatch (decision (a)); the view is | ||
| // consumed synchronously before the next pull. The hot DB is NOT closed here — | ||
| // the caller owns its lifecycle. | ||
| func RunHot( |
There was a problem hiding this comment.
Design → Hot DB Ingestion
RunHot drives per-ledger ingestion into the injected, caller-owned per-chunk hot DB: each ledger is one atomic synced WriteBatch (decision (a)), consumed from a borrowed view synchronously. The DB is chunk-bound; a mismatch is rejected up front.
| // paths, so RunColdChunk is the re-materialization primitive the streaming | ||
| // freeze protocol drives: a partial file from a crashed attempt is inert scratch | ||
| // the next call overwrites. | ||
| func RunColdChunk( |
There was a problem hiding this comment.
Design → Filesystem artifacts
RunColdChunk is the single-chunk freeze primitive processChunk drives: it streams one chunk into its cold .pack artifact, overwriting any partial from a crashed attempt — the re-materialization the one-write protocol relies on.
| // LedgersCF is the column family the hot ledger data lives in (decision (a): | ||
| // one multi-CF RocksDB per chunk). The standalone OpenHotStore path registers | ||
| // the same CF name, so the on-disk layout is identical shared or standalone. | ||
| const LedgersCF = "ledgers" |
There was a problem hiding this comment.
Design → The chunk hot DB
LedgersCF is the ledger column family inside the shared multi-CF hot DB. The same CF name is registered on the standalone OpenHotStore path, so the on-disk layout is identical whether the store is shared (decision (a)) or standalone.
… fixes (1) Durability fix (real, latent): the post-write barrier gated the grandparent (ledgersRoot) dirent fsync on chunkID % ChunksPerBucket == 0 — a proxy for "first chunk in this bucket" that only holds when buckets are entered in order at their aligned chunk. The bucket dir is created by NewLedgerColdIngester's MkdirAll for whichever chunk freezes FIRST in that bucket, so a frontfill-at-tip or out-of-order backfill/recovery (e.g. chunk 1500 first into bucket 00001) created the dir without fsyncing its dirent — a crash in that window orphans a "frozen" pack, violating "frozen => file durably exists". Replace the arithmetic with the real predicate: snapshot which artifact parent dirs are absent BEFORE the write, and fsync the grandparent for exactly those. Tested via createdParentDirs (TestCreatedParentDirs_FlagsNewBucketRegardlessOfChunkNumber). (2) ingest.Config doc: Txhash/Events are NOT removed — config.go is inherited byte-identical by slices 2/3, which wire those fields additively in driver.go; removing or rejecting them here would break those slices. Instead, document that they are forward-declarations honored only where a slice wires them (selecting an unwired type is accepted but writes nothing). Also fix the stale "per-ledger hot fan-out is always parallel" line — under decision (a) each ledger is one atomic WriteBatch, there is no fan-out.
|
Superseded by a new 2-phase stacked series that re-slices this work by phase (backfill → live ingestion + lifecycle), with the MVP scope cuts (recovery / audit / convergence / retention-reconfiguration dropped) and the folded-in fixes (cold+hot ingest service +
Each layer builds + |
Slice 1, Layer 2 of 4 — Storage primitives. Stacked on Layer 1 (foundations). Diff is storage-only.
Producing and freezing a chunk:
pkg/stores/hotchunk, ledger CF) — one atomic syncedWriteBatchper ledger;processChunk+backfillSource(materialize a chunk's cold.packvia the one-write protocol; source preference hot→local→backend);ingest/driver,service).Compiles +
-short-green on top of Layer 1.