Streaming daemon: Phase 1 layer 1 — foundations (#815)#817
Conversation
Catalog + one-write protocol + key schema, config loader/locking, chunk/window geometry, the retention floor/gate, the crash-injection hook seam, and the artifact/path layout for the full-history streaming daemon. Cold substrate only; the primitives (processChunk, buildTxhashIndex) and orchestration land in the stacked layers above. Part of #815.
3d6b182 to
20c4e3b
Compare
chowbao
left a comment
There was a problem hiding this comment.
Two kinds of inline notes below:
- 📍 design/issue pointers — each maps a component to its design-doc section and the Phase 1 issue (#815, epic #808), so reviewers have an exact what/where anchor right at the code. They mirror the PR description's file map.
- 🔎 flagged for review — items a cleanup pass surfaced but left unchanged because they're design/behavior calls, not mechanical fixes.
| // Windows is window arithmetic bound to one chunks_per_txhash_index value. The | ||
| // value is immutable for a deployment (pinned in config:chunks_per_txhash_index | ||
| // on first start), so a Windows is constructed once and shared. | ||
| type Windows struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · Geometry & window arithmetic — #815.
Windows is the chunk↔window↔index layer over the merged pkg/chunk, bound to one immutable chunks_per_txhash_index. MaxChunksPerTxhashIndex is the cpi bound; IsTerminalCoverage is the finalized-window predicate.
There was a problem hiding this comment.
All the math in windows.go comes from the above design docs
| // State is an artifact key's lifecycle value. Per-chunk artifacts and index | ||
| // coverages share the same three states with the same meanings; the empty | ||
| // State (key absent) means "neither file nor in-progress write exists". | ||
| type State string |
There was a problem hiding this comment.
📍 Phase 1 foundations · catalog key schema + lifecycle states — #815.
The freezing → frozen → pruning lifecycle and the chunk: / index: / config: key families (chunkKey, indexKey below). These keys are the single source the path bijection (paths.go) and the sweeps walk instead of listing directories.
Design: Catalog keys · Index keys.
| // data path reads/writes. Below each per-tree root the bucket/window structure | ||
| // is fixed (a bucket is a filesystem concern only; bucket ids never appear in | ||
| // meta-store keys). | ||
| type Layout struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · key↔path bijection — #815.
Layout is the single source of the strict key→file mapping (one root per artifact tree) that lets cleanup walk keys rather than readdir. The same roots flocked by LockRoots are the ones read/written here.
Design: Filesystem artifacts · Directory layout.
| // layout. The catalog stays a *pure* catalog — every key names a file/dir | ||
| // state or a config pin; progress is derived, never stored (see the data | ||
| // model's "Progress is derived, never stored"). | ||
| type Catalog struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · catalog reader — #815.
A pure wrapper over metastore.Store — progress is derived, never stored. FrozenCoverage asserts the single-canonical-state invariant; PinLayout is the atomic first-start commit primitive (its caller validateConfig lands in a later layer).
Design: Data model · Invariants.
| // The caller MUST have fsynced the .idx file and its dir first. CommitIndex | ||
| // re-reads the predecessor inside the batch-composition phase from durable | ||
| // state, so it is safe to call after a crash without external bookkeeping. | ||
| func (c *Catalog) CommitIndex(cov IndexCoverage) error { |
There was a problem hiding this comment.
📍 Phase 1 foundations · the one-write protocol — #815.
CommitIndex is the full index-finalization batch: promote coverage → demote predecessor → on a terminal build demote the in-window txhash keys, all in one synced atomic batch (so there's never an instant with two frozen coverages).
Design: One write protocol · gettx §7.3 Finalization.
| // coverage refresh) — how it obtains completeThrough is the query-routing | ||
| // design's concern; this type only fixes the contract's arithmetic so the read | ||
| // path and the prune stage cannot drift. | ||
| type RetentionGate struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · reader retention gate — #815.
RetentionGate makes a below-floor read not-found regardless of on-disk state — the contract that lets prune/sweep unlink without coordinating with the index lifecycle. Floor/gate only; retention reconfiguration is dropped for the MVP.
Design: Reader contract · gettx §8.2 Cold lookup.
| // The representation is a fixed-width bitmask over allKinds' canonical order, so | ||
| // Kinds() yields kinds in that order (the same order buildColdIngesters uses) | ||
| // and membership tests are allocation-free. | ||
| type ArtifactSet struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · seam for the layers above — #815.
ArtifactSet is the Kind set processChunk (a later layer) consumes, narrowed by per-kind idempotency. It's not dead in the stack — process.go / resolve.go / eligibility.go in the layers above this one already consume it.
Design: Invariants.
| // | ||
| // The primitives layer (processChunk, buildTxhashIndex) adds its own hooks to | ||
| // this struct. | ||
| type crashHooks struct { |
There was a problem hiding this comment.
📍 Phase 1 foundations · test-only fault-injection seam — #815.
crashHooks is nil in production; tests set it to assert the mark→fsync→flip ordering and batch atomicity from inside the real protocol methods — it's how the crash-safety invariants are exercised.
Non-behavioral cleanup of the Phase 1 foundations package. Code: - Delegate cold-artifact leaf filenames to the owning store packages (ledger.PackName, eventstore.*, txhash.ColdBinName) instead of re-deriving them in Layout — one source of truth, matching ingest. - Alias DefaultChunksPerTxhashIndex to txhash.DefaultChunksPerIndex and DefaultEarliestLedger to EarliestGenesis. - Extract syncAndClose, the sync-then-close epilogue shared by fsyncFile/fsyncDir. - Compute chunkKey once per iteration in windowTxhashKeysPresent. - Drop the duplicate Windows.ChunksIn (identical to ChunksPerIndex). Docs: - Condense doc comments across the package (~26% fewer comment words; the high-volume files 20–43%), cutting repetition while preserving invariants, rationale, design-doc citations, and //nolint directives. No behavior change; gofmt clean, build/vet/tests green.
| } | ||
|
|
||
| // --------------------------------------------------------------------------- | ||
| // fsync barriers — the os-level durability primitives the one-write protocol and |
There was a problem hiding this comment.
📍 Phase 1 foundations · one-write-protocol durability primitives — #815.
The os-level fsync barriers (barrierNewFile, fsyncParentDirs) the mark→fsync→flip protocol and the sweeps depend on: a creation is durable only once both the file's data and its dirent are fsynced.
Design: One write protocol. (Re-anchored to the current commit; the original went stale after the cleanup push.)
| type Paths struct { | ||
| DataDir string // the data root | ||
| Catalog string // catalog RocksDB dir | ||
| Ledgers string // immutable ledger packs root | ||
| Events string // immutable events segments root | ||
| TxhashRaw string // transient txhash .bin root | ||
| TxhashIndex string // frozen txhash .idx root | ||
| HotStorage string // per-chunk hot RocksDB root | ||
| } |
There was a problem hiding this comment.
Different components can live in different storage (nvme VS ebs)
| "os" | ||
| "path/filepath" | ||
|
|
||
| "golang.org/x/sys/unix" |
There was a problem hiding this comment.
This only works on unix for now. Not sure if making this compile/work on windows is necessary
There was a problem hiding this comment.
Worth doing — the build CI matrix includes windows-latest running go build ./cmd/stellar-rpc, so this will fail to compile there once the daemon is wired into the binary, not just lose functionality.
github.com/gofrs/flock is probably the cleanest fix: it handles flock/LockFileEx internally (no build tags), and its non-blocking TryLock() matches the current LOCK_NB fail-fast behavior, so LockRoots/RootLocks stay as-is.
There was a problem hiding this comment.
Addressed in 0953299 — split the lock primitive behind acquireLock/releaseLock with build tags: config_lock_unix.go keeps the flock (LOCK_EX|LOCK_NB, EWOULDBLOCK), and config_lock_windows.go adds the equivalent LockFileEx (LOCKFILE_EXCLUSIVE_LOCK|LOCKFILE_FAIL_IMMEDIATELY, ERROR_LOCK_VIOLATION) over a one-byte range. Both release on handle close and on any process exit, so the crash-release contract (and TestLockRoots_ClosingFdReleasesLock) holds on both. The shared config_lock.go is now platform-neutral; LockRoots/Release signatures are unchanged.
Note: the unix path is runtime-tested; the Windows path is compile-verified against golang.org/x/sys/windows (no Windows host to run it on here).
Review-time cleanups, no behavior change: - Rename the Paths.LockRoots() accessor to RootsToLock() so it no longer echoes the package-level LockRoots() that actually acquires the flocks. The call site now reads LockRoots(paths.RootsToLock()...): the method is the noun (roots to lock), the func is the verb (lock them). - Fix stale filename references in the Catalog doc comment (protocol.go -> catalog_protocol.go, sweep.go -> catalog_sweep.go). gofmt clean; build/vet/tests green.
No behavior change: - Remove the unexported retentionFloorFor (a one-line pass-through to effectiveRetentionFloor) and seqWithinRetention (test-only). NewRetentionGate now calls effectiveRetentionFloor directly. - Drop the tautological test assertion that the free function and the gate agree — both were seq >= effectiveRetentionFloor(...), so gate.Admits already covers it. - Fold the free-floating contract header onto the RetentionGate type and condense the docs (retention.go 95 -> 59 lines). gofmt clean; vet 0; tests green.
The ArtifactSet doc cited "design-docs rule 2" / "rule 1's per-kind idempotency", but the design's only numbered rules are the reader contract's (a different concept). Describe the concepts directly instead. Comment-only; no behavior change.
Pure test reorganization, no logic change — the 894-line omnibus split by concern, function bodies moved verbatim: - window_test.go geometry / window arithmetic - keys_test.go key schema + key<->path bijection - catalog_test.go catalog reader (pins, scans, FrozenCoverage) - catalog_protocol_test.go mark/flip + CommitIndex - catalog_sweep_test.go the two sweeps - crashsafety_test.go power-loss-between-steps + never-unlink-under-frozen - helpers_test.go shared test helpers + vars Same 33 tests (51 package-wide); gofmt clean, vet/test green.
Deferral map — what this foundations layer leaves to later layersWalked the full PR (all 11 source files + the test suite) against the design docs and #815. It's a faithful, well-tested realization of the foundations layer — no correctness defects found. This note catalogs what's intentionally deferred, so reviewers know what is not proven here and where each piece lands. Deferred to the remaining Phase 1 layers (the stack above this one)
Deferred to Phase 2 (#816 — hot tier + lifecycle)
Dropped for the MVP (per #815 "out of scope")
Present + tested here, but not yet wired (forward scaffolding)These primitives are built and unit-tested in this PR but have no production caller yet — their callers are the layers above. Expected for the bottom of the stack; listing so it reads as "tested, awaiting caller," not an oversight:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0b6e71d556
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Comment-only pass over the primitives layer (process.go, txindex.go, hooks.go, ingest/driver.go), applying #817's condensing rubric: keep each canonical explanation once, shrink per-function docs to what's unique, and collapse multi-line inline re-explanations to short markers — without rewording. Notably: buildTxhashIndex's four-step protocol is explained once in its docstring, so the inline "Step N" comments shrink to markers carrying only the per-step "why" (skip-precedes-precondition, the done-channel backstop, create-or-truncate). No code, no directive comments, and no behavior change (verified by diffing the comment-stripped source against the base).
Comment-only pass over the primitives layer (process.go, txindex.go, hooks.go, ingest/driver.go), applying #817's condensing rubric: keep each canonical explanation once, shrink per-function docs to what's unique, and collapse multi-line inline re-explanations to short markers — without rewording. Notably: buildTxhashIndex's four-step protocol is explained once in its docstring, so the inline "Step N" comments shrink to markers carrying only the per-step "why" (skip-precedes-precondition, the done-channel backstop, create-or-truncate). No code, no directive comments, and no behavior change (verified by diffing the comment-stripped source against the base).
Foundations (#817) renamed Paths.LockRoots() -> RootsToLock() to disambiguate from the package-level LockRoots() that acquires the flocks. This adapts the daemon's lock-acquisition call after rebasing onto that foundations; the package func call LockRoots(...) is unchanged. Surfaced by the rebase as a build break (paths.LockRoots undefined); no behavior change. build/vet/test -short green.
Review follow-ups to the Phase 1 foundations, the first being a crash-safety
fix:
- Durability: MkdirAll creates the storage roots but fsyncs neither the new
dirs nor the direntries naming them, and the one-write protocol's
grandparent fsync (barrierNewFile) only reaches a root's CONTENTS, never the
root's own link in its parent. On a fresh deployment a crash just after the
first freeze could lose a whole storage subtree while the synced catalog
still advertised a "frozen" artifact under it. lockOne now records the
deepest pre-existing ancestor before MkdirAll and fsyncs the created chain
(deepestExistingDir + fsyncNewDirs in paths.go) — one extra dir fsync per
root at startup.
- Catalog doc: warn that PutEarliestLedger/PutChunksPerTxhashIndex must not be
used to pin the layout on first start (that is PinLayout's atomic
both-or-neither job); they exist only for isolated pin writes.
- windowTxhashKeysPresent: drop the redundant cid <= last condition — the
explicit `if cid == last { break }` is the real inclusive-bound /
chunk.ID-wraparound guard.
- Tests: cover retention_chunks=0 (full history pins at earliest_ledger, the
earliest-wins branch the existing tests miss) and the young-store /
oversized-retention clamp to genesis; and prove the flock is released when
the lock fd is closed without LOCK_UN — the kernel guarantee kill -9 relies
on.
gofmt clean; the new fsync helpers were build- and behavior-checked standalone.
Part of #815.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3347348492
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Comment-only pass over the primitives layer (process.go, txindex.go, hooks.go, ingest/driver.go), applying #817's condensing rubric: keep each canonical explanation once, shrink per-function docs to what's unique, and collapse multi-line inline re-explanations to short markers — without rewording. Notably: buildTxhashIndex's four-step protocol is explained once in its docstring, so the inline "Step N" comments shrink to markers carrying only the per-step "why" (skip-precedes-precondition, the done-channel backstop, create-or-truncate). No code, no directive comments, and no behavior change (verified by diffing the comment-stripped source against the base).
chunks_per_txhash_index was a settable [layout] TOML field, pinned to the metastore on first start and validated-or-abort on every restart. Changing it invalidates every existing tx-hash index boundary, so there is no good reason to ever deviate from the default. Make it the fixed geometry.ChunksPerTxhashIndex constant (1000, aliasing txhash.DefaultChunksPerIndex) and remove the settable + pinned machinery: - config.go: drop the [layout] section, the ChunksPerTxhashIndex field, DefaultChunksPerTxhashIndex, and its WithDefaults default. - catalog: remove the config:chunks_per_txhash_index pin key and the ChunksPerTxhashIndex() getter; PinLayout -> PinEarliestLedger (a single synced Put, since earliest_ledger is now the only pin). - geometry: add the ChunksPerTxhashIndex constant; TxHashIndexLayout stays parameterized for test coverage but is built from the constant in prod. Tests updated; adds a regression test that the removed key is now rejected.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) stays in package streaming and now imports the new subpackages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) become free functions in streaming over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in streaming (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (streaming + catalog + geometry + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Restacked on the split/no-hooks #819 and ported the hot tier across the new package boundary: - hot key schema -> geometry (HotState/HotReady/HotTransient, exported HotChunkKey/ParseHotChunkKey/HotChunkPrefix); hot catalog methods -> catalog (HotState, PutHotTransient, FlipHotReady, DeleteHotKey, {Ready,}HotChunkKeys) - processChunk hot-source branch + progress hot refinement (lastCommittedLedger(cat, probe), highestReadyChunkSigned, refineWithHotDB) - new files: pkg/stores/hotchunk, streaming/{eligibility,hotsource,ingest,lifecycle} - daemon wires the cold-only catch-up's HotProbe (NewRocksHotProbe) - crash-hooks REMOVED to match #817/#818 (the split makes cat.hooks unreachable from streaming); the one beforeHotTransient hook test is dropped, the rest are the structural crash tests #817/#818 established - propagated renames: window->tx-hash-index, RetentionGate->RetentionFloor, cat.Has->public HotState, cat.layout->Layout() build + vet + go test -short green on ./cmd/stellar-rpc/internal/fullhistory/...
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new streaming/backfill/ package (per #824, named for what it does) and imports the new subpackages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ nests under streaming/ for now, mirroring catalog/ and geometry/; dropping the streaming codename is left to the wider #824 pass. It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (streaming + backfill + catalog + geometry + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does — the "streaming" codename dropped) and imports #817's subpackages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/ (per #824). It still imports the catalog and geometry subpackages that currently live under streaming/ — relocating those out of streaming/ is a later #824 step — and depends only on ingest, catalog, and geometry: one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + streaming + catalog + geometry + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Move the streaming package up one level so its files live directly under
cmd/stellar-rpc/internal/fullhistory/, removing the streaming/ directory:
- top-level files become package fullhistory (was package streaming)
- catalog/ and geometry/ subpackages move up to fullhistory/{catalog,geometry}
- import paths rewritten: .../fullhistory/streaming/* -> .../fullhistory/*
Pure move/rename; no behavior change. go build + go vet pass.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebased onto #817's foundations after the round-2 review reworked them: the #824 geometry+catalog subpackage split, the streaming/ subdir flattened into fullhistory/, Windows support dropped, and the crash-test hooks removed. The primitives spine (processChunk, buildTxhashIndex, buildThenSweep + the cold backfill source order) lives in a new fullhistory/backfill/ package (per #824, named for what it does) and imports #817's packages: - *Catalog -> *catalog.Catalog. The two former *Catalog helpers (txhashBinInputs, windowDemotedTxhashRefs) are free functions in backfill over the catalog's exported API, since the type now lives in another package and methods can't be added across the boundary. - Key/state/layout/index types and the fsync barrier (BarrierNewFile) resolve through geometry.*; the ArtifactSet -> ingest.Config translation stays in backfill (ingestConfigFor) so catalog keeps its one-way dependency on geometry alone (the #824 split invariant). backfill/ sits directly under fullhistory/, alongside the catalog and geometry packages #817 flattened out of streaming/ (per #824). It depends only on ingest, catalog, and geometry — one direction, no cycle. The single-chunk cold primitive lands in ingest as RunColdChunk/ColdDirs, and buildColdIngesters now delegates to the explicit-per-type-dir builder buildColdIngestersIn (deriving its dirs from one coldDir), so the constructor table and rollback semantics keep a single definition site. RunCold's behavior is unchanged. Crash-test hooks dropped to match #817. The in-method ordering observations (afterMarkFreezing / afterBarrier / afterIndexMark / afterCommitBeforeSweep) are gone; the §7.6 crash matrix is reconstructed hook-free through the public protocol and the buildTxhashIndex(commit) / buildThenSweep(commit+sweep) seam, asserting recovery convergence on the durable states a crash leaves behind. The pure mid-method ordering checks are deferred to the fault-injection harness (#823), mirroring catalog's TestCrashSafety_FileWrittenKeyNotFlipped. go build, go vet, gofmt and go test (backfill + ingest, RocksDB 10.9.1 cgo toolchain) are all green. Part of #815.
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
backendTip/newBackendTip had no production caller (buildProductionBoundaries wires notConfiguredTip + a nil Backend); only a test exercised it. Per the #817 review pattern (remove speculative API ahead of its caller — Admits, Floor, PutEarliestLedger were all cut), drop it until #772 wires the real backend/read path. The fakeBackend test helper loses its now-unused tipErr.
Rebase the daemon layer onto the split #818 and follow the issue #824 package layout instead of the old single streaming package: - backfill/: resolve.go + execute.go join process.go/txindex.go (the planner + bounded-worker executor). RunBackfill and the RunChunk/ RunIndex test seams are exported so the daemon package's catch-up tests can inject fakes across the package boundary. - observability/: the daemon's control-plane Metrics sink (NopMetrics and MetricsOrNop exported for backfill, which consumes the interface). - fullhistory/ (top level, package fullhistory): daemon, startup (catch-up + serve), config_validate, progress, doc. Adopt #817's chunks_per_txhash_index-is-a-constant change in the daemon layer: drop the cpi config field + its form/restart validation, replace PinLayout(cpi, earliest) with PinEarliestLedger(earliest), and build the TxHashIndexLayout from geometry.ChunksPerTxhashIndex. The daemon E2E test now exercises a non-terminal (rolling) index coverage, since one complete chunk no longer finalizes a 1000-chunk window; the terminal demote+sweep stays unit-tested in backfill/txindex_test. Propagate #818's window->index naming into the daemon: windowsOverlapping -> indexesOverlapping, the TxHashIndexLayout vars -> txLayout. closes #815
backendTip/newBackendTip had no production caller (buildProductionBoundaries wires notConfiguredTip + a nil Backend); only a test exercised it. Per the #817 review pattern (remove speculative API ahead of its caller — Admits, Floor, PutEarliestLedger were all cut), drop it until #772 wires the real backend/read path. The fakeBackend test helper loses its now-unused tipErr.
Epic #808 → Phase 1 #815 → this PR (layer 1 of 3).
design-docs/full-history-streaming-workflow.md+gettransaction-full-history-design.md(section links inline below).What this layer implements
New
cmd/stellar-rpc/internal/fullhistory/streamingpackage, cold-only: the catalog and the immutable cold artifacts (.pack/ events segment /.bin/.idx). The hot tier and live loop are Phase 2 (#816).window.gopkg/chunk;Windowsbound to one immutablechunks_per_txhash_index, plusMaxChunksPerTxhashIndexandIsTerminalCoverage.keys.go,paths.gochunk:/index:/config:key families, thefreezing|frozen|pruningstates, andLayoutas the single source of the strict key→file mapping that lets cleanup walk keys instead of listing directories.catalog.gometastore.Store(progress is derived, never stored); typed reads, key-driven scans,FrozenCoverage(asserts the single-canonical-state invariant), andPinLayout(the atomic first-start commit primitive; its callervalidateConfigis a later layer).catalog_protocol.go+ fsync barriers inpaths.goCommitIndexis the full index-finalization batch (promote coverage → demote predecessor → on a terminal build demote in-window txhash keys).catalog_sweep.goconfig.go--configTOML schema, strict parsing (unknown keys rejected), defaults,ResolvePaths, andLockRoots.validateConfigitself is deferred to the orchestration layer.config_lock.goflockper storage root; a second daemon on any shared root fails fast, and the kernel drops the lock on exit.retention.goRetentionGatemakes a below-floor read not-found regardless of on-disk state, which is what lets prune/sweep unlink without coordinating with the index lifecycle. Floor/gate only; retention reconfiguration dropped for MVP.artifacts.go,hooks.goArtifactSet(theKindsetprocessChunkwill consume) and the test-onlycrashHooksfault-injection seam (nil in production).How this maps to #815 / #808
Phase 1 (#815) groups its scope as foundations → primitives → orchestration → entrypoint. This PR delivers the foundations in full, minus the hot-tier keys:
window.gokeys.go,paths.gofreezing/frozen/pruningkeys.gocatalog_protocol.go,catalog_sweep.goconfig.goflockconfig_lock.goRetentionGate(floor/gate; reconfig dropped)retention.goconfig:*pins + the two-pin atomic commit primitivePinLayout(caller deferred)hot:chunk:*keys +transient/readystatesDeferred to the remaining Phase 1 layers (#815):
processChunk,backfillSource,buildTxhashIndexvalidateConfig,resolve/executePlan/runBackfill,startStreaming+ the catch-up loopColdService/NewPrometheusSinkwiringOut of scope of #815 entirely:
Tests
The package compiles and passes
-shorton its own. Coverage instreaming_test.go(+config_test.go,config_lock_test.go,retention_test.go):FrozenCoverageuniqueness (incl. detecting two-frozen as a bug),CommitIndexpromote/demote, terminal txhash-key demotion, batch atomicity, and re-commit idempotency.crashHooks): file-written-but-key-not-flipped, unlink-durable-but-key-not-deleted (chunk and index), and never-unlink-under-a-frozen-key.Verification:
go build+go vet+go test -shortgreen on./cmd/stellar-rpc/internal/fullhistory/...(cgo RocksDB toolchain).cmd/stellar-rpcbinary link needs the Rustlibpreflight/libxdr2json(CImake build-libs);go vet ./cmd/stellar-rpc/type-checks the entrypoint locally.golangci-lintruns in CI.Stack
streaming-phase1-foundations→ feature/full-history.pkg/{chunk,rocksdb,stores/metastore},pkg/stores/{ledger,eventstore}streamhash