bench(fullhistory): make apply-load synthetic LCM usable by the read benches at target TPS#760
bench(fullhistory): make apply-load synthetic LCM usable by the read benches at target TPS#760chowbao wants to merge 29 commits into
Conversation
Markdown report covering the cross-machine bench run captured under
gs://rpc-full-history/benchmarks/{c6id.2xlarge,c6id.4xlarge,c6id.8xlarge,im4gn.4xlarge}-2026-05-21*.
Tables + Mermaid xychart-beta blocks for: peak read throughput,
worker scaling (cold and hot n=1), tx-page page-size sweep,
xdr-views vs round-trip on tx-hash + events-ingest, per-ledger ingest,
bulk ingest, cold-vs-hot speedup, and x86 vs Graviton2 at matched vCPU.
Source per-iter CSVs and the summary CSVs that back every table here
live at gs://rpc-full-history/benchmarks/_summary/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… report New section 11 transposes the cross-machine tables: one consolidated table per machine (c6id.2xlarge, c6id.4xlarge, c6id.8xlarge, im4gn.4xlarge) listing every bench result — full ledger grid sweep, tx-page, tx-hash (hit/miss × xdrviews/roundtrip), per-ledger ingest, and bulk ingest — with p50/p90/p99 and throughput. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New Section 2 ("Internal vs production RPC providers") includes the
prior black-box benchmark across 4–6 production RPC providers and
juxtaposes their p50s with the internal hot/cold tiers. Adds a
Mermaid bar/line chart of the per-workload speedups. Remaining
sections renumbered 3–12.
Headline: hot/cold full-history is 10×–1773× faster than the average
production RPC across ledger-point, ledger-range, tx-page, tx-hash,
and the four event-filter scenarios. Note: 'onfinality' and 'sorobanrpc'
are absent from tx-hash and events workloads (n=4 instead of 6).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colons in x-axis labels (ev:nofilt, ev:contract, ev:topic, ev:both) break Mermaid's xychart-beta parser. Replaced with hyphens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New-data-only report over the 2026-06-03 runs (4 machines) on the rewritten rpc-hack bench harness. Notes methodology changes vs 2026-05-21: ops/s is no longer comparable across runs (only single-in-flight p50 latency is), the sweep axis is now query-concurrency 1-16, and ledger/tx-page/tx-hash read coverage narrowed while events query + ingest stage detail broadened. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Condensed two-table view (typical p50 latency + peak throughput) with a full glossary defining every row, column, tier, and variable (n, page, c, p50/p99, ops/s). Links back to the full cross-machine report. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds Table 3 (ingest throughput: hot-ingest ledgers/s, build-txhash-index keys/s) and Table 4 (per-stage ingest cost), plus glossary entries for the ingest workloads and ledgers/s, keys/s, and stage terms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cold-ingest ledgers/s computed as sum(chunk_wall) / chunk-workers (upper-bound estimate, since the harness records summed per-chunk wall, not true end-to-end wall). Flagged as an estimate; scales with --chunk-workers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1 report Source/summary CSV paths were missing the dated prefix (data lives under .../benchmarks/2026-05-21/, the undated paths don't exist). Also dates the title and forward-links the 2026-06-03 run, noting the harness changed and ops/s is not comparable across runs. Historical 5/21 numbers are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drives the full read + ingest bench suite in bench-fullhistory: builds the binary once, then runs cold+hot ledgers/txpage/txhash/events read benches (each a 1,4,8,16 query-concurrency sweep) plus the hot-ingest, cold-ingest, and build-txhash-index ingest benches. By default the reads use prebuilt fixtures and ingest writes to scratch (independent measurements). INGEST_FIRST=1 instead ingests first and repoints every read bench at the freshly-ingested stores, so the suite is self-contained from a single raw-ledger packfile seed — usable on a fresh machine with no prebuilt data. Paths/sizing knobs are env- overridable for running across different machines. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR #750 review (tamirms) flagged two harness gaps and several execution issues. Code fixes: - txpage (hot+cold) previously only touched TransactionHash + ResultPair — it never fetched the page contents, so it measured a tx *count*, not a getTransactions response. New walkPageMaterialize (tx_page_helpers.go) builds a full db.Transaction per tx in the page (envelope, result, meta, events, hash, application order, ledger info). - txpage (hot+cold) had no --xdr-views flag, so it only measured the slow full-decode path. Added --xdr-views with a single-pass view materializer, mirroring the txhash bench. CSVs suffix -roundtrip / -xdrviews; detail column scan_ns -> materialize_ns (decode_ns stays 0 under views). Execution (run-all-benches.sh): - Run the decode-heavy query benches (txpage/txhash/events) once per mode (QUERY_VIEW_MODES = roundtrip + xdrviews) so the report can compare with/ without XDR views. Previously every query ran views-off (slow path). - Events use the worst-case query (EVENTS_BUCKETS=15, max filters/request). - Ingest runs with --parallel; hot-ingest runs both xdr-views on and off (the views run feeds the reads, the parsed run is kept for its CSVs). Smoke-tested: 0 errors, pages fully materialized; views 4-8x faster than round-trip (decode_ns=0 confirms the path dispatch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
) Re-ran c6id.8xlarge with the corrected harness and rewrote the report to address the PR #750 review: - New "c6id.8xlarge — corrected" section: query latency split into hot/cold tables with roundtrip vs xdr-views columns and P50+P99; events use worst-case K=15; ingest shown hot (parsed vs view, --parallel) and cold with the per-stage phase breakdown + per-ledger driver total. - The other three machines (2xlarge/4xlarge/im4gn) are marked STALE (old harness: tx-page-as-count, views-off) pending a re-run. - Dropped the per-machine raw-cell dump (§12) — the CSVs are on GCS. - Summary table: same treatment (banner, corrected c6id.8xlarge rows, stale markers on the rest). Headline corrected numbers: xdr-views cuts tx-page/tx-hash p50 4-9x (hot tx-hash 10.6->1.2ms) and lifts peak throughput 5-10x (hot tx-hash 706->7253 ops/s); events is decode-insensitive (1.1-1.4x). Hot ingest with views is ~2.1x faster than parsed (skips the 8.4ms/ledger UnmarshalBinary). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ply-load Adds an `lcm` ledger source and an apply-load-gen.sh driver so the bench-fullhistory suite can run on fully synthetic, density-controlled data instead of real pubnet chunks. - sources.go: new --source=lcm reader over apply-load's framed-XDR METADATA_OUTPUT_STREAM. Skips setup ledgers (<= --lcm-checkpoint) and decode-free frame-skips to each chunk's 10k-ledger block; reuses the entire cold-ingest/hot-ingest/build-txhash-index pipeline. Wired --lcm-file/ --lcm-checkpoint flags into both ingest commands. - apply-load-gen.sh: drives stellar-core new-db/new-hist/apply-load -> meta.xdr -> cold-ingest --source=lcm -> packfiles -> build-txhash-index. Profiles map to apply-load model txs + target TPS: sac (~10k), token/oz (~9k custom_token), soroswap (~2.5k). Uses the installed core's protocol. - lcm_source_test.go: unit-tests setup-skip, chunk-block mapping, short-read. - README: documents the lcm source, the driver, profiles, BUILD_TESTS requirement, and the real cost of full 10k-ledger chunks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- absolutize OUT_ROOT so the config/meta paths survive the cd into the per-profile work dir (core was erroring "No config file ... found") - default NETWORK_PASSPHRASE to pubnet to match the bench binary's hardcoded pubnetPassphrase: the ingest reader recomputes each tx hash under this passphrase and matches it against the result entries, so a mismatch broke the roundtrip txpage/txhash read paths with "unknown tx hash in LedgerCloseMeta". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…benches apply-load streams a LedgerCloseMeta whose tx-set and TxProcessing are the same transactions in different order, but whose stored result hash does not equal any envelope's real hash under the network passphrase (confirmed against core 26.1.1: 0/N result hashes matched an envelope under pubnet/testnet/standalone, while every envelope's source account was fee-charged in exactly one TxProcessing entry — a clean bijection). The go-stellar-sdk ingest LedgerTransactionReader pairs envelope↔result BY HASH, so it rejected the meta with "unknown tx hash in LedgerCloseMeta", breaking the roundtrip tx-page and tx-hash read benches. (The xdr-views path, which pairs positionally, was unaffected.) - lcm_fixup.go: for each result, find the fee-charged account, map it back to the unique envelope with that source, and stamp the envelope's real tx hash. This is a correct pairing, not merely self-consistent. cold-ingest --source=lcm applies it by default (--lcm-fix-tx-hashes); logs fixed/skipped per chunk. - sources.go: lcmStream applies the fixup and tolerates a short final chunk (--lcm-allow-partial) so runs sized below a full 10k-ledger chunk work. - cold-ledgers / cold-txhash: clamp sampling + start cursors to each chunk's actual ledger range (FirstSeq/LastSeq) so partial chunks don't short-read. - apply-load-gen.sh: NUM_LEDGERS knob for quick runs — TPS is set by per-ledger density, not ledger count, so a few hundred ledgers hit the profile target. - README: document the fixup, partial chunks, NUM_LEDGERS, and that cold-events is unsupported on apply-load data (single-contract; corpus needs >=3). Validated end-to-end: cold-ledgers / cold-txpage / cold-txhash all run with 0 errors on both a 308-ledger SAC store and an 892-ledger / 7.65M-tx token store (fixup paired 7650042/7650042, skipped 0). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The cold-events corpus builder hard-required ≥3 distinct contracts emitting 4-topic events (termsPerCategory anchors), which excluded apply-load's single-contract synthetic workloads. But the real requirement is enough unique FILTERABLE TERMS to fill the K-bucket sweep — a contract anchor plus topic values — not a minimum contract count. A single contract with topic diversity (e.g. a SAC's `transfer` events varying from/to over thousands of accounts) provides them. - scanForTopTerms: accept ≥1 contract (anchors = min(3, nContracts)); fill the rest of the 15-term budget from topic values. Only fail when NO contract emits 4-topic events. - newCorpus: validate total terms ≥ max(buckets) — the actual sweep requirement — with a message that points at topic diversity / --buckets, not contracts. Validated: cold-events now runs the full K=2..15 sweep on a synthetic SAC store (1 contract + 14 topic terms = 15) and a soroswap store (2 contracts + 13). token/custom_token still yields nothing — its events are not 4-topic (a workload property). Existing pubnet-shaped corpus behaviour is unchanged (still picks 3 contract anchors when ≥3 are present). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With APPLY_LOAD_BATCH_SAC_COUNT=100 the sac profile folded 100 transfers into a single InvokeHostFunction tx, and core's benchmark mode closed/streamed just one such tx per ledger — so the usable pack carried ~100 transfers/ledger (~100 TPS), 100x below the 10k target (verified by decoding the pack: 1 tx, 1 op, ~97 events per ledger). Setting BATCH_SAC=1 makes every transfer its own tx, so the closed ledger carries the full count. Verified by decoding the regenerated packs (tail/benchmark ledgers): sac : 10000 tx / 10000 ops / 10000 events per ledger -> 10000 TPS soroswap : 2500 tx / 2500 ops / 12500 events per ledger -> 2500 TPS token : 9000 tx / 9000 ops / 9000 events per ledger -> 9000 TPS (unchanged) All four read benches (cold-ledgers/txpage/txhash/events) run with 0 errors and miss-rate=0 on the sac and soroswap stores. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b75236a8a4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| applyIdx := start + k | ||
| envelopeRaw, envType, eerr := envAt(applyIdx) |
There was a problem hiding this comment.
Pair page envelopes by hash instead of index
For --source=lcm apply-load data, lcm_fixup.go documents that the generalized tx set and TxProcessing contain the same transactions in different orders. In the new --xdr-views txpage path, parts[k] comes from TxProcessing[start+k] but the envelope is fetched from the tx set at the same index, so V1/V2 synthetic pages can return a db.Transaction whose envelope does not match its hash/result/meta. This affects the apply-load read benches the commit adds; the roundtrip path avoids it by pairing through the SDK's hash map after fixup.
Useful? React with 👍 / 👎.
Resolves 3 conflicts, all from the bench-report lineage (NOT the apply-load work, which merged cleanly): - results/2026-06-03-cross-machine.md, results/2026-06-03-summary-table.md: took rpc-hack's versions (stale local copies superseded there). - tx_page_helpers.go: took rpc-hack's version (its newer single-pass envelope collection; supporting helpers come in via rpc-hack's other files). Package builds and tests pass; the apply-load SAC store still reads cleanly (cold-txpage errors=0, cold-txhash miss-rate=0, cold-events ok). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rgets Per issue #762, the synthetic datasets target specific load shapes. The network target is 600ms blocks, so TPS is taken at a 600ms block time: per-ledger tx count = TPS * 0.6. Replaces the 1s assumption (CLOSE_TIME_S) with CLOSE_TIME_MS (default 600). The ledger header closeTime is whole seconds in XDR, so the sub-second cadence can't be a timestamp — it's modeled purely by density. Resulting per-ledger densities (BATCH_SAC=1): sac 10,000 TPS -> 6,000 txs/ledger token/OZ 9,000 TPS -> 5,400 txs/ledger soroswap 2,500 TPS -> 1,500 txs/ledger Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b02f0eb5cc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // End of the framed file. For the final/only chunk this is | ||
| // expected when the synthetic run was sized below a full | ||
| // chunk: yield what we have (if allowed) rather than error. | ||
| if p.opts.allowPartial && isEnd(rerr) { |
There was a problem hiding this comment.
Treat truncated LCM frames as ingest errors
When --lcm-allow-partial is enabled by default, this branch also accepts io.ErrUnexpectedEOF, which readFrame returns if the file ends in the middle of a length prefix or payload. In an interrupted apply-load run or a partially copied meta.xdr, cold-ingest will silently stop at the last complete ledger and commit a shortened pack instead of surfacing that the input is corrupt; allowing a short final chunk should only accept a clean EOF before the next frame.
Useful? React with 👍 / 👎.
… speed APPLY_LOAD_LEDGER_MAX_DEPENDENT_TX_CLUSTERS sets the number of parallel apply threads — purely a generation-speed knob, it doesn't change the workload. Per upstream guidance, default it to 8 (was per-profile 1/2) and cap there: stellar- core's multi-threaded apply has known perf issues above 8 even on bigger boxes. Promoted to a top-level CLUSTERS env knob; removed from the per-profile table. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f2e7fd326d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| v1 := lcm.MustV1() | ||
| for i := range v1.TxProcessing { | ||
| if h, ok := stamp(v1.TxProcessing[i].FeeProcessing); ok { | ||
| v1.TxProcessing[i].Result.TransactionHash = h |
There was a problem hiding this comment.
Pair envelopes after the LCM hash fixup
When --source=lcm ingests V1/V2 apply-load metadata, this only stamps each TxProcessing result with the envelope's real hash but leaves TxProcessing in apply-load order while the generalized tx set remains in its different envelope order (as documented at the top of this file). This is a separate affected path from the existing tx-page note: the xdr-view txhash bench finds the stamped hash in TxProcessing and then materializeViews fetches the envelope by the same apply index, so synthetic LCM packs can return a db.Transaction whose envelope belongs to a different transaction; the fixup needs to make the view materializers pair by hash or otherwise align the two orders.
Useful? React with 👍 / 👎.
…parallel stellar-core binds its HTTP server (default port 11626); running multiple apply-load generations concurrently failed the 2nd/3rd with "bind: address already in use". apply-load doesn't need the HTTP endpoint, so default HTTP_PORT=0 (disabled), env-overridable. Lets all profiles generate in parallel — on a 32-vCPU box that cuts a 3-profile 20k run from ~99h sequential to ~the slowest profile (~42h). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2d00293ff1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| payload = raw | ||
| } | ||
| if !yield(p.applyFixup(payload, &fx), nil) { |
There was a problem hiding this comment.
Rewrite LCM header sequences before storing synthetic chunks
When --source=lcm is used with apply-load output whose benchmark ledger numbers do not already match the target chunk range (the script maps chunk 1 to ledger 10002+), this yields the raw LedgerCloseMeta unchanged while the driver indexes it under the synthetic l.Seq. Later read paths derive response metadata from the raw header (ledgerInfoFromHeader for tx pages and events.LCMToPayloads* for event payloads), so getTransaction/getTransactions/getEvents can return ledger sequences from the apply-load file rather than the pack/index sequence that was queried. The stream should rewrite the LCM header sequence to the positional chunk sequence before yielding/storing it.
Useful? React with 👍 / 👎.
Make the synthetic-ledger generation reproducible on another machine. The per-profile generator (apply-load-gen.sh) and the meta fixup were already committed; this adds the orchestration + docs that were previously ad-hoc: - synthetic-run.sh: loop profiles -> apply-load-gen.sh (generate) -> bench-suite.sh (read benches) -> optional GCS upload. Sequential by default; PARALLEL=1 opt-in. Auto-builds the bench binary if BENCH_BIN unset. - bench-suite.sh: cold-* and hot-* read suite per profile (both decode modes, concurrency sweep); skips events for non-4-topic profiles (token). - SYNTHETIC-LEDGERS.md: host prereqs (~buildtests core, RocksDB cgo, Go), the TPS/600ms model, run commands, outputs, and the RAM ceiling. RAM is the real limit: dense apply-load accumulates in-memory soroban state (~8.5 MB/ledger at 6000 SAC tx/ledger), so a full 10k-ledger 10k-TPS SAC chunk needs ~96-128 GB; on a 61 GB box cap sac/token near ~6000 ledgers. Documented with a per-box sizing table so the run can target a larger machine. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3ea7cc40c7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| gen_one(){ | ||
| local P="$1" | ||
| log "generate $P (num_ledgers=$NUM_LEDGERS clusters=$CLUSTERS close_ms=$CLOSE_TIME_MS)" | ||
| PROFILE="$P" "$SCRIPT_DIR/apply-load-gen.sh" > "$OUT_ROOT/$P.gen.log" 2>&1 |
There was a problem hiding this comment.
Propagate generation failures before benchmarking
When apply-load-gen.sh fails for a profile (for example because stellar-core OOMs or no meta.xdr is produced), this command returns non-zero but the function immediately runs log, so gen_one itself succeeds; with set -uo pipefail the sequential loop continues and the parallel wait path can also miss failures. The orchestrator can then run benches or upload using missing/stale cold stores and still print DONE with exit 0 instead of stopping on the failed generation.
Useful? React with 👍 / 👎.
… terms The events corpus hard-required exactly-4-topic events, so apply-load's custom_token profile (whose transfer events carry 3 topics) produced zero terms and couldn't run the events bench at all. - EVENTS_TOPIC_COUNT env (default 4) sets the required topic count; extractors and the scan loop use it instead of a literal 4. sac/soroswap (4-topic) unchanged; token runs with EVENTS_TOPIC_COUNT=3. - newCorpus: instead of erroring when the workload can't reach max(buckets), CAP the K-bucket sweep to the terms available (dedup), logging the cap. Lets low-diversity workloads run at the largest K they support. Verified: token (3-topic, 1 contract) now builds the full 15-term universe (contract + transfer symbol + 13 from-addresses) and runs the K=1..15 sweep; cold-events 235ms@c=1 / 111 ops peak, hot-events 14ms / 1140 ops. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2fd1966903
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| # ---- COLD read benches (auto-discover chunk range) ---- | ||
| for n in $LEDGER_NS; do | ||
| "$BENCH_BIN" cold-ledgers --cold-dir="$COLD/ledgers" --n="$n" --iters="$LEDGERS_ITERS" \ | ||
| --query-concurrency="$QC" --out="$O" > "$O/cold-ledgers-n$n.log" 2>&1 || echo " cold-ledgers n=$n FAILED" |
There was a problem hiding this comment.
Return failure when a benchmark command fails
When any bench invocation fails (for example a missing txhash.idx or a cold-txpage error), the || echo ... FAILED branch consumes the non-zero status, and this script has no failure accumulator or final non-zero exit. As a result, direct bench-suite.sh runs and callers such as synthetic-run.sh can report/upload an incomplete result set as successful; if continuing through the remaining benches is desired, record the failure and exit 1 at the end.
Useful? React with 👍 / 👎.
| CHECKPOINT="$(grep -oE 'Published final checkpoint before benchmark: ledger [0-9]+' "$APPLY_LOG" \ | ||
| | grep -oE '[0-9]+$' | tail -1 || true)" | ||
| CHECKPOINT="${CHECKPOINT:-0}" |
There was a problem hiding this comment.
Fail when the apply-load checkpoint cannot be found
If stellar-core does not emit this exact log message (or the wording changes), CHECKPOINT silently falls back to 0, so cold-ingest --source=lcm skips no pre-benchmark setup frames. The generated cold store then includes setup/account-creation ledgers before the dense benchmark ledgers, which corrupts the target TPS/density assumptions for the synthetic read benches; this should error unless the checkpoint was actually found or explicitly provided.
Useful? React with 👍 / 👎.
Results report for the three synthetic apply-load datasets (sac/token/soroswap, 10k SAC / 9k OZ / 2.5k Soroswap TPS @ 600ms blocks) run through the full read + ingest suite on c6id.8xlarge. Datasets, configs, CSVs, and RESULTS.md are in GCS at gs://rpc-full-history/synthetic-ledgers/2026-06-04-apply-load-20k/. Covers query latency (both decode paths, p50/p99), peak throughput, per-stage ingest, and a comparison to the pubnet chunk-5860 baseline (synthetic is ~5-9x slower per query due to per-ledger density; ingest is item-bound). Implements #762's acceptance criteria via --source=lcm (no SDK loadtest dependency). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d results (#762) Self-contained, offline-capable HTML explorer (same tool as #758) for the three synthetic datasets. The "profile" dimension (sac/token/soroswap) maps onto the explorer's per-subdir axis; UI labels relabeled machine->Profile. - adds make_explorer.py (from #758) + the generated 2026-06-09-synthetic-apply-load-explorer.html (all CSVs embedded; no deps). - fixes the cold-throughput calc: divide chunk_wall by the actual chunk-workers (= chunk count for these runs; override via COLD_CHUNK_WORKERS) instead of a hardcoded 8, so cold ledgers/s reads correctly (sac 20, token 38, soroswap 108). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce8d9c272d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| HOT="${HOT:-1}" # 1 = also build a hot store per profile and run hot-* benches | ||
| # Profiles whose events are NOT 4-topic (skip cold/hot-events). apply-load's | ||
| # custom_token emits non-4-topic events; sac/soroswap are fine. | ||
| NO_EVENTS="${NO_EVENTS:-token}" |
There was a problem hiding this comment.
Skip events for the oz profile alias
When PROFILES uses the supported oz alias for custom_token (accepted by apply-load-gen.sh), this default does not skip event benches even though the same non-4-topic event shape applies. The suite then runs cold-events/hot-events for oz and hits the corpus error path instead of producing a clean read-bench run; include oz in the default skip list or normalize the profile name before this check.
Useful? React with 👍 / 👎.
…er + report (#762) Include the real pubnet (non-synthetic) chunk-5860 baseline alongside the three synthetic datasets, on the same c6id.8xlarge, so the comparison is interactive and in the tables — not just prose. - explorer: replace the synthetic-only HTML with 2026-06-09-synthetic-vs-pubnet-explorer.html — 4 datasets (pubnet + sac/token/ soroswap), 200 query rows. pubnet contributes its query sweeps (the headline comparison); the ingest tab stays synthetic-only (pubnet used 8 chunk-workers vs synthetic 1-2, so a single throughput divisor would misreport it — its ingest lives in results/2026-06-03-cross-machine.md). Dimension relabeled Profile -> Dataset. - report: add a pubnet column to the query-latency and throughput tables, and link the explorer. Pubnet query sweeps pulled from gs://.../benchmarks/2026-06-03/c6id.8xlarge-... -corrected (same harness/CSV layout as the synthetic run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23bee2f951
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| "$BENCH_BIN" cold-ledgers --cold-dir="$COLD/ledgers" --n="$n" --iters="$LEDGERS_ITERS" \ | ||
| --query-concurrency="$QC" --out="$O" > "$O/cold-ledgers-n$n.log" 2>&1 || echo " cold-ledgers n=$n FAILED" |
There was a problem hiding this comment.
Preserve each ledger-size sweep CSV
With the default LEDGER_NS="1 10 20", every iteration writes to the same --out=$O directory while cold-ledgers uses fixed output names (cold-ledgers.csv/cold-ledgers-sweep.csv via createCSV, which truncates with os.Create). The per-n logs are distinct, but the CSV data for n=1 and n=10 is overwritten by the later runs, so the result directory and explorer only retain the last ledger-size sweep; the hot-ledgers loop below has the same pattern.
Useful? React with 👍 / 👎.
| @@ -0,0 +1,168 @@ | |||
| package main | |||
There was a problem hiding this comment.
this is no longer necessary now that stellar/stellar-core#5319 has been merged. Even without that fix in stellar-core, the better workaround is to just configure the network passphrase in the benchmark commands to be "Apply Load"
| const ( | ||
| sourcePack = "pack" | ||
| sourceBSB = "bsb" | ||
| sourceLCM = "lcm" |
There was a problem hiding this comment.
instead of introducing a new source, I think it would be better to make have a post processing step after apply load which takes the generated ledgers and then converts them into the pack format. then we can run the benchmarks using the source as pack pointing to the synthetic ledgers packfile
…t p99 tail (#762) results/rocksdb-config.md — per-CF key knobs (events/ledgers/txhash) extracted from the OPTIONS files RocksDB wrote, plus the full verbatim events-CF OPTIONS. Reveals the p99 ingest-tail cause: events & ledgers CFs run on RocksDB defaults (auto-compaction on, max_background_jobs=2, L0 slowdown@20/stop@36), while txhash is tuned write-once (disable_auto_compactions, L0 triggers 999, 8 bg jobs). The events CF's default L0 throttling under dense writes is what produces the ~8x p99/p50 on events hot_write. Linked from the synthetic report. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📊 Results report (addresses #762)
Benchmark report committed at
cmd/stellar-rpc/scripts/bench-fullhistory/results/2026-06-09-synthetic-apply-load.md— the three synthetic datasets (sac 10k / token 9k / soroswap 2.5k TPS @ 600 ms) run through the full read + ingest suite on c6id.8xlarge, with a comparison to the pubnet chunk-5860 baseline.Datasets + configs + CSVs + machine-readable
RESULTS.mdare in GCS atgs://rpc-full-history/synthetic-ledgers/2026-06-04-apply-load-20k/.Summary
Addresses #762 — a controllable synthetic-ledger dataset whose transaction profile we set deliberately, feeding the existing
cold-*/hot-*ingest + query benches. Builds on the apply-load driver from the parent branch (a8c82958).It delivers the issue's acceptance criteria via the
--source=lcmpath:apply-load-gen.shruns stellar-coreapply-loadto emit a framedLedgerCloseMetastream, andcold-ingest --source=lcmturns it into the cold packfiles (+ hot store) the query benches read — unchanged. This is used instead of the proposed--source=synthetic+ingest/loadtest.ApplyLoadwiring because the pinnedgo-stellar-sdkdoesn't yet includeingest/loadtest(stellar/go-stellar-sdk#5940), so--source=lcmreaches the same goal with no dependency bump. Swapping inloadtest.ApplyLoadlater is a drop-in producer change; the ingest/query path is unaffected.apply-load config
apply-load-gen.shwrites the upstreamdocs/apply-load-benchmark-*.cfgshape plus meta output, parameterized per profile:APPLY_LOAD_MODE="benchmark",APPLY_LOAD_MODEL_TX=sac|custom_token|soroswapAPPLY_LOAD_MAX_SOROBAN_TX_COUNT= txs-per-ledger (the density knob)APPLY_LOAD_LEDGER_MAX_DEPENDENT_TX_CLUSTERS=CLUSTERS(default 8) — parallel apply threads, a generation-speed knob only; capped at 8 (known multi-threaded-apply perf issues above that)APPLY_LOAD_NUM_LEDGERS= ledgers to close after setup (NUM_LEDGERS)METADATA_OUTPUT_STREAM+DISABLE_TX_META_FOR_TESTING=false, BL pre-gen disabled,GENESIS_TEST_ACCOUNT_COUNT≥ 2× TPL, quorum/history boilerplate.Target load shapes (10k SAC / 9k OZ / 2.5k Soroswap @ 600 ms blocks)
TPS is taken at the network's 600 ms block time (
CLOSE_TIME_MS), so per-ledger tx count =TPS × 0.6. (The ledger headercloseTimeis whole seconds in XDR, so the sub-second cadence is modeled by density, not timestamps.)saccustom_tokensoroswapA small
NUM_LEDGERSis enough — TPS is set by density, not ledger count.What it took
apply-load-gen.shpath + pubnet passphrase — absolutizeOUT_ROOT; defaultNETWORK_PASSPHRASEto pubnet (the bench binary hardcodes it).lcm_fixup.go) — apply-load's streamed meta holds the same txs in the tx-set andTxProcessingbut in different order, and the stored result hash matches no envelope under any passphrase, so the SDK ingest reader (pairs by hash) rejected it withunknown tx hash in LedgerCloseMeta, breaking roundtripcold-txpage/cold-txhash. We repair it: each result's unique fee-charged account identifies its envelope, so we stamp the real hash (a correct pairing). Default-on (--lcm-fix-tx-hashes). Plus partial final chunk (--lcm-allow-partial) +cold-ledgers/cold-txhashcursor clamping, so runs below a full 10k-ledger chunk work;NUM_LEDGERSknob.corpus.go) — relaxed the ≥3-contract floor to "enough filterable terms (anchors + topic values ≥ max K)"; a single SAC contract'stransferevents over many accounts give the needed 15 terms. Pubnet behaviour unchanged.sacBATCH_SAC=1 —BATCH_SAC>1folds transfers into one tx and only that tx is streamed, so the pack carried ~1 tx/ledger;=1makes each transfer its own tx so density equals the TPS target.CLOSE_TIME_MS, default 600) andCLUSTERSdefault 8 for generation speed.Verification — generated all 3 profiles (100 ledgers each, CLUSTERS=8) and decoded the packs
Fixup paired 100% (0 skipped) on all three; read benches run 0-errors / miss-rate 0.
Known limitation
cold-eventsis not supported fortoken/custom_token— its events are not 4-topic (a workload property). Usesacorsoroswapfor event benches.Base note: targeted at
rpc-hack, which doesn't yet containa8c82958nor the 2026-06-03 bench-report commits, so the diff also carries those precedingbench(fullhistory)commits + a merge ofrpc-hack(conflicts were stale report docs + a txpage helper, all resolved to rpc-hack; the apply-load files merged cleanly). The substantive new work is the apply-load commits above.🤖 Generated with Claude Code