Skip to content

bench(fullhistory): make apply-load synthetic LCM usable by the read benches at target TPS#760

Open
chowbao wants to merge 29 commits into
rpc-hackfrom
bench/apply-load-synthetic-ledgers
Open

bench(fullhistory): make apply-load synthetic LCM usable by the read benches at target TPS#760
chowbao wants to merge 29 commits into
rpc-hackfrom
bench/apply-load-synthetic-ledgers

Conversation

@chowbao

@chowbao chowbao commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

📊 Results report (addresses #762)

Benchmark report committed at cmd/stellar-rpc/scripts/bench-fullhistory/results/2026-06-09-synthetic-apply-load.md — the three synthetic datasets (sac 10k / token 9k / soroswap 2.5k TPS @ 600 ms) run through the full read + ingest suite on c6id.8xlarge, with a comparison to the pubnet chunk-5860 baseline.

Datasets + configs + CSVs + machine-readable RESULTS.md are in GCS at gs://rpc-full-history/synthetic-ledgers/2026-06-04-apply-load-20k/.


Summary

Addresses #762 — a controllable synthetic-ledger dataset whose transaction profile we set deliberately, feeding the existing cold-* / hot-* ingest + query benches. Builds on the apply-load driver from the parent branch (a8c82958).

It delivers the issue's acceptance criteria via the --source=lcm path: apply-load-gen.sh runs stellar-core apply-load to emit a framed LedgerCloseMeta stream, and cold-ingest --source=lcm turns it into the cold packfiles (+ hot store) the query benches read — unchanged. This is used instead of the proposed --source=synthetic + ingest/loadtest.ApplyLoad wiring because the pinned go-stellar-sdk doesn't yet include ingest/loadtest (stellar/go-stellar-sdk#5940), so --source=lcm reaches the same goal with no dependency bump. Swapping in loadtest.ApplyLoad later is a drop-in producer change; the ingest/query path is unaffected.

apply-load config

apply-load-gen.sh writes the upstream docs/apply-load-benchmark-*.cfg shape plus meta output, parameterized per profile:

  • APPLY_LOAD_MODE="benchmark", APPLY_LOAD_MODEL_TX=sac|custom_token|soroswap
  • APPLY_LOAD_MAX_SOROBAN_TX_COUNT = txs-per-ledger (the density knob)
  • APPLY_LOAD_LEDGER_MAX_DEPENDENT_TX_CLUSTERS = CLUSTERS (default 8) — parallel apply threads, a generation-speed knob only; capped at 8 (known multi-threaded-apply perf issues above that)
  • APPLY_LOAD_NUM_LEDGERS = ledgers to close after setup (NUM_LEDGERS)
  • METADATA_OUTPUT_STREAM + DISABLE_TX_META_FOR_TESTING=false, BL pre-gen disabled, GENESIS_TEST_ACCOUNT_COUNT ≥ 2× TPL, quorum/history boilerplate.

Target load shapes (10k SAC / 9k OZ / 2.5k Soroswap @ 600 ms blocks)

TPS is taken at the network's 600 ms block time (CLOSE_TIME_MS), so per-ledger tx count = TPS × 0.6. (The ledger header closeTime is whole seconds in XDR, so the sub-second cadence is modeled by density, not timestamps.)

profile model tx target txs/ledger @600ms
sac sac 10 000 TPS 6 000
token/OZ custom_token 9 000 TPS 5 400
soroswap soroswap 2 500 TPS 1 500

A small NUM_LEDGERS is enough — TPS is set by density, not ledger count.

What it took

  1. apply-load-gen.sh path + pubnet passphrase — absolutize OUT_ROOT; default NETWORK_PASSPHRASE to pubnet (the bench binary hardcodes it).
  2. Make the synthetic LCM consumable (lcm_fixup.go) — apply-load's streamed meta holds the same txs in the tx-set and TxProcessing but in different order, and the stored result hash matches no envelope under any passphrase, so the SDK ingest reader (pairs by hash) rejected it with unknown tx hash in LedgerCloseMeta, breaking roundtrip cold-txpage/cold-txhash. We repair it: each result's unique fee-charged account identifies its envelope, so we stamp the real hash (a correct pairing). Default-on (--lcm-fix-tx-hashes). Plus partial final chunk (--lcm-allow-partial) + cold-ledgers/cold-txhash cursor clamping, so runs below a full 10k-ledger chunk work; NUM_LEDGERS knob.
  3. Single-contract events corpus (corpus.go) — relaxed the ≥3-contract floor to "enough filterable terms (anchors + topic values ≥ max K)"; a single SAC contract's transfer events over many accounts give the needed 15 terms. Pubnet behaviour unchanged.
  4. sac BATCH_SAC=1BATCH_SAC>1 folds transfers into one tx and only that tx is streamed, so the pack carried ~1 tx/ledger; =1 makes each transfer its own tx so density equals the TPS target.
  5. 600 ms block model (CLOSE_TIME_MS, default 600) and CLUSTERS default 8 for generation speed.

Verification — generated all 3 profiles (100 ledgers each, CLUSTERS=8) and decoded the packs

profile tx/ledger (decoded) TPS @600ms tx-hash fixup cold-ledgers cold-txpage cold-txhash cold-events
sac 6 000 10 000 600022 / 600022 ✅ miss-rate 0
token/OZ 5 400 9 000 540028 / 540028 ✅ miss-rate 0 n/a
soroswap 1 500 2 500 164445 / 164445 ✅ miss-rate 0

Fixup paired 100% (0 skipped) on all three; read benches run 0-errors / miss-rate 0.

Known limitation

cold-events is not supported for token/custom_token — its events are not 4-topic (a workload property). Use sac or soroswap for event benches.


Base note: targeted at rpc-hack, which doesn't yet contain a8c82958 nor the 2026-06-03 bench-report commits, so the diff also carries those preceding bench(fullhistory) commits + a merge of rpc-hack (conflicts were stale report docs + a txpage helper, all resolved to rpc-hack; the apply-load files merged cleanly). The substantive new work is the apply-load commits above.

🤖 Generated with Claude Code

chowbao and others added 19 commits May 21, 2026 19:37
Markdown report covering the cross-machine bench run captured under
gs://rpc-full-history/benchmarks/{c6id.2xlarge,c6id.4xlarge,c6id.8xlarge,im4gn.4xlarge}-2026-05-21*.
Tables + Mermaid xychart-beta blocks for: peak read throughput,
worker scaling (cold and hot n=1), tx-page page-size sweep,
xdr-views vs round-trip on tx-hash + events-ingest, per-ledger ingest,
bulk ingest, cold-vs-hot speedup, and x86 vs Graviton2 at matched vCPU.

Source per-iter CSVs and the summary CSVs that back every table here
live at gs://rpc-full-history/benchmarks/_summary/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… report

New section 11 transposes the cross-machine tables: one consolidated
table per machine (c6id.2xlarge, c6id.4xlarge, c6id.8xlarge,
im4gn.4xlarge) listing every bench result — full ledger grid sweep,
tx-page, tx-hash (hit/miss × xdrviews/roundtrip), per-ledger ingest,
and bulk ingest — with p50/p90/p99 and throughput.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New Section 2 ("Internal vs production RPC providers") includes the
prior black-box benchmark across 4–6 production RPC providers and
juxtaposes their p50s with the internal hot/cold tiers. Adds a
Mermaid bar/line chart of the per-workload speedups. Remaining
sections renumbered 3–12.

Headline: hot/cold full-history is 10×–1773× faster than the average
production RPC across ledger-point, ledger-range, tx-page, tx-hash,
and the four event-filter scenarios. Note: 'onfinality' and 'sorobanrpc'
are absent from tx-hash and events workloads (n=4 instead of 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colons in x-axis labels (ev:nofilt, ev:contract, ev:topic, ev:both)
break Mermaid's xychart-beta parser. Replaced with hyphens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New-data-only report over the 2026-06-03 runs (4 machines) on the rewritten
rpc-hack bench harness. Notes methodology changes vs 2026-05-21: ops/s is no
longer comparable across runs (only single-in-flight p50 latency is), the
sweep axis is now query-concurrency 1-16, and ledger/tx-page/tx-hash read
coverage narrowed while events query + ingest stage detail broadened.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Condensed two-table view (typical p50 latency + peak throughput) with a full
glossary defining every row, column, tier, and variable (n, page, c, p50/p99,
ops/s). Links back to the full cross-machine report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds Table 3 (ingest throughput: hot-ingest ledgers/s, build-txhash-index
keys/s) and Table 4 (per-stage ingest cost), plus glossary entries for the
ingest workloads and ledgers/s, keys/s, and stage terms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cold-ingest ledgers/s computed as sum(chunk_wall) / chunk-workers (upper-bound
estimate, since the harness records summed per-chunk wall, not true end-to-end
wall). Flagged as an estimate; scales with --chunk-workers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1 report

Source/summary CSV paths were missing the dated prefix (data lives under
.../benchmarks/2026-05-21/, the undated paths don't exist). Also dates the title
and forward-links the 2026-06-03 run, noting the harness changed and ops/s is
not comparable across runs. Historical 5/21 numbers are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drives the full read + ingest bench suite in bench-fullhistory: builds
the binary once, then runs cold+hot ledgers/txpage/txhash/events read
benches (each a 1,4,8,16 query-concurrency sweep) plus the hot-ingest,
cold-ingest, and build-txhash-index ingest benches.

By default the reads use prebuilt fixtures and ingest writes to scratch
(independent measurements). INGEST_FIRST=1 instead ingests first and
repoints every read bench at the freshly-ingested stores, so the suite
is self-contained from a single raw-ledger packfile seed — usable on a
fresh machine with no prebuilt data. Paths/sizing knobs are env-
overridable for running across different machines.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PR #750 review (tamirms) flagged two harness gaps and several execution
issues. Code fixes:

- txpage (hot+cold) previously only touched TransactionHash + ResultPair —
  it never fetched the page contents, so it measured a tx *count*, not a
  getTransactions response. New walkPageMaterialize (tx_page_helpers.go)
  builds a full db.Transaction per tx in the page (envelope, result, meta,
  events, hash, application order, ledger info).
- txpage (hot+cold) had no --xdr-views flag, so it only measured the slow
  full-decode path. Added --xdr-views with a single-pass view materializer,
  mirroring the txhash bench. CSVs suffix -roundtrip / -xdrviews; detail
  column scan_ns -> materialize_ns (decode_ns stays 0 under views).

Execution (run-all-benches.sh):

- Run the decode-heavy query benches (txpage/txhash/events) once per mode
  (QUERY_VIEW_MODES = roundtrip + xdrviews) so the report can compare with/
  without XDR views. Previously every query ran views-off (slow path).
- Events use the worst-case query (EVENTS_BUCKETS=15, max filters/request).
- Ingest runs with --parallel; hot-ingest runs both xdr-views on and off
  (the views run feeds the reads, the parsed run is kept for its CSVs).

Smoke-tested: 0 errors, pages fully materialized; views 4-8x faster than
round-trip (decode_ns=0 confirms the path dispatch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
)

Re-ran c6id.8xlarge with the corrected harness and rewrote the report to
address the PR #750 review:

- New "c6id.8xlarge — corrected" section: query latency split into hot/cold
  tables with roundtrip vs xdr-views columns and P50+P99; events use
  worst-case K=15; ingest shown hot (parsed vs view, --parallel) and cold
  with the per-stage phase breakdown + per-ledger driver total.
- The other three machines (2xlarge/4xlarge/im4gn) are marked STALE (old
  harness: tx-page-as-count, views-off) pending a re-run.
- Dropped the per-machine raw-cell dump (§12) — the CSVs are on GCS.
- Summary table: same treatment (banner, corrected c6id.8xlarge rows, stale
  markers on the rest).

Headline corrected numbers: xdr-views cuts tx-page/tx-hash p50 4-9x (hot
tx-hash 10.6->1.2ms) and lifts peak throughput 5-10x (hot tx-hash 706->7253
ops/s); events is decode-insensitive (1.1-1.4x). Hot ingest with views is
~2.1x faster than parsed (skips the 8.4ms/ledger UnmarshalBinary).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ply-load

Adds an `lcm` ledger source and an apply-load-gen.sh driver so the
bench-fullhistory suite can run on fully synthetic, density-controlled data
instead of real pubnet chunks.

- sources.go: new --source=lcm reader over apply-load's framed-XDR
  METADATA_OUTPUT_STREAM. Skips setup ledgers (<= --lcm-checkpoint) and
  decode-free frame-skips to each chunk's 10k-ledger block; reuses the entire
  cold-ingest/hot-ingest/build-txhash-index pipeline. Wired --lcm-file/
  --lcm-checkpoint flags into both ingest commands.
- apply-load-gen.sh: drives stellar-core new-db/new-hist/apply-load ->
  meta.xdr -> cold-ingest --source=lcm -> packfiles -> build-txhash-index.
  Profiles map to apply-load model txs + target TPS: sac (~10k), token/oz
  (~9k custom_token), soroswap (~2.5k). Uses the installed core's protocol.
- lcm_source_test.go: unit-tests setup-skip, chunk-block mapping, short-read.
- README: documents the lcm source, the driver, profiles, BUILD_TESTS
  requirement, and the real cost of full 10k-ledger chunks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- absolutize OUT_ROOT so the config/meta paths survive the cd into the
  per-profile work dir (core was erroring "No config file ... found")
- default NETWORK_PASSPHRASE to pubnet to match the bench binary's
  hardcoded pubnetPassphrase: the ingest reader recomputes each tx hash
  under this passphrase and matches it against the result entries, so a
  mismatch broke the roundtrip txpage/txhash read paths with
  "unknown tx hash in LedgerCloseMeta".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…benches

apply-load streams a LedgerCloseMeta whose tx-set and TxProcessing are the same
transactions in different order, but whose stored result hash does not equal any
envelope's real hash under the network passphrase (confirmed against core
26.1.1: 0/N result hashes matched an envelope under pubnet/testnet/standalone,
while every envelope's source account was fee-charged in exactly one
TxProcessing entry — a clean bijection). The go-stellar-sdk ingest
LedgerTransactionReader pairs envelope↔result BY HASH, so it rejected the meta
with "unknown tx hash in LedgerCloseMeta", breaking the roundtrip tx-page and
tx-hash read benches. (The xdr-views path, which pairs positionally, was
unaffected.)

- lcm_fixup.go: for each result, find the fee-charged account, map it back to
  the unique envelope with that source, and stamp the envelope's real tx hash.
  This is a correct pairing, not merely self-consistent. cold-ingest --source=lcm
  applies it by default (--lcm-fix-tx-hashes); logs fixed/skipped per chunk.
- sources.go: lcmStream applies the fixup and tolerates a short final chunk
  (--lcm-allow-partial) so runs sized below a full 10k-ledger chunk work.
- cold-ledgers / cold-txhash: clamp sampling + start cursors to each chunk's
  actual ledger range (FirstSeq/LastSeq) so partial chunks don't short-read.
- apply-load-gen.sh: NUM_LEDGERS knob for quick runs — TPS is set by per-ledger
  density, not ledger count, so a few hundred ledgers hit the profile target.
- README: document the fixup, partial chunks, NUM_LEDGERS, and that cold-events
  is unsupported on apply-load data (single-contract; corpus needs >=3).

Validated end-to-end: cold-ledgers / cold-txpage / cold-txhash all run with 0
errors on both a 308-ledger SAC store and an 892-ledger / 7.65M-tx token store
(fixup paired 7650042/7650042, skipped 0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The cold-events corpus builder hard-required ≥3 distinct contracts emitting
4-topic events (termsPerCategory anchors), which excluded apply-load's
single-contract synthetic workloads. But the real requirement is enough unique
FILTERABLE TERMS to fill the K-bucket sweep — a contract anchor plus topic
values — not a minimum contract count. A single contract with topic diversity
(e.g. a SAC's `transfer` events varying from/to over thousands of accounts)
provides them.

- scanForTopTerms: accept ≥1 contract (anchors = min(3, nContracts)); fill the
  rest of the 15-term budget from topic values. Only fail when NO contract emits
  4-topic events.
- newCorpus: validate total terms ≥ max(buckets) — the actual sweep requirement
  — with a message that points at topic diversity / --buckets, not contracts.

Validated: cold-events now runs the full K=2..15 sweep on a synthetic SAC store
(1 contract + 14 topic terms = 15) and a soroswap store (2 contracts + 13).
token/custom_token still yields nothing — its events are not 4-topic (a workload
property). Existing pubnet-shaped corpus behaviour is unchanged (still picks 3
contract anchors when ≥3 are present).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With APPLY_LOAD_BATCH_SAC_COUNT=100 the sac profile folded 100 transfers into a
single InvokeHostFunction tx, and core's benchmark mode closed/streamed just one
such tx per ledger — so the usable pack carried ~100 transfers/ledger (~100 TPS),
100x below the 10k target (verified by decoding the pack: 1 tx, 1 op, ~97 events
per ledger). Setting BATCH_SAC=1 makes every transfer its own tx, so the closed
ledger carries the full count.

Verified by decoding the regenerated packs (tail/benchmark ledgers):
  sac      : 10000 tx / 10000 ops / 10000 events per ledger -> 10000 TPS
  soroswap :  2500 tx /  2500 ops / 12500 events per ledger ->  2500 TPS
  token    :  9000 tx /  9000 ops /  9000 events per ledger ->  9000 TPS (unchanged)
All four read benches (cold-ledgers/txpage/txhash/events) run with 0 errors and
miss-rate=0 on the sac and soroswap stores.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b75236a8a4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +242 to +243
applyIdx := start + k
envelopeRaw, envType, eerr := envAt(applyIdx)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pair page envelopes by hash instead of index

For --source=lcm apply-load data, lcm_fixup.go documents that the generalized tx set and TxProcessing contain the same transactions in different orders. In the new --xdr-views txpage path, parts[k] comes from TxProcessing[start+k] but the envelope is fetched from the tx set at the same index, so V1/V2 synthetic pages can return a db.Transaction whose envelope does not match its hash/result/meta. This affects the apply-load read benches the commit adds; the roundtrip path avoids it by pairing through the SDK's hash map after fixup.

Useful? React with 👍 / 👎.

Simon Chow and others added 2 commits June 4, 2026 18:45
Resolves 3 conflicts, all from the bench-report lineage (NOT the apply-load
work, which merged cleanly):
- results/2026-06-03-cross-machine.md, results/2026-06-03-summary-table.md:
  took rpc-hack's versions (stale local copies superseded there).
- tx_page_helpers.go: took rpc-hack's version (its newer single-pass envelope
  collection; supporting helpers come in via rpc-hack's other files).

Package builds and tests pass; the apply-load SAC store still reads cleanly
(cold-txpage errors=0, cold-txhash miss-rate=0, cold-events ok).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rgets

Per issue #762, the synthetic datasets target specific load shapes. The network
target is 600ms blocks, so TPS is taken at a 600ms block time: per-ledger tx
count = TPS * 0.6. Replaces the 1s assumption (CLOSE_TIME_S) with CLOSE_TIME_MS
(default 600). The ledger header closeTime is whole seconds in XDR, so the
sub-second cadence can't be a timestamp — it's modeled purely by density.

Resulting per-ledger densities (BATCH_SAC=1):
  sac      10,000 TPS -> 6,000 txs/ledger
  token/OZ  9,000 TPS -> 5,400 txs/ledger
  soroswap  2,500 TPS -> 1,500 txs/ledger

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b02f0eb5cc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// End of the framed file. For the final/only chunk this is
// expected when the synthetic run was sized below a full
// chunk: yield what we have (if allowed) rather than error.
if p.opts.allowPartial && isEnd(rerr) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat truncated LCM frames as ingest errors

When --lcm-allow-partial is enabled by default, this branch also accepts io.ErrUnexpectedEOF, which readFrame returns if the file ends in the middle of a length prefix or payload. In an interrupted apply-load run or a partially copied meta.xdr, cold-ingest will silently stop at the last complete ledger and commit a shortened pack instead of surfacing that the input is corrupt; allowing a short final chunk should only accept a clean EOF before the next frame.

Useful? React with 👍 / 👎.

… speed

APPLY_LOAD_LEDGER_MAX_DEPENDENT_TX_CLUSTERS sets the number of parallel apply
threads — purely a generation-speed knob, it doesn't change the workload. Per
upstream guidance, default it to 8 (was per-profile 1/2) and cap there: stellar-
core's multi-threaded apply has known perf issues above 8 even on bigger boxes.
Promoted to a top-level CLUSTERS env knob; removed from the per-profile table.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f2e7fd326d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

v1 := lcm.MustV1()
for i := range v1.TxProcessing {
if h, ok := stamp(v1.TxProcessing[i].FeeProcessing); ok {
v1.TxProcessing[i].Result.TransactionHash = h

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pair envelopes after the LCM hash fixup

When --source=lcm ingests V1/V2 apply-load metadata, this only stamps each TxProcessing result with the envelope's real hash but leaves TxProcessing in apply-load order while the generalized tx set remains in its different envelope order (as documented at the top of this file). This is a separate affected path from the existing tx-page note: the xdr-view txhash bench finds the stamped hash in TxProcessing and then materializeViews fetches the envelope by the same apply index, so synthetic LCM packs can return a db.Transaction whose envelope belongs to a different transaction; the fixup needs to make the view materializers pair by hash or otherwise align the two orders.

Useful? React with 👍 / 👎.

…parallel

stellar-core binds its HTTP server (default port 11626); running multiple
apply-load generations concurrently failed the 2nd/3rd with "bind: address
already in use". apply-load doesn't need the HTTP endpoint, so default HTTP_PORT=0
(disabled), env-overridable. Lets all profiles generate in parallel — on a
32-vCPU box that cuts a 3-profile 20k run from ~99h sequential to ~the slowest
profile (~42h).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d00293ff1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}
payload = raw
}
if !yield(p.applyFixup(payload, &fx), nil) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rewrite LCM header sequences before storing synthetic chunks

When --source=lcm is used with apply-load output whose benchmark ledger numbers do not already match the target chunk range (the script maps chunk 1 to ledger 10002+), this yields the raw LedgerCloseMeta unchanged while the driver indexes it under the synthetic l.Seq. Later read paths derive response metadata from the raw header (ledgerInfoFromHeader for tx pages and events.LCMToPayloads* for event payloads), so getTransaction/getTransactions/getEvents can return ledger sequences from the apply-load file rather than the pack/index sequence that was queried. The stream should rewrite the LCM header sequence to the positional chunk sequence before yielding/storing it.

Useful? React with 👍 / 👎.

Make the synthetic-ledger generation reproducible on another machine. The
per-profile generator (apply-load-gen.sh) and the meta fixup were already
committed; this adds the orchestration + docs that were previously ad-hoc:

- synthetic-run.sh: loop profiles -> apply-load-gen.sh (generate) -> bench-suite.sh
  (read benches) -> optional GCS upload. Sequential by default; PARALLEL=1 opt-in.
  Auto-builds the bench binary if BENCH_BIN unset.
- bench-suite.sh: cold-* and hot-* read suite per profile (both decode modes,
  concurrency sweep); skips events for non-4-topic profiles (token).
- SYNTHETIC-LEDGERS.md: host prereqs (~buildtests core, RocksDB cgo, Go), the
  TPS/600ms model, run commands, outputs, and the RAM ceiling.

RAM is the real limit: dense apply-load accumulates in-memory soroban state
(~8.5 MB/ledger at 6000 SAC tx/ledger), so a full 10k-ledger 10k-TPS SAC chunk
needs ~96-128 GB; on a 61 GB box cap sac/token near ~6000 ledgers. Documented
with a per-box sizing table so the run can target a larger machine.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3ea7cc40c7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

gen_one(){
local P="$1"
log "generate $P (num_ledgers=$NUM_LEDGERS clusters=$CLUSTERS close_ms=$CLOSE_TIME_MS)"
PROFILE="$P" "$SCRIPT_DIR/apply-load-gen.sh" > "$OUT_ROOT/$P.gen.log" 2>&1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Propagate generation failures before benchmarking

When apply-load-gen.sh fails for a profile (for example because stellar-core OOMs or no meta.xdr is produced), this command returns non-zero but the function immediately runs log, so gen_one itself succeeds; with set -uo pipefail the sequential loop continues and the parallel wait path can also miss failures. The orchestrator can then run benches or upload using missing/stale cold stores and still print DONE with exit 0 instead of stopping on the failed generation.

Useful? React with 👍 / 👎.

… terms

The events corpus hard-required exactly-4-topic events, so apply-load's
custom_token profile (whose transfer events carry 3 topics) produced zero terms
and couldn't run the events bench at all.

- EVENTS_TOPIC_COUNT env (default 4) sets the required topic count; extractors
  and the scan loop use it instead of a literal 4. sac/soroswap (4-topic)
  unchanged; token runs with EVENTS_TOPIC_COUNT=3.
- newCorpus: instead of erroring when the workload can't reach max(buckets),
  CAP the K-bucket sweep to the terms available (dedup), logging the cap. Lets
  low-diversity workloads run at the largest K they support.

Verified: token (3-topic, 1 contract) now builds the full 15-term universe
(contract + transfer symbol + 13 from-addresses) and runs the K=1..15 sweep;
cold-events 235ms@c=1 / 111 ops peak, hot-events 14ms / 1140 ops.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2fd1966903

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

# ---- COLD read benches (auto-discover chunk range) ----
for n in $LEDGER_NS; do
"$BENCH_BIN" cold-ledgers --cold-dir="$COLD/ledgers" --n="$n" --iters="$LEDGERS_ITERS" \
--query-concurrency="$QC" --out="$O" > "$O/cold-ledgers-n$n.log" 2>&1 || echo " cold-ledgers n=$n FAILED"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return failure when a benchmark command fails

When any bench invocation fails (for example a missing txhash.idx or a cold-txpage error), the || echo ... FAILED branch consumes the non-zero status, and this script has no failure accumulator or final non-zero exit. As a result, direct bench-suite.sh runs and callers such as synthetic-run.sh can report/upload an incomplete result set as successful; if continuing through the remaining benches is desired, record the failure and exit 1 at the end.

Useful? React with 👍 / 👎.

Comment on lines +210 to +212
CHECKPOINT="$(grep -oE 'Published final checkpoint before benchmark: ledger [0-9]+' "$APPLY_LOG" \
| grep -oE '[0-9]+$' | tail -1 || true)"
CHECKPOINT="${CHECKPOINT:-0}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail when the apply-load checkpoint cannot be found

If stellar-core does not emit this exact log message (or the wording changes), CHECKPOINT silently falls back to 0, so cold-ingest --source=lcm skips no pre-benchmark setup frames. The generated cold store then includes setup/account-creation ledgers before the dense benchmark ledgers, which corrupts the target TPS/density assumptions for the synthetic read benches; this should error unless the checkpoint was actually found or explicitly provided.

Useful? React with 👍 / 👎.

Simon Chow and others added 2 commits June 9, 2026 14:49
Results report for the three synthetic apply-load datasets (sac/token/soroswap,
10k SAC / 9k OZ / 2.5k Soroswap TPS @ 600ms blocks) run through the full read +
ingest suite on c6id.8xlarge. Datasets, configs, CSVs, and RESULTS.md are in GCS
at gs://rpc-full-history/synthetic-ledgers/2026-06-04-apply-load-20k/.

Covers query latency (both decode paths, p50/p99), peak throughput, per-stage
ingest, and a comparison to the pubnet chunk-5860 baseline (synthetic is ~5-9x
slower per query due to per-ledger density; ingest is item-bound). Implements
#762's acceptance criteria via --source=lcm (no SDK loadtest dependency).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d results (#762)

Self-contained, offline-capable HTML explorer (same tool as #758) for the three
synthetic datasets. The "profile" dimension (sac/token/soroswap) maps onto the
explorer's per-subdir axis; UI labels relabeled machine->Profile.

- adds make_explorer.py (from #758) + the generated
  2026-06-09-synthetic-apply-load-explorer.html (all CSVs embedded; no deps).
- fixes the cold-throughput calc: divide chunk_wall by the actual chunk-workers
  (= chunk count for these runs; override via COLD_CHUNK_WORKERS) instead of a
  hardcoded 8, so cold ledgers/s reads correctly (sac 20, token 38, soroswap 108).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce8d9c272d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

HOT="${HOT:-1}" # 1 = also build a hot store per profile and run hot-* benches
# Profiles whose events are NOT 4-topic (skip cold/hot-events). apply-load's
# custom_token emits non-4-topic events; sac/soroswap are fine.
NO_EVENTS="${NO_EVENTS:-token}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip events for the oz profile alias

When PROFILES uses the supported oz alias for custom_token (accepted by apply-load-gen.sh), this default does not skip event benches even though the same non-4-topic event shape applies. The suite then runs cold-events/hot-events for oz and hits the corpus error path instead of producing a clean read-bench run; include oz in the default skip list or normalize the profile name before this check.

Useful? React with 👍 / 👎.

…er + report (#762)

Include the real pubnet (non-synthetic) chunk-5860 baseline alongside the three
synthetic datasets, on the same c6id.8xlarge, so the comparison is interactive
and in the tables — not just prose.

- explorer: replace the synthetic-only HTML with
  2026-06-09-synthetic-vs-pubnet-explorer.html — 4 datasets (pubnet + sac/token/
  soroswap), 200 query rows. pubnet contributes its query sweeps (the headline
  comparison); the ingest tab stays synthetic-only (pubnet used 8 chunk-workers
  vs synthetic 1-2, so a single throughput divisor would misreport it — its
  ingest lives in results/2026-06-03-cross-machine.md). Dimension relabeled
  Profile -> Dataset.
- report: add a pubnet column to the query-latency and throughput tables, and
  link the explorer.

Pubnet query sweeps pulled from gs://.../benchmarks/2026-06-03/c6id.8xlarge-...
-corrected (same harness/CSV layout as the synthetic run).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23bee2f951

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +48 to +49
"$BENCH_BIN" cold-ledgers --cold-dir="$COLD/ledgers" --n="$n" --iters="$LEDGERS_ITERS" \
--query-concurrency="$QC" --out="$O" > "$O/cold-ledgers-n$n.log" 2>&1 || echo " cold-ledgers n=$n FAILED"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve each ledger-size sweep CSV

With the default LEDGER_NS="1 10 20", every iteration writes to the same --out=$O directory while cold-ledgers uses fixed output names (cold-ledgers.csv/cold-ledgers-sweep.csv via createCSV, which truncates with os.Create). The per-n logs are distinct, but the CSV data for n=1 and n=10 is overwritten by the later runs, so the result directory and explorer only retain the last ledger-size sweep; the hot-ledgers loop below has the same pattern.

Useful? React with 👍 / 👎.

@@ -0,0 +1,168 @@
package main

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer necessary now that stellar/stellar-core#5319 has been merged. Even without that fix in stellar-core, the better workaround is to just configure the network passphrase in the benchmark commands to be "Apply Load"

const (
sourcePack = "pack"
sourceBSB = "bsb"
sourceLCM = "lcm"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of introducing a new source, I think it would be better to make have a post processing step after apply load which takes the generated ledgers and then converts them into the pack format. then we can run the benchmarks using the source as pack pointing to the synthetic ledgers packfile

…t p99 tail (#762)

results/rocksdb-config.md — per-CF key knobs (events/ledgers/txhash) extracted
from the OPTIONS files RocksDB wrote, plus the full verbatim events-CF OPTIONS.

Reveals the p99 ingest-tail cause: events & ledgers CFs run on RocksDB defaults
(auto-compaction on, max_background_jobs=2, L0 slowdown@20/stop@36), while txhash
is tuned write-once (disable_auto_compactions, L0 triggers 999, 8 bg jobs). The
events CF's default L0 throttling under dense writes is what produces the ~8x
p99/p50 on events hot_write. Linked from the synthetic report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs Review

Development

Successfully merging this pull request may close these issues.

2 participants