restore: increase MTU from 64K to 256K#10173
Conversation
Use tsorig field for frag sizes instead of sz field
Performance Measurements ⏳
|
There was a problem hiding this comment.
Pull request overview
This PR updates the snapshot restore pipeline to support a larger per-fragment MTU (64 KiB → 256 KiB) by carrying fragment sizes in the tsorig metadata field (since fd_frag_meta_t.sz is only a ushort). It also updates the default/dev topologies to provision the larger MTU on the relevant snapshot links.
Changes:
- Switch snapshot fragment size propagation from
sztotsorigacross the snap* restore tiles (publish + consume paths). - Update snapshot link MTUs to
1UL<<18(256 KiB) and reduce link depths to4096in multiple topology builders. - Adjust dcache chunk advancement to use the actual fragment size being forwarded (where applicable).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/discof/restore/fd_snapwr_tile.c | Consume snapshot data frags using tsorig as the fragment size. |
| src/discof/restore/fd_snapld_tile.c | Publish snapshot META/DATA/control frags with size carried in tsorig (and sz unused). |
| src/discof/restore/fd_snapin_tile.c | Consume snapshot data frags using tsorig as the fragment size. |
| src/discof/restore/fd_snapdc_tile.c | Forward/decompress snapshot stream while treating tsorig as fragment size. |
| src/discof/restore/fd_snapct_tile.c | Treat incoming snapld stream fragment sizes as tsorig when processing. |
| src/app/firedancer/topology.c | Increase snapshot link MTU to 256 KiB and reduce depths for snapld_dc/snapdc_in. |
| src/app/firedancer-dev/commands/snapshot_load.c | Match snapshot-load dev topology MTU/depth updates for snapshot links. |
| src/app/firedancer-dev/commands/forktest/forktest.c | Match forktest topology MTU/depth updates for snapshot links. |
| src/app/firedancer-dev/commands/backtest.c | Match backtest topology MTU/depth updates for snapshot links. |
| } | ||
| } else { | ||
| fd_stem_publish( stem, 0UL, FD_SNAPSHOT_MSG_DATA, ctx->out_dc.chunk, (ulong)result, 0UL, 0UL, 0UL ); | ||
| fd_stem_publish( stem, 0UL, FD_SNAPSHOT_MSG_DATA, ctx->out_dc.chunk, 0UL, 0UL, (ulong)result, 0UL ); |
| @@ -501,8 +501,8 @@ fd_topo_initialize( config_t * config ) { | |||
| if( FD_LIKELY( snapshots_enabled ) ) { | |||
| /* TODO: Revisit the depths of all the snapshot links */ | |||
Greptile SummaryThis PR increases the maximum fragment size (MTU) for snapshot restore links (
Confidence Score: 5/5This PR is safe to merge — all producers and consumers in the snapshot pipeline are consistently updated. The change is mechanically straightforward: move size from the 16-bit sz mcache field to the 32-bit tsorig field across all snapshot tiles. All four topology files are updated identically. All five tile files are consistently updated on both producer and consumer sides. The tsorig field (uint, max ~4B) easily accommodates the new 256K MTU. Depth reduction from 16384 to 4096 maintains the same total dcache footprint. No during_frag callbacks exist in these tiles, so no additional consumers were missed. Verified that both snapin and snapwr consume from snapdc_in, and both are updated. No files require special attention — all changes are consistent and correct. Important Files Changed
Sequence DiagramsequenceDiagram
participant snapct as snapct_tile
participant snapld as snapld_tile
participant snapdc as snapdc_tile
participant snapin as snapin_tile
participant snapwr as snapwr_tile
snapct->>snapld: "snapct_ld: init msg via tsorig"
snapld->>snapdc: "snapld_dc: data/ctrl via tsorig (MTU=256K)"
snapld->>snapdc: "snapld_dc: meta msg via tsorig"
snapdc->>snapin: "snapdc_in: decompressed data via tsorig (MTU=256K)"
snapdc->>snapwr: "snapdc_in: decompressed data via tsorig (MTU=256K)"
Note over snapct,snapwr: sz field always 0. Actual size carried in tsorig (uint 32-bit)
Reviews (1): Last reviewed commit: "restore: increase MTU from 64K to 256K" | Re-trigger Greptile |
Use tsorig field for frag sizes instead of sz field