gui: add snapwr to boot_progress, fix live_network_metrics docs#10210
gui: add snapwr to boot_progress, fix live_network_metrics docs#10210jherrera-jump wants to merge 1 commit into
Conversation
9eaddda to
4e4a53f
Compare
4e4a53f to
80b7701
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends Firedancer’s GUI boot progress reporting to include snapwr (snapshot write) stage metrics, and updates API/docs to reflect the expanded boot progress payload and live network metrics layout.
Changes:
- Add
snapwr_accounts_writtengauge and track it in the snapwr tile. - Include snapwr progress (in-bytes, out-bytes, accounts) in
boot_progresssummary output and websocket docs. - Update
live_network_metricsdocumentation to include therserveprotocol slot.
Reviewed changes
Copilot reviewed 7 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/discof/restore/fd_snapwr_tile.c | Track accounts_written and preserve full-snapshot baseline across incremental/retries. |
| src/disco/metrics/metrics.xml | Define the new Snapwr AccountsWritten gauge. |
| src/disco/metrics/generated/fd_metrics_snapwr.h | Generated metadata for new Snapwr gauge. |
| src/disco/metrics/generated/fd_metrics_snapwr.c | Generated metric table includes new Snapwr gauge. |
| src/disco/gui/fd_gui.h | Extend boot progress snapshot struct with snapwr fields. |
| src/disco/gui/fd_gui.c | Sample snapwr metrics and compute per-snapshot (full vs incremental) deltas. |
| src/disco/gui/fd_gui_printf.c | Emit new boot progress JSON fields for snapwr. |
| book/api/websocket.md | Document new boot progress fields and fix live network metrics examples/tiles list. |
| book/api/metrics-generated.md | Generated metrics docs include new Snapwr gauge. |
| <tile name="snapwr"> | ||
| <gauge name="FullBytesRead" summary="Number of decompressed snapshot bytes consumed from the full snapshot. Might decrease if snapshot load is aborted and restarted" /> | ||
| <gauge name="IncrementalBytesRead" summary="Number of decompressed snapshot bytes consumed from the incremental snapshot. Might decrease if snapshot load is aborted and restarted" /> | ||
| <gauge name="BytesWritten" summary="Number of bytes written to the accounts database on disk. Monotonically increasing across snapshot loads." /> | ||
| <gauge name="AccountsWritten" summary="Number of accounts written to the accounts database on disk. Might decrease if snapshot load is aborted and restarted" /> |
| ulong snapwr_in_bytes_decompressed; | ||
| ulong snapwr_out_bytes_decompressed; | ||
| ulong snapwr_accounts_current; |
| jsonp_ulong_as_str( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_in_bytes_decompressed", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_in_bytes_decompressed ); \ | ||
| jsonp_ulong_as_str( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_out_bytes_decompressed", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_out_bytes_decompressed ); \ | ||
| jsonp_ulong ( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_accounts", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_accounts_current ); \ |
| | loading_{full\|incremental}_snapshot_insert_accounts | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the current number of accounts inserted into the validator's accounts database from this snapshot. Otherwise, `null` | | ||
| | loading_{full\|incremental}_snapshot_snapwr_in_bytes_decompressed | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the (decompressed) number of bytes consumed from the snapshot by the snapshot write (snapwr) stage so far. Otherwise, `null` | | ||
| | loading_{full\|incremental}_snapshot_snapwr_out_bytes_decompressed | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the number of bytes written to the on-disk account database by the snapshot write (snapwr) stage for this snapshot so far. Otherwise, `null` | | ||
| | loading_{full\|incremental}_snapshot_snapwr_accounts | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the current number of accounts written to the on-disk account database by the snapshot write (snapwr) stage for this snapshot so far. Otherwise, `null` | |
Greptile SummaryThis PR adds snapshot write (snapwr) stage metrics to the GUI boot progress display and updates the websocket API documentation.
Confidence Score: 4/5This PR is safe to merge; it adds observability metrics and documentation with correct baseline subtraction logic. The new snapwr metrics follow established patterns from neighboring snapshot tiles. The baseline subtraction logic is correct and consistent with the existing cumulative metric handling. The only minor concern is that the websocket docs imply EMA arrays have the same element count as raw ingress/egress arrays, but this is a pre-existing issue. book/api/websocket.md has a minor documentation accuracy issue with EMA array element counts. Important Files Changed
Sequence DiagramsequenceDiagram
participant snapct as snapct tile
participant snapdc as snapdc tile
participant snapin as snapin tile
participant snapwr as snapwr tile
participant gui as GUI (fd_gui)
participant ws as WebSocket Client
Note over snapwr: INIT_FULL: reset accounts_written=0
snapwr->>snapwr: Process full snapshot accounts
snapwr->>snapwr: Increment accounts_written per account header
Note over snapwr: NEXT: backup full_accounts_written = accounts_written
Note over snapwr: INIT_INCR: accounts_written = full_accounts_written
snapwr->>snapwr: Process incremental snapshot accounts
snapwr->>snapwr: Increment accounts_written per account header
loop Every 100ms sampling
gui->>snapct: Read FULL_SIZE_BYTES, FULL_BYTES_READ, etc.
gui->>snapdc: Read FULL_DECOMPRESSED_BYTES_WRITTEN, etc.
gui->>snapin: Read ACCOUNT_LOADED (cumulative)
gui->>snapwr: Read FULL_BYTES_READ, ACCOUNTS_WRITTEN, BYTES_WRITTEN
gui->>gui: Compute per-snapshot values via baseline subtraction
gui->>ws: Broadcast boot_progress JSON (incl. snapwr fields)
end
Reviews (1): Last reviewed commit: "gui: add snapwr to summary.boot_progress..." | Re-trigger Greptile |
| "ingress_ema": [1234543.00, 543123.00, 9234.00, 4321.00, 876.00, ...], | ||
| "egress_ema": [1234543.00, 543123.00, 9234.00, 4321.00, 876.00, ...], |
There was a problem hiding this comment.
EMA arrays shorter than documented
The ingress_ema and egress_ema arrays will have 5 elements (turbine, gossip, tpu, repair, metric) because FD_GUI_NET_PROTO_CNT is 5 and rserve is excluded from the EMA rate calculation in fd_gui_network_rate_max_update (fd_gui.c:643-655). Meanwhile, the ingress and egress arrays output 6 elements (including rserve, at fd_gui_printf.c:726/734).
Showing 5 example values + ... for all four arrays suggests they all have the same length, which would be confusing for consumers who discover the mismatch at runtime.
No description provided.