Skip to content

gui: add snapwr to boot_progress, fix live_network_metrics docs#10210

Open
jherrera-jump wants to merge 1 commit into
firedancer-io:mainfrom
jherrera-jump:jherrera/gui-snapwr-support
Open

gui: add snapwr to boot_progress, fix live_network_metrics docs#10210
jherrera-jump wants to merge 1 commit into
firedancer-io:mainfrom
jherrera-jump:jherrera/gui-snapwr-support

Conversation

@jherrera-jump

@jherrera-jump jherrera-jump commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

No description provided.

@jherrera-jump jherrera-jump changed the title gui: add snapwr to summary.boot_progress, fix summary.live_network_me… gui: add snapwr to summary.boot_progress, fix live_network_metrics docs Jun 12, 2026
@jherrera-jump jherrera-jump changed the title gui: add snapwr to summary.boot_progress, fix live_network_metrics docs gui: add snapwr to boot_progress, fix live_network_metrics docs Jun 12, 2026
@jherrera-jump jherrera-jump force-pushed the jherrera/gui-snapwr-support branch from 9eaddda to 4e4a53f Compare June 12, 2026 16:27
@jherrera-jump jherrera-jump force-pushed the jherrera/gui-snapwr-support branch from 4e4a53f to 80b7701 Compare June 13, 2026 19:48
@jherrera-jump jherrera-jump marked this pull request as ready for review June 13, 2026 19:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Firedancer’s GUI boot progress reporting to include snapwr (snapshot write) stage metrics, and updates API/docs to reflect the expanded boot progress payload and live network metrics layout.

Changes:

  • Add snapwr_accounts_written gauge and track it in the snapwr tile.
  • Include snapwr progress (in-bytes, out-bytes, accounts) in boot_progress summary output and websocket docs.
  • Update live_network_metrics documentation to include the rserve protocol slot.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/discof/restore/fd_snapwr_tile.c Track accounts_written and preserve full-snapshot baseline across incremental/retries.
src/disco/metrics/metrics.xml Define the new Snapwr AccountsWritten gauge.
src/disco/metrics/generated/fd_metrics_snapwr.h Generated metadata for new Snapwr gauge.
src/disco/metrics/generated/fd_metrics_snapwr.c Generated metric table includes new Snapwr gauge.
src/disco/gui/fd_gui.h Extend boot progress snapshot struct with snapwr fields.
src/disco/gui/fd_gui.c Sample snapwr metrics and compute per-snapshot (full vs incremental) deltas.
src/disco/gui/fd_gui_printf.c Emit new boot progress JSON fields for snapwr.
book/api/websocket.md Document new boot progress fields and fix live network metrics examples/tiles list.
book/api/metrics-generated.md Generated metrics docs include new Snapwr gauge.

Comment on lines 1622 to +1626
<tile name="snapwr">
<gauge name="FullBytesRead" summary="Number of decompressed snapshot bytes consumed from the full snapshot. Might decrease if snapshot load is aborted and restarted" />
<gauge name="IncrementalBytesRead" summary="Number of decompressed snapshot bytes consumed from the incremental snapshot. Might decrease if snapshot load is aborted and restarted" />
<gauge name="BytesWritten" summary="Number of bytes written to the accounts database on disk. Monotonically increasing across snapshot loads." />
<gauge name="AccountsWritten" summary="Number of accounts written to the accounts database on disk. Might decrease if snapshot load is aborted and restarted" />
Comment thread src/disco/gui/fd_gui.h
Comment on lines +633 to +635
ulong snapwr_in_bytes_decompressed;
ulong snapwr_out_bytes_decompressed;
ulong snapwr_accounts_current;
Comment on lines +2531 to +2533
jsonp_ulong_as_str( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_in_bytes_decompressed", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_in_bytes_decompressed ); \
jsonp_ulong_as_str( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_out_bytes_decompressed", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_out_bytes_decompressed ); \
jsonp_ulong ( gui->http, "loading_" FD_STRINGIFY(snapshot_type) "_snapshot_snapwr_accounts", gui->summary.boot_progress.loading_snapshot[ snapshot_idx ].snapwr_accounts_current ); \
Comment thread book/api/websocket.md
Comment on lines +583 to +586
| loading_{full\|incremental}_snapshot_insert_accounts | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the current number of accounts inserted into the validator's accounts database from this snapshot. Otherwise, `null` |
| loading_{full\|incremental}_snapshot_snapwr_in_bytes_decompressed | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the (decompressed) number of bytes consumed from the snapshot by the snapshot write (snapwr) stage so far. Otherwise, `null` |
| loading_{full\|incremental}_snapshot_snapwr_out_bytes_decompressed | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the number of bytes written to the on-disk account database by the snapshot write (snapwr) stage for this snapshot so far. Otherwise, `null` |
| loading_{full\|incremental}_snapshot_snapwr_accounts | `number\|null` | If the phase is at least `loading_{full\|incremental}_snapshot`, this is the current number of accounts written to the on-disk account database by the snapshot write (snapwr) stage for this snapshot so far. Otherwise, `null` |
@greptile-jt

greptile-jt Bot commented Jun 13, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds snapshot write (snapwr) stage metrics to the GUI boot progress display and updates the websocket API documentation.

  • Adds a new AccountsWritten gauge metric to the snapwr tile, with a full_accounts_written backup field to correctly split cumulative counts between full and incremental snapshot phases
  • Surfaces three new snapwr fields in the boot progress JSON output: snapwr_in_bytes_decompressed, snapwr_out_bytes_decompressed, and snapwr_accounts
  • Applies per-snapshot baseline subtraction to insert_accounts (fixing a pre-existing issue where the incremental snapshot showed cumulative counts instead of per-snapshot counts)
  • Documents the rserve (repair serve) network protocol in the websocket API docs and updates live_network_metrics examples to reflect the current 6-protocol set

Confidence Score: 4/5

This PR is safe to merge; it adds observability metrics and documentation with correct baseline subtraction logic.

The new snapwr metrics follow established patterns from neighboring snapshot tiles. The baseline subtraction logic is correct and consistent with the existing cumulative metric handling. The only minor concern is that the websocket docs imply EMA arrays have the same element count as raw ingress/egress arrays, but this is a pre-existing issue.

book/api/websocket.md has a minor documentation accuracy issue with EMA array element counts.

Important Files Changed

Filename Overview
src/disco/gui/fd_gui.c Adds snapwr tile metrics lookup and per-snapshot baseline subtraction for snapwr and insert accounts. Logic correctly mirrors the existing snapin/snapdc patterns with saturating subtraction.
src/disco/gui/fd_gui.h Adds three new fields (snapwr_in_bytes_decompressed, snapwr_out_bytes_decompressed, snapwr_accounts_current) to the boot_progress loading_snapshot struct. Clean, consistent with existing members.
src/disco/gui/fd_gui_printf.c Adds JSON output fields for the new snapwr boot progress metrics in both the active and null branches of the HANDLE_SNAPSHOT_STATE macro. Consistent with existing output patterns.
src/discof/restore/fd_snapwr_tile.c Adds accounts_written metric with full_accounts_written backup mechanism for full/incremental snapshot transitions. Correctly increments on account header processing and manages resets on INIT/NEXT/FAIL control messages.
src/disco/metrics/metrics.xml Adds AccountsWritten gauge metric to the snapwr tile definition.
src/disco/metrics/generated/fd_metrics_snapwr.h Generated file updated to include ACCOUNTS_WRITTEN offset and metadata. Total metric count correctly updated to 4.
src/disco/metrics/generated/fd_metrics_snapwr.c Generated file updated to declare the new ACCOUNTS_WRITTEN gauge metric.
book/api/websocket.md Documents new snapwr boot progress fields and rserve network protocol. The EMA array examples suggest 6 elements but code only outputs 5 (rserve missing from EMA).
book/api/metrics-generated.md Auto-generated metrics documentation updated to include the new snapwr AccountsWritten metric.

Sequence Diagram

sequenceDiagram
    participant snapct as snapct tile
    participant snapdc as snapdc tile
    participant snapin as snapin tile
    participant snapwr as snapwr tile
    participant gui as GUI (fd_gui)
    participant ws as WebSocket Client

    Note over snapwr: INIT_FULL: reset accounts_written=0
    snapwr->>snapwr: Process full snapshot accounts
    snapwr->>snapwr: Increment accounts_written per account header
    Note over snapwr: NEXT: backup full_accounts_written = accounts_written

    Note over snapwr: INIT_INCR: accounts_written = full_accounts_written
    snapwr->>snapwr: Process incremental snapshot accounts
    snapwr->>snapwr: Increment accounts_written per account header

    loop Every 100ms sampling
        gui->>snapct: Read FULL_SIZE_BYTES, FULL_BYTES_READ, etc.
        gui->>snapdc: Read FULL_DECOMPRESSED_BYTES_WRITTEN, etc.
        gui->>snapin: Read ACCOUNT_LOADED (cumulative)
        gui->>snapwr: Read FULL_BYTES_READ, ACCOUNTS_WRITTEN, BYTES_WRITTEN
        gui->>gui: Compute per-snapshot values via baseline subtraction
        gui->>ws: Broadcast boot_progress JSON (incl. snapwr fields)
    end
Loading

Reviews (1): Last reviewed commit: "gui: add snapwr to summary.boot_progress..." | Re-trigger Greptile

Comment thread book/api/websocket.md
Comment on lines +940 to +941
"ingress_ema": [1234543.00, 543123.00, 9234.00, 4321.00, 876.00, ...],
"egress_ema": [1234543.00, 543123.00, 9234.00, 4321.00, 876.00, ...],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 EMA arrays shorter than documented

The ingress_ema and egress_ema arrays will have 5 elements (turbine, gossip, tpu, repair, metric) because FD_GUI_NET_PROTO_CNT is 5 and rserve is excluded from the EMA rate calculation in fd_gui_network_rate_max_update (fd_gui.c:643-655). Meanwhile, the ingress and egress arrays output 6 elements (including rserve, at fd_gui_printf.c:726/734).

Showing 5 example values + ... for all four arrays suggests they all have the same length, which would be confusing for consumers who discover the mismatch at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants