Skip to content

Make EXPLAIN ANALYZE-like visualization less verbose#527

Merged
gabotechs merged 2 commits into
datafusion-contrib:mainfrom
Tristan1900:tristan1900/less-verbose-explain
Jul 3, 2026
Merged

Make EXPLAIN ANALYZE-like visualization less verbose#527
gabotechs merged 2 commits into
datafusion-contrib:mainfrom
Tristan1900:tristan1900/less-verbose-explain

Conversation

@Tristan1900

@Tristan1900 Tristan1900 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Closes #525.

Two tweaks to the plan display, aimed at the wide/long distributed plans the issue calls out.

Stage header — instead of one entry per task:

before:  ┌───── Stage 1 ── Tasks: t0:[p0..p15] t1:[p16..p31] … t79:[p1264..p1279]
after:   ┌───── Stage 1 ── tasks=80, partitions=1280

Per-task node metrics — instead of repeating the metric name for every task:

before:  output_rows_0=132, output_rows_1=216, output_rows_2=200, …
after:   output_rows={0:132, 1:216, 2:200, …}

Also updated the docs/examples that show plan output, regenerated the inline snapshots, and added unit tests for both formatters in stage.rs.

Stage headers now summarize as `tasks=N, partitions=M` instead of listing
every task's partition range, and per-task node metrics collapse into a
`name={task_id:value, ...}` map instead of repeating the metric name for
each task.

The map keeps task ids explicit, which matters when a node runs on only a
subset of tasks (e.g. under a ChildrenIsolatorUnionExec) and reports a
non-contiguous set of ids. Metrics with no task id (non-distributed plans)
keep the plain name=value form.

Regenerate the inline snapshots, update the docs/examples that show plan
output, and add unit tests for both formatters.
Comment thread docs/source/user-guide/how-a-distributed-plan-is-built.md Outdated
DistributedExec is the coordinator's single-task head, and the planner
coalesces its input to a single partition before wrapping it, so its
header was always `tasks=1, partitions=1` — pure noise. Render just
`DistributedExec` (plus its coordinator-side metrics when shown).
@Tristan1900 Tristan1900 marked this pull request as ready for review July 2, 2026 19:36
@datafusion-contrib datafusion-contrib deleted a comment from Tristan1900 Jul 3, 2026
@gabotechs

Copy link
Copy Markdown
Collaborator

🤔 I'm a bit on the fence whether we should do:

output_rows={0:132, 1:216, 2:200, …}

or

output_rows=[0:132, 1:216, 2:200, …]

Don't have a strong opinion, but maybe the latter is actually nicer to eye given that square brackets are also used in other places.

@gabotechs gabotechs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the square brackets, it's easy to revisit later if needed

@gabotechs gabotechs merged commit 2fc982c into datafusion-contrib:main Jul 3, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make EXPLAIN ANALYZE-like visualization less verbose

2 participants