Skip to content

perf(parser): fix O(n²) JSONL streaming that made Sync take >2 minutes#161

Open
san360 wants to merge 3 commits into
mainfrom
san360-sync-timing-telemetry
Open

perf(parser): fix O(n²) JSONL streaming that made Sync take >2 minutes#161
san360 wants to merge 3 commits into
mainfrom
san360-sync-timing-telemetry

Conversation

@san360

@san360 san360 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Sync Sessions took >2 minutes for large session volumes (follow-up to the #106 OOM fix). Root-caused via CPU profiling and fixed an O(n²) hotspot in the JSONL streaming readers. Phase-2 parse drops from ~197s → ~14s (~14×) with identical parsed results (same 616 sessions / 132 workspaces in my test data).

Root cause

The cold-parse JSONL readers (forEachJsonlLine / forEachJsonlLineAsync in parser-vscode-files.ts) used:

pending += decoder.write(chunk);
const parts = pending.split('\n');   // re-splits the ENTIRE growing buffer on every 1MB read
pending = parts.pop() ?? '';

Copilot CLI events.jsonl lines embed tens of MB of inline base64 (images). When a single line spans many 1 MB read chunks, every read re-concatenated and re-split the whole accumulated buffer → O(n²) CPU plus massive GC churn.

Evidence (CPU profile of a real 616-session / 132-workspace parse)

Before After
Phase 2 (parse) ~197 s ~14 s
forEachJsonlLineAsync self-time 154.5 s negligible
Garbage collector 29.8 s 0.9 s
Idle (waiting on disk I/O) 2.6 s 9.4 s (healthy)

This ruled out the earlier hypotheses (per-file setImmediate yields, prefetch I/O, the worker pool) — the profile attributed 78% of wall time to a single function.

The fix

A shared emitChunkLines() helper that scans only the freshly read chunk with indexOf, joining the carried partial line only when its terminating newline arrives. Total work is now linear in file size, with byte-for-byte identical line-splitting semantics. Applied to both the sync and async readers.

Telemetry (local-only, as requested)

Adds Sync timing instrumentation to the VS Code Output channel to attribute future regressions — all local, no remote telemetry:

  • [panel] sync-timing — end-to-end Sync breakdown (discover / parse / analyzer / warm-up / dataReady).
  • [parser] worker-phase-timing — per-phase wall clock.
  • cold-parse-breakdown / phase2-attribution — fine-grained cold-parse attribution.

Tests

  • Added regression tests for a single line spanning multiple read chunks (the O(n²) trigger) in both readers.
  • npm run check green — 1215 tests pass (+2 new).

Verification

VSIX repackaged and installed locally; "Developer: Reload Window" then re-run Sync to see the ~14× improvement and the sync-timing line in the Output channel.

Parsed-output parity verification

Added npm run parity (scripts/parse-parity.ts + scripts/parity-parse-entry.ts) to prove this change is output-preserving. There was no existing automated way to compare parsed JSON against main, and the harness can't live on main — so instead of a committed golden snapshot it builds the parser from both trees and diffs them directly:

  1. Bundles parseAllLogs from this branch's src/ and from a main checkout's src/.
  2. Runs both against the same real local logs.
  3. Serializes each ParseResult to canonical, stable-sorted NDJSON and compares by SHA-256.

Result — byte-for-byte identical across both trees (main runs the old split-based reader, branch runs emitChunkLines):

branch sha256 = 849183b49f4556f44678256104a169f3073f2002baa711e5ddc5ed95efe13880
main   sha256 = 849183b49f4556f44678256104a169f3073f2002baa711e5ddc5ed95efe13880
sessions=616 workspaces=132 editLoc=1771 sources=599  (identical)

PARITY OK — parsed output is byte-for-byte identical across both trees.

Usage: npm run parity (defaults to the local main checkout) or npm run parity -- --main <path-to-main-checkout>.

The cold-parse JSONL readers accumulated `pending += chunk` and then
re-ran `pending.split('\n')` over the entire growing buffer on every 1MB
read. Copilot CLI events.jsonl lines embed tens of MB of inline base64,
so a single long line spanning many chunks cost O(n^2) CPU plus heavy GC.

CPU-profiling a real 616-session / 132-workspace parse attributed 154.5s
of 198s to forEachJsonlLineAsync self-time and 29.8s to GC. Replacing the
repeated whole-buffer split with emitChunkLines(), which scans only each
freshly read chunk via indexOf, makes the work linear with identical line
semantics: phase-2 parse dropped from ~197s to ~14s (~14x) with the same
parsed result.

Also adds local-only Sync timing telemetry to the VS Code Output channel
(sync-timing, worker-phase-timing, cold-parse-breakdown, phase2-attribution)
to attribute future regressions. No remote telemetry or network calls.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@san360 san360 requested a review from aymenfurter June 24, 2026 17:22
Adds `npm run parity`, which builds the parser from two source trees (this
branch and a main checkout), runs both against the same real local logs,
and compares a canonical NDJSON serialization of the ParseResult by SHA-256.

This is how we prove the O(n^2) JSONL streaming fix changed performance
only, not output: there is no committed golden snapshot, and the harness
itself lives only on the branch, so it diffs branch-vs-main builds directly
instead of against a baseline. Verified: both trees emit an identical
sha256 (sessions=616 workspaces=132 editLoc=1771 sources=599).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@san360

san360 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the automated check. This is informational: Dependency Review found no vulnerabilities or license issues for this PR, which is expected since it adds no runtime dependencies (only source changes plus two dev-only scripts). No action needed here.

The "deny-licenses is deprecated" note refers to the repository's dependency-review workflow configuration, not anything in this PR's diff, so it is out of scope for this change and best handled as a separate workflow-maintenance update.

@aymenfurter aymenfurter left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves cold-parse performance for large Copilot JSONL logs by replacing an O(n²) chunk-splitting hotspot with a linear-time line emitter, and adds local-only timing instrumentation (plus a parity harness) to make future regressions diagnosable while preserving parsed output.

Changes:

  • Reworked JSONL streaming readers to scan only the newly read chunk for newlines (linear-time behavior) and added regression coverage for long single-line spans.
  • Added local-only sync/parse timing instrumentation across the panel, worker host, and parser to attribute wall-clock time by phase/sub-phase.
  • Introduced a npm run parity harness that builds/runs parsers from two trees and SHA-compares canonicalized NDJSON output.
Show a summary per file
File Description
src/webview/panel.ts Adds end-to-end sync timing breakdown emitted via runtimeDebug (warm-cache + cold).
src/core/parser.ts Adds phase-2 attribution logging and cold-parse sub-phase breakdown logging hooks.
src/core/parser-worker-host.ts Tracks per-phase wall-clock durations from worker progress and logs a summary on completion.
src/core/parser-worker-host.test.ts Adds coverage ensuring the worker-phase timing summary is emitted on done.
src/core/parser-vscode.ts Adds sub-step timing accumulation around chat/CLI/edit parsing operations.
src/core/parser-vscode-files.ts Replaces pending.split('\n') loop with emitChunkLines() to avoid O(n²) behavior on long lines.
src/core/parser-vscode-files.test.ts Adds regression tests for single JSONL lines spanning multiple read chunks (sync + async readers).
src/core/parser-shared.ts Introduces parse timing counters and integrates forced-GC counting for attribution.
scripts/parse-parity.ts Adds a two-tree parse parity runner that bundles/runs both parsers and compares SHA-256 output hashes.
scripts/parity-parse-entry.ts Adds the deterministic canonicalization + NDJSON emission entrypoint used by the parity runner.
package.json Adds parity npm script.
.gitignore Ignores parity output/temp artifacts.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 11/12 changed files
  • Comments generated: 2

Comment thread scripts/parse-parity.ts
Comment thread src/core/parser-shared.ts Outdated
- parse-parity: replace the hard-coded Windows-only main checkout path with
  git-based discovery (git worktree list -> the main branch's working tree),
  so `npm run parity` works on any OS and checkout layout, and fails with an
  actionable message when no main checkout is found.
- parser-shared(maybeForceGc): increment parseTiming.forcedGc only after gc()
  returns, so the counter reflects GCs actually forced rather than attempted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants