Skip to content

Add DriverExtension hook for embedding paralegal as a library#196

Draft
JustusAdam wants to merge 4 commits into
mainfrom
haven-driver-extension
Draft

Add DriverExtension hook for embedding paralegal as a library#196
JustusAdam wants to merge 4 commits into
mainfrom
haven-driver-extension

Conversation

@JustusAdam
Copy link
Copy Markdown
Collaborator

Summary

Adds a DriverExtension trait and a run_with_extension entry point so external rustc drivers can plug into paralegal between borrow-check and PDG construction without forking. The motivating consumer is haven's planned MIR rewrite pass, which needs to mutate MIR before paralegal walks it.

  • DriverExtension::pre_pdg(tcx, &MarkerCtx) fires after MIR borrow-check, before paralegal consumes MIR.
  • run_with_extension(args, plugin_args, Box<dyn DriverExtension>); existing run becomes sugar that passes a NoopExtension.
  • Pctx is now built once at the top of Callbacks::after_expansion and shared with both the hook and the analyzer, collapsing the previous double MarkerDatabase::init.
  • DumpOnlyCallbacks opts into the same hook for dependency crates only when the extension sets requires_dependency_markers, preserving the standalone paralegal cost profile.

Notes for reviewers

The non-obvious ordering constraint surfaced by the cross-crate tests: MarkerDatabase::init reads each included crate's .pmeta from disk, including the local one. dump_markers(tcx) therefore has to run before Pctx::new. Both after_expansion paths now document that constraint in code, and the run_in_context_with_pctx helper expects callers to have already dumped.

Marked draft: opening this now to get directional feedback on the trait surface and the timing/ordering changes before haven starts depending on the API.

🤖 Generated with Claude Code

JustusAdam and others added 4 commits May 15, 2026 23:44
Lets external rustc drivers (haven's planned MIR rewriter being the
motivating case) plug into paralegal between borrow-check and PDG
construction without forking. The pre_pdg hook receives the shared
Pctx/MarkerCtx so any MIR mutation happens before paralegal's analyzer
consumes MIR and any marker lookup the extension does hits the same
cache the downstream analyzer will use.

NoopExtension preserves the existing standalone paralegal binary's
cost profile; DumpOnlyCallbacks skips the (expensive) MarkerCtx build
for dependency crates unless an extension opts in via
requires_dependency_markers.

The Pctx lift incidentally collapses the two MarkerDatabase::init
calls the previous flow did per analyzed crate (one in Pctx::new
inside into_generator that's no longer needed) into one. Ordering
matters: MarkerDatabase::init reads each included crate's .pmeta from
disk including the local one, so dump_markers must run before Pctx::new
— the cross-crate tests caught this when the dumps and the init were
reordered. Both after_expansion paths now document the constraint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extensions like haven need to install their own query providers (notably
`mir_borrowck` and `fn_sig` for synthetic DefIds) alongside paralegal's
own. Since `Config::override_queries` is `Option<fn(..)>` and can't be
layered externally, expose a hook on the trait and compose inside
`configure()` via a process-wide OnceLock.

No behavior change when no extension is registered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets embedders (haven, future paralegal consumers) swap the
rustc_driver-linked analyzer binary without forking the cargo
orchestrator. The wrapper-shim execs the overridden binary, and the
build hash mixes in its identity so cached artifacts from a different
analyzer aren't reused silently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`prepare_compiler_args` in cargo_paralegal_flow owns the boilerplate
(VersionArgs parsing, --version/-vV print + nightly-version spoof,
EXEC_HASH_ARG strip, RUSTC_WRAPPER mode detection, sysroot append) so
both paralegal-flow-impl and embedders linking the analyzer as a
library (haven-driver) share one source-of-truth instead of keeping
~100 lines of mirrored argv-prep in sync.

Co-located orchestrator change: new --build flag flips the wrapped
cargo invocation from `check` to `build` so MIR-rewriting embedders
can codegen the rewritten bodies into real binaries. Both paralegal
Callbacks already return Compilation::Continue, so codegen runs
unconditionally past the analyzer hooks; the flag just unlocks it at
the cargo layer. Mixed into the build hash to keep check/build caches
disjoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant