Surface actionable diagnostic when TypeScript codegen generator is dropped by load failure#18125
Surface actionable diagnostic when TypeScript codegen generator is dropped by load failure#18125sebastienros wants to merge 7 commits into
Conversation
…failure
A TypeScript AppHost whose configured SDK version diverges from the installed
CLI fails with a cryptic "No code generator found for language: TypeScript".
The real cause is a binary mismatch: the code-generation assembly references an
Aspire.TypeSystem version that is absent on disk, so type discovery raises a
ReflectionTypeLoadException that CodeGeneratorResolver swallows. The generator is
silently dropped, and GenerateCode then throws a plain ArgumentException that the
diagnostic builder does not classify, so the actionable IncompatibleAspireSdk
("run aspire update") diagnostic the CLI knows how to render is never emitted.
- CodeGeneratorResolver now retains the swallowed ReflectionTypeLoadExceptions and
exposes them via GetDiscoveryLoadFailures().
- GenerateCode chains the captured load failure as the inner exception when no
generator is found, so the existing catch path produces the IncompatibleAspireSdk
diagnostic with the version-alignment remediation hint.
- CodeGenerationDiagnosticBuilder recognizes FileNotFoundException (the actual
loader exception for a missing dependency assembly) and captures its file name.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 18125Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 18125" |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…mbly Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only treat a bare FileNotFoundException as an incompatible-SDK signal when its FileName is an assembly display name, so a genuine missing-file IO error during code generation is not misreported as a version mismatch. FNFEs inside an RTLE's loader exceptions are still always treated as assembly-bind failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
PR #18125 - Functional Testing ReportPR: #18125 - TypeScript AppHost codegen failure -> actionable diagnostic What the PR fixesWhen the installed CLI is older than the configured SDK, the bundled apphost server's Scenario 1 - No regression on the SHIPPED PR artifact (PASS)Installed the PR CLI to an isolated dir and ran Generated
Scenario 2 - Diagnostic flip proven against real production code (PASS)The fix lives in Key deterministic proof - Supporting coverage:
Verdict
PR is functionally sound for its stated intent (fixes #18110, #17910). |
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
JamesNK
left a comment
There was a problem hiding this comment.
LGTM. The diagnostic chain is well-structured — the heuristic to distinguish assembly-bind FileNotFoundException from plain file IO is careful, and the test coverage is thorough across all the relevant exception shapes.
We need to harden this, not just fix the error. I think we need a breakdown of why this is happening. Unless there was some fundamental contract breakage between the apphost server between 13.4 and 13.5 (which I am not aware of), this should not happen. The version of Aspire.Hosting should come from the apphost, the apphost server's version should not impact the ALC where integrations are loaded. |
davidfowl
left a comment
There was a problem hiding this comment.
No issues with improving the error but I think there's more to fix here, this scenario should not be broken.
Replace the negative/substring assertions with a single Assert.Equal against the full SafeMessage so the test pins the actionable message verbatim. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❓ CLI E2E Tests unknown — 115 passed, 0 failed, 2 unknown (commit View all recordings
📹 Recordings uploaded automatically from CI run #27423488724 |
Should this be draft till then? |
….0.0.0 The shipped 1.0.0.0 freeze fixed old-CLI + new-SDK skew but regressed the common backward-compat direction (new CLI + stable-channel SDK), where the bundled Aspire.TypeSystem (1.0.0.0) could not satisfy the codegen assemblies' strong-named reference to the real released version (e.g. 13.4.3.0). The CLR only binds when the loaded copy's version is >= the requested version, so a single frozen value must sit ABOVE every shipped real version to satisfy the must-work direction. Freeze at 10001.0.0.0 (a '1.0' seeded high; major slot is the breaking-change bump axis). - Aspire.TypeSystem.csproj: AssemblyVersion 1.0.0.0 -> 10001.0.0.0; drop DisablePackageBaselineValidation (10001 >= package baseline, so CP0003 passes); rewrite rationale comment. - IntegrationLoadContext: explain the bundled high copy satisfies any lower/equal reference; residual old-CLI case is #18125's actionable 'update your CLI' path. - AssemblyLoader.WarnIfSharedAssemblyMismatch: warn only when bundled < libs (the only failing order); bundled >= libs binds and logs at Debug. - AtsSharedContractSurfaceTests / SharedContractSurface: reframe the freeze guard around constant-version binding (content compat, not version) and reflow comments. Fixes the regression behind #18110 and #17910; complements the #18125 diagnostics, which still surface the genuine new-member case (already-shipped old CLI + post-freeze codegen) as an actionable error. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
When the configured Aspire SDK is out of sync with the installed CLI, running or restoring a TypeScript (Node.js) AppHost could fail with a cryptic, unactionable error:
The root cause is a binary mismatch between the bundled apphost server and the integration assemblies on disk. The TypeScript code generator assembly (from the SDK packages) references an
Aspire.TypeSystemassembly that the running apphost server cannot provide (for example, an older bundled server that predates that assembly version). During assembly discovery,CodeGeneratorResolver.DiscoverGeneratorscallsGetTypes(), hits aReflectionTypeLoadException(whose loader exception is aFileNotFoundExceptionforAspire.TypeSystem), and silently drops the TypeScript generator. The downstreamCodeGenerationService.GenerateCodethen threw a plainArgumentException, which the diagnostic builder did not classify, so the existing actionableIncompatibleAspireSdkdiagnostic (with a "run aspire update" remediation hint) was never emitted. The bare exception then propagated across StreamJsonRpc to the CLI and surfaced as a raw stack trace / "unexpected error".This change routes the swallowed load failure into that existing actionable diagnostic so the user instead sees:
Approach
The fix is contained entirely within
Aspire.Hosting.RemoteHost(the apphost server, which is bundled into the CLI). It has three parts:CodeGeneratorResolvernow retains theReflectionTypeLoadExceptions it swallows during discovery (via a newDiscoveryResultandGetDiscoveryLoadFailures()), instead of discarding them.CodeGenerationService.GenerateCodechains the first retained load failure as the inner exception of theArgumentExceptionit throws when no generator is found.CodeGenerationDiagnosticrecognizesFileNotFoundExceptionin the exception chain (in addition to the existing reflection-load cases) and builds the structuredIncompatibleAspireSdkRPC error that the CLI already renders with the actionable "run aspire update" message.Because both the affected server code and the CLI client catch path (
AppHostRpcClient.InvokeCodeGenerationAsync) are shared, this applies to every command that triggers code generation (aspire run,aspire restore, etc.). No public API or codegen exports coverage changed; the CLI-side rendering path was already in place.Verification
In addition to the unit tests, I ran a functional A/B test against a real TypeScript AppHost configured with a daily SDK, injecting the same
Aspire.TypeSystemload failure. With the fix, the message flips from the cryptic "No code generator found..." to the actionable "...incompatible with the configured SDK version. Run 'aspire update'...". This was verified end-to-end viaaspire run;aspire restoreexercises the same shared RPC and client paths.Fixes #18110
Fixes #17910
Checklist
<remarks />and<code />elements on your triple slash comments?