Add backend-neutral raw strided bgemm API#134
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a backend-neutral raw borrowed-layout entry point for strided
batched GEMM.
The normal
StridedView/StridedViewMutAPIs remain unchanged. The new APIis for prepared replay paths that already have validated
dims/strides/offsetdescriptors and should not rebuild owningview metadata for every small GEMM.
Motivation
Tensor contraction engines often split execution into two phases:
In the replay phase, the caller already has borrowed layout metadata:
Constructing a new
StridedViewfor every replay allocates owned dynamic-rankmetadata. That is fine for the general view API, but it is avoidable overhead
for compiled replay. This PR adds a raw borrowed-layout API for that case.
API shape
Raw layout types live in
strided-view, not in a concrete backend module:The backend-neutral GEMM entry point is re-exported from
strided-einsum2:For compiled plans that have already validated bounds and ranks:
The unchecked path requires the caller to prove:
[lo, sum, batch],[sum, ro, batch], and[lo, ro, batch],Cdoes not aliasAorBin a way that violates mutable access.Backend behavior
The raw API is not faer-specific. It lowers through the same backend-neutral
prepare path:
That means faer, BLAS, and future backends can share the same raw metadata
boundary. The concrete backend difference stays at the final GEMM call.
Compatibility
Existing
bgemm_strided_intocallers continue to useStridedView/StridedViewMut. The faer module keeps its compatibility wrapper, but externalcallers should use the backend-neutral
strided_einsum2::bgemm_raw_strided_intoAPI for prepared replay.
Tests
Validated:
Coverage includes:
f64f32Complex64raw GEMM with conjugationbeta == 0andbeta != 0