[AIROCMLIR-811] Fix vectorization through held-constant unmerge dims#2373
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a vectorization inference bug in Rock’s transform-map analysis where a held-constant non-unit Unmerge dimension could incorrectly allow later dimensions to extend the inferred contiguous vectorization length, leading to misvectorized reads/writes (and verification failures in attention workloads).
Changes:
- Update
propagateUnmergeVectorizationto stop extending contiguous vectorization once it encounters a held-constant dimension withdimLength > 1, while preserving alignment computed so far. - Add an MLIR regression test that reproduces the minimal Merge → Unmerge-with-held-const → Unmerge chain and verifies the corrected inferred vector length (4 vs 8).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
mlir/lib/Dialect/Rock/utility/transformMapUtils.cpp |
Prevents over-extension of contiguous vectorization through held-constant non-unit Unmerge dimensions by breaking inference in that case. |
mlir/test/Dialect/Rock/test_vectorization_inference.mlir |
Adds a targeted regression test validating the corrected vectorization length inference. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #2373 +/- ##
===========================================
+ Coverage 79.50% 81.01% +1.51%
===========================================
Files 100 126 +26
Lines 31016 42492 +11476
Branches 4819 7006 +2187
===========================================
+ Hits 24659 34424 +9765
- Misses 4245 5328 +1083
- Partials 2112 2740 +628
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
justinrosner
left a comment
There was a problem hiding this comment.
We also have some unittests for transformMapUtils in mlir/unittests/Dialect/Rock, can you add a test case for this failure there as well?
justinrosner
left a comment
There was a problem hiding this comment.
Changes look good to me now!
Motivation
f16/bf16 attention with perf config
attn:v3:128,128,256,4,128,64,16,4,1,1,2,0,1on gfx942/gfx950 was failing verification during the Weekly CI tuning phase.Technical Details
In
propagateUnmergeVectorization, an upper dim with no vectorization data was treated as "held constant" but its full length was folded intopreviousDimsStride. A subsequent slower dim whoseneedsCoefficientmatched that accumulated stride would then incorrectly extend the contiguous vectorization length, even though only one value of the held-constant dim is accessed.Fix: in the held-constant branch, break when the dim's size is > 1. Alignment computed so far is preserved; size-1 dims are unaffected.
This was reporting
vectorSrcLen=8instead of4for the bypass-LDS B-operand write ofrock.gridwise_attention_accel's second GEMM, so the loweredvector.transfer_readcrossed thenigap (flat = m_i*16 + ni*4 + vec_item) and corrupted all but the lastnislot, yielding NaNs downstream in softmax.Test Plan
A regression test (
@test_unmerge_held_constant_non_unit_dimin test_vectorization_inference.mlir) reconstructs the minimal Merge → Unmerge-with-held-const → Unmerge chain; it returns 4 with the fix and 8 without.Submission Checklist