[bug] fix for incorrect output from grouped Conv and regression test for Conv with batch>1 & group>1#24
Open
harz05 wants to merge 2 commits into
Open
[bug] fix for incorrect output from grouped Conv and regression test for Conv with batch>1 & group>1#24harz05 wants to merge 2 commits into
harz05 wants to merge 2 commits into
Conversation
Author
|
@sanjibansg a quick check on this. I noticed that in the recent operator init refactor, the |
17906b3 to
d732b8d
Compare
Author
|
also added a ConvGroupBatch GTest that exercises group>1 and batch>1 together (group=2, batch=4), which was never covered. Verified on Colab T4 with the tests passing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #23
In
Generate_GPU_ALPAKAandGetBlasConfigofROperator_Conv.hxx,gemm_nwas being set to total output channels and used directly in per-group matmul calls and BLAS layout config without dividing byfAttrGroupfor the grouped path. The comment mentions that "we divide per group at launch" but the division was missing and same was verified with test failure as reported by #22 ,this caused per-group matmuls to compute all output channels instead of just the group's slice.Restoring the division at the variable assignment fixes it in one spot for both call sites.
EDIT: The PR nowadds a ConvGroupBatch GTest (group=2, batch=4) as the regression test for this path; it fails without this fix and passes with it.