BurstAttention and Ulyless all2all support for long sequence training.#203
Open
MayDomine wants to merge 11 commits into
Open
BurstAttention and Ulyless all2all support for long sequence training.#203MayDomine wants to merge 11 commits into
MayDomine wants to merge 11 commits into