Skip to content

Add TurboQuant KV cache compression with native Metal SDPA kernel#3328

Closed
arozanov wants to merge 6 commits into
ml-explore:mainfrom
arozanov:feature/turboquant-kv-cache
Closed

Add TurboQuant KV cache compression with native Metal SDPA kernel#3328
arozanov wants to merge 6 commits into
ml-explore:mainfrom
arozanov:feature/turboquant-kv-cache

Fix k_norms dimension index for sequence length in quantized SDPA

fbd6385
Select commit
Loading
Failed to load commit list.

Workflow runs completed with no jobs