Skip to content

Commit 28c320e

Browse files
authored
🔥🔥[FBCache] Fastest HunyuanVideo Inference with Context Parallelism and First Block Cache on NVIDIA L20 GPUs (#23)
1 parent 847bca8 commit 28c320e

1 file changed

Lines changed: 1 addition & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@
8888
|2024.10| 🔥🔥[**ToCa**] ToCa: Accelerating Diffusion Transformers with Token-wise Feature Caching(@SJTU)|[[pdf]](https://arxiv.org/pdf/2410.05317) | [[ToCa]](https://github.com/Shenyi-Z/ToCa) ![](https://img.shields.io/github/stars/Shenyi-Z/ToCa.svg?style=social)|⭐️⭐️ |
8989
|2024.11| 🔥🔥[**SkipCache**] Accelerating Vision Diffusion Transformers with Skip Branches(@SJTU)|[[pdf]](https://arxiv.org/pdf/2411.17616) | [[Skip-DiT]](https://github.com/OpenSparseLLMs/Skip-DiT) ![](https://img.shields.io/github/stars/OpenSparseLLMs/Skip-DiT.svg?style=social)|⭐️⭐️ |
9090
|2024.12| 🔥🔥[**DuCa**] Accelerating Diffusion Transformers with Dual Feature Caching(@SJTU)|[[pdf]](https://arxiv.org/pdf/2412.18911) | [[DuCa]](https://github.com/Shenyi-Z/DuCa) ![](https://img.shields.io/github/stars/Shenyi-Z/DuCa.svg?style=social)|⭐️⭐️ |
91+
|2025.01| 🔥🔥[**FBCache**] Fastest HunyuanVideo Inference with Context Parallelism and First Block Cache on NVIDIA L20 GPUs(@chengzeyi)| [[docs]](https://github.com/chengzeyi/ParaAttention/blob/main/doc/fastest_hunyuan_video.md) | [[ParaAttention]](https://github.com/chengzeyi/ParaAttention) ![](https://img.shields.io/github/stars/chengzeyi/ParaAttention.svg?style=social)|⭐️⭐️ |
9192

9293
## 📙Awesome Diffusion Distributed Inference with Multi-GPUs
9394

0 commit comments

Comments
 (0)