|
10 | 10 |
|
11 | 11 | 📒A **small** curated list of Awesome **SD/DiT/ViT/Diffusion** **Distributed/Caching Inference** Paper with codes. For Awesome LLM Inference, please check 📖[Awesome-LLM-Inference](https://github.com/DefTruth/Awesome-LLM-Inference)  |
12 | 12 |
|
| 13 | +## 🤖Contents |
| 14 | + |
| 15 | +- [📙Awesome SD Inference with Caching](#Caching) |
| 16 | +- [📙Awesome SD Distributed Inference with Multi-GPUs](#Distributed) |
| 17 | + |
13 | 18 |
|
14 | 19 | ## ©️Citations |
15 | 20 |
|
|
23 | 28 | } |
24 | 29 | ``` |
25 | 30 |
|
26 | | -## 📙Awesome SD Distributed/Caching Inference Papers with Codes |
| 31 | + |
| 32 | +## 📙Awesome SD Inference with Caching |
| 33 | + |
| 34 | +<div id="Caching"></div> |
27 | 35 |
|
28 | 36 | - **UNet Based (DeepCache)** |
29 | 37 |
|
|
33 | 41 | - **DiT Based (Fast-Forward Caching)** |
34 | 42 | <img width="1119" alt="image" src="https://github.com/user-attachments/assets/fad8f187-d4ac-4290-9943-7b34116fed05"> |
35 | 43 |
|
| 44 | +|Date|Title|Paper|Code|Recom| |
| 45 | +|:---:|:---:|:---:|:---:|:---:| |
| 46 | +|2023.05|🔥🔥[**Cache-Enabled Sparse Diffusion**] Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference(@pku.edu.cn etc)|[[pdf]](https://arxiv.org/pdf/2305.17423) |⚠️|⭐️⭐️ | |
| 47 | +|2023.12|🔥🔥[**DeepCache**] DeepCache: Accelerating Diffusion Models for Free(@nus.edu)|[[pdf]](https://arxiv.org/pdf/2312.00858) | [[DeepCache]](https://github.com/horseee/DeepCache) | ⭐️⭐️ | |
| 48 | +|2023.12|🔥🔥[**Block Caching**] Cache Me if You Can: Accelerating Diffusion Models through Block Caching(@Meta GenAI etc)|[[pdf]](https://arxiv.org/pdf/2312.03209) |⚠️|⭐️⭐️ | |
| 49 | +|2023.12|🔥🔥[**Approximate Caching**] Approximate Caching for Efficiently Serving Diffusion Models(@Adobe)|[[pdf]](https://arxiv.org/pdf/2312.04429) |⚠️|⭐️⭐️ | |
| 50 | +|2024.06| 🔥🔥[**Layer Caching**] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching(@nus.edu) | [[pdf]](https://arxiv.org/pdf/2406.01733) | [[learning-to-cache]](https://github.com/horseee/learning-to-cache/) | ⭐️⭐️ | |
| 51 | +|2024.07|🔥[**ElasticCache-LVLM**] Efficient Inference of Vision Instruction-Following Models with Elastic Cache(@Tsinghua University etc)|[[pdf]](https://arxiv.org/pdf/2407.18121)|[[ElasticCache]](https://github.com/liuzuyan/ElasticCache) |⭐️ | |
| 52 | +|2024.07| 🔥🔥[**Fast-Forward Caching(DiT)**] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration(@microsoft.com etc) | [[pdf]](https://arxiv.org/pdf/2407.01425) | [[FORA]](https://github.com/prathebaselva/FORA) |⭐️⭐️ | |
| 53 | + |
| 54 | +## 📙Awesome SD Distributed Inference with Multi-GPUs |
| 55 | + |
| 56 | +<div id="Distributed"></div> |
| 57 | + |
36 | 58 | - **UNet Based: Displaced Patch parallelism (DistriFusion)** |
37 | 59 |
|
38 | 60 | <img width="1677" alt="image" src="https://github.com/user-attachments/assets/aefb2ae7-73eb-4e9c-bf1a-ec540f4dfa7d"> |
|
44 | 66 |
|
45 | 67 | |Date|Title|Paper|Code|Recom| |
46 | 68 | |:---:|:---:|:---:|:---:|:---:| |
47 | | -|2023.05|🔥🔥[**Cache-Enabled Sparse Diffusion**] Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference(@pku.edu.cn etc)|[[pdf]](https://arxiv.org/pdf/2305.17423) |⚠️|⭐️⭐️ | |
48 | | -|2023.12|🔥🔥[**DeepCache**] DeepCache: Accelerating Diffusion Models for Free(@nus.edu)|[[pdf]](https://arxiv.org/pdf/2312.00858) | [[DeepCache]](https://github.com/horseee/DeepCache) | ⭐️⭐️ | |
49 | | -|2023.12|🔥🔥[**Block Caching**] Cache Me if You Can: Accelerating Diffusion Models through Block Caching(@Meta GenAI etc)|[[pdf]](https://arxiv.org/pdf/2312.03209) |⚠️|⭐️⭐️ | |
50 | | -|2023.12|🔥🔥[**Approximate Caching**] Approximate Caching for Efficiently Serving Diffusion Models(@Adobe)|[[pdf]](https://arxiv.org/pdf/2312.04429) |⚠️|⭐️⭐️ | |
51 | 69 | |2024.02|🔥🔥[**DistriFusion**] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models(@MIT etc)|[[pdf]](https://arxiv.org/abs/2402.19481) | [[distrifuser]](https://github.com/mit-han-lab/distrifuser) | ⭐️⭐️ | |
52 | 70 | |2024.05|🔥🔥[**PipeFusion**] PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models(@Tencent etc)|[[pdf]](https://arxiv.org/pdf/2405.14430) | [[xDiT]](https://github.com/xdit-project/xDiT) | ⭐️⭐️ | |
53 | 71 | |2024.06| 🔥🔥[**AsyncDiff**] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising(@nus.edu) | [[pdf]](https://arxiv.org/pdf/2406.06911) | [[AsyncDiff]](https://github.com/czg1225/AsyncDiff) | ⭐️⭐️ | |
54 | | -|2024.06| 🔥🔥[**Layer Caching**] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching(@nus.edu) | [[pdf]](https://arxiv.org/pdf/2406.01733) | [[learning-to-cache]](https://github.com/horseee/learning-to-cache/) | ⭐️⭐️ | |
55 | 72 | |2024.05 | 🔥🔥[**TensorRT-LLM SDXL**] SDXL Distributed Inference with TensorRT-LLM and synchronous comm(@Zars19) | [[pdf]](https://arxiv.org/abs/2402.19481) | [[SDXL-TensorRT-LLM]](https://github.com/NVIDIA/TensorRT-LLM/pull/1514) | ⭐️⭐️ | |
56 | 73 | |2024.06| 🔥🔥[**Clip Parallelism**] Video-Infinity: Distributed Long Video Generation(@nus.edu)|[[pdf]](https://arxiv.org/pdf/2406.16260) | [[Video-Infinity]](https://github.com/Yuanshi9815/Video-Infinity) |⭐️⭐️ | |
57 | 74 | |2024.05| 🔥🔥[**FIFO-Diffusion**] FIFO-Diffusion: Generating Infinite Videos from Text without Training(@Seoul National University)|[[pdf]](https://arxiv.org/pdf/2405.11473) | [[FIFO-Diffusion]](https://github.com/jjihwan/FIFO-Diffusion_public)  |⭐️⭐️ | |
58 | | -|2024.07|🔥[**ElasticCache-LVLM**] Efficient Inference of Vision Instruction-Following Models with Elastic Cache(@Tsinghua University etc)|[[pdf]](https://arxiv.org/pdf/2407.18121)|[[ElasticCache]](https://github.com/liuzuyan/ElasticCache) |⭐️ | |
59 | | -|2024.07| 🔥🔥[**Fast-Forward Caching(DiT)**] FORA: Fast-Forward Caching in Diffusion Transformer Acceleration(@microsoft.com etc) | [[pdf]](https://arxiv.org/pdf/2407.01425) | [[FORA]](https://github.com/prathebaselva/FORA) |⭐️⭐️ | |
60 | 75 |
|
61 | 76 | ## ©️License |
62 | 77 |
|
|
0 commit comments