Zigeng commited on
Commit
22b6641
·
verified ·
1 Parent(s): 5a09af0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -3
README.md CHANGED
@@ -1,3 +1,27 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ <div align="center">
6
+ <h1>🚀 CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient</h1>
7
+ </div>
8
+
9
+ > **Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient**
10
+ > [Zigeng Chen](https://github.com/czg1225), [Xinyin Ma](https://horseee.github.io/), [Gongfan Fang](https://fangggf.github.io/), [Xinchao Wang](https://sites.google.com/site/sitexinchaowang/)
11
+ > [Learning and Vision Lab](http://lv-nus.org/), National University of Singapore
12
+ > 🥯[[Paper]](https://arxiv.org/abs/2406.06911)🎄[[Project Page]](https://czg1225.github.io/asyncdiff_page/) 💻 [[GitHub]](https://github.com/czg1225/CoDe)
13
+
14
+
15
+ <div align="center">
16
+ <img src="assets/teaser.png" width="100%" ></img>
17
+ <br>
18
+ <em>
19
+ 1.7x Speedup and 0.5x memory consumption on ImageNet-256 generation. Top: original VAR-d30; Bottom: CoDe N=8. Speed ​​measurement does not include vae decoder
20
+ </em>
21
+ </div>
22
+ <be>
23
+
24
+ ## 💡 Introduction
25
+ We propose Collaborative Decoding (CoDe), a novel decoding strategy tailored to the VAR framework. CoDe capitalizes on two critical observations: the substantially reduced parameter demands at larger scales and the exclusive generation patterns across different scales. Based on these insights, we partition the multi-scale inference process into a seamless collaboration between a large model and a small model.This collaboration yields remarkable efficiency with minimal impact on quality: CoDe achieves a 1.7x speedup, slashes memory usage by around 50%, and preserves image quality with only a negligible FID increase from 1.95 to 1.98. When drafting steps are further decreased, CoDe can achieve an impressive 2.9x acceleration, reaching over 41 images/s at 256x256 resolution on a single NVIDIA 4090 GPU, while preserving a commendable FID of 2.27.
26
+ ![figure](assets/curve.png)
27
+ ![figure](assets/frame.png)