MTDoven commited on
Commit
c3bf5ce
·
verified ·
1 Parent(s): 71e7732

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -16
README.md CHANGED
@@ -14,19 +14,19 @@ pipeline_tag: any-to-any
14
 
15
 
16
  ## Abstract
17
- Parameter generation has long struggled to match the scale of todays large vision and language
18
- models, curbing its broader utility. In this paper, we introduce **R**ecurrent Diffusion for Large-Scale
19
- **P**arameter **G**eneration (**RPG**), a novel framework that generates full neural network parameters—up
20
- to **hundreds of millions**—on a **single GPU**. Our approach first partitions a networks parameters
21
- into non-overlapping tokens’, each corresponding to a distinct portion of the model. A recurrent
22
- mechanism then learns the inter-token relationships, producing prototypes which serve as conditions
23
- for a diffusion process that ultimately synthesizes the full parameters. Across a spectrum of
24
- architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
25
- and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while
26
- avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate
27
- valid parameters for previously unseen tasks, highlighting its flexibility in dynamic and open-ended
28
  scenarios. By overcoming the longstanding memory and scalability barriers,
29
- RPG serves as a critical advance in AI generating AI’, potentially
30
  enabling efficient weight generation at scales previously deemed infeasible.
31
 
32
 
@@ -149,8 +149,7 @@ We thank
149
  [Mingjia Shi](bdemo.github.io/homepage),
150
  [Zangwei Zheng](https://zhengzangw.github.io/),
151
  [Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
152
- [Tianlong Chen](https://tianlong-chen.github.io/),
153
- and [Zhangyang Wang](https://www.ece.utexas.edu/people/faculty/atlas-wang)
154
  for valuable discussions and feedbacks.
155
  This research is supported by the National Research Foundation,
156
  Singapore under its AI Singapore Programme
@@ -160,8 +159,8 @@ Singapore under its AI Singapore Programme
160
  ## Citation
161
  ```
162
  @misc{wang2025recurrent,
163
- title={Recurrent Diffusion for Large-Scale Parameter Generation},
164
- author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and You, Yang},
165
  year={2025},
166
  }
167
  ```
 
14
 
15
 
16
  ## Abstract
17
+ Parameter generation has long struggled to match the scale of today's large vision and language
18
+ models, curbing its broader utility. In this paper, we introduce **R**ecurrent Diffusion for Large-Scale
19
+ **P**arameter **G**eneration (**RPG**), a novel framework that generates full neural network parameters—up
20
+ to **hundreds of millions**—on a **single GPU**. Our approach first partitions a network's parameters
21
+ into non-overlapping 'tokens', each corresponding to a distinct portion of the model. A recurrent
22
+ mechanism then learns the inter-token relationships, producing 'prototypes' which serve as conditions
23
+ for a diffusion process that ultimately synthesizes the parameters. Across a spectrum of
24
+ architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
25
+ and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while
26
+ avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate
27
+ valid parameters for previously unseen tasks, highlighting its flexibility in open-ended
28
  scenarios. By overcoming the longstanding memory and scalability barriers,
29
+ RPG serves as a critical advance in 'AI generating AI', potentially
30
  enabling efficient weight generation at scales previously deemed infeasible.
31
 
32
 
 
149
  [Mingjia Shi](bdemo.github.io/homepage),
150
  [Zangwei Zheng](https://zhengzangw.github.io/),
151
  [Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
152
+ and [Tianlong Chen](https://tianlong-chen.github.io/)
 
153
  for valuable discussions and feedbacks.
154
  This research is supported by the National Research Foundation,
155
  Singapore under its AI Singapore Programme
 
159
  ## Citation
160
  ```
161
  @misc{wang2025recurrent,
162
+ title={Scaling Up Parameter Generation: A Recurrent Diffusion Approach},
163
+ author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and Schürholt, Konstantin and Wang, Zhangyang and You, Yang},
164
  year={2025},
165
  }
166
  ```