MTDoven
/

Recurrent-Parameter-Generation

Any-to-Any

Safetensors

English

Model card Files Files and versions

xet

Community

MTDoven commited on Sep 24, 2025

Commit

c3bf5ce

verified ·

1 Parent(s): 71e7732

Update README.md

Browse files

Files changed (1) hide show

README.md +15 -16

README.md CHANGED Viewed

@@ -14,19 +14,19 @@ pipeline_tag: any-to-any
 ## Abstract
-Parameter generation has long struggled to match the scale of today’s large vision and language
-models, curbing its broader utility. In this paper, we introduce **R**ecurrent Diffusion for Large-Scale
-**P**arameter **G**eneration (**RPG**), a novel framework that generates full neural network parameters—up
-to **hundreds of millions**—on a **single GPU**. Our approach first partitions a network’s parameters
-into non-overlapping ‘tokens’, each corresponding to a distinct portion of the model. A recurrent
-mechanism then learns the inter-token relationships, producing ‘prototypes’ which serve as conditions
-for a diffusion process that ultimately synthesizes the full parameters. Across a spectrum of
-architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
-and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while
-avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate
-valid parameters for previously unseen tasks, highlighting its flexibility in dynamic and open-ended
 scenarios. By overcoming the longstanding memory and scalability barriers,
-RPG serves as a critical advance in ‘AI generating AI’, potentially
 enabling efficient weight generation at scales previously deemed infeasible.
@@ -149,8 +149,7 @@ We thank
 [Mingjia Shi](bdemo.github.io/homepage),
 [Zangwei Zheng](https://zhengzangw.github.io/),
 [Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
-[Tianlong Chen](https://tianlong-chen.github.io/),
-and [Zhangyang Wang](https://www.ece.utexas.edu/people/faculty/atlas-wang)
 for valuable discussions and feedbacks.
 This research is supported by the National Research Foundation,
 Singapore under its AI Singapore Programme
@@ -160,8 +159,8 @@ Singapore under its AI Singapore Programme
 ## Citation
 ```
 @misc{wang2025recurrent,
-      title={Recurrent Diffusion for Large-Scale Parameter Generation},
-      author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and You, Yang},
       year={2025},
 }
 ```

 ## Abstract
+Parameter generation has long struggled to match the scale of today's large vision and language
+models, curbing its broader utility. In this paper, we introduce **R**ecurrent Diffusion for Large-Scale
+**P**arameter **G**eneration (**RPG**), a novel framework that generates full neural network parameters—up
+to **hundreds of millions**—on a **single GPU**. Our approach first partitions a network's parameters
+into non-overlapping 'tokens', each corresponding to a distinct portion of the model. A recurrent
+mechanism then learns the inter-token relationships, producing 'prototypes' which serve as conditions
+for a diffusion process that ultimately synthesizes the parameters. Across a spectrum of
+architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
+and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while
+avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate
+valid parameters for previously unseen tasks, highlighting its flexibility in open-ended
 scenarios. By overcoming the longstanding memory and scalability barriers,
+RPG serves as a critical advance in 'AI generating AI', potentially
 enabling efficient weight generation at scales previously deemed infeasible.
 [Mingjia Shi](bdemo.github.io/homepage),
 [Zangwei Zheng](https://zhengzangw.github.io/),
 [Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
+and [Tianlong Chen](https://tianlong-chen.github.io/)
 for valuable discussions and feedbacks.
 This research is supported by the National Research Foundation,
 Singapore under its AI Singapore Programme
 ## Citation
 ```
 @misc{wang2025recurrent,
+      title={Scaling Up Parameter Generation: A Recurrent Diffusion Approach},
+      author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and Schürholt, Konstantin and Wang, Zhangyang and You, Yang},
       year={2025},
 }
 ```