File size: 6,283 Bytes
5df0e14
d49bb55
 
 
 
 
 
 
5df0e14
 
 
71e7732
852491b
f7009b3
 
181711a
c3bf5ce
 
 
 
 
 
 
 
 
 
 
71e7732
c3bf5ce
71e7732
f7009b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
181711a
f7009b3
 
 
 
181711a
f7009b3
 
 
 
 
 
 
 
181711a
f7009b3
 
181711a
 
f7009b3
8bfbb7a
 
 
181711a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f7009b3
181711a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8bfbb7a
 
 
544aebe
f7009b3
 
 
 
 
181711a
8bfbb7a
fe85061
8bfbb7a
 
5230021
 
181711a
8bfbb7a
c3bf5ce
181711a
 
 
 
f7009b3
 
 
89e7c8b
 
c3bf5ce
 
89e7c8b
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
datasets:
- MTDoven/ViTTiny1022
language:
- en
metrics:
- accuracy
pipeline_tag: any-to-any
---


# Scaling Up Parameter Generation: A Recurrent Diffusion Approach
[Paper](https://arxiv.org/pdf/2501.11587) | [Project Page](https://NUS-HPC-AI-Lab.github.io/Recurrent-Parameter-Generation/) | [Github](https://github.com/NUS-HPC-AI-Lab/Recurrent-Parameter-Generation) | [Twitter](https://x.com/VictorKaiWang1/status/1881380005118419435)


## Abstract
Parameter generation has long struggled to match the scale of today's large vision and language 
models, curbing its broader utility. In this paper, we introduce **R**ecurrent Diffusion for Large-Scale 
**P**arameter **G**eneration (**RPG**), a novel framework that generates full neural network parameters—up 
to **hundreds of millions**—on a **single GPU**. Our approach first partitions a network's parameters 
into non-overlapping 'tokens', each corresponding to a distinct portion of the model. A recurrent 
mechanism then learns the inter-token relationships, producing 'prototypes' which serve as conditions 
for a diffusion process that ultimately synthesizes the parameters. Across a spectrum of 
architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO, 
and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while 
avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate 
valid parameters for previously unseen tasks, highlighting its flexibility in open-ended 
scenarios. By overcoming the longstanding memory and scalability barriers, 
RPG serves as a critical advance in 'AI generating AI', potentially 
enabling efficient weight generation at scales previously deemed infeasible.





## Environment
Before you get started, you need to set up a conda environment first.
1. Create your conda environment.
```shell
conda create -n rpg python=3.11
conda activate rpg
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
```
2. Install mamba-ssm. (You may run into compilation issues, refer to the [official mamba-ssm repository](https://github.com/state-spaces/mamba) for details.)
```shell
pip install causal-conv1d
pip install mamba-ssm[causal-conv1d]
```
3. Install other dependencies for this repository.
```shell
git lfs install
git clone https://huggingface.co/MTDoven/Recurrent-Parameter-Generation.git
cd Recurrent-Parameter-Generation
pip install -r requirements.txt
```




## Quick Start
Try to generate with RPG model.
```shell
cd ./workspace
CUDA_VISIBLE_DEVICES=0 sh demo.sh
# CUDA_VISIBLE_DEVICES=<GPU_index> sh demo.sh
```
<details>
<summary>Here are some examples.</summary>

```angular2html
description: "Give me a model to select all living things"
expected_class: [0,0,1,1,1,1,1,1,0,0]  # bird, cat, deer, dog, frog, horse

description: "Find all vehicles that operate on roads"
expected_class: [0,1,0,0,0,0,0,0,0,1]  # automobile, truck

description: "Select all things that can fly"
expected_class: [1,0,1,0,0,0,0,0,0,0]  # airplane, bird

description: "Find all transportation methods that travel on water"
expected_class: [0,0,0,0,0,0,0,0,1,0]  # ship

description: "Classify all mammals"
expected_class: [0,0,0,1,1,1,0,1,0,0]  # cat, deer, dog, horse

description: "Find all animals with fur"
expected_class: [0,0,1,1,1,1,0,1,0,0]  # bird, cat, deer, dog, horse

description: "Select all pets commonly found in households"
expected_class: [0,0,1,1,0,1,0,0,0,0]  # bird, cat, dog

description: "Identify all cold-blooded animals"
expected_class: [0,0,0,0,0,0,1,0,0,0]  # frog

description: "Find all objects that can carry cargo"
expected_class: [1,1,0,0,0,0,0,0,1,1]  # airplane, automobile, ship, truck

description: "Select all things used for commercial transportation"
expected_class: [1,1,0,0,0,0,0,0,1,1]  # airplane, automobile, ship, truck

description: "Identify all animals that can swim naturally"
expected_class: [0,0,0,1,0,0,1,0,0,0]  # cat, frog

description: "Find all things with wheels"
expected_class: [1,1,0,0,0,0,0,0,0,1]  # airplane, automobile, truck

description: "Select all creatures with four legs"
expected_class: [0,0,0,1,1,1,0,1,0,0]  # cat, deer, dog, horse

description: "Identify all creatures that live in forests"
expected_class: [0,0,1,1,1,1,0,0,0,0]  # bird, cat, deer, dog

description: "Find all animals that can live near water"
expected_class: [0,0,1,0,0,0,1,0,0,0]  # bird, frog

description: "Select all man-made objects"
expected_class: [1,1,0,0,0,0,0,0,1,1]  # airplane, automobile, ship, truck

description: "Find all things that make noise naturally"
expected_class: [0,0,1,1,1,1,1,1,0,0]  # all animals

description: "Identify all animals that can climb trees"
expected_class: [0,0,1,1,0,1,0,0,0,0]  # bird, cat, dog

"Select all animals that hunt other animals"
expected_class: [0,0,0,1,0,1,0,0,0,0]  # cat, dog

description: "Find all things that are both man-made and can operate on water"
expected_class: [0,0,0,0,0,0,0,0,1,0]  # ship

description: "Select all animals that are both pets and can climb"
expected_class: [0,0,0,1,0,1,0,0,0,0]  # cat, dog
```

</details>

You can get more information from [Github](https://github.com/NUS-HPC-AI-Lab/Recurrent-Parameter-Generation) and [Project-Page](https://NUS-HPC-AI-Lab.github.io/Recurrent-Parameter-Generation/).




## Acknowledgment
We thank 
[Zhiyuan Liang](https://jerryliang24.github.io/),
[Zhuang Liu](https://liuzhuang13.github.io/),
[Gongfan Fang](https://fangggf.github.io/),
[Xuanlei Zhao](https://oahzxl.github.io/),
[Yuhao Zhou](https://github.com/Soptq),
[Mingjia Shi](bdemo.github.io/homepage),
[Zangwei Zheng](https://zhengzangw.github.io/), 
[Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
and [Tianlong Chen](https://tianlong-chen.github.io/)
for valuable discussions and feedbacks. 
This research is supported by the National Research Foundation, 
Singapore under its AI Singapore Programme 
(AISG Award No: AISG2-PhD-2021-08-008).


## Citation
```
@misc{wang2025recurrent,
      title={Scaling Up Parameter Generation: A Recurrent Diffusion Approach},
      author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and Schürholt, Konstantin and Wang, Zhangyang and You, Yang},
      year={2025},
}
```