Recurrent-Parameter-Generation / README.md

Update README.md

c3bf5ce verified 4 months ago

6.28 kB

	---
	datasets:
	- MTDoven/ViTTiny1022
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: any-to-any
	---


	# Scaling Up Parameter Generation: A Recurrent Diffusion Approach
	[Paper](https://arxiv.org/pdf/2501.11587) \| [Project Page](https://NUS-HPC-AI-Lab.github.io/Recurrent-Parameter-Generation/) \| [Github](https://github.com/NUS-HPC-AI-Lab/Recurrent-Parameter-Generation) \| [Twitter](https://x.com/VictorKaiWang1/status/1881380005118419435)


	## Abstract
	Parameter generation has long struggled to match the scale of today's large vision and language
	models, curbing its broader utility. In this paper, we introduce Recurrent Diffusion for Large-Scale
	Parameter Generation (RPG), a novel framework that generates full neural network parameters—up
	to hundreds of millions—on a single GPU. Our approach first partitions a network's parameters
	into non-overlapping 'tokens', each corresponding to a distinct portion of the model. A recurrent
	mechanism then learns the inter-token relationships, producing 'prototypes' which serve as conditions
	for a diffusion process that ultimately synthesizes the parameters. Across a spectrum of
	architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
	and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while
	avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate
	valid parameters for previously unseen tasks, highlighting its flexibility in open-ended
	scenarios. By overcoming the longstanding memory and scalability barriers,
	RPG serves as a critical advance in 'AI generating AI', potentially
	enabling efficient weight generation at scales previously deemed infeasible.





	## Environment
	Before you get started, you need to set up a conda environment first.
	1. Create your conda environment.
	```shell
	conda create -n rpg python=3.11
	conda activate rpg
	conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
	```
	2. Install mamba-ssm. (You may run into compilation issues, refer to the [official mamba-ssm repository](https://github.com/state-spaces/mamba) for details.)
	```shell
	pip install causal-conv1d
	pip install mamba-ssm[causal-conv1d]
	```
	3. Install other dependencies for this repository.
	```shell
	git lfs install
	git clone https://huggingface.co/MTDoven/Recurrent-Parameter-Generation.git
	cd Recurrent-Parameter-Generation
	pip install -r requirements.txt
	```




	## Quick Start
	Try to generate with RPG model.
	```shell
	cd ./workspace
	CUDA_VISIBLE_DEVICES=0 sh demo.sh
	# CUDA_VISIBLE_DEVICES=<GPU_index> sh demo.sh
	```
	<details>
	<summary>Here are some examples.</summary>

	```angular2html
	description: "Give me a model to select all living things"
	expected_class: [0,0,1,1,1,1,1,1,0,0] # bird, cat, deer, dog, frog, horse

	description: "Find all vehicles that operate on roads"
	expected_class: [0,1,0,0,0,0,0,0,0,1] # automobile, truck

	description: "Select all things that can fly"
	expected_class: [1,0,1,0,0,0,0,0,0,0] # airplane, bird

	description: "Find all transportation methods that travel on water"
	expected_class: [0,0,0,0,0,0,0,0,1,0] # ship

	description: "Classify all mammals"
	expected_class: [0,0,0,1,1,1,0,1,0,0] # cat, deer, dog, horse

	description: "Find all animals with fur"
	expected_class: [0,0,1,1,1,1,0,1,0,0] # bird, cat, deer, dog, horse

	description: "Select all pets commonly found in households"
	expected_class: [0,0,1,1,0,1,0,0,0,0] # bird, cat, dog

	description: "Identify all cold-blooded animals"
	expected_class: [0,0,0,0,0,0,1,0,0,0] # frog

	description: "Find all objects that can carry cargo"
	expected_class: [1,1,0,0,0,0,0,0,1,1] # airplane, automobile, ship, truck

	description: "Select all things used for commercial transportation"
	expected_class: [1,1,0,0,0,0,0,0,1,1] # airplane, automobile, ship, truck

	description: "Identify all animals that can swim naturally"
	expected_class: [0,0,0,1,0,0,1,0,0,0] # cat, frog

	description: "Find all things with wheels"
	expected_class: [1,1,0,0,0,0,0,0,0,1] # airplane, automobile, truck

	description: "Select all creatures with four legs"
	expected_class: [0,0,0,1,1,1,0,1,0,0] # cat, deer, dog, horse

	description: "Identify all creatures that live in forests"
	expected_class: [0,0,1,1,1,1,0,0,0,0] # bird, cat, deer, dog

	description: "Find all animals that can live near water"
	expected_class: [0,0,1,0,0,0,1,0,0,0] # bird, frog

	description: "Select all man-made objects"
	expected_class: [1,1,0,0,0,0,0,0,1,1] # airplane, automobile, ship, truck

	description: "Find all things that make noise naturally"
	expected_class: [0,0,1,1,1,1,1,1,0,0] # all animals

	description: "Identify all animals that can climb trees"
	expected_class: [0,0,1,1,0,1,0,0,0,0] # bird, cat, dog

	"Select all animals that hunt other animals"
	expected_class: [0,0,0,1,0,1,0,0,0,0] # cat, dog

	description: "Find all things that are both man-made and can operate on water"
	expected_class: [0,0,0,0,0,0,0,0,1,0] # ship

	description: "Select all animals that are both pets and can climb"
	expected_class: [0,0,0,1,0,1,0,0,0,0] # cat, dog
	```

	</details>

	You can get more information from [Github](https://github.com/NUS-HPC-AI-Lab/Recurrent-Parameter-Generation) and [Project-Page](https://NUS-HPC-AI-Lab.github.io/Recurrent-Parameter-Generation/).




	## Acknowledgment
	We thank
	[Zhiyuan Liang](https://jerryliang24.github.io/),
	[Zhuang Liu](https://liuzhuang13.github.io/),
	[Gongfan Fang](https://fangggf.github.io/),
	[Xuanlei Zhao](https://oahzxl.github.io/),
	[Yuhao Zhou](https://github.com/Soptq),
	[Mingjia Shi](bdemo.github.io/homepage),
	[Zangwei Zheng](https://zhengzangw.github.io/),
	[Ziheng Qin](https://henryqin1997.github.io/ziheng_qin/),
	and [Tianlong Chen](https://tianlong-chen.github.io/)
	for valuable discussions and feedbacks.
	This research is supported by the National Research Foundation,
	Singapore under its AI Singapore Programme
	(AISG Award No: AISG2-PhD-2021-08-008).


	## Citation
	```
	@misc{wang2025recurrent,
	title={Scaling Up Parameter Generation: A Recurrent Diffusion Approach},
	author={Wang, Kai and Tang, Dongwen and Zhao, Wangbo and Schürholt, Konstantin and Wang, Zhangyang and You, Yang},
	year={2025},
	}
	```