Tina / README.md

Update README.md

90b3324 verified about 1 month ago

4.36 kB

	---
	license: apache-2.0
	tags:
	- parameter-generation
	- diffusion
	- personalization
	- text-to-model
	- neural-network-diffusion
	- image-classification
	datasets:
	- cifar100
	language:
	- en
	pipeline_tag: other
	---

	# Tina: Text-to-Model Generative AI (CIFAR-100, CNN)

	Tina is a text-conditioned neural network diffusion model that generates personalized image classifiers from natural language prompts. Given a text description of the desired classification task (e.g., a list of class names), Tina directly outputs the full parameters of a lightweight CNN — no gradient-based training required at inference time.

	This checkpoint is the Tina model trained on CIFAR-100, capable of generating 10-class personalized CNN classifiers (~5K parameters) from text prompts.

	## Model Description

	\| Property \| Value \|
	\|---\|---\|
	\| Architecture \| Diffusion Transformer (DiT), GPT-2 style backbone \|
	\| Text Encoder \| CLIP ViT-B/32 (frozen) \|
	\| Hidden Size \| 2048 \|
	\| Transformer Layers \| 12 encoder layers + 12 decoder layers \|
	\| Attention Heads \| 16 \|
	\| Diffusion Steps \| 1000 (DDPM sampling) \|
	\| Prediction Type \| Signal prediction (x₀) \|
	\| Generated Model \| 2-layer CNN, ~5K parameters \|
	\| Max Classification Classes \| 10 \|
	\| Training p-Models \| 1000 personalized models \|
	\| Training Dataset \| CIFAR-100 (100 classes, 32×32 images) \|

	## How It Works

	Tina treats model generation as a conditional diffusion process — analogous to how text-to-image diffusion models denoise random pixels into coherent images, Tina denoises random vectors into functional neural network parameters.

	1. Training: Tina is trained on (task description, personalized model) pairs. Each personalized model is a CNN fine-tuned on a specific 10-class subset of CIFAR-100.
	2. Inference: Given a text prompt listing the desired classes (e.g., `["apple", "bear", "bicycle", "bus", "castle", "clock", "cloud", "forest", "mountain", "train"]`), Tina generates a complete CNN classifier in a single forward pass through 1000 denoising steps.

	Thanks to the vision-language alignment of CLIP, Tina also supports:
	- Image prompts: Zero-shot and few-shot image-prompted generation
	- Natural language descriptions: Using class descriptions instead of class names
	- Unseen classes: Generalization to classes not seen during training
	- Variable class counts: Any number of classes up to 10 via classification sequence padding

	## Intended Use

	- On-demand personalized classification: Quickly generate a lightweight classifier tailored to a user's specific needs without any training data or GPU-intensive fine-tuning.
	- Edge AI deployment: The generated CNN (~5K params) is extremely lightweight, suitable for resource-constrained devices.
	- Research on text-to-model generation: Exploring the paradigm of generating functional AI models from natural language.

	## Performance

	### Main Results on CIFAR-100 (10-class personalization)

	\| Method \| In-Distribution \| Out-of-Distribution \|
	\|---\|---\|---\|
	\| Generic Model \| 28.72 \| 29.88 \|
	\| Classifier Selection \| 64.83 \| 64.15 \|
	\| TAPER \| 67.71 \| 66.85 \|
	\| Tina (this model) \| 68.35 \| 67.14 \|

	### Inference Efficiency

	\| Method \| Time per model (CNN) \|
	\|---\|---\|
	\| Pretrain + fine-tune \| 94.35s \|
	\| TAPER \| 18.10s \|
	\| Tina \| 4.88s \|

	## Limitations

	- This checkpoint generates CNN classifiers only (2-layer, ~5K parameters) for CIFAR-100 class subsets.
	- Input images are expected to be 32×32 resolution.
	- A single Tina cannot generate models across different architectures or modalities simultaneously.
	- Performance on entirely out-of-domain classes (beyond CIFAR-100 semantic scope) may degrade.

	<!-- ## Citation

	If you use this model, please cite our paper:

	```bibtex
	@article{li2026tina,
	title={Tina: A Diffusion Neural Network for Generating Personalized AI Models from Text Prompts},
	author={Li, Zexi and Gao, Lingzhi and Cai, Dongqi and Lane, Nicholas D. and Wu, Chao},
	journal={Patterns},
	year={2026},
	publisher={Cell Press}
	}
	``` -->

	## Links

	<!-- - Paper: Patterns (Cell Press), 2026 -->
	- Code: [https://github.com/aoliliao/Tina](https://github.com/aoliliao/Tina)
	<!-- - Zenodo Archive: [https://doi.org/10.5281/zenodo.19062137](https://doi.org/10.5281/zenodo.19062137) -->