---
license: apache-2.0
tags:
  - parameter-generation
  - diffusion
  - personalization
  - text-to-model
  - neural-network-diffusion
  - image-classification
datasets:
  - cifar100
language:
  - en
pipeline_tag: other
---

# Tina: Text-to-Model Generative AI (CIFAR-100, CNN)

**Tina** is a text-conditioned neural network diffusion model that generates personalized image classifiers from natural language prompts. Given a text description of the desired classification task (e.g., a list of class names), Tina directly outputs the full parameters of a lightweight CNN — no gradient-based training required at inference time.

This checkpoint is the Tina model trained on **CIFAR-100**, capable of generating **10-class personalized CNN classifiers** (~5K parameters) from text prompts.

## Model Description

| Property | Value |
|---|---|
| **Architecture** | Diffusion Transformer (DiT), GPT-2 style backbone |
| **Text Encoder** | CLIP ViT-B/32 (frozen) |
| **Hidden Size** | 2048 |
| **Transformer Layers** | 12 encoder layers + 12 decoder layers |
| **Attention Heads** | 16 |
| **Diffusion Steps** | 1000 (DDPM sampling) |
| **Prediction Type** | Signal prediction (x₀) |
| **Generated Model** | 2-layer CNN, ~5K parameters |
| **Max Classification Classes** | 10 |
| **Training p-Models** | 1000 personalized models |
| **Training Dataset** | CIFAR-100 (100 classes, 32×32 images) |

## How It Works

Tina treats model generation as a conditional diffusion process — analogous to how text-to-image diffusion models denoise random pixels into coherent images, Tina denoises random vectors into functional neural network parameters.

1. **Training**: Tina is trained on (task description, personalized model) pairs. Each personalized model is a CNN fine-tuned on a specific 10-class subset of CIFAR-100.
2. **Inference**: Given a text prompt listing the desired classes (e.g., `["apple", "bear", "bicycle", "bus", "castle", "clock", "cloud", "forest", "mountain", "train"]`), Tina generates a complete CNN classifier in a single forward pass through 1000 denoising steps.

Thanks to the vision-language alignment of CLIP, Tina also supports:
- **Image prompts**: Zero-shot and few-shot image-prompted generation
- **Natural language descriptions**: Using class descriptions instead of class names
- **Unseen classes**: Generalization to classes not seen during training
- **Variable class counts**: Any number of classes up to 10 via classification sequence padding

## Intended Use

- **On-demand personalized classification**: Quickly generate a lightweight classifier tailored to a user's specific needs without any training data or GPU-intensive fine-tuning.
- **Edge AI deployment**: The generated CNN (~5K params) is extremely lightweight, suitable for resource-constrained devices.
- **Research on text-to-model generation**: Exploring the paradigm of generating functional AI models from natural language.

## Performance

### Main Results on CIFAR-100 (10-class personalization)

| Method | In-Distribution | Out-of-Distribution |
|---|---|---|
| Generic Model | 28.72 | 29.88 |
| Classifier Selection | 64.83 | 64.15 |
| TAPER | 67.71 | 66.85 |
| **Tina (this model)** | **68.35** | **67.14** |

### Inference Efficiency

| Method | Time per model (CNN) |
|---|---|
| Pretrain + fine-tune | 94.35s |
| TAPER | 18.10s |
| **Tina** | **4.88s** |

## Limitations

- This checkpoint generates **CNN classifiers only** (2-layer, ~5K parameters) for **CIFAR-100** class subsets.
- Input images are expected to be 32×32 resolution.
- A single Tina cannot generate models across different architectures or modalities simultaneously.
- Performance on entirely out-of-domain classes (beyond CIFAR-100 semantic scope) may degrade.

<!-- ## Citation

If you use this model, please cite our paper:

```bibtex
@article{li2026tina,
  title={Tina: A Diffusion Neural Network for Generating Personalized AI Models from Text Prompts},
  author={Li, Zexi and Gao, Lingzhi and Cai, Dongqi and Lane, Nicholas D. and Wu, Chao},
  journal={Patterns},
  year={2026},
  publisher={Cell Press}
}
``` -->

## Links

<!-- - **Paper**: *Patterns* (Cell Press), 2026 -->
- **Code**: [https://github.com/aoliliao/Tina](https://github.com/aoliliao/Tina)
<!-- - **Zenodo Archive**: [https://doi.org/10.5281/zenodo.19062137](https://doi.org/10.5281/zenodo.19062137) -->