File size: 2,839 Bytes
00a9d5b 55d1590 0bc67de 8fac008 be9eb25 5284a4f be9eb25 c4b1c35 a9f05e4 be9eb25 be79f33 be9eb25 5284a4f be9eb25 b6c0ce5 be9eb25 b6c0ce5 be9eb25 93306cd 119e62a d197edf ccf7451 af6a793 be9eb25 2ded7f2 113045c 2ded7f2 be9eb25 f23c3b1 c6737d6 1329a64 be9eb25 bfea4f8 be9eb25 bfea4f8 3394354 be9eb25 8fac008 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ---
license: apache-2.0
datasets:
- huggan/anime-faces
pipeline_tag: unconditional-image-generation
---
# TinyDiT
TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**.
## Model Details
* **Model Name:** TinyDiT
* **Architecture:** Diffusion Transformer (DiT-inspired)
* **Parameters:** 85M
* **Task:** Unconditional Image Generation
* **Dataset Size:** 21,000+ anime face images
* **VAE:** Lightweight 13M parameter VAE
* **Generation Type:** Anime face generation from random noise (no text conditioning)
* **Image Size:** 64x64px
* **Github Repo:** https://github.com/Nitesh1405/TinyDiT/tree/main
## Dataset
TinyDiT was trained on a curated anime face dataset containing over 21k images.
**Dataset Repository:** `huggan/anime-faces`
## VAE
The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding.
## Example Generated Images
Below is a sample images generated by TinyDiT:
<p align="center" style="display: flex;">
<img src="images/sample.webp" width="64"/>
<img src="images/sample2.webp" width="64"/>
<img src="images/sample3.webp" width="64"/>
<img src="images/sample4.webp" width="64"/>
<img src="images/sample5.webp" width="64"/>
<img src="images/sample6.webp" width="64"/>
<img src="images/sample7.webp" width="64"/>
<img src="images/sample8.webp" width="64"/>
<img src="images/sample9.webp" width="64"/>
<img src="images/sample10.webp" width="64"/>
<img src="images/sample11.webp" width="64"/>
</p>
<p align="center" style="display: flex;">
<img src="images/sample22.webp" width="64"/>
<img src="images/sample12.webp" width="64"/>
<img src="images/sample13.webp" width="64"/>
<img src="images/sample14.webp" width="64"/>
<img src="images/sample15.webp" width="64"/>
<img src="images/sample16.webp" width="64"/>
<img src="images/sample17.webp" width="64"/>
<img src="images/sample18.webp" width="64"/>
<img src="images/sample19.webp" width="64"/>
<img src="images/sample20.webp" width="64"/>
<img src="images/sample21.webp" width="64"/>
</p>
## Usage
* **HuggingFace Space:** https://huggingface.co/spaces/nitesh501/TinyDiT
```bash
git clone https://github.com/Nitesh1405/TinyDiT.git && cd TinyDiT
pip install -r requirements.txt
python app.py
#the model will automatically download on first run if you have wget, if not you can download the model from https://huggingface.co/nitesh501/tinydit and place it in TinyDit Folder.
```
## Limitations
* Trained only on anime face data
* Unconditional generation only
* Limited diversity compared to larger diffusion models
* May occasionally generate blurry or distorted outputs
## Acknowledgements
Inspired by DiT architectures, latent diffusion models, and the open-source generative AI community. |