Update README.md

# TinyDiT

TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**. The model is designed to be lightweight, efficient, and fast while still producing visually appealing anime-style face generations.

The project explores compact diffusion transformer architectures capable of generating high-quality images with relatively low computational requirements.

## Model Details

* **Model Name:** TinyDiT
* **Architecture:** Diffusion Transformer (DiT-inspired)
* **Parameters:** 85M
* **Task:** Unconditional Image Generation
* **Dataset Size:** 21,000+ anime face images
* **VAE:** Lightweight 13M parameter VAE
* **Generation Type:** Anime face generation from random noise (no text conditioning)

## Dataset

TinyDiT was trained on a curated anime face dataset containing over 21k images.

**Dataset Repository:** `YOUR_DATASET_REPO_ID`

Replace the placeholder above with your actual Hugging Face dataset repository ID.

## VAE

The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding. This significantly reduces training cost and improves inference efficiency.

## Features

* Compact 85M parameter architecture
* Fast and lightweight image generation
* Anime-style face synthesis
* Efficient latent diffusion training
* Suitable for low-resource GPUs and experimentation

## Example Generated Image

Below is a sample image generated by TinyDiT:

<p align="center">
<img src="generated_sample.png" width="256"/>
</p>

The model produces soft anime-style portraits with coherent facial structure and color consistency despite its relatively small size.

## Usage

```python
from diffusers import DiffusionPipeline
import torch

pipe = DiffusionPipeline.from_pretrained("YOUR_USERNAME/tinydit")
pipe.to("cuda")

image = pipe().images[0]
image.save("tinydit_sample.png")
```

## Training

TinyDiT was trained using latent diffusion techniques on anime face images with a lightweight transformer backbone.

### Training Highlights

* 21k+ anime face dataset
* Latent-space diffusion training
* Compact transformer architecture
* Memory-efficient VAE
* Optimized for smaller GPUs

## Limitations

* Trained only on anime face data
* Unconditional generation only
* Limited diversity compared to larger diffusion models
* Lower image sharpness at higher resolutions
* May occasionally generate blurry or distorted outputs

## Future Improvements

* Text-conditioned generation
* Larger and more diverse datasets
* Higher-resolution image synthesis
* Improved sampling methods
* Better facial detail consistency

## License

Please specify the appropriate license for this repository.

## Acknowledgements

Inspired by DiT architectures, latent diffusion models, and the open-source generative AI community.

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -1,3 +1,5 @@
 ---
 license: apache-2.0
----

 ---
 license: apache-2.0
+datasets:
+- bob80333/animefacesv2
+---