nitesh501
/

tinydit

Unconditional Image Generation

Model card Files Files and versions

xet

Community

nitesh501 commited on 28 days ago

Commit

5284a4f

verified ·

1 Parent(s): be9eb25

Update README.md

Browse files

Files changed (1) hide show

README.md +3 -38

README.md CHANGED Viewed

@@ -6,9 +6,7 @@ datasets:
 # TinyDiT
-TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**. The model is designed to be lightweight, efficient, and fast while still producing visually appealing anime-style face generations.
-The project explores compact diffusion transformer architectures capable of generating high-quality images with relatively low computational requirements.
 ## Model Details
@@ -26,29 +24,19 @@ TinyDiT was trained on a curated anime face dataset containing over 21k images.
 **Dataset Repository:** `YOUR_DATASET_REPO_ID`
-Replace the placeholder above with your actual Hugging Face dataset repository ID.
 ## VAE
-The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding. This significantly reduces training cost and improves inference efficiency.
-## Features
-* Compact 85M parameter architecture
-* Fast and lightweight image generation
-* Anime-style face synthesis
-* Efficient latent diffusion training
-* Suitable for low-resource GPUs and experimentation
 ## Example Generated Image
 Below is a sample image generated by TinyDiT:
 <p align="center">
-  <img src="generated_sample.png" width="256"/>
 </p>
-The model produces soft anime-style portraits with coherent facial structure and color consistency despite its relatively small size.
 ## Usage
@@ -63,37 +51,14 @@ image = pipe().images[0]
 image.save("tinydit_sample.png")
 ```
-## Training
-TinyDiT was trained using latent diffusion techniques on anime face images with a lightweight transformer backbone.
-### Training Highlights
-* 21k+ anime face dataset
-* Latent-space diffusion training
-* Compact transformer architecture
-* Memory-efficient VAE
-* Optimized for smaller GPUs
 ## Limitations
 * Trained only on anime face data
 * Unconditional generation only
 * Limited diversity compared to larger diffusion models
-* Lower image sharpness at higher resolutions
 * May occasionally generate blurry or distorted outputs
-## Future Improvements
-* Text-conditioned generation
-* Larger and more diverse datasets
-* Higher-resolution image synthesis
-* Improved sampling methods
-* Better facial detail consistency
-## License
-Please specify the appropriate license for this repository.
 ## Acknowledgements

 # TinyDiT
+TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**.
 ## Model Details
 **Dataset Repository:** `YOUR_DATASET_REPO_ID`
 ## VAE
+The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding.
 ## Example Generated Image
 Below is a sample image generated by TinyDiT:
 <p align="center">
+  <img src="sample.png" width="256"/>
 </p>
 ## Usage
 image.save("tinydit_sample.png")
 ```
 ## Limitations
 * Trained only on anime face data
 * Unconditional generation only
 * Limited diversity compared to larger diffusion models
 * May occasionally generate blurry or distorted outputs
 ## Acknowledgements