nitesh501 commited on
Commit
5284a4f
·
verified ·
1 Parent(s): be9eb25

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -38
README.md CHANGED
@@ -6,9 +6,7 @@ datasets:
6
 
7
  # TinyDiT
8
 
9
- TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**. The model is designed to be lightweight, efficient, and fast while still producing visually appealing anime-style face generations.
10
-
11
- The project explores compact diffusion transformer architectures capable of generating high-quality images with relatively low computational requirements.
12
 
13
  ## Model Details
14
 
@@ -26,29 +24,19 @@ TinyDiT was trained on a curated anime face dataset containing over 21k images.
26
 
27
  **Dataset Repository:** `YOUR_DATASET_REPO_ID`
28
 
29
- Replace the placeholder above with your actual Hugging Face dataset repository ID.
30
-
31
  ## VAE
32
 
33
- The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding. This significantly reduces training cost and improves inference efficiency.
34
-
35
- ## Features
36
 
37
- * Compact 85M parameter architecture
38
- * Fast and lightweight image generation
39
- * Anime-style face synthesis
40
- * Efficient latent diffusion training
41
- * Suitable for low-resource GPUs and experimentation
42
 
43
  ## Example Generated Image
44
 
45
  Below is a sample image generated by TinyDiT:
46
 
47
  <p align="center">
48
- <img src="generated_sample.png" width="256"/>
49
  </p>
50
 
51
- The model produces soft anime-style portraits with coherent facial structure and color consistency despite its relatively small size.
52
 
53
  ## Usage
54
 
@@ -63,37 +51,14 @@ image = pipe().images[0]
63
  image.save("tinydit_sample.png")
64
  ```
65
 
66
- ## Training
67
-
68
- TinyDiT was trained using latent diffusion techniques on anime face images with a lightweight transformer backbone.
69
-
70
- ### Training Highlights
71
-
72
- * 21k+ anime face dataset
73
- * Latent-space diffusion training
74
- * Compact transformer architecture
75
- * Memory-efficient VAE
76
- * Optimized for smaller GPUs
77
 
78
  ## Limitations
79
 
80
  * Trained only on anime face data
81
  * Unconditional generation only
82
  * Limited diversity compared to larger diffusion models
83
- * Lower image sharpness at higher resolutions
84
  * May occasionally generate blurry or distorted outputs
85
 
86
- ## Future Improvements
87
-
88
- * Text-conditioned generation
89
- * Larger and more diverse datasets
90
- * Higher-resolution image synthesis
91
- * Improved sampling methods
92
- * Better facial detail consistency
93
-
94
- ## License
95
-
96
- Please specify the appropriate license for this repository.
97
 
98
  ## Acknowledgements
99
 
 
6
 
7
  # TinyDiT
8
 
9
+ TinyDiT is an **85 million parameter unconditional image generation model** trained on **21,000+ anime face images**.
 
 
10
 
11
  ## Model Details
12
 
 
24
 
25
  **Dataset Repository:** `YOUR_DATASET_REPO_ID`
26
 
 
 
27
  ## VAE
28
 
29
+ The model uses a compact **13M parameter Variational Autoencoder (VAE)** for latent-space encoding and decoding.
 
 
30
 
 
 
 
 
 
31
 
32
  ## Example Generated Image
33
 
34
  Below is a sample image generated by TinyDiT:
35
 
36
  <p align="center">
37
+ <img src="sample.png" width="256"/>
38
  </p>
39
 
 
40
 
41
  ## Usage
42
 
 
51
  image.save("tinydit_sample.png")
52
  ```
53
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## Limitations
56
 
57
  * Trained only on anime face data
58
  * Unconditional generation only
59
  * Limited diversity compared to larger diffusion models
 
60
  * May occasionally generate blurry or distorted outputs
61
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
  ## Acknowledgements
64