Update README.md
Browse files
README.md
CHANGED
|
@@ -4,14 +4,40 @@ language:
|
|
| 4 |
- km
|
| 5 |
pipeline_tag: text-to-image
|
| 6 |
---
|
|
|
|
| 7 |
|
| 8 |
-
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
-
- **Developed by:** Mr. Channudam Ray
|
| 15 |
-
- **Funded by:** Factory.io
|
| 16 |
-
- **Model type:** StableDiffusion
|
| 17 |
-
- **Language:** Khmer Central
|
|
|
|
| 4 |
- km
|
| 5 |
pipeline_tag: text-to-image
|
| 6 |
---
|
| 7 |
+
## Model Description
|
| 8 |
|
| 9 |
+
This project explores Khmer text-to-image generation, inspired by the architecture of Stable Diffusion. It builds upon the base model [`channudam/unet2dcon-khm-35`](https://huggingface.co/channudam/unet2dcon-khm-35) by integrating key components from the Stable Diffusion framework. This setup enhances image quality, provides better control, and offers more flexibility for downstream tasks.
|
| 10 |
|
| 11 |
+
- **Developed by:** Mr. Channudam Ray
|
| 12 |
+
- **Funded by:** Factory.io
|
| 13 |
+
- **Model Type:** Stable Diffusion-based
|
| 14 |
+
- **Language:** Khmer (Central dialect)
|
| 15 |
|
| 16 |
+
## Fine-Tuning
|
| 17 |
+
|
| 18 |
+
This is a base model and is intended to be fine-tuned for specific tasks or datasets. The model was trained on images with a resolution of **128×64**, but this can be adjusted during fine-tuning to match your desired output size.
|
| 19 |
+
|
| 20 |
+
For best results, it is recommended to fine-tune the following three main components rather than just the core UNet model:
|
| 21 |
+
|
| 22 |
+
- **Text Encoder** – [`RobertaModel`]
|
| 23 |
+
- **Variational Autoencoder** – [`AutoencoderKL`]
|
| 24 |
+
- **Image Generation Model** – [`UNet2DConditionModel`]
|
| 25 |
+
|
| 26 |
+
## Usage (with GPU)
|
| 27 |
+
|
| 28 |
+
```python
|
| 29 |
+
from diffusers import StableDiffusionPipeline
|
| 30 |
+
from khmernltk import word_tokenize
|
| 31 |
+
import matplotlib.pyplot as plt
|
| 32 |
+
from PIL import Image
|
| 33 |
+
import torch
|
| 34 |
+
|
| 35 |
+
pipe = StableDiffusionPipeline.from_pretrained(
|
| 36 |
+
"stable_diffusion_v1",
|
| 37 |
+
torch_dtype=torch.float16,
|
| 38 |
+
).to("cuda")
|
| 39 |
+
|
| 40 |
+
images = pipe("បាត់ដំបង", guidance_scale=2).images[0]
|
| 41 |
+
plt.imshow(images)
|
| 42 |
+
plt.show()
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|