Instructions to use pratik220704/yi-syllable-diffusion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use pratik220704/yi-syllable-diffusion with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("pratik220704/yi-syllable-diffusion", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Add Yi syllable diffusion model
Browse files- README.md +47 -0
- model_index.json +12 -0
- scheduler/scheduler_config.json +19 -0
- unet/config.json +46 -0
- unet/diffusion_pytorch_model.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- diffusion
|
| 5 |
+
- unconditional-image-generation
|
| 6 |
+
- ddpm
|
| 7 |
+
- diffusers
|
| 8 |
+
- yi-script
|
| 9 |
+
library_name: diffusers
|
| 10 |
+
pipeline_tag: unconditional-image-generation
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Yi Syllable Diffusion
|
| 14 |
+
|
| 15 |
+
An unconditional **DDPM** that generates images of **Yi script syllables**
|
| 16 |
+
(Unicode block `U+A000`–`U+A48C`). Trained on 1,165 glyphs rendered from the
|
| 17 |
+
`NotoSansYi-Regular` font.
|
| 18 |
+
|
| 19 |
+
## Usage
|
| 20 |
+
|
| 21 |
+
```python
|
| 22 |
+
from diffusers import DDPMPipeline
|
| 23 |
+
pipe = DDPMPipeline.from_pretrained("pratik220704/yi-syllable-diffusion").to("cuda")
|
| 24 |
+
image = pipe(num_inference_steps=50).images[0]
|
| 25 |
+
image.save("yi.png")
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
## Training data
|
| 29 |
+
1,165 grayscale 64×64 PNGs, one per Yi syllable, rendered with PIL from
|
| 30 |
+
`NotoSansYi-Regular.ttf`.
|
| 31 |
+
|
| 32 |
+
## Training procedure
|
| 33 |
+
- Architecture: `UNet2DModel` (diffusers), 1-channel in/out, ~17 M params.
|
| 34 |
+
- Noise schedule: cosine-beta DDPM (1000 steps) with **zero terminal SNR**.
|
| 35 |
+
- Objective: **v-prediction**.
|
| 36 |
+
- Sampler: `DDIMScheduler`, `timestep_spacing="trailing"`, `clip_sample=True`, 50 steps.
|
| 37 |
+
- Optimizer: AdamW, lr 1e-4, cosine LR warmup. Epochs: 10.
|
| 38 |
+
|
| 39 |
+
The zero-SNR + v-prediction recipe is what produces crisp black-on-white glyphs
|
| 40 |
+
(plain epsilon-prediction yields a grey haze). FID (full dataset) ≈ 108.6.
|
| 41 |
+
|
| 42 |
+
## Limitations
|
| 43 |
+
Unconditional — you cannot request a specific syllable. Quality is bounded by the
|
| 44 |
+
64 px resolution and short (10-epoch) training budget.
|
| 45 |
+
|
| 46 |
+
## License
|
| 47 |
+
Model weights: Apache-2.0. The Noto fonts are licensed under the SIL Open Font License.
|
model_index.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "DDPMPipeline",
|
| 3 |
+
"_diffusers_version": "0.31.0",
|
| 4 |
+
"scheduler": [
|
| 5 |
+
"diffusers",
|
| 6 |
+
"DDIMScheduler"
|
| 7 |
+
],
|
| 8 |
+
"unet": [
|
| 9 |
+
"diffusers",
|
| 10 |
+
"UNet2DModel"
|
| 11 |
+
]
|
| 12 |
+
}
|
scheduler/scheduler_config.json
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "DDIMScheduler",
|
| 3 |
+
"_diffusers_version": "0.31.0",
|
| 4 |
+
"beta_end": 0.02,
|
| 5 |
+
"beta_schedule": "squaredcos_cap_v2",
|
| 6 |
+
"beta_start": 0.0001,
|
| 7 |
+
"clip_sample": true,
|
| 8 |
+
"clip_sample_range": 1.0,
|
| 9 |
+
"dynamic_thresholding_ratio": 0.995,
|
| 10 |
+
"num_train_timesteps": 1000,
|
| 11 |
+
"prediction_type": "v_prediction",
|
| 12 |
+
"rescale_betas_zero_snr": true,
|
| 13 |
+
"sample_max_value": 1.0,
|
| 14 |
+
"set_alpha_to_one": true,
|
| 15 |
+
"steps_offset": 0,
|
| 16 |
+
"thresholding": false,
|
| 17 |
+
"timestep_spacing": "trailing",
|
| 18 |
+
"trained_betas": null
|
| 19 |
+
}
|
unet/config.json
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "UNet2DModel",
|
| 3 |
+
"_diffusers_version": "0.31.0",
|
| 4 |
+
"_name_or_path": "runs/full_v/unet",
|
| 5 |
+
"act_fn": "silu",
|
| 6 |
+
"add_attention": true,
|
| 7 |
+
"attention_head_dim": 8,
|
| 8 |
+
"attn_norm_num_groups": null,
|
| 9 |
+
"block_out_channels": [
|
| 10 |
+
64,
|
| 11 |
+
128,
|
| 12 |
+
128,
|
| 13 |
+
256
|
| 14 |
+
],
|
| 15 |
+
"center_input_sample": false,
|
| 16 |
+
"class_embed_type": null,
|
| 17 |
+
"down_block_types": [
|
| 18 |
+
"DownBlock2D",
|
| 19 |
+
"DownBlock2D",
|
| 20 |
+
"AttnDownBlock2D",
|
| 21 |
+
"DownBlock2D"
|
| 22 |
+
],
|
| 23 |
+
"downsample_padding": 1,
|
| 24 |
+
"downsample_type": "conv",
|
| 25 |
+
"dropout": 0.0,
|
| 26 |
+
"flip_sin_to_cos": true,
|
| 27 |
+
"freq_shift": 0,
|
| 28 |
+
"in_channels": 1,
|
| 29 |
+
"layers_per_block": 2,
|
| 30 |
+
"mid_block_scale_factor": 1,
|
| 31 |
+
"norm_eps": 1e-05,
|
| 32 |
+
"norm_num_groups": 32,
|
| 33 |
+
"num_class_embeds": null,
|
| 34 |
+
"num_train_timesteps": null,
|
| 35 |
+
"out_channels": 1,
|
| 36 |
+
"resnet_time_scale_shift": "default",
|
| 37 |
+
"sample_size": 64,
|
| 38 |
+
"time_embedding_type": "positional",
|
| 39 |
+
"up_block_types": [
|
| 40 |
+
"UpBlock2D",
|
| 41 |
+
"AttnUpBlock2D",
|
| 42 |
+
"UpBlock2D",
|
| 43 |
+
"UpBlock2D"
|
| 44 |
+
],
|
| 45 |
+
"upsample_type": "conv"
|
| 46 |
+
}
|
unet/diffusion_pytorch_model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:874d87e631d36dde353bb35e7a6b8a96e1f912d53cfe2a91c57f3e29cdda3511
|
| 3 |
+
size 68897084
|