pratik220704 commited on
Commit
39fa6c7
·
verified ·
1 Parent(s): 7960b6e

Add Yi syllable diffusion model

Browse files
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - diffusion
5
+ - unconditional-image-generation
6
+ - ddpm
7
+ - diffusers
8
+ - yi-script
9
+ library_name: diffusers
10
+ pipeline_tag: unconditional-image-generation
11
+ ---
12
+
13
+ # Yi Syllable Diffusion
14
+
15
+ An unconditional **DDPM** that generates images of **Yi script syllables**
16
+ (Unicode block `U+A000`–`U+A48C`). Trained on 1,165 glyphs rendered from the
17
+ `NotoSansYi-Regular` font.
18
+
19
+ ## Usage
20
+
21
+ ```python
22
+ from diffusers import DDPMPipeline
23
+ pipe = DDPMPipeline.from_pretrained("pratik220704/yi-syllable-diffusion").to("cuda")
24
+ image = pipe(num_inference_steps=50).images[0]
25
+ image.save("yi.png")
26
+ ```
27
+
28
+ ## Training data
29
+ 1,165 grayscale 64×64 PNGs, one per Yi syllable, rendered with PIL from
30
+ `NotoSansYi-Regular.ttf`.
31
+
32
+ ## Training procedure
33
+ - Architecture: `UNet2DModel` (diffusers), 1-channel in/out, ~17 M params.
34
+ - Noise schedule: cosine-beta DDPM (1000 steps) with **zero terminal SNR**.
35
+ - Objective: **v-prediction**.
36
+ - Sampler: `DDIMScheduler`, `timestep_spacing="trailing"`, `clip_sample=True`, 50 steps.
37
+ - Optimizer: AdamW, lr 1e-4, cosine LR warmup. Epochs: 10.
38
+
39
+ The zero-SNR + v-prediction recipe is what produces crisp black-on-white glyphs
40
+ (plain epsilon-prediction yields a grey haze). FID (full dataset) ≈ 108.6.
41
+
42
+ ## Limitations
43
+ Unconditional — you cannot request a specific syllable. Quality is bounded by the
44
+ 64 px resolution and short (10-epoch) training budget.
45
+
46
+ ## License
47
+ Model weights: Apache-2.0. The Noto fonts are licensed under the SIL Open Font License.
model_index.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDPMPipeline",
3
+ "_diffusers_version": "0.31.0",
4
+ "scheduler": [
5
+ "diffusers",
6
+ "DDIMScheduler"
7
+ ],
8
+ "unet": [
9
+ "diffusers",
10
+ "UNet2DModel"
11
+ ]
12
+ }
scheduler/scheduler_config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "DDIMScheduler",
3
+ "_diffusers_version": "0.31.0",
4
+ "beta_end": 0.02,
5
+ "beta_schedule": "squaredcos_cap_v2",
6
+ "beta_start": 0.0001,
7
+ "clip_sample": true,
8
+ "clip_sample_range": 1.0,
9
+ "dynamic_thresholding_ratio": 0.995,
10
+ "num_train_timesteps": 1000,
11
+ "prediction_type": "v_prediction",
12
+ "rescale_betas_zero_snr": true,
13
+ "sample_max_value": 1.0,
14
+ "set_alpha_to_one": true,
15
+ "steps_offset": 0,
16
+ "thresholding": false,
17
+ "timestep_spacing": "trailing",
18
+ "trained_betas": null
19
+ }
unet/config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "UNet2DModel",
3
+ "_diffusers_version": "0.31.0",
4
+ "_name_or_path": "runs/full_v/unet",
5
+ "act_fn": "silu",
6
+ "add_attention": true,
7
+ "attention_head_dim": 8,
8
+ "attn_norm_num_groups": null,
9
+ "block_out_channels": [
10
+ 64,
11
+ 128,
12
+ 128,
13
+ 256
14
+ ],
15
+ "center_input_sample": false,
16
+ "class_embed_type": null,
17
+ "down_block_types": [
18
+ "DownBlock2D",
19
+ "DownBlock2D",
20
+ "AttnDownBlock2D",
21
+ "DownBlock2D"
22
+ ],
23
+ "downsample_padding": 1,
24
+ "downsample_type": "conv",
25
+ "dropout": 0.0,
26
+ "flip_sin_to_cos": true,
27
+ "freq_shift": 0,
28
+ "in_channels": 1,
29
+ "layers_per_block": 2,
30
+ "mid_block_scale_factor": 1,
31
+ "norm_eps": 1e-05,
32
+ "norm_num_groups": 32,
33
+ "num_class_embeds": null,
34
+ "num_train_timesteps": null,
35
+ "out_channels": 1,
36
+ "resnet_time_scale_shift": "default",
37
+ "sample_size": 64,
38
+ "time_embedding_type": "positional",
39
+ "up_block_types": [
40
+ "UpBlock2D",
41
+ "AttnUpBlock2D",
42
+ "UpBlock2D",
43
+ "UpBlock2D"
44
+ ],
45
+ "upsample_type": "conv"
46
+ }
unet/diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:874d87e631d36dde353bb35e7a6b8a96e1f912d53cfe2a91c57f3e29cdda3511
3
+ size 68897084