--- license: apache-2.0 tags: - diffusion - unconditional-image-generation - ddpm - diffusers - yi-script library_name: diffusers pipeline_tag: unconditional-image-generation --- # Yi Syllable Diffusion An unconditional **DDPM** that generates images of **Yi script syllables** (Unicode block `U+A000`–`U+A48C`). Trained on 1,165 glyphs rendered from the `NotoSansYi-Regular` font.

denoising animation quality vs inference steps

Left: reverse diffusion (noise → glyph). Right: the same glyph sharpening as the number of inference steps increases. ## Sample output ![real vs generated](real_vs_generated.png) Top: real glyphs (font). Bottom: generated by this model. ## Usage ```python from diffusers import DDPMPipeline pipe = DDPMPipeline.from_pretrained("pratik220704/yi-syllable-diffusion").to("cuda") image = pipe(num_inference_steps=50).images[0] image.save("yi.png") ``` ## Training data 1,165 grayscale 64×64 PNGs, one per Yi syllable, rendered with PIL from `NotoSansYi-Regular.ttf`. ## Training procedure - Architecture: `UNet2DModel` (diffusers), 1-channel in/out, ~17 M params. - Noise schedule: cosine-beta DDPM (1000 steps) with **zero terminal SNR**. - Objective: **v-prediction**. - Sampler: `DDIMScheduler`, `timestep_spacing="trailing"`, `clip_sample=True`, 50 steps. - Optimizer: AdamW, lr 1e-4, cosine LR warmup. Epochs: 10. The zero-SNR + v-prediction recipe is what produces crisp black-on-white glyphs (plain epsilon-prediction yields a grey haze). FID (full dataset) ≈ 108.6. ## Limitations Unconditional — you cannot request a specific syllable. Quality is bounded by the 64 px resolution and short (10-epoch) training budget. ## License Model weights: Apache-2.0. The Noto fonts are licensed under the SIL Open Font License.