Instructions to use pratik220704/yi-syllable-diffusion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use pratik220704/yi-syllable-diffusion with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("pratik220704/yi-syllable-diffusion", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
license: apache-2.0
tags:
- diffusion
- unconditional-image-generation
- ddpm
- diffusers
- yi-script
library_name: diffusers
pipeline_tag: unconditional-image-generation
Yi Syllable Diffusion
An unconditional DDPM that generates images of Yi script syllables
(Unicode block U+A000–U+A48C). Trained on 1,165 glyphs rendered from the
NotoSansYi-Regular font.
Left: reverse diffusion (noise → glyph). Right: the same glyph sharpening as the number of inference steps increases.
Sample output
Top: real glyphs (font). Bottom: generated by this model.
Usage
from diffusers import DDPMPipeline
pipe = DDPMPipeline.from_pretrained("pratik220704/yi-syllable-diffusion").to("cuda")
image = pipe(num_inference_steps=50).images[0]
image.save("yi.png")
Training data
1,165 grayscale 64×64 PNGs, one per Yi syllable, rendered with PIL from
NotoSansYi-Regular.ttf.
Training procedure
- Architecture:
UNet2DModel(diffusers), 1-channel in/out, ~17 M params. - Noise schedule: cosine-beta DDPM (1000 steps) with zero terminal SNR.
- Objective: v-prediction.
- Sampler:
DDIMScheduler,timestep_spacing="trailing",clip_sample=True, 50 steps. - Optimizer: AdamW, lr 1e-4, cosine LR warmup. Epochs: 10.
The zero-SNR + v-prediction recipe is what produces crisp black-on-white glyphs (plain epsilon-prediction yields a grey haze). FID (full dataset) ≈ 108.6.
Limitations
Unconditional — you cannot request a specific syllable. Quality is bounded by the 64 px resolution and short (10-epoch) training budget.
License
Model weights: Apache-2.0. The Noto fonts are licensed under the SIL Open Font License.
