oraculumai's picture
examples: fix quick-start command path
1d16933 verified

LLaDA CoreML Diffusion Loop Examples

These scripts show how to run llada_8b_instruct_seq192.mlpackage in an iterative diffusion loop (not single-pass argmax).

Files

  • llada_generate.py: text prompt -> tokenize -> run diffusion loop -> decode output
  • llada_diffuse.swift: CoreML denoising loop runner (called by Python wrapper)

Prerequisites

  • macOS with Xcode command line tools (xcrun, swift)
  • Python 3.10+
  • Hugging Face access for tokenizer (GSAI-ML/LLaDA-8B-Instruct)

Install Python deps:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install transformers sentencepiece jinja2

Quick run

From the repo root (where llada_8b_instruct_seq192.mlpackage exists):

source .venv/bin/activate
python examples/llada_generate.py "Write one short sentence about the moon." --max-new-tokens 48 --steps 32

Notes

  • Uses <|mdm_mask|> as the diffusion mask token.
  • --steps and --max-new-tokens are the main quality/speed knobs.
  • Model is loaded through CoreML (MLModel), and .mlpackage is auto-compiled at runtime.