oraculumai's picture
examples: fix quick-start command path
1d16933 verified
# LLaDA CoreML Diffusion Loop Examples
These scripts show how to run `llada_8b_instruct_seq192.mlpackage` in an iterative diffusion loop (not single-pass argmax).
## Files
- `llada_generate.py`: text prompt -> tokenize -> run diffusion loop -> decode output
- `llada_diffuse.swift`: CoreML denoising loop runner (called by Python wrapper)
## Prerequisites
- macOS with Xcode command line tools (`xcrun`, `swift`)
- Python 3.10+
- Hugging Face access for tokenizer (`GSAI-ML/LLaDA-8B-Instruct`)
Install Python deps:
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install transformers sentencepiece jinja2
```
## Quick run
From the repo root (where `llada_8b_instruct_seq192.mlpackage` exists):
```bash
source .venv/bin/activate
python examples/llada_generate.py "Write one short sentence about the moon." --max-new-tokens 48 --steps 32
```
## Notes
- Uses `<|mdm_mask|>` as the diffusion mask token.
- `--steps` and `--max-new-tokens` are the main quality/speed knobs.
- Model is loaded through CoreML (`MLModel`), and `.mlpackage` is auto-compiled at runtime.