oraculumai
/

llada-8b-coreml-seq192-private

Text Generation

diffusion-language-model

Model card Files Files and versions

llada-8b-coreml-seq192-private / examples /README.md

oraculumai's picture

examples: fix quick-start command path

1d16933 verified 18 days ago

|

history blame contribute delete

1.13 kB

LLaDA CoreML Diffusion Loop Examples

These scripts show how to run llada_8b_instruct_seq192.mlpackage in an iterative diffusion loop (not single-pass argmax).

Files

llada_generate.py: text prompt -> tokenize -> run diffusion loop -> decode output
llada_diffuse.swift: CoreML denoising loop runner (called by Python wrapper)

Prerequisites

macOS with Xcode command line tools (xcrun, swift)
Python 3.10+
Hugging Face access for tokenizer (GSAI-ML/LLaDA-8B-Instruct)

Install Python deps:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install transformers sentencepiece jinja2

Quick run

From the repo root (where llada_8b_instruct_seq192.mlpackage exists):

source .venv/bin/activate
python examples/llada_generate.py "Write one short sentence about the moon." --max-new-tokens 48 --steps 32

Notes

Uses <|mdm_mask|> as the diffusion mask token.
--steps and --max-new-tokens are the main quality/speed knobs.
Model is loaded through CoreML (MLModel), and .mlpackage is auto-compiled at runtime.