mlx-community/Boogu-Image-0.1-Base-4bit
MLX (int4) conversion of Boogu-Image-0.1-Base (Apache-2.0)
for Apple Silicon — bilingual (EN/ZH) text-to-image. OmniGen2-lineage pipeline
(DiT + FLUX.1 VAE + FlowMatchEuler scheduler). The Qwen3-VL-8B-Instruct text
encoder is the stock model (verified bit-identical) — referenced from
mlx-community/Qwen3-VL-8B-Instruct, not re-hosted.
Quantization: attn+FFN Linears int4 (group_size=32); per-pass cosine vs bf16 0.99896. ~7.4 GB. Quant auto-detected via transformer/quant_config.json.
Parity (CPU stream, fp32)
- FLUX VAE decode: max_abs 6.7e-6 · encode 1.97e-4
- Scheduler (flow-match + time-shift): bit-exact
- Full DiT (40-layer): max_abs 1.56e-5
Use
pip install mlx mlx-vlm
git clone https://github.com/xocialize/boogu-image-mlx && cd boogu-image-mlx && pip install -e .
from boogu_image_mlx.pipeline_mlx import BooguImagePipeline
from PIL import Image
pipe = BooguImagePipeline.from_pretrained("<this repo dir>", "mlx-community/Qwen3-VL-8B-Instruct")
img = pipe.generate("a red panda surfing on a wave, photorealistic", height=1024, width=1024, steps=30, guidance=3.5)
Image.fromarray(img).save("out.png")
Code: https://github.com/xocialize/boogu-image-mlx