Lens 3.8B (MLX)
Collection
Apple MLX conversions of microsoft/Lens — 3.8B text-to-image DiT (GPT-OSS features + FLUX.2 VAE) for Apple Silicon. bf16 + int4/int8. • 3 items • Updated
How to use mlx-community/Lens-3.8B-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Lens-3.8B-8bit mlx-community/Lens-3.8B-8bit
Apple MLX conversion of the microsoft/Lens
3.8B text-to-image DiT — int8 (group_size 64), keeping the small in/out/time projections at bf16. Higher-fidelity quant, ~2x smaller than bf16. ~4.39 GB.
This repo contains the DiT only (MIT). The full pipeline also uses the GPT-OSS-20B text encoder (Apache-2.0) and the FLUX.2 semantic VAE, pulled from source by the loader (see License). Full-precision: Lens-3.8B-bf16.
Fidelity: single-pass DiT cosine 0.99998 vs the PyTorch reference. Quantization changes the denoise trajectory, so quantized samples differ in composition from bf16 but are equally sharp.
from lens_mlx.pipeline_mlx import LensPipeline # github.com/xocialize-code/lens-mlx
# `base` = a microsoft/Lens snapshot providing the tokenizer, GPT-OSS encoder, and FLUX.2 VAE.
pipe = LensPipeline.from_pretrained(base, dit_repo="mlx-community/Lens-3.8B-8bit")
img = pipe("A serene lake below snow-capped mountains, golden hour.",
height=1024, width=1024, num_inference_steps=20, seed=42)
img.save("out.png")
microsoft/Lens.mlx-community/gpt-oss-20b-MXFP4-*).Upstream: microsoft/Lens · MLX port: xocialize-code/lens-mlx
Quantized
Base model
microsoft/Lens