FLUX.1-schnell MLX Pipeline

Pure MLX (Apple Silicon) inference pipeline for FLUX.1-schnell — a fast text-to-image model by Black Forest Labs.

Zero PyTorch dependency. Runs natively on Apple Silicon via Metal GPU.

Highlights

100% MLX native — no torch, no diffusers needed
4-bit quantization support via argmaxinc/mlx-FLUX.1-schnell-4bit-quantized
Fast 4-step generation (FLUX.1-schnell is distilled for speed)
T5-XXL + CLIP-L dual text encoders
FluxTransformer with 19 Joint Blocks + 38 Single Blocks + N-dim RoPE

Architecture

FluxPipeline
├── T5-XXL Encoder (24 layers, hidden=4096)
│   └── Relative positional attention + GatedFFN
├── CLIP-L Encoder (23 layers, hidden=768)
│   └── Causal mask + EOS pooling
├── FluxTransformer (DiT)
│   ├── 19 JointTransformerBlock (txt+img joint attention)
│   ├── 38 SingleTransformerBlock (img self-attention)
│   └── N-dim RoPE (axes_dim=[16,56,56])
├── AutoencoderKL Decoder
│   └── Latent channels=16, block_out=[128,256,512,512]
└── FlowMatchEuler Sampler

Quick Start

Install

pip install mlx safetensors sentencepiece tokenizers pillow numpy

Download Weights

# 4-bit quantized (recommended, ~5GB)
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell-4bit-quantized

# Or full precision
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell

Generate

from pipeline import FluxPipeline

pipe = FluxPipeline()
pipe.load()

result = pipe.generate_and_save(
    prompt="a beautiful sunset over mountains",
    output_path="output.png",
    width=512,
    height=512,
    num_steps=4,
    seed=42,
)
print(f"Generated in {result['elapsed_s']}s")

pipe.unload()

Files

├── pipeline.py          # Main inference pipeline
├── flux_model.py        # FluxTransformer (JointBlock + SingleBlock)
├── t5_encoder.py        # T5-XXL text encoder
├── clip_encoder.py      # CLIP-L text encoder
├── autoencoder.py       # VAE decoder
├── sampler.py           # FlowMatch Euler sampler
├── tokenizers.py        # T5 + CLIP tokenizers
├── weight_loader.py     # Weight loading + key mapping
└── download_weights.py  # HF Hub download helper

Model Source

Inference code is original work. Weights are loaded from:

argmaxinc/mlx-FLUX.1-schnell-4bit-quantized (default)
black-forest-labs/FLUX.1-schnell (original)

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Model tree for illusion615/FLUX.1-schnell-MLX

Base model

black-forest-labs/FLUX.1-schnell

Finetuned

(61)

this model