FLUX.1-schnell MLX Pipeline

Pure MLX (Apple Silicon) inference pipeline for FLUX.1-schnell β€” a fast text-to-image model by Black Forest Labs.

Zero PyTorch dependency. Runs natively on Apple Silicon via Metal GPU.

Highlights

  • 100% MLX native β€” no torch, no diffusers needed
  • 4-bit quantization support via argmaxinc/mlx-FLUX.1-schnell-4bit-quantized
  • Fast 4-step generation (FLUX.1-schnell is distilled for speed)
  • T5-XXL + CLIP-L dual text encoders
  • FluxTransformer with 19 Joint Blocks + 38 Single Blocks + N-dim RoPE

Architecture

FluxPipeline
β”œβ”€β”€ T5-XXL Encoder (24 layers, hidden=4096)
β”‚   └── Relative positional attention + GatedFFN
β”œβ”€β”€ CLIP-L Encoder (23 layers, hidden=768)
β”‚   └── Causal mask + EOS pooling
β”œβ”€β”€ FluxTransformer (DiT)
β”‚   β”œβ”€β”€ 19 JointTransformerBlock (txt+img joint attention)
β”‚   β”œβ”€β”€ 38 SingleTransformerBlock (img self-attention)
β”‚   └── N-dim RoPE (axes_dim=[16,56,56])
β”œβ”€β”€ AutoencoderKL Decoder
β”‚   └── Latent channels=16, block_out=[128,256,512,512]
└── FlowMatchEuler Sampler

Quick Start

Install

pip install mlx safetensors sentencepiece tokenizers pillow numpy

Download Weights

# 4-bit quantized (recommended, ~5GB)
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell-4bit-quantized

# Or full precision
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell

Generate

from pipeline import FluxPipeline

pipe = FluxPipeline()
pipe.load()

result = pipe.generate_and_save(
    prompt="a beautiful sunset over mountains",
    output_path="output.png",
    width=512,
    height=512,
    num_steps=4,
    seed=42,
)
print(f"Generated in {result['elapsed_s']}s")

pipe.unload()

Files

β”œβ”€β”€ pipeline.py          # Main inference pipeline
β”œβ”€β”€ flux_model.py        # FluxTransformer (JointBlock + SingleBlock)
β”œβ”€β”€ t5_encoder.py        # T5-XXL text encoder
β”œβ”€β”€ clip_encoder.py      # CLIP-L text encoder
β”œβ”€β”€ autoencoder.py       # VAE decoder
β”œβ”€β”€ sampler.py           # FlowMatch Euler sampler
β”œβ”€β”€ tokenizers.py        # T5 + CLIP tokenizers
β”œβ”€β”€ weight_loader.py     # Weight loading + key mapping
└── download_weights.py  # HF Hub download helper

Model Source

Inference code is original work. Weights are loaded from:

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for illusion615/FLUX.1-schnell-MLX

Finetuned
(61)
this model