FLUX.1-schnell-MLX / README.md
illusion615's picture
Upload folder using huggingface_hub
31f3da5 verified
metadata
license: apache-2.0
language:
  - en
library_name: mlx
tags:
  - mlx
  - text-to-image
  - apple-silicon
  - image-generation
  - diffusion
  - flux
base_model: black-forest-labs/FLUX.1-schnell
pipeline_tag: text-to-image

FLUX.1-schnell MLX Pipeline

Pure MLX (Apple Silicon) inference pipeline for FLUX.1-schnell β€” a fast text-to-image model by Black Forest Labs.

Zero PyTorch dependency. Runs natively on Apple Silicon via Metal GPU.

Highlights

  • 100% MLX native β€” no torch, no diffusers needed
  • 4-bit quantization support via argmaxinc/mlx-FLUX.1-schnell-4bit-quantized
  • Fast 4-step generation (FLUX.1-schnell is distilled for speed)
  • T5-XXL + CLIP-L dual text encoders
  • FluxTransformer with 19 Joint Blocks + 38 Single Blocks + N-dim RoPE

Architecture

FluxPipeline
β”œβ”€β”€ T5-XXL Encoder (24 layers, hidden=4096)
β”‚   └── Relative positional attention + GatedFFN
β”œβ”€β”€ CLIP-L Encoder (23 layers, hidden=768)
β”‚   └── Causal mask + EOS pooling
β”œβ”€β”€ FluxTransformer (DiT)
β”‚   β”œβ”€β”€ 19 JointTransformerBlock (txt+img joint attention)
β”‚   β”œβ”€β”€ 38 SingleTransformerBlock (img self-attention)
β”‚   └── N-dim RoPE (axes_dim=[16,56,56])
β”œβ”€β”€ AutoencoderKL Decoder
β”‚   └── Latent channels=16, block_out=[128,256,512,512]
└── FlowMatchEuler Sampler

Quick Start

Install

pip install mlx safetensors sentencepiece tokenizers pillow numpy

Download Weights

# 4-bit quantized (recommended, ~5GB)
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell-4bit-quantized

# Or full precision
huggingface-cli download argmaxinc/mlx-FLUX.1-schnell

Generate

from pipeline import FluxPipeline

pipe = FluxPipeline()
pipe.load()

result = pipe.generate_and_save(
    prompt="a beautiful sunset over mountains",
    output_path="output.png",
    width=512,
    height=512,
    num_steps=4,
    seed=42,
)
print(f"Generated in {result['elapsed_s']}s")

pipe.unload()

Files

β”œβ”€β”€ pipeline.py          # Main inference pipeline
β”œβ”€β”€ flux_model.py        # FluxTransformer (JointBlock + SingleBlock)
β”œβ”€β”€ t5_encoder.py        # T5-XXL text encoder
β”œβ”€β”€ clip_encoder.py      # CLIP-L text encoder
β”œβ”€β”€ autoencoder.py       # VAE decoder
β”œβ”€β”€ sampler.py           # FlowMatch Euler sampler
β”œβ”€β”€ tokenizers.py        # T5 + CLIP tokenizers
β”œβ”€β”€ weight_loader.py     # Weight loading + key mapping
└── download_weights.py  # HF Hub download helper

Model Source

Inference code is original work. Weights are loaded from:

License

Apache 2.0