File size: 2,183 Bytes

---
license: apache-2.0
base_model: Qwen/Qwen-Image
pipeline_tag: text-to-image
library_name: mlx-gen
tags:
- mlx
- mlx-gen
- mflux
- apple-silicon
- 4-bit
- mixed-q4
- mixed-q4-q8
- qwen
- qwen-image
---
# qwen-image-4bit

This repository contains MLX-Gen saved weights for `Qwen/Qwen-Image`. The checkpoint is designed for local Apple Silicon inference with [`mlx-gen`](https://github.com/lpalbou/mlx-gen).

It uses the mflux/MLX saved-weight layout and MLX quantization tensors. It is not a Diffusers or Transformers `from_pretrained()` checkpoint.

## Source Model

Original model: [`Qwen/Qwen-Image`](https://huggingface.co/Qwen/Qwen-Image).

## License and Access

This quantized derivative follows the Apache 2.0 license of the source model.

## Quantization

This is a mixed q4/q8 checkpoint for Qwen Image generation and editing. Fully q4 Qwen checkpoints can lose coherent generative behavior, so MLX-Gen uses a mixed policy:

- q4 for most Qwen transformer attention, feed-forward, and projection linears.
- q8 for Qwen `*.img_mod_linear` transformer modulation layers.
- q4 for group64-compatible Qwen text-encoder language linears.
- q8 for group64-compatible Qwen text-encoder visual linears.
- BF16 for the VAE, norms, embeddings, and linears that are not MLX group64-compatible.

See the [MLX-Gen quantization docs](https://github.com/lpalbou/mlx-gen/blob/main/docs/quantization.md) for the current mixed q4/q8 policy and compatibility notes.

## Compatibility

Requires `mlx-gen >= 0.18.2`.

Generated with `mlx-gen 0.18.2`.

Use the `mlxgen` command and Python import path for new MLX-Gen projects.

## Usage

```bash
python -m pip install -U mlx-gen

mlxgen download --model AbstractFramework/qwen-image-4bit

mlxgen generate \
  --model AbstractFramework/qwen-image-4bit \
  --prompt "Your prompt here" \
  --steps 20 \
  --seed 42 \
  --output image.png
```

## Attribution

MLX-Gen is based on [mflux](https://github.com/filipstrand/mflux) by Filip Strand and the original mflux contributors. This model card is generated by MLX-Gen so derived checkpoints keep that attribution visible.

Quantized and contributed by [@lpalbou](https://huggingface.co/lpalbou).