AmdGoose's picture
Update README documentation
b8eb85a
---
license: other
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- image-generation
- quantization
- int8
- torchao
- amd
- rocm
base_model: black-forest-labs/FLUX.2-dev
---
# FLUX.2-dev β€” Attention-only INT8 Weight-Only Transformer (ROCm)
This repository provides an **INT8 weight-only quantized transformer** for
[`black-forest-labs/FLUX.2-dev`](https://huggingface.co/black-forest-labs/FLUX.2-dev).
It is designed to be:
- βœ… **ROCm-compatible**
- βœ… **Stable on AMD Instinct MI210**
- βœ… **Image-quality preserving**
Only **attention Linear layers (Q/K/V + projections)** are quantized.
All other components remain in **BF16**.
---
## πŸ” What is included
- βœ… Transformer with **attention-only INT8 weight-only quantization**
- βœ… TorchAO-based quantization (no bitsandbytes)
- βœ… Compatible with **Diffusers standard pipelines**
---
## ❌ What is NOT included
- ❌ VAE
- ❌ Text encoders
- ❌ Scheduler
These components are automatically loaded from the base FLUX.2 model.
---
## πŸ’‘ Why attention-only INT8?
Full INT8 quantization of FLUX.2 introduces visible artifacts on ROCm.
Quantizing **only attention layers** provides:
- Significant VRAM reduction
- Stable generation
- No "confetti noise" artifacts
- Safe inference on MI210 (64 GB)
---
## πŸš€ Usage (Diffusers)
```python
import torch
from diffusers import Flux2Pipeline, AutoModel
BASE_MODEL = "black-forest-labs/FLUX.2-dev"
ATTN_INT8 = "AmdGoose/FLUX.2-dev-transformer-attn-int8wo"
dtype = torch.bfloat16
device = "cuda" # ROCm uses "cuda" in PyTorch
transformer = AutoModel.from_pretrained(
ATTN_INT8,
subfolder="transformer_attn_int8wo",
torch_dtype=dtype,
use_safetensors=False,
).to(device)
pipe = Flux2Pipeline.from_pretrained(
BASE_MODEL,
transformer=transformer,
torch_dtype=dtype,
)
pipe.enable_attention_slicing()
pipe.vae.enable_tiling()
pipe.enable_model_cpu_offload()
image = pipe(
prompt="A realistic starter pack figurine in a blister box, studio lighting",
num_inference_steps=28,
guidance_scale=4,
height=1024,
width=1024,
).images[0]
image.save("out.png")