wangkanai
/

flux-dev-fp8

+---
+license: apache-2.0
+library_name: diffusers
+pipeline_tag: text-to-image
+tags:
+  - text-to-image
+  - flux
+  - flux.1-dev
+  - image-generation
+  - stable-diffusion
+  - fp8
+  - quantized
+  - low-vram
+base_model: black-forest-labs/FLUX.1-dev
+---
+# FLUX.1-dev FP8 Model Collection
+This repository contains the FP8 (8-bit quantized) variant of the FLUX.1-dev text-to-image generation model. This optimized collection is designed for lower VRAM usage with minimal quality loss.
+## Model Description
+FLUX.1-dev is a state-of-the-art text-to-image generation model. This FP8 collection provides efficient inference with approximately 50% size reduction compared to FP16, making it ideal for systems with limited VRAM.
+## Repository Contents
+**Total Size**: ~41GB
+### Diffusion Models
+- `diffusion_models/flux1-dev-fp8.safetensors` (17GB) - FP8 quantized diffusion model
+- `checkpoints/flux1-dev-fp8.safetensors` (12GB) - FP8 checkpoint format
+### Text Encoders
+- `text_encoders/clip_g.safetensors` (1.3GB) - CLIP-G text encoder
+- `text_encoders/clip_l.safetensors` (235MB) - CLIP-L text encoder
+- `text_encoders/clip-vit-large.safetensors` (1.6GB) - CLIP ViT-Large encoder
+- `text_encoders/t5xxl_fp8_e4m3fn.safetensors` (4.6GB) - T5-XXL FP8 quantized encoder
+### Vision Models
+- `clip_vision/clip_vision_h.safetensors` (1.2GB) - CLIP Vision H model
+## Hardware Requirements
+- **VRAM**: 12GB+ recommended
+- **Disk Space**: 41GB
+- **Precision**: FP8 (8-bit quantized)
+- **Memory**: 16GB+ system RAM recommended
+## Usage
+```python
+from diffusers import FluxPipeline
+import torch
+# Load the FP8 model
+pipe = FluxPipeline.from_pretrained(
+    "path/to/flux-dev-fp8",
+    torch_dtype=torch.float8_e4m3fn
+)
+pipe.to("cuda")
+# Generate an image
+image = pipe(
+    prompt="a beautiful mountain landscape at sunset",
+    num_inference_steps=50,
+    guidance_scale=7.5
+).images[0]
+image.save("output.png")
+```
+## Model Precision Trade-offs
+**FP8 (This Collection)**:
+- ~50% smaller than FP16
+- Faster inference
+- Minimal quality loss
+- Lower VRAM requirements (12GB+)
+- Recommended for: Memory-constrained systems, faster generation
+**Alternatives**:
+- FP16: Full precision, best quality, requires 16GB+ VRAM
+- GGUF: Further quantized variants for extreme memory constraints
+## License
+This model is released under the Apache 2.0 license.
+## Citation
+```bibtex
+@software{flux1-dev,
+  author = {Black Forest Labs},
+  title = {FLUX.1-dev},
+  year = {2024},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/black-forest-labs/FLUX.1-dev}
+}
+```
+## Model Card Contact
+For questions or issues with this model collection, please refer to the original FLUX.1-dev model card and repository.