wangkanai commited on
Commit
24ae175
Β·
verified Β·
1 Parent(s): 1400f57

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -8,11 +8,11 @@ tags:
8
  - image-generation
9
  ---
10
 
11
- <!-- README Version: v1.3 -->
12
 
13
  # FLUX.1-dev FP8 Quantized Model Collection
14
 
15
- High-performance 8-bit floating point quantized version of FLUX.1-dev, optimized for reduced VRAM usage while maintaining excellent image generation quality. This collection includes the complete pipeline with text encoders, CLIP models, and IP-Adapter support.
16
 
17
  ## Model Description
18
 
@@ -21,8 +21,8 @@ FLUX.1-dev is a state-of-the-art text-to-image diffusion model developed by Blac
21
  **Key Features**:
22
  - **FP8 Quantization**: Reduced precision for memory efficiency (~46GB total vs 72GB FP16)
23
  - **Complete Pipeline**: Includes all components for text-to-image generation
24
- - **IP-Adapter Support**: Image prompt adapter for style transfer and image-guided generation
25
- - **Multiple Text Encoders**: CLIP-L, CLIP-G, and T5-XXL for comprehensive text understanding
26
  - **Production Ready**: Optimized for inference with minimal quality loss
27
 
28
  ## Repository Contents
@@ -31,20 +31,18 @@ FLUX.1-dev is a state-of-the-art text-to-image diffusion model developed by Blac
31
  flux-dev-fp8/
32
  β”œβ”€β”€ checkpoints/
33
  β”‚ └── flux/
34
- β”‚ └── flux1-dev-fp8.safetensors (17GB) - Main checkpoint format
35
  β”œβ”€β”€ diffusion_models/
36
- β”‚ └── flux1-dev-fp8.safetensors (12GB) - Diffusion model only
37
  β”œβ”€β”€ text_encoders/
38
- β”‚ β”œβ”€β”€ clip-vit-large.safetensors (1.6GB) - CLIP ViT-L text encoder
39
- β”‚ β”œβ”€β”€ clip_g.safetensors (1.3GB) - CLIP-G text encoder
40
- β”‚ β”œβ”€β”€ clip_l.safetensors (235MB) - CLIP-L text encoder
41
- β”‚ └── t5xxl_fp8_e4m3fn.safetensors (4.6GB) - T5-XXL FP8 text encoder
42
  β”œβ”€β”€ clip/
43
- β”‚ └── t5xxl_fp8.safetensors (4.6GB) - T5-XXL FP8 (duplicate)
44
- β”œβ”€β”€ clip_vision/
45
- β”‚ └── clip_vision_h.safetensors (1.2GB) - CLIP vision encoder
46
- └── ipadapter-flux/
47
- └── ip-adapter.bin (5.0GB) - IP-Adapter weights
48
  ```
49
 
50
  **Total Repository Size**: 46GB
@@ -127,42 +125,6 @@ pipe = FluxPipeline.from_single_file(
127
  )
128
  ```
129
 
130
- ### IP-Adapter Image-Guided Generation
131
-
132
- ```python
133
- import torch
134
- from diffusers import FluxPipeline
135
- from PIL import Image
136
-
137
- # Load pipeline with IP-Adapter
138
- pipe = FluxPipeline.from_single_file(
139
- "E:/huggingface/flux-dev-fp8/checkpoints/flux/flux1-dev-fp8.safetensors",
140
- torch_dtype=torch.float8_e4m3fn
141
- )
142
-
143
- # Load IP-Adapter weights
144
- pipe.load_ip_adapter(
145
- "E:/huggingface/flux-dev-fp8/ipadapter-flux",
146
- weight_name="ip-adapter.bin"
147
- )
148
- pipe.set_ip_adapter_scale(0.7)
149
-
150
- # Load reference image
151
- ref_image = Image.open("reference.jpg")
152
-
153
- # Generate with image guidance
154
- prompt = "A portrait in the style of the reference image"
155
- image = pipe(
156
- prompt=prompt,
157
- ip_adapter_image=ref_image,
158
- height=1024,
159
- width=1024,
160
- num_inference_steps=28
161
- ).images[0]
162
-
163
- image.save("styled_output.png")
164
- ```
165
-
166
  ### Memory-Constrained Setup (16GB VRAM)
167
 
168
  ```python
@@ -208,7 +170,8 @@ image = pipe(
208
 
209
  ### Supported Features
210
  - Text-to-image generation up to 2048x2048
211
- - IP-Adapter for image-guided generation
 
212
  - Negative prompts for content control
213
  - CFG (Classifier-Free Guidance) for prompt adherence
214
  - VAE tiling for high-resolution generation
@@ -312,9 +275,9 @@ For FP8 quantization methodology:
312
  - **FLUX Reddit**: https://reddit.com/r/StableDiffusion
313
  - **Discord Community**: https://discord.gg/stablediffusion
314
 
315
- ### Related Models in Repository
316
- - **FLUX.1-dev FP16**: `E:/huggingface/flux-dev-fp16/` - Full precision version (72GB)
317
- - **FLUX Upscale**: `E:/huggingface/flux-upscale/` - Super-resolution models (192MB)
318
 
319
  ## Troubleshooting
320
 
@@ -355,7 +318,7 @@ For issues, questions, or contributions:
355
  ---
356
 
357
  **Model Version**: FLUX.1-dev FP8
358
- **Repository Version**: v1.3
359
- **Last Updated**: 2025-10-14
360
  **Total Size**: 46GB
361
- **Format**: SafeTensors (.safetensors, .bin)
 
8
  - image-generation
9
  ---
10
 
11
+ <!-- README Version: v1.4 -->
12
 
13
  # FLUX.1-dev FP8 Quantized Model Collection
14
 
15
+ High-performance 8-bit floating point quantized version of FLUX.1-dev, optimized for reduced VRAM usage while maintaining excellent image generation quality. This collection includes the complete pipeline with text encoders and CLIP models for production-ready text-to-image generation.
16
 
17
  ## Model Description
18
 
 
21
  **Key Features**:
22
  - **FP8 Quantization**: Reduced precision for memory efficiency (~46GB total vs 72GB FP16)
23
  - **Complete Pipeline**: Includes all components for text-to-image generation
24
+ - **Multiple Text Encoders**: CLIP-L, CLIP-G, CLIP ViT-Large, and T5-XXL for comprehensive text understanding
25
+ - **CLIP Vision Support**: Image understanding capabilities with CLIP-H vision encoder
26
  - **Production Ready**: Optimized for inference with minimal quality loss
27
 
28
  ## Repository Contents
 
31
  flux-dev-fp8/
32
  β”œβ”€β”€ checkpoints/
33
  β”‚ └── flux/
34
+ β”‚ └── flux1-dev-fp8.safetensors (17GB) - Full checkpoint with all components
35
  β”œβ”€β”€ diffusion_models/
36
+ β”‚ └── flux1-dev-fp8.safetensors (12GB) - Core diffusion model (FP8)
37
  β”œβ”€β”€ text_encoders/
38
+ β”‚ β”œβ”€β”€ clip-vit-large.safetensors (1.6GB) - CLIP ViT-Large text encoder
39
+ β”‚ β”œβ”€β”€ clip-g.safetensors (1.3GB) - CLIP-G text encoder
40
+ β”‚ β”œβ”€β”€ clip-l.safetensors (235MB) - CLIP-L text encoder
41
+ β”‚ └── t5xxl-fp8.safetensors (4.6GB) - T5-XXL text encoder (FP8)
42
  β”œβ”€β”€ clip/
43
+ β”‚ └── t5xxl-fp8.safetensors (4.6GB) - T5-XXL text encoder (alternate location)
44
+ └── clip_vision/
45
+ └── clip-vision-h.safetensors (1.2GB) - CLIP-H vision encoder
 
 
46
  ```
47
 
48
  **Total Repository Size**: 46GB
 
125
  )
126
  ```
127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
  ### Memory-Constrained Setup (16GB VRAM)
129
 
130
  ```python
 
170
 
171
  ### Supported Features
172
  - Text-to-image generation up to 2048x2048
173
+ - Multiple text encoder architectures for enhanced prompt understanding
174
+ - CLIP vision encoding for potential multimodal applications
175
  - Negative prompts for content control
176
  - CFG (Classifier-Free Guidance) for prompt adherence
177
  - VAE tiling for high-resolution generation
 
275
  - **FLUX Reddit**: https://reddit.com/r/StableDiffusion
276
  - **Discord Community**: https://discord.gg/stablediffusion
277
 
278
+ ### Related Models in This Repository
279
+ - **FLUX.1-dev FP16**: Available in parent directory - Full precision version (72GB)
280
+ - **FLUX Upscale**: Available in parent directory - Super-resolution models (192MB)
281
 
282
  ## Troubleshooting
283
 
 
318
  ---
319
 
320
  **Model Version**: FLUX.1-dev FP8
321
+ **Repository Version**: v1.4
322
+ **Last Updated**: 2025-10-28
323
  **Total Size**: 46GB
324
+ **Format**: SafeTensors (.safetensors)
clip/t5xxl-fp8.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d330da4816157540d6bb7838bf63a0f02f573fc48ca4d8de34bb0cbfd514f09
3
+ size 4893934904
clip_vision/clip-vision-h.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:64a7ef761bfccbadbaa3da77366aac4185a6c58fa5de5f589b42a65bcc21f161
3
+ size 1264219396
text_encoders/clip-g.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec310df2af79c318e24d20511b601a591ca8cd4f1fce1d8dff822a356bcdb1f4
3
+ size 1389382176
text_encoders/clip-l.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
3
+ size 246144152
text_encoders/t5xxl-fp8.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d330da4816157540d6bb7838bf63a0f02f573fc48ca4d8de34bb0cbfd514f09
3
+ size 4893934904