Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

README.md +21 -58
clip/t5xxl-fp8.safetensors +3 -0
clip_vision/clip-vision-h.safetensors +3 -0
text_encoders/clip-g.safetensors +3 -0
text_encoders/clip-l.safetensors +3 -0
text_encoders/t5xxl-fp8.safetensors +3 -0

README.md CHANGED Viewed

@@ -8,11 +8,11 @@ tags:
   - image-generation
 ---
-<!-- README Version: v1.3 -->
 # FLUX.1-dev FP8 Quantized Model Collection
-High-performance 8-bit floating point quantized version of FLUX.1-dev, optimized for reduced VRAM usage while maintaining excellent image generation quality. This collection includes the complete pipeline with text encoders, CLIP models, and IP-Adapter support.
 ## Model Description
@@ -21,8 +21,8 @@ FLUX.1-dev is a state-of-the-art text-to-image diffusion model developed by Blac
 **Key Features**:
 - **FP8 Quantization**: Reduced precision for memory efficiency (~46GB total vs 72GB FP16)
 - **Complete Pipeline**: Includes all components for text-to-image generation
-- **IP-Adapter Support**: Image prompt adapter for style transfer and image-guided generation
-- **Multiple Text Encoders**: CLIP-L, CLIP-G, and T5-XXL for comprehensive text understanding
 - **Production Ready**: Optimized for inference with minimal quality loss
 ## Repository Contents
@@ -31,20 +31,18 @@ FLUX.1-dev is a state-of-the-art text-to-image diffusion model developed by Blac
 flux-dev-fp8/
 ├── checkpoints/
 │   └── flux/
-│       └── flux1-dev-fp8.safetensors           (17GB) - Main checkpoint format
 ├── diffusion_models/
-│   └── flux1-dev-fp8.safetensors               (12GB) - Diffusion model only
 ├── text_encoders/
-│   ├── clip-vit-large.safetensors              (1.6GB) - CLIP ViT-L text encoder
-│   ├── clip_g.safetensors                      (1.3GB) - CLIP-G text encoder
-│   ├── clip_l.safetensors                      (235MB) - CLIP-L text encoder
-│   └── t5xxl_fp8_e4m3fn.safetensors           (4.6GB) - T5-XXL FP8 text encoder
 ├── clip/
-│   └── t5xxl_fp8.safetensors                   (4.6GB) - T5-XXL FP8 (duplicate)
-├── clip_vision/
-│   └── clip_vision_h.safetensors               (1.2GB) - CLIP vision encoder
-└── ipadapter-flux/
-    └── ip-adapter.bin                          (5.0GB) - IP-Adapter weights
 ```
 **Total Repository Size**: 46GB
@@ -127,42 +125,6 @@ pipe = FluxPipeline.from_single_file(
 )
 ```
-### IP-Adapter Image-Guided Generation
-```python
-import torch
-from diffusers import FluxPipeline
-from PIL import Image
-# Load pipeline with IP-Adapter
-pipe = FluxPipeline.from_single_file(
-    "E:/huggingface/flux-dev-fp8/checkpoints/flux/flux1-dev-fp8.safetensors",
-    torch_dtype=torch.float8_e4m3fn
-)
-# Load IP-Adapter weights
-pipe.load_ip_adapter(
-    "E:/huggingface/flux-dev-fp8/ipadapter-flux",
-    weight_name="ip-adapter.bin"
-)
-pipe.set_ip_adapter_scale(0.7)
-# Load reference image
-ref_image = Image.open("reference.jpg")
-# Generate with image guidance
-prompt = "A portrait in the style of the reference image"
-image = pipe(
-    prompt=prompt,
-    ip_adapter_image=ref_image,
-    height=1024,
-    width=1024,
-    num_inference_steps=28
-).images[0]
-image.save("styled_output.png")
-```
 ### Memory-Constrained Setup (16GB VRAM)
 ```python
@@ -208,7 +170,8 @@ image = pipe(
 ### Supported Features
 - Text-to-image generation up to 2048x2048
-- IP-Adapter for image-guided generation
 - Negative prompts for content control
 - CFG (Classifier-Free Guidance) for prompt adherence
 - VAE tiling for high-resolution generation
@@ -312,9 +275,9 @@ For FP8 quantization methodology:
 - **FLUX Reddit**: https://reddit.com/r/StableDiffusion
 - **Discord Community**: https://discord.gg/stablediffusion
-### Related Models in Repository
-- **FLUX.1-dev FP16**: `E:/huggingface/flux-dev-fp16/` - Full precision version (72GB)
-- **FLUX Upscale**: `E:/huggingface/flux-upscale/` - Super-resolution models (192MB)
 ## Troubleshooting
@@ -355,7 +318,7 @@ For issues, questions, or contributions:
 ---
 **Model Version**: FLUX.1-dev FP8
-**Repository Version**: v1.3
-**Last Updated**: 2025-10-14
 **Total Size**: 46GB
-**Format**: SafeTensors (.safetensors, .bin)

   - image-generation
 ---
+<!-- README Version: v1.4 -->
 # FLUX.1-dev FP8 Quantized Model Collection
+High-performance 8-bit floating point quantized version of FLUX.1-dev, optimized for reduced VRAM usage while maintaining excellent image generation quality. This collection includes the complete pipeline with text encoders and CLIP models for production-ready text-to-image generation.
 ## Model Description
 **Key Features**:
 - **FP8 Quantization**: Reduced precision for memory efficiency (~46GB total vs 72GB FP16)
 - **Complete Pipeline**: Includes all components for text-to-image generation
+- **Multiple Text Encoders**: CLIP-L, CLIP-G, CLIP ViT-Large, and T5-XXL for comprehensive text understanding
+- **CLIP Vision Support**: Image understanding capabilities with CLIP-H vision encoder
 - **Production Ready**: Optimized for inference with minimal quality loss
 ## Repository Contents
 flux-dev-fp8/
 ├── checkpoints/
 │   └── flux/
+│       └── flux1-dev-fp8.safetensors           (17GB)  - Full checkpoint with all components
 ├── diffusion_models/
+│   └── flux1-dev-fp8.safetensors               (12GB)  - Core diffusion model (FP8)
 ├── text_encoders/
+│   ├── clip-vit-large.safetensors              (1.6GB) - CLIP ViT-Large text encoder
+│   ├── clip-g.safetensors                      (1.3GB) - CLIP-G text encoder
+│   ├── clip-l.safetensors                      (235MB) - CLIP-L text encoder
+│   └── t5xxl-fp8.safetensors                   (4.6GB) - T5-XXL text encoder (FP8)
 ├── clip/
+│   └── t5xxl-fp8.safetensors                   (4.6GB) - T5-XXL text encoder (alternate location)
+└── clip_vision/
+    └── clip-vision-h.safetensors               (1.2GB) - CLIP-H vision encoder
 ```
 **Total Repository Size**: 46GB
 )
 ```
 ### Memory-Constrained Setup (16GB VRAM)
 ```python
 ### Supported Features
 - Text-to-image generation up to 2048x2048
+- Multiple text encoder architectures for enhanced prompt understanding
+- CLIP vision encoding for potential multimodal applications
 - Negative prompts for content control
 - CFG (Classifier-Free Guidance) for prompt adherence
 - VAE tiling for high-resolution generation
 - **FLUX Reddit**: https://reddit.com/r/StableDiffusion
 - **Discord Community**: https://discord.gg/stablediffusion
+### Related Models in This Repository
+- **FLUX.1-dev FP16**: Available in parent directory - Full precision version (72GB)
+- **FLUX Upscale**: Available in parent directory - Super-resolution models (192MB)
 ## Troubleshooting
 ---
 **Model Version**: FLUX.1-dev FP8
+**Repository Version**: v1.4
+**Last Updated**: 2025-10-28
 **Total Size**: 46GB
+**Format**: SafeTensors (.safetensors)

clip/t5xxl-fp8.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7d330da4816157540d6bb7838bf63a0f02f573fc48ca4d8de34bb0cbfd514f09
+size 4893934904

clip_vision/clip-vision-h.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:64a7ef761bfccbadbaa3da77366aac4185a6c58fa5de5f589b42a65bcc21f161
+size 1264219396

text_encoders/clip-g.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ec310df2af79c318e24d20511b601a591ca8cd4f1fce1d8dff822a356bcdb1f4
+size 1389382176

text_encoders/clip-l.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
+size 246144152

text_encoders/t5xxl-fp8.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7d330da4816157540d6bb7838bf63a0f02f573fc48ca4d8de34bb0cbfd514f09
+size 4893934904