prithivMLmods commited on
Commit
7b6c5d0
·
verified ·
1 Parent(s): 142ab71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -110
README.md CHANGED
@@ -14,113 +14,3 @@ tags:
14
  - compressed-tensors
15
  ---
16
 
17
- ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/kyWNODHEkGK1tAxe0aUVD.png)
18
-
19
- `fp8 @` - [model](https://huggingface.co/prithivMLmods/FireRed-Image-Edit-1.0-fp8/tree/main/transformer)
20
-
21
- # **FireRed-Image-Edit-1.0-fp8**
22
-
23
- > **FireRed-Image-Edit-1.0-fp8** is an FP8-compressed diffusion transformer variant built on top of **FireRedTeam/FireRed-Image-Edit-1.0**.
24
- > This release provides **Diffusers-compatible transformer weights only**, enabling reduced memory usage and improved throughput while preserving the high-fidelity instruction-based image editing capabilities of the original model.
25
-
26
- > [!important]
27
- > This release compresses **only the diffusion transformer module** (`QwenImageTransformer2DModel`) using **FP8 (F8_E4M3) weight quantization with BF16 compute fallback**. The VAE, scheduler, text encoders, and other pipeline components remain unchanged and must be loaded from the base model. FP8 (8-bit floating point) weight and activation quantization using hardware acceleration on GPUs – [FP8 W8A8](https://docs.vllm.ai/en/stable/features/quantization/fp8/). Quantization W8A8 FP8-dynamic recipe – [examples](https://github.com/vllm-project/llm-compressor/tree/main/examples/quantization_w8a8_fp8).
28
-
29
- ## Diffusers Usage
30
-
31
- ```python
32
- import torch
33
- from diffusers.models import QwenImageTransformer2DModel
34
- from diffusers import QwenImageEditPlusPipeline
35
- from diffusers.utils import load_image
36
-
37
- transformer = QwenImageTransformer2DModel.from_pretrained(
38
- "prithivMLmods/FireRed-Image-Edit-1.0-fp8",
39
- subfolder="transformer",
40
- torch_dtype=torch.bfloat16
41
- )
42
-
43
- pipeline = QwenImageEditPlusPipeline.from_pretrained(
44
- "FireRedTeam/FireRed-Image-Edit-1.0",
45
- transformer=transformer,
46
- torch_dtype=torch.bfloat16
47
- )
48
-
49
- pipeline.to("cuda")
50
-
51
- image1 = load_image("grumpycat.png")
52
- prompt = "turn the cat into an orange cat"
53
-
54
- inputs = {
55
- "image": [image1],
56
- "prompt": prompt,
57
- "generator": torch.manual_seed(42),
58
- "true_cfg_scale": 1.0,
59
- "negative_prompt": " ",
60
- "num_inference_steps": 40,
61
- "guidance_scale": 1.0,
62
- "num_images_per_prompt": 1,
63
- }
64
-
65
- output = pipeline(**inputs)
66
- output_image = output.images[0]
67
- output_image.save("output_image_edit_plus.png")
68
- ```
69
-
70
- ## About the Base Model
71
-
72
- **FireRed-Image-Edit-1.0** from FireRedTeam is a state-of-the-art open-source diffusion transformer designed for instruction-based image editing.
73
-
74
- It achieves top-tier performance through:
75
-
76
- * A **1.6B-sample dataset**, refined to **100M+ high-quality text-to-image and editing pairs**
77
- * Cleaning, stratification, and auto-labeling
78
- * Dual-stage filtering for optimal semantic coverage and instruction alignment
79
-
80
- ### Multi-Stage Training Pipeline
81
-
82
- 1. Pre-training
83
- 2. Supervised fine-tuning
84
- 3. Reinforcement learning
85
-
86
- ### Key Innovations
87
-
88
- * **Multi-Condition Aware Bucket Sampler** for efficient variable-resolution batching
89
- * **Stochastic Instruction Alignment** with dynamic prompt re-indexing
90
- * **Asymmetric Gradient Optimization** for stable DPO
91
- * **DiffusionNFT** with layout-aware OCR rewards for precise text editing
92
- * **Differentiable Consistency Loss** for identity preservation
93
-
94
- ## Native Capabilities
95
-
96
- * Photo restoration
97
- * Multi-image editing such as virtual try-on
98
- * Style transfer with text fidelity
99
- * Complex instruction adherence
100
- * Layout-aware text editing
101
- * Identity-preserving edits
102
- * Professional photorealistic refinements
103
-
104
- * Skin texture realism
105
- * Multi-outfit changes in single passes
106
-
107
- It achieves strong results across:
108
-
109
- * REDEdit-Bench with 15 editing categories
110
- * ImgEdit
111
- * GEdit
112
-
113
- The model supports native editing from text-to-image diffusion foundations rather than patch-based methods, enabling coherent, high-quality outputs suitable for professional workflows and ComfyUI integration.
114
-
115
- ## What FP8 Adds
116
-
117
- The **FireRed-Image-Edit-1.0-fp8** variant introduces:
118
-
119
- * **FP8 (F8_E4M3) transformer weight compression with BF16 compute fallback**
120
- * Reduced VRAM usage
121
- * Improved throughput
122
- * Faster inference on Hopper and other GPUs with native FP8 support
123
- * Production-friendly deployment without modifying the original pipeline structure
124
-
125
- > [!NOTE]
126
- Only the transformer weights are compressed. All other components must be loaded from the base model, ensuring seamless compatibility with existing Diffusers pipelines. This repository strictly follows the release notes, license, and terms and conditions of the original model: [FireRed-Image-Edit-1.0](https://huggingface.co/FireRedTeam/FireRed-Image-Edit-1.0).
 
14
  - compressed-tensors
15
  ---
16