Update README.md
Browse files
README.md
CHANGED
|
@@ -14,113 +14,3 @@ tags:
|
|
| 14 |
- compressed-tensors
|
| 15 |
---
|
| 16 |
|
| 17 |
-

|
| 18 |
-
|
| 19 |
-
`fp8 @` - [model](https://huggingface.co/prithivMLmods/FireRed-Image-Edit-1.0-fp8/tree/main/transformer)
|
| 20 |
-
|
| 21 |
-
# **FireRed-Image-Edit-1.0-fp8**
|
| 22 |
-
|
| 23 |
-
> **FireRed-Image-Edit-1.0-fp8** is an FP8-compressed diffusion transformer variant built on top of **FireRedTeam/FireRed-Image-Edit-1.0**.
|
| 24 |
-
> This release provides **Diffusers-compatible transformer weights only**, enabling reduced memory usage and improved throughput while preserving the high-fidelity instruction-based image editing capabilities of the original model.
|
| 25 |
-
|
| 26 |
-
> [!important]
|
| 27 |
-
> This release compresses **only the diffusion transformer module** (`QwenImageTransformer2DModel`) using **FP8 (F8_E4M3) weight quantization with BF16 compute fallback**. The VAE, scheduler, text encoders, and other pipeline components remain unchanged and must be loaded from the base model. FP8 (8-bit floating point) weight and activation quantization using hardware acceleration on GPUs – [FP8 W8A8](https://docs.vllm.ai/en/stable/features/quantization/fp8/). Quantization W8A8 FP8-dynamic recipe – [examples](https://github.com/vllm-project/llm-compressor/tree/main/examples/quantization_w8a8_fp8).
|
| 28 |
-
|
| 29 |
-
## Diffusers Usage
|
| 30 |
-
|
| 31 |
-
```python
|
| 32 |
-
import torch
|
| 33 |
-
from diffusers.models import QwenImageTransformer2DModel
|
| 34 |
-
from diffusers import QwenImageEditPlusPipeline
|
| 35 |
-
from diffusers.utils import load_image
|
| 36 |
-
|
| 37 |
-
transformer = QwenImageTransformer2DModel.from_pretrained(
|
| 38 |
-
"prithivMLmods/FireRed-Image-Edit-1.0-fp8",
|
| 39 |
-
subfolder="transformer",
|
| 40 |
-
torch_dtype=torch.bfloat16
|
| 41 |
-
)
|
| 42 |
-
|
| 43 |
-
pipeline = QwenImageEditPlusPipeline.from_pretrained(
|
| 44 |
-
"FireRedTeam/FireRed-Image-Edit-1.0",
|
| 45 |
-
transformer=transformer,
|
| 46 |
-
torch_dtype=torch.bfloat16
|
| 47 |
-
)
|
| 48 |
-
|
| 49 |
-
pipeline.to("cuda")
|
| 50 |
-
|
| 51 |
-
image1 = load_image("grumpycat.png")
|
| 52 |
-
prompt = "turn the cat into an orange cat"
|
| 53 |
-
|
| 54 |
-
inputs = {
|
| 55 |
-
"image": [image1],
|
| 56 |
-
"prompt": prompt,
|
| 57 |
-
"generator": torch.manual_seed(42),
|
| 58 |
-
"true_cfg_scale": 1.0,
|
| 59 |
-
"negative_prompt": " ",
|
| 60 |
-
"num_inference_steps": 40,
|
| 61 |
-
"guidance_scale": 1.0,
|
| 62 |
-
"num_images_per_prompt": 1,
|
| 63 |
-
}
|
| 64 |
-
|
| 65 |
-
output = pipeline(**inputs)
|
| 66 |
-
output_image = output.images[0]
|
| 67 |
-
output_image.save("output_image_edit_plus.png")
|
| 68 |
-
```
|
| 69 |
-
|
| 70 |
-
## About the Base Model
|
| 71 |
-
|
| 72 |
-
**FireRed-Image-Edit-1.0** from FireRedTeam is a state-of-the-art open-source diffusion transformer designed for instruction-based image editing.
|
| 73 |
-
|
| 74 |
-
It achieves top-tier performance through:
|
| 75 |
-
|
| 76 |
-
* A **1.6B-sample dataset**, refined to **100M+ high-quality text-to-image and editing pairs**
|
| 77 |
-
* Cleaning, stratification, and auto-labeling
|
| 78 |
-
* Dual-stage filtering for optimal semantic coverage and instruction alignment
|
| 79 |
-
|
| 80 |
-
### Multi-Stage Training Pipeline
|
| 81 |
-
|
| 82 |
-
1. Pre-training
|
| 83 |
-
2. Supervised fine-tuning
|
| 84 |
-
3. Reinforcement learning
|
| 85 |
-
|
| 86 |
-
### Key Innovations
|
| 87 |
-
|
| 88 |
-
* **Multi-Condition Aware Bucket Sampler** for efficient variable-resolution batching
|
| 89 |
-
* **Stochastic Instruction Alignment** with dynamic prompt re-indexing
|
| 90 |
-
* **Asymmetric Gradient Optimization** for stable DPO
|
| 91 |
-
* **DiffusionNFT** with layout-aware OCR rewards for precise text editing
|
| 92 |
-
* **Differentiable Consistency Loss** for identity preservation
|
| 93 |
-
|
| 94 |
-
## Native Capabilities
|
| 95 |
-
|
| 96 |
-
* Photo restoration
|
| 97 |
-
* Multi-image editing such as virtual try-on
|
| 98 |
-
* Style transfer with text fidelity
|
| 99 |
-
* Complex instruction adherence
|
| 100 |
-
* Layout-aware text editing
|
| 101 |
-
* Identity-preserving edits
|
| 102 |
-
* Professional photorealistic refinements
|
| 103 |
-
|
| 104 |
-
* Skin texture realism
|
| 105 |
-
* Multi-outfit changes in single passes
|
| 106 |
-
|
| 107 |
-
It achieves strong results across:
|
| 108 |
-
|
| 109 |
-
* REDEdit-Bench with 15 editing categories
|
| 110 |
-
* ImgEdit
|
| 111 |
-
* GEdit
|
| 112 |
-
|
| 113 |
-
The model supports native editing from text-to-image diffusion foundations rather than patch-based methods, enabling coherent, high-quality outputs suitable for professional workflows and ComfyUI integration.
|
| 114 |
-
|
| 115 |
-
## What FP8 Adds
|
| 116 |
-
|
| 117 |
-
The **FireRed-Image-Edit-1.0-fp8** variant introduces:
|
| 118 |
-
|
| 119 |
-
* **FP8 (F8_E4M3) transformer weight compression with BF16 compute fallback**
|
| 120 |
-
* Reduced VRAM usage
|
| 121 |
-
* Improved throughput
|
| 122 |
-
* Faster inference on Hopper and other GPUs with native FP8 support
|
| 123 |
-
* Production-friendly deployment without modifying the original pipeline structure
|
| 124 |
-
|
| 125 |
-
> [!NOTE]
|
| 126 |
-
Only the transformer weights are compressed. All other components must be loaded from the base model, ensuring seamless compatibility with existing Diffusers pipelines. This repository strictly follows the release notes, license, and terms and conditions of the original model: [FireRed-Image-Edit-1.0](https://huggingface.co/FireRedTeam/FireRed-Image-Edit-1.0).
|
|
|
|
| 14 |
- compressed-tensors
|
| 15 |
---
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|