codemichaeld
/

T5Base_fp8

@@ -3,41 +3,48 @@ library_name: diffusers
 tags:
 - fp8
 - safetensors
-- lora
-- low-rank
 - diffusion
 - converted-by-gradio
 ---
-# FP8 Model with Low-Rank LoRA
 - **Source**: `https://huggingface.co/LifuWang/DistillT5`
 - **File**: `model.safetensors`
 - **FP8 Format**: `E5M2`
-- **LoRA Rank**: 64
-- **LoRA File**: `model-lora-r64.safetensors`
 ## Usage (Inference)
 ```python
 from safetensors.torch import load_file
 import torch
-# Load FP8 model
 fp8_state = load_file("model-fp8-e5m2.safetensors")
-lora_state = load_file("model-lora-r64.safetensors")
-# Reconstruct approximate original weights
 reconstructed = {}
 for key in fp8_state:
-    if f"lora_A.{key}" in lora_state and f"lora_B.{key}" in lora_state:
-        A = lora_state[f"lora_A.{key}"].to(torch.float32)
-        B = lora_state[f"lora_B.{key}"].to(torch.float32)
-        lora_weight = B @ A  # (rank, out) @ (in, rank) -> (out, in)
-        fp8_weight = fp8_state[key].to(torch.float32)
-        reconstructed[key] = fp8_weight + lora_weight
     else:
-        reconstructed[key] = fp8_state[key].to(torch.float32)
 ```
-> Requires PyTorch ≥ 2.1 for FP8 support.

 tags:
 - fp8
 - safetensors
+- quantization
+- precision-recovery
 - diffusion
 - converted-by-gradio
 ---
+# FP8 Model with Precision Recovery
 - **Source**: `https://huggingface.co/LifuWang/DistillT5`
 - **File**: `model.safetensors`
 - **FP8 Format**: `E5M2`
+- **Correction Mode**: per_tensor
+- **Correction File**: `model-correction.safetensors`
+- **FP8 File**: `model-fp8-e5m2.safetensors`
 ## Usage (Inference)
 ```python
 from safetensors.torch import load_file
 import torch
+# Load FP8 model and correction factors
 fp8_state = load_file("model-fp8-e5m2.safetensors")
+correction_state = load_file("model-correction.safetensors") if os.path.exists("model-correction.safetensors") else {}
+# Reconstruct high-precision weights
 reconstructed = {}
 for key in fp8_state:
+    fp8_weight = fp8_state[key].to(torch.float32)
+    # Apply correction if available
+    correction_key = f"correction.{key}"
+    if correction_key in correction_state:
+        correction = correction_state[correction_key].to(torch.float32)
+        reconstructed[key] = fp8_weight + correction
     else:
+        reconstructed[key] = fp8_weight
+# Use reconstructed weights in your model
+model.load_state_dict(reconstructed)
 ```
+## Correction Modes
+- **Per-Channel**: Computes mean correction per output channel (best for most layers)
+- **Per-Tensor**: Single correction value per tensor (lightweight)
+- **None**: No correction (pure FP8)
+> Requires PyTorch ≥ 2.1 for FP8 support. For best quality, use the correction file during inference.