codemichaeld
/

T5Base_fp8

@@ -3,48 +3,36 @@ library_name: diffusers
 tags:
 - fp8
 - safetensors
-- quantization
-- precision-recovery
 - diffusion
 - converted-by-gradio
 ---
-# FP8 Model with Precision Recovery
 - **Source**: `https://huggingface.co/LifuWang/DistillT5`
 - **File**: `model.safetensors`
 - **FP8 Format**: `E5M2`
-- **Correction Mode**: per_channel
-- **Correction File**: `model-correction.safetensors`
 - **FP8 File**: `model-fp8-e5m2.safetensors`
 ## Usage (Inference)
 ```python
 from safetensors.torch import load_file
 import torch
-# Load FP8 model and correction factors
 fp8_state = load_file("model-fp8-e5m2.safetensors")
-correction_state = load_file("model-correction.safetensors") if os.path.exists("model-correction.safetensors") else {}
-# Reconstruct high-precision weights
 reconstructed = {}
 for key in fp8_state:
-    fp8_weight = fp8_state[key].to(torch.float32)
-    # Apply correction if available
-    correction_key = f"correction.{key}"
-    if correction_key in correction_state:
-        correction = correction_state[correction_key].to(torch.float32)
-        reconstructed[key] = fp8_weight + correction
     else:
-        reconstructed[key] = fp8_weight
-# Use reconstructed weights in your model
-model.load_state_dict(reconstructed)
 ```
-## Correction Modes
-- **Per-Channel**: Computes mean correction per output channel (best for most layers)
-- **Per-Tensor**: Single correction value per tensor (lightweight)
-- **None**: No correction (pure FP8)
-> Requires PyTorch ≥ 2.1 for FP8 support. For best quality, use the correction file during inference.

 tags:
 - fp8
 - safetensors
+- lora
+- low-rank
 - diffusion
 - converted-by-gradio
 ---
+# FP8 Model with Low-Rank LoRA
 - **Source**: `https://huggingface.co/LifuWang/DistillT5`
 - **File**: `model.safetensors`
 - **FP8 Format**: `E5M2`
+- **LoRA Rank**: 128
+- **Architecture**: text_encoder
+- **LoRA File**: `model-lora-r128.safetensors`
 - **FP8 File**: `model-fp8-e5m2.safetensors`
 ## Usage (Inference)
 ```python
 from safetensors.torch import load_file
 import torch
+# Load FP8 model
 fp8_state = load_file("model-fp8-e5m2.safetensors")
+lora_state = load_file("model-lora-r128.safetensors")
+# Reconstruct approximate original weights
 reconstructed = {}
 for key in fp8_state:
+    if f"lora_A.{key}" in lora_state and f"lora_B.{key}" in lora_state:
+        A = lora_state[f"lora_A.{key}"].to(torch.float32)
+        B = lora_state[f"lora_B.{key}"].to(torch.float32)
+        lora_weight = B @ A  # (out_features, rank) @ (rank, in_features) -> (out_features, in_features)
+        fp8_weight = fp8_state[key].to(torch.float32)
+        reconstructed[key] = fp8_weight + lora_weight
     else:
+        reconstructed[key] = fp8_state[key].to(torch.float32)
 ```
+> Requires PyTorch ≥ 2.1 for FP8 support.