metadata
library_name: diffusers
tags:
- fp8
- safetensors
- lora
- low-rank
- diffusion
- converted-by-gradio
FP8 Model with Low-Rank LoRA
- Source:
https://huggingface.co/LifuWang/DistillT5 - File:
model.safetensors - FP8 Format:
E5M2 - LoRA Rank: 128
- Architecture: text_encoder
- LoRA File:
model-lora-r128.safetensors - FP8 File:
model-fp8-e5m2.safetensors
Usage (Inference)
from safetensors.torch import load_file
import torch
# Load FP8 model
fp8_state = load_file("model-fp8-e5m2.safetensors")
lora_state = load_file("model-lora-r128.safetensors")
# Reconstruct approximate original weights
reconstructed = {}
for key in fp8_state:
if f"lora_A.{key}" in lora_state and f"lora_B.{key}" in lora_state:
A = lora_state[f"lora_A.{key}"].to(torch.float32)
B = lora_state[f"lora_B.{key}"].to(torch.float32)
lora_weight = B @ A # (out_features, rank) @ (rank, in_features) -> (out_features, in_features)
fp8_weight = fp8_state[key].to(torch.float32)
reconstructed[key] = fp8_weight + lora_weight
else:
reconstructed[key] = fp8_state[key].to(torch.float32)
Requires PyTorch ≥ 2.1 for FP8 support.