Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -52,7 +52,7 @@ FP8 quantized versions of the [LTX-2.3 22B](https://huggingface.co/Lightricks/LT
 - **Method:** Static per-tensor W8A8 quantization
 - **Scope:** Transformer blocks 1–42 (block 0 and last 5 blocks kept in BF16)
 - **Targets:** All linear projection weight matrices in `attn1`, `attn2`, `audio_attn1`, `audio_attn2`, `audio_to_video_attn`, `video_to_audio_attn`, `ff.net`, `audio_ff.net` — specifically `to_q`, `to_k`, `to_v`, `to_out.0`, `ff.net.0.proj`, `ff.net.2` and their audio equivalents
-- **Scale:** Per-tensor `input_scale = max(|W|) / 448` stored as F32 scalar (the reconstruction scale: `real_W = fp8_weight × input_scale`). Static `weight_scale = 1.0` — matches Lightricks' own fp8 convention exactly
 - **Non-quantized:** Biases, norms, scale_shift_tables, gate_logits kept as BF16/F32
 - **Quantized tensors:** 1176 / 5947 total (28 patterns × 42 blocks)
 - **Output size:** ~29.94 GB (down from ~46 GB BF16)

 - **Method:** Static per-tensor W8A8 quantization
 - **Scope:** Transformer blocks 1–42 (block 0 and last 5 blocks kept in BF16)
 - **Targets:** All linear projection weight matrices in `attn1`, `attn2`, `audio_attn1`, `audio_attn2`, `audio_to_video_attn`, `video_to_audio_attn`, `ff.net`, `audio_ff.net` — specifically `to_q`, `to_k`, `to_v`, `to_out.0`, `ff.net.0.proj`, `ff.net.2` and their audio equivalents
+- **Scale:** Per-tensor `weight_scale = max(|W|) / 448` stored as F32 scalar alongside each weight. Static `input_scale = 1.0` placeholder matching the source model format
 - **Non-quantized:** Biases, norms, scale_shift_tables, gate_logits kept as BF16/F32
 - **Quantized tensors:** 1176 / 5947 total (28 patterns × 42 blocks)
 - **Output size:** ~29.94 GB (down from ~46 GB BF16)