xycld commited on
Commit
62cb673
·
verified ·
1 Parent(s): 1d58fe0

Add quantization benchmark table to README

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -45,6 +45,20 @@ tokenizer_path = hf_hub_download("xycld/lyric-align-mms-fa", "tokenizer.json")
45
  - **Output:** log-probability emission matrix [num_frames, 29 labels] at 50fps (20ms/frame)
46
  - **ONNX opset:** 18
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  ## Attribution
49
 
50
  This is an ONNX format conversion of Meta's MMS forced alignment model, originally distributed via [torchaudio.pipelines.MMS_FA](https://pytorch.org/audio/stable/generated/torchaudio.pipelines.MMS_FA.html).
 
45
  - **Output:** log-probability emission matrix [num_frames, 29 labels] at 50fps (20ms/frame)
46
  - **ONNX opset:** 18
47
 
48
+ ## Quantized Variants
49
+
50
+ | Variant | Size | Compression | Load Time | Inference | MAE | Acc @50ms | Acc @100ms | Acc @200ms | Status |
51
+ |:--------|-----:|:-----------:|----------:|----------:|----:|----------:|-----------:|-----------:|:-------|
52
+ | **FP32** (original) | 1,207 MB | 1.0x | 989ms | 424ms/line | 34.7ms | 86.2% | 97.5% | 99.4% | Available |
53
+ | **FP16** | 605 MB | 2.0x | 2,335ms | 576ms/line | 34.7ms | 86.2% | 97.5% | 99.4% | Not recommended |
54
+ | **UINT8** | 303 MB | 4.0x | 412ms | 262ms/line | 34.9ms | 86.2% | 97.5% | 99.4% | **Recommended** |
55
+
56
+ > Benchmark: Chinese song "错位时空" (362 characters, 53 lines) on CPU.
57
+
58
+ **UINT8 is the recommended variant** — 75% smaller, 38% faster inference, with virtually no accuracy loss (MAE +0.2ms).
59
+
60
+ FP16 is not recommended for CPU inference (no native FP16 support, slower than FP32). INT8 (QInt8) is incompatible with some ONNX runtimes due to `ConvInteger` operator requirements.
61
+
62
  ## Attribution
63
 
64
  This is an ONNX format conversion of Meta's MMS forced alignment model, originally distributed via [torchaudio.pipelines.MMS_FA](https://pytorch.org/audio/stable/generated/torchaudio.pipelines.MMS_FA.html).