Add quantization benchmark table to README
Browse files
README.md
CHANGED
|
@@ -45,6 +45,20 @@ tokenizer_path = hf_hub_download("xycld/lyric-align-mms-fa", "tokenizer.json")
|
|
| 45 |
- **Output:** log-probability emission matrix [num_frames, 29 labels] at 50fps (20ms/frame)
|
| 46 |
- **ONNX opset:** 18
|
| 47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
## Attribution
|
| 49 |
|
| 50 |
This is an ONNX format conversion of Meta's MMS forced alignment model, originally distributed via [torchaudio.pipelines.MMS_FA](https://pytorch.org/audio/stable/generated/torchaudio.pipelines.MMS_FA.html).
|
|
|
|
| 45 |
- **Output:** log-probability emission matrix [num_frames, 29 labels] at 50fps (20ms/frame)
|
| 46 |
- **ONNX opset:** 18
|
| 47 |
|
| 48 |
+
## Quantized Variants
|
| 49 |
+
|
| 50 |
+
| Variant | Size | Compression | Load Time | Inference | MAE | Acc @50ms | Acc @100ms | Acc @200ms | Status |
|
| 51 |
+
|:--------|-----:|:-----------:|----------:|----------:|----:|----------:|-----------:|-----------:|:-------|
|
| 52 |
+
| **FP32** (original) | 1,207 MB | 1.0x | 989ms | 424ms/line | 34.7ms | 86.2% | 97.5% | 99.4% | Available |
|
| 53 |
+
| **FP16** | 605 MB | 2.0x | 2,335ms | 576ms/line | 34.7ms | 86.2% | 97.5% | 99.4% | Not recommended |
|
| 54 |
+
| **UINT8** | 303 MB | 4.0x | 412ms | 262ms/line | 34.9ms | 86.2% | 97.5% | 99.4% | **Recommended** |
|
| 55 |
+
|
| 56 |
+
> Benchmark: Chinese song "错位时空" (362 characters, 53 lines) on CPU.
|
| 57 |
+
|
| 58 |
+
**UINT8 is the recommended variant** — 75% smaller, 38% faster inference, with virtually no accuracy loss (MAE +0.2ms).
|
| 59 |
+
|
| 60 |
+
FP16 is not recommended for CPU inference (no native FP16 support, slower than FP32). INT8 (QInt8) is incompatible with some ONNX runtimes due to `ConvInteger` operator requirements.
|
| 61 |
+
|
| 62 |
## Attribution
|
| 63 |
|
| 64 |
This is an ONNX format conversion of Meta's MMS forced alignment model, originally distributed via [torchaudio.pipelines.MMS_FA](https://pytorch.org/audio/stable/generated/torchaudio.pipelines.MMS_FA.html).
|