FluidInference
/

paraformer-large-zh-coreml

@@ -62,9 +62,16 @@ the encoder/decoder pad-masks from the **input tensor's seq dim** (so
 ## Benchmark — AISHELL-1 test (CoreML on ANE)
-| Metric | full-CoreML (ANE) | Official Paraformer-large |
-|--------|-------------------|---------------------------|
-| **CER** | **2.12%** (full test, 7,176 utts) | ~1.95% |
 Reproduces the published Paraformer-large AISHELL-1 number — confirming the
 conversion (front-end + encoder + CIF + decoder) is faithful.

 ## Benchmark — AISHELL-1 test (CoreML on ANE)
+Full test set (7,176 utts), full-CoreML pipeline on M5 Pro ANE:
+| Precision | size (enc+dec) | CER | median RTFx | peak RAM |
+|-----------|----------------|-----|-------------|----------|
+| fp16 (default) | 411 MB | **2.12%** | 85× | 0.38 GB |
+| int8 | 207 MB | **2.12%** | 84× | 0.24 GB |
+Official Paraformer-large AISHELL-1 ≈ 1.95% CER (the ~0.17 pp gap is fp16 + the
+fixed-shape decoder padding). int8 weight quantization is accuracy-neutral (CER
+unchanged), ~half the size/memory.
 Reproduces the published Paraformer-large AISHELL-1 number — confirming the
 conversion (front-end + encoder + CIF + decoder) is faithful.