docs: add int8 variant to model card
Browse files
README.md
CHANGED
|
@@ -32,12 +32,17 @@ audio-event tags. One forward pass yields all output tokens.
|
|
| 32 |
|
| 33 |
## Files (3-stage pipeline)
|
| 34 |
|
| 35 |
-
| File | Precision | Compute unit | Role |
|
| 36 |
-
|------|-----------|--------------|------|
|
| 37 |
-
| `SenseVoicePreprocessor.mlmodelc` | FLOAT32 | CPU | front-end: waveform β 560-d LFR features |
|
| 38 |
-
| `SenseVoiceSmall.mlmodelc` | FLOAT16 | **`CPU_AND_NE` (ANE)** | **
|
| 39 |
-
| `
|
| 40 |
-
| `
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
Pipeline: `waveform β [Preprocessor, fp32/CPU] β features β [encoder+CTC, fp16/ANE] β logits β host greedy-CTC decode`.
|
| 43 |
|
|
|
|
| 32 |
|
| 33 |
## Files (3-stage pipeline)
|
| 34 |
|
| 35 |
+
| File | Precision | Compute unit | Size | Role |
|
| 36 |
+
|------|-----------|--------------|------|------|
|
| 37 |
+
| `SenseVoicePreprocessor.mlmodelc` | FLOAT32 | CPU | 3 MB | front-end: waveform β 560-d LFR features |
|
| 38 |
+
| `SenseVoiceSmall.mlmodelc` | FLOAT16 | **`CPU_AND_NE` (ANE)** | 447 MB | **default** encoder+CTC |
|
| 39 |
+
| `SenseVoiceSmall_int8.mlmodelc` | INT8 (weights) | `CPU_AND_NE` (ANE) | 225 MB | ~half size, accuracy-neutral |
|
| 40 |
+
| `SenseVoiceSmall_fp32.mlmodelc` | FLOAT32 | any | 897 MB | encoder fallback (non-ANE) |
|
| 41 |
+
| `vocab.json` | β | β | β | 25055 SentencePiece tokens (array form) |
|
| 42 |
+
|
| 43 |
+
**int8** is post-training weight quantization (`linear_symmetric`), accuracy-neutral
|
| 44 |
+
vs fp16 on the full canonical sets: LibriSpeech WER 3.27β3.22%, AISHELL CER 3.40β3.43%
|
| 45 |
+
(Ξ β€ 0.05 pp, 0 NaN on ANE). Pick it for ~half the on-disk/memory footprint.
|
| 46 |
|
| 47 |
Pipeline: `waveform β [Preprocessor, fp32/CPU] β features β [encoder+CTC, fp16/ANE] β logits β host greedy-CTC decode`.
|
| 48 |
|