FluidInference
/

campplus-coreml

Audio Classification

speaker-verification

speaker-diarization

Model card Files Files and versions

alexwengg commited on 11 days ago

Commit

daa02e7

·

verified ·

1 Parent(s): 0c813cd

docs: AISHELL EER 0.48%

Files changed (1) hide show

README.md +9 -1

README.md CHANGED Viewed

@@ -31,7 +31,15 @@ waveform → [Preprocessor fp32/CPU] → fbank [1,T,80]
 CAM++ normalizes the fbank internally. The 192-d embedding is used with cosine
 similarity for speaker verification and diarization clustering.
-Parity: torch↔CoreML embedding cosine 0.99998 (random) / 0.99999 (real audio via the preprocessor).
 ## License

 CAM++ normalizes the fbank internally. The 192-d embedding is used with cosine
 similarity for speaker verification and diarization clustering.
+## Benchmark — AISHELL-1 speaker verification
+| Metric | Value |
+|--------|-------|
+| **EER** | **0.48%** (20 speakers, 6000 same / 6000 diff trials) |
+| same-speaker cosine | 0.805 |
+| different-speaker cosine | 0.256 |
+AISHELL-1 (clean read Mandarin) is easier than the official CN-Celeb (~6-7%). CoreML↔torch embedding cosine 0.9997-0.99999.
 ## License