alexwengg commited on
Commit
daa02e7
·
verified ·
1 Parent(s): 0c813cd

docs: AISHELL EER 0.48%

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -31,7 +31,15 @@ waveform → [Preprocessor fp32/CPU] → fbank [1,T,80]
31
  CAM++ normalizes the fbank internally. The 192-d embedding is used with cosine
32
  similarity for speaker verification and diarization clustering.
33
 
34
- Parity: torch↔CoreML embedding cosine 0.99998 (random) / 0.99999 (real audio via the preprocessor).
 
 
 
 
 
 
 
 
35
 
36
  ## License
37
 
 
31
  CAM++ normalizes the fbank internally. The 192-d embedding is used with cosine
32
  similarity for speaker verification and diarization clustering.
33
 
34
+ ## Benchmark AISHELL-1 speaker verification
35
+
36
+ | Metric | Value |
37
+ |--------|-------|
38
+ | **EER** | **0.48%** (20 speakers, 6000 same / 6000 diff trials) |
39
+ | same-speaker cosine | 0.805 |
40
+ | different-speaker cosine | 0.256 |
41
+
42
+ AISHELL-1 (clean read Mandarin) is easier than the official CN-Celeb (~6-7%). CoreML↔torch embedding cosine 0.9997-0.99999.
43
 
44
  ## License
45