Voice Activity Detection
ONNX
speech-processing
semantic-vad
multilingual
smart-turn-v3 / benchmarks /smart-turn-v3.1-gpu.md
marcus-daily
Smart Turn v3.1
16c8130

Endpointing Model Benchmark Report

Model: /data/smart-turn-v3.1-gpu.onnx

Generated: 2025-12-03 16:21:25 UTC

Accuracy Results

Total Samples: 31,473

Unique Languages: ๐Ÿ‡ธ๐Ÿ‡ฆ Arabic, ๐Ÿ‡ง๐Ÿ‡ฉ Bengali, ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ฉ๐Ÿ‡ช German, ๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English, ๐Ÿ‡ซ๐Ÿ‡ฎ Finnish, ๐Ÿ‡ซ๐Ÿ‡ท French, ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi, ๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian, ๐Ÿ‡ฎ๐Ÿ‡น Italian, ๐Ÿ‡ฏ๐Ÿ‡ต Japanese, ๐Ÿ‡ฐ๐Ÿ‡ท Korean, ๐Ÿ‡ฎ๐Ÿ‡ณ Marathi, ๐Ÿ‡ณ๐Ÿ‡ฑ Dutch, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian, ๐Ÿ‡ต๐Ÿ‡ฑ Polish, ๐Ÿ‡ต๐Ÿ‡น Portuguese, ๐Ÿ‡ท๐Ÿ‡บ Russian, ๐Ÿ‡ช๐Ÿ‡ธ Spanish, ๐Ÿ‡น๐Ÿ‡ท Turkish, ๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian, ๐Ÿ‡ป๐Ÿ‡ณ Vietnamese, ๐Ÿ‡จ๐Ÿ‡ณ Chinese

Unique Datasets: chirp3_1, chirp3_2, human_5, human_convcollector_1, liva_1, midcentury_1, mundo_1, orpheus_endfiller_1, orpheus_grammar_1, orpheus_midfiller_1, rime_2

Overall Performance

Metric Sample Count Accuracy (%) False Positives (%) False Negatives (%)
Overall 31,473 93.98 3.21 2.81

Performance by Language

Language Sample Count Accuracy (%) False Positives (%) False Negatives (%)
๐Ÿ‡ฏ๐Ÿ‡ต Japanese 834 98.08 0.84 1.08
๐Ÿ‡ฐ๐Ÿ‡ท Korean 890 97.98 0.79 1.24
๐Ÿ‡น๐Ÿ‡ท Turkish 966 97.52 1.24 1.24
๐Ÿ‡ณ๐Ÿ‡ฑ Dutch 1,401 96.79 1.57 1.64
๐Ÿ‡ฉ๐Ÿ‡ช German 1,322 96.37 2.42 1.21
๐Ÿ‡ซ๐Ÿ‡ท French 1,253 96.09 1.68 2.23
๐Ÿ‡ต๐Ÿ‡น Portuguese 1,398 95.99 2.15 1.86
๐Ÿ‡ซ๐Ÿ‡ฎ Finnish 1,010 95.74 2.18 2.08
๐Ÿ‡ต๐Ÿ‡ฑ Polish 976 95.59 2.56 1.84
๐Ÿ‡ฎ๐Ÿ‡ฉ Indonesian 971 95.57 2.47 1.96
๐Ÿ‡ฌ๐Ÿ‡ง ๐Ÿ‡บ๐Ÿ‡ธ English 7,722 95.55 2.55 1.90
๐Ÿ‡ฎ๐Ÿ‡น Italian 782 95.52 2.81 1.66
๐Ÿ‡ท๐Ÿ‡บ Russian 1,470 94.15 2.93 2.93
๐Ÿ‡ฉ๐Ÿ‡ฐ Danish 779 94.09 3.34 2.57
๐Ÿ‡บ๐Ÿ‡ฆ Ukrainian 929 93.97 2.58 3.44
๐Ÿ‡ณ๐Ÿ‡ด Norwegian 1,014 93.69 3.25 3.06
๐Ÿ‡ฎ๐Ÿ‡ณ Hindi 1,295 93.36 3.17 3.47
๐Ÿ‡ช๐Ÿ‡ธ Spanish 1,791 90.95 5.75 3.29
๐Ÿ‡จ๐Ÿ‡ณ Chinese 945 88.99 4.34 6.67
๐Ÿ‡ธ๐Ÿ‡ฆ Arabic 947 88.91 6.44 4.65
๐Ÿ‡ฎ๐Ÿ‡ณ Marathi 774 88.24 6.20 5.56
๐Ÿ‡ง๐Ÿ‡ฉ Bengali 1,000 85.10 7.10 7.80
๐Ÿ‡ป๐Ÿ‡ณ Vietnamese 1,004 81.87 9.86 8.27

Performance by Dataset

Dataset Sample Count Accuracy (%) False Positives (%) False Negatives (%)
midcentury_1 1,044 99.52 0.10 0.38
human_5 402 98.51 0.50 1.00
orpheus_endfiller_1 182 98.35 0.00 1.65
rime_2 396 97.98 0.25 1.77
liva_1 3,832 95.12 2.97 1.91
chirp3_1 16,300 95.10 2.58 2.32
orpheus_grammar_1 163 93.87 4.29 1.84
chirp3_2 8,428 90.76 4.84 4.40
human_convcollector_1 90 88.89 6.67 4.44
orpheus_midfiller_1 140 87.86 5.00 7.14
mundo_1 496 85.69 8.87 5.44