Upload README_zh.md
Browse files- README_zh.md +373 -0
README_zh.md
ADDED
|
@@ -0,0 +1,373 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<div align="center">
|
| 2 |
+
|
| 3 |
+
<img src="./TenSense/image/Baiji_Team.png" alt="Baiji Team Logo" width="1000" height="500"/>
|
| 4 |
+
|
| 5 |
+
<br/>
|
| 6 |
+
|
| 7 |
+
# TurnSense
|
| 8 |
+
|
| 9 |
+
### 🎯 轻量 · 精准 · 三分类 — 重新定义语音轮次判别
|
| 10 |
+
|
| 11 |
+
<br/>
|
| 12 |
+
|
| 13 |
+
<center><strong>47M 参数 | CPU 延迟 ~55ms | F1 高达 96.35% | 无效语义过滤</strong></center>
|
| 14 |
+
|
| 15 |
+
<br/>
|
| 16 |
+
|
| 17 |
+
[](https://github.com/Baiji-Team/TurnSense)
|
| 18 |
+
[](https://huggingface.co/Baiji-Team/TurnSense)
|
| 19 |
+
[](./LICENSE)
|
| 20 |
+
[](https://github.com/Baiji-Team/TurnSense)
|
| 21 |
+
|
| 22 |
+
</div>
|
| 23 |
+
|
| 24 |
+
<br/>
|
| 25 |
+
|
| 26 |
+
**语言**: [English](./README.md) | **中文**
|
| 27 |
+
|
| 28 |
+
<br/>
|
| 29 |
+
|
| 30 |
+
> **⭐ 如果 TurnSense 对你有帮助,请给我们一个 Star!** 这将帮助我们持续改进模型与文档。
|
| 31 |
+
|
| 32 |
+
<br/>
|
| 33 |
+
|
| 34 |
+
## 📖 目录
|
| 35 |
+
|
| 36 |
+
- [为什么选择 TurnSense](#-为什么选择-turnsense)
|
| 37 |
+
- [项目简介](#-项目简介)
|
| 38 |
+
- [核心特性](#-核心特性)
|
| 39 |
+
- [模型参数量对比](#-模型参数量对比)
|
| 40 |
+
- [基准测试结果](#-基准测试结果)
|
| 41 |
+
- [快速开始](#-快速开始)
|
| 42 |
+
- [评测说明](#-评测说明)
|
| 43 |
+
- [引用](#-引用)
|
| 44 |
+
- [问题与交流](#-问题与交流)
|
| 45 |
+
- [许可证](#-许可证)
|
| 46 |
+
|
| 47 |
+
<br/>
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
+
<br/>
|
| 52 |
+
|
| 53 |
+
## 🏆 为什么选择 TurnSense
|
| 54 |
+
|
| 55 |
+
<div align="center">
|
| 56 |
+
|
| 57 |
+
| 维度 | TurnSense 表现 |
|
| 58 |
+
| :---: | :---: |
|
| 59 |
+
| 🎯 **准确率** | F1 **96.35%**(easyturn_real_test_ZH)— 同类最优 |
|
| 60 |
+
| ⚡ **推理延迟** | CPU p50 ≈ **54.65ms** — 满足实时交互需求 |
|
| 61 |
+
| 📦 **模型体积** | 仅 **47M** 参数,INT8 版本仅 **~50MB** |
|
| 62 |
+
| 🧠 **分类能力** | 业内首个同时支持 **complete / incomplete / invalid** 三分类 |
|
| 63 |
+
| 🚫 **无效过滤** | 无效语义 F1 达 **94.34%**,有效抑制噪声误触发 |
|
| 64 |
+
| 🤗 **开源友好** | 提供 FP32 / INT8 ONNX,开箱即用 |
|
| 65 |
+
|
| 66 |
+
</div>
|
| 67 |
+
|
| 68 |
+
<br/>
|
| 69 |
+
|
| 70 |
+
---
|
| 71 |
+
|
| 72 |
+
<br/>
|
| 73 |
+
|
| 74 |
+
## 📌 项目简介
|
| 75 |
+
|
| 76 |
+
**TurnSense** 是一个面向人机语音交互场景的 **三分类语义判别模型**,专注解决对话系统中一个核心问题:
|
| 77 |
+
|
| 78 |
+
> **用户说话过程中,系统应该立即响应,还是继续等待?**
|
| 79 |
+
|
| 80 |
+
传统方案通常只做"是否结束"的二分类判断。**TurnSense 更进一步** — 它同时建模语义完整度与无效输入识别,帮助系统在复杂真实场景下实现更自然的轮次衔接,**大幅减少误打断、抢话和无效触发**。
|
| 81 |
+
|
| 82 |
+
<div align="center">
|
| 83 |
+
<img src="./TenSense/image/TurnSense.svg" alt="TurnSense 三分类示意图" width="820"/>
|
| 84 |
+
</div>
|
| 85 |
+
|
| 86 |
+
<br/>
|
| 87 |
+
|
| 88 |
+
TurnSense 将用户输入划分为三种语义状态:
|
| 89 |
+
|
| 90 |
+
| 状态 | 含义 | 示例 |
|
| 91 |
+
| :---: | :--- | :--- |
|
| 92 |
+
| ✅ **完整语义 (complete)** | 用户表达已形成完整意图,系统可以响应 | `"帮我查一下明天上海天气。"` |
|
| 93 |
+
| ⏳ **不完整语义 (incomplete)** | 用户表达尚未完成,存在截断或停顿后续 | `"我想问一下那个订单就是昨天……"` |
|
| 94 |
+
| 🔇 **无效语义 (invalid)** | 输入不构成有效语义,不应触发响应 | `"…(持续噪声 / 非语义发声)"` |
|
| 95 |
+
|
| 96 |
+
这三类标签让系统不仅能判断 **"是否该接话"**,还能识别 **"是否值得接话"**,从而在语音助手、实时通话、智能客服等场景中显著提升交互自然度与系统稳定性。
|
| 97 |
+
|
| 98 |
+
<br/>
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
+
|
| 102 |
+
<br/>
|
| 103 |
+
|
| 104 |
+
## ✨ 核心特性
|
| 105 |
+
|
| 106 |
+
### 🧠 语义级三分类
|
| 107 |
+
|
| 108 |
+
同时建模 `complete / incomplete / invalid` 三种状态,比传统二分类更贴近真实对话行为,也是目前**开源模型中唯一原生支持无效语义检测**的方案。
|
| 109 |
+
|
| 110 |
+
### ⚡ 极致轻量,极速推理
|
| 111 |
+
|
| 112 |
+
仅 **47M** 参数(INT8 版本约 50MB),CPU 环境下推理延迟 p50 ≈ **54.65ms**、p90 ≈ **58.00ms** — 无需 GPU 即可满足实时交互的严苛要求。
|
| 113 |
+
|
| 114 |
+
### 🎯 精度领先
|
| 115 |
+
|
| 116 |
+
在 easyturn_real_test_ZH(300 条)上取得 **F1 96.35%**(complete)和 **F1 96.32%**(incomplete),在 semantic_test_ZH(2000 条)上取得 **F1 92.30%**(complete)和 **F1 91.62%**(incomplete),均为同类最优或次优水平。
|
| 117 |
+
|
| 118 |
+
### 🚫 无效输入过滤
|
| 119 |
+
|
| 120 |
+
在 NonverbalVocalization 测试集上,无效语义识别的 precision 达 **100%**、recall 达 **90.37%**(F1 = 94.34%),有效抑制非语义发声和噪声带来的误触发。
|
| 121 |
+
|
| 122 |
+
### ⚖️ 更稳健的轮次决策
|
| 123 |
+
|
| 124 |
+
在语义模糊、停顿或口语化表达场景下,兼顾 precision 与 recall,减少过早响应和漏响应。
|
| 125 |
+
|
| 126 |
+
### 📊 可复现评测体系
|
| 127 |
+
|
| 128 |
+
配套完整评测流程与脚本,支持统一指标对比、性能回归分析,确保实验可复现。
|
| 129 |
+
|
| 130 |
+
### 🤗 开源友好,即插即用
|
| 131 |
+
|
| 132 |
+
标准化仓库结构,提供 FP32 / INT8 ONNX 模型,从安装到推理只需几分钟。
|
| 133 |
+
|
| 134 |
+
<br/>
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
<br/>
|
| 139 |
+
|
| 140 |
+
## 📐 模型参数量对比
|
| 141 |
+
|
| 142 |
+
<div align="center">
|
| 143 |
+
|
| 144 |
+
| 模型 | 参数量 | 三分类 | 链接 |
|
| 145 |
+
| :--- | :---: | :---: | :--- |
|
| 146 |
+
| TEN-Turn | **7B**(70 亿) | ❌ | [TEN-framework/TEN_Turn_Detection](https://huggingface.co/TEN-framework/TEN_Turn_Detection) |
|
| 147 |
+
| Easy-Turn | 850M | ❌ | [ASLP-lab/Easy-Turn](https://huggingface.co/ASLP-lab/Easy-Turn) |
|
| 148 |
+
| NAMO-Turn-Detector (ZH) | 66M | ❌ | [videosdk-live/Namo-Turn-Detector-v1-Multilingual](https://huggingface.co/videosdk-live/Namo-Turn-Detector-v1-Multilingual) |
|
| 149 |
+
| **⭐ TurnSense** | **47M** | **✅** | [**Baiji-Team/TurnSense**](https://huggingface.co/Baiji-Team/TurnSense) |
|
| 150 |
+
| Smart-Turn-v3 | 8M | ❌ | [pipecat-ai/smart-turn-v3](https://huggingface.co/pipecat-ai/smart-turn-v3) |
|
| 151 |
+
| FireRedChat-turn-detector | -- | ❌ | [FireRedTeam/FireRedChat-turn-detector](https://huggingface.co/FireRedTeam/FireRedChat-turn-detector) |
|
| 152 |
+
|
| 153 |
+
</div>
|
| 154 |
+
|
| 155 |
+
> 💡 TurnSense 以仅 **47M** 的参数量实现了三分类能力,在精度与体积之间取得了最优平衡。
|
| 156 |
+
|
| 157 |
+
<br/>
|
| 158 |
+
|
| 159 |
+
---
|
| 160 |
+
|
| 161 |
+
<br/>
|
| 162 |
+
|
| 163 |
+
## 📊 基准测试结果
|
| 164 |
+
|
| 165 |
+
> 以下所有结果均基于开源中文评测集。延迟标注 `(GPU)` 表示 GPU 环境评测,未标注则为 **CPU 环境**。
|
| 166 |
+
|
| 167 |
+
<br/>
|
| 168 |
+
|
| 169 |
+
### 📋 easyturn_real_test_ZH(300 条)
|
| 170 |
+
|
| 171 |
+
> 数据来源:[Easy-Turn-Testset](https://huggingface.co/datasets/ASLP-lab/Easy-Turn-Testset) 真实数据样本
|
| 172 |
+
|
| 173 |
+
| 模型 | P (complete) | R (complete) | **F1 (complete)** | P (incomplete) | R (incomplete) | **F1 (incomplete)** | p50 延迟 | p90 延迟 |
|
| 174 |
+
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 175 |
+
| Easy-Turn | 97.26% | 94.67% | 95.95% | 94.81% | 97.33% | 96.05% | 183.87 (GPU) | 300.37 (GPU) |
|
| 176 |
+
| Smart-Turn-v3 | 64.97% | 76.67% | 70.34% | 71.54% | 58.67% | 64.47% | 36.84 | 39.10 |
|
| 177 |
+
| TEN-Turn | **99.25%** | 88.00% | 93.29% | 89.22% | **99.33%** | 94.01% | 17.66 (GPU) | 19.41 (GPU) |
|
| 178 |
+
| FireRedChat | 70.65% | 94.67% | 80.91% | 91.92% | 60.67% | 73.09% | 98.30 | 99.42 |
|
| 179 |
+
| NAMO-Turn | 81.53% | 85.33% | 83.39% | 84.62% | 80.67% | 82.59% | 3.60 | 83.44 |
|
| 180 |
+
| **⭐ TurnSense** | 96.03% | **96.67%** | **🏆 96.35%** | **96.64%** | 96.00% | **🏆 96.32%** | 54.65 | 58.00 |
|
| 181 |
+
|
| 182 |
+
> **🔍 关键发现:** TurnSense 在 complete 和 incomplete 两类上均取得 **最高 F1**,且是唯一在 CPU 上 p50 < 60ms 同时 F1 > 96% 的模型。
|
| 183 |
+
|
| 184 |
+
<br/>
|
| 185 |
+
|
| 186 |
+
### 📋 semantic_test_ZH(2000 条)
|
| 187 |
+
|
| 188 |
+
> 数据来源:[KE-Team/SemanticVAD-Dataset](https://huggingface.co/datasets/KE-Team/SemanticVAD-Dataset) 中文测试集
|
| 189 |
+
|
| 190 |
+
| 模型 | P (complete) | R (complete) | **F1 (complete)** | P (incomplete) | R (incomplete) | **F1 (incomplete)** | p50 延迟 | p90 延迟 |
|
| 191 |
+
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
| 192 |
+
| Easy-Turn | 78.14% | 98.30% | 87.07% | 97.64% | 70.30% | 81.74% | 183.87 (GPU) | 300.37 (GPU) |
|
| 193 |
+
| Smart-Turn-v3 | 59.25% | 88.10% | 70.85% | 76.80% | 39.40% | 52.08% | 36.84 | 39.10 |
|
| 194 |
+
| TEN-Turn | 85.25% | **99.60%** | 91.87% | **99.52%** | 82.70% | 90.33% | 17.66 (GPU) | 19.41 (GPU) |
|
| 195 |
+
| FireRedChat | 66.76% | 99.40% | 79.87% | 98.83% | 50.50% | 66.84% | 98.30 | 99.42 |
|
| 196 |
+
| NAMO-Turn | 71.48% | 86.70% | 78.36% | 83.10% | 65.40% | 73.20% | 3.60 | 83.44 |
|
| 197 |
+
| **⭐ TurnSense** | **88.96%** | 95.90% | **🏆 92.30%** | 95.55% | **88.00%** | **🏆 91.62%** | 54.65 | 58.00 |
|
| 198 |
+
|
| 199 |
+
> **🔍 关键发现:** 在 2000 条的大规模测试集上,TurnSense 依然保持 F1 最优,验证了模型的泛化能力。
|
| 200 |
+
|
| 201 |
+
<br/>
|
| 202 |
+
|
| 203 |
+
### 📋 NonverbalVocalization_invalid(728 条)
|
| 204 |
+
|
| 205 |
+
> 数据来源:OpenSLR [Deeply Nonverbal Vocalization Dataset(SLR99)](https://openslr.elda.org/99/)
|
| 206 |
+
|
| 207 |
+
| 模型 | P (invalid) | R (invalid) | **F1 (invalid)** |
|
| 208 |
+
| :--- | :---: | :---: | :---: |
|
| 209 |
+
| **⭐ TurnSense** | **100.00%** | **90.37%** | **🏆 94.34%** |
|
| 210 |
+
|
| 211 |
+
> **🔍 关键发现:** 目前仅 TurnSense 支持无效语义判别。precision 达到 **100%** 意味着零误报,有效防止噪声触发系统响应。
|
| 212 |
+
|
| 213 |
+
<br/>
|
| 214 |
+
|
| 215 |
+
---
|
| 216 |
+
|
| 217 |
+
<br/>
|
| 218 |
+
|
| 219 |
+
## 🚀 快速开始
|
| 220 |
+
|
| 221 |
+
### 1. 安装
|
| 222 |
+
|
| 223 |
+
```bash
|
| 224 |
+
git clone https://github.com/Baiji-Team/TurnSense.git
|
| 225 |
+
cd TurnSense
|
| 226 |
+
|
| 227 |
+
pip install -U numpy onnxruntime torch librosa soundfile pandas scikit-learn huggingface_hub
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
+
### 2. 获取模型权重
|
| 231 |
+
|
| 232 |
+
TurnSense 模型权重已发布在 Hugging Face:[Baiji-Team/TurnSense](https://huggingface.co/Baiji-Team/TurnSense)
|
| 233 |
+
|
| 234 |
+
| 版本 | 体积 | 适用场景 |
|
| 235 |
+
| :--- | :--- | :--- |
|
| 236 |
+
| FP32 | ~191 MB | 精度优先 |
|
| 237 |
+
| INT8 | ~50 MB | 部署优先(推荐) |
|
| 238 |
+
|
| 239 |
+
**下载方式:**
|
| 240 |
+
|
| 241 |
+
**方式一:自动下载(推荐)**
|
| 242 |
+
推理脚本内置了 Hugging Face 下载逻辑,首次运行时会自动拉取并缓存模型。
|
| 243 |
+
|
| 244 |
+
**方式二:Git LFS**
|
| 245 |
+
|
| 246 |
+
```bash
|
| 247 |
+
git lfs install
|
| 248 |
+
git clone https://huggingface.co/Baiji-Team/TurnSense
|
| 249 |
+
```
|
| 250 |
+
|
| 251 |
+
**方式三:Hugging Face Hub**
|
| 252 |
+
|
| 253 |
+
```python
|
| 254 |
+
from huggingface_hub import snapshot_download
|
| 255 |
+
snapshot_download(repo_id="Baiji-Team/TurnSense")
|
| 256 |
+
```
|
| 257 |
+
|
| 258 |
+
### 3. 推理
|
| 259 |
+
|
| 260 |
+
```bash
|
| 261 |
+
python infer.py
|
| 262 |
+
```
|
| 263 |
+
|
| 264 |
+
示例输出:
|
| 265 |
+
|
| 266 |
+
```
|
| 267 |
+
Loading model from Baiji-Team/TurnSense...
|
| 268 |
+
Running inference on: "我想问一下那个订单就是昨天..."
|
| 269 |
+
|
| 270 |
+
Results:
|
| 271 |
+
Input: "我���问一下那个订单就是昨天..."
|
| 272 |
+
TurnSense Detection Result: "incomplete"
|
| 273 |
+
```
|
| 274 |
+
|
| 275 |
+
<br/>
|
| 276 |
+
|
| 277 |
+
---
|
| 278 |
+
|
| 279 |
+
<br/>
|
| 280 |
+
|
| 281 |
+
## 🧪 评测说明
|
| 282 |
+
|
| 283 |
+
### 1)评测流程
|
| 284 |
+
|
| 285 |
+
1. 读取 `.jsonl` 格式的测试数据集
|
| 286 |
+
2. 每个模型先进行预热(默认 `warmup_iters=20`)
|
| 287 |
+
3. 逐样本推理,统计分类指标与性能指标
|
| 288 |
+
4. 自动输出汇总与明细文件
|
| 289 |
+
|
| 290 |
+
输出文件包括:
|
| 291 |
+
|
| 292 |
+
| 文件 | 说明 |
|
| 293 |
+
| :--- | :--- |
|
| 294 |
+
| `report.md` | 汇总评测报告 |
|
| 295 |
+
| `results.json` | 结构化评测结果 |
|
| 296 |
+
| `config.json` | 本次评测配置 |
|
| 297 |
+
| `per_sample__*.jsonl` | 逐条样本预测结果 |
|
| 298 |
+
|
| 299 |
+
### 2)数据格式要求(JSONL)
|
| 300 |
+
|
| 301 |
+
每一行是一个 JSON 对象,至少包含以下字段:
|
| 302 |
+
|
| 303 |
+
| 字段 | 说明 |
|
| 304 |
+
| :--- | :--- |
|
| 305 |
+
| `audio_path` | 音频文件路径 |
|
| 306 |
+
| `text` | 文本内容 |
|
| 307 |
+
| `label` | 标签(`complete` / `incomplete` / `invalid`) |
|
| 308 |
+
|
| 309 |
+
示例:
|
| 310 |
+
|
| 311 |
+
```jsonl
|
| 312 |
+
{"audio_path":"/001.wav","text":"帮我查一下明天上海天气","label":"complete"}
|
| 313 |
+
{"audio_path":"/002.wav","text":"我想问一下那个订单就是昨天...","label":"incomplete"}
|
| 314 |
+
{"audio_path":"/003.wav","text":"啊…嗯…(持续噪声)","label":"invalid"}
|
| 315 |
+
```
|
| 316 |
+
|
| 317 |
+
### 3)运行评测
|
| 318 |
+
|
| 319 |
+
```bash
|
| 320 |
+
python TurnSense/Turn_benchmark/benchmark.py
|
| 321 |
+
```
|
| 322 |
+
|
| 323 |
+
<br/>
|
| 324 |
+
|
| 325 |
+
---
|
| 326 |
+
|
| 327 |
+
<br/>
|
| 328 |
+
|
| 329 |
+
## 📚 引用
|
| 330 |
+
|
| 331 |
+
如果你在研究或产品中使用了 TurnSense,请引用:
|
| 332 |
+
|
| 333 |
+
```bibtex
|
| 334 |
+
@misc{turnsense2026,
|
| 335 |
+
author = {Baiji Team},
|
| 336 |
+
title = {TurnSense: A Three-Class Semantic Detection Model for Complete, Incomplete, and Invalid Utterances},
|
| 337 |
+
year = {2026},
|
| 338 |
+
publisher = {Hugging Face},
|
| 339 |
+
howpublished = {\url{https://huggingface.co/Baiji-Team/TurnSense}},
|
| 340 |
+
}
|
| 341 |
+
```
|
| 342 |
+
|
| 343 |
+
<br/>
|
| 344 |
+
|
| 345 |
+
<br/>
|
| 346 |
+
|
| 347 |
+
## ❓ 问题与交流
|
| 348 |
+
|
| 349 |
+
如果有问题或改进建议,欢迎通过以下方式联系我们:
|
| 350 |
+
|
| 351 |
+
| 渠道 | 联系方式 |
|
| 352 |
+
| :--- | :--- |
|
| 353 |
+
| 📧 邮箱 | [huan.shen@brgroup.com](mailto:huan.shen@brgroup.com) ・ [yingao.wang@brgroup.com](mailto:yingao.wang@brgroup.com) ・ [wei.zou@brgroup.com](mailto:wei.zou@brgroup.com) |
|
| 354 |
+
| 💬 微信 | h2538406363 |
|
| 355 |
+
| 👥 微信群聊 | 扫码加入群聊<br><img src="TenSense/image/wechat.jpg" alt="微信群聊二维码" width="220" /> |
|
| 356 |
+
| 🐛 Issues | [GitHub Issues](https://github.com/Baiji-Team/TurnSense/issues) |
|
| 357 |
+
| 🔀 PR | [Pull Requests](https://github.com/Baiji-Team/TurnSense/pulls) |
|
| 358 |
+
|
| 359 |
+
<br/>
|
| 360 |
+
|
| 361 |
+
## 📄 许可证
|
| 362 |
+
|
| 363 |
+
本项目基于 **Apache License 2.0** 发布,并附加特定限制条件。详情请参见 [LICENSE](./LICENSE)。
|
| 364 |
+
|
| 365 |
+
<br/>
|
| 366 |
+
|
| 367 |
+
---
|
| 368 |
+
|
| 369 |
+
<div align="center">
|
| 370 |
+
|
| 371 |
+
**由 [Baiji Team](https://github.com/Baiji-Team) 用 ❤️ 打造**
|
| 372 |
+
|
| 373 |
+
</div>
|