Automatic Speech Recognition
Safetensors
Chinese
whisper
shaomei commited on
Commit
037c4fc
·
verified ·
1 Parent(s): c2cc81a

Update README.md

Browse files

Updates:
- permitted and out of scope use cases
- corresponding paper

Files changed (1) hide show
  1. README.md +10 -6
README.md CHANGED
@@ -23,23 +23,27 @@ The model was fine-tuned on the **AS-70: A Mandarin stuttered speech dataset** f
23
 
24
  ### Authors of the Dataset & Paper
25
 
26
- Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li
 
 
 
27
 
28
  ## Intended Uses
29
 
30
- - Transcribing Mandarin Chinese audio, particularly for speakers who stutter.
31
- - Research in speech therapy, clinical linguistics, or accessibility applications.
32
 
33
  ### Out-of-Scope Use
34
 
35
  - Non-Chinese languages or highly noisy audio.
36
  - Real-time transcription without optimization.
37
- - Sensitive or legal audio without human verification.
 
38
 
39
  ## Limitations & Risks
40
 
41
  - Accuracy may drop on fast speech, mixed-language speech, or heavy background noise.
42
- - Stuttering patterns may still cause transcription errors.
43
  - Not recommended to use as sole source for clinical or legal decisions.
44
 
45
  ## How to Use
@@ -81,4 +85,4 @@ model.to(device)
81
  ## Citation
82
 
83
  **Paper:**
84
- Gong, Rong, et al. "As-70: A mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection." arXiv preprint arXiv:2406.07256 (2024).
 
23
 
24
  ### Authors of the Dataset & Paper
25
 
26
+ - Dataset: Rong Gong, Hongfei Xue, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li
27
+ - Dataset paper: Gong, R., Xue, H., Wang, L., Xu, X., Li, Q., Xie, L., Bu, H., Wu, S., Zhou, J., Qin, Y., Zhang, B., Du, J., Bin, J., Li, M. (2024) AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection. Proc. Interspeech 2024, 5098-5102, doi: 10.21437/Interspeech.2024-918
28
+ - Fine-tuning paper: Jingjin Li, Qisheng Li, Rong Gong, Lezhi Wang, and Shaomei Wu. 2025. Our Collective Voices: The Social and Technical Values of a Grassroots Chinese Stuttered Speech Dataset. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). Association for Computing Machinery, New York, NY, USA, 2768–2783. https://doi.org/10.1145/3715275.3732179
29
+
30
 
31
  ## Intended Uses
32
 
33
+ - Transcribing Mandarin Chinese spoken language verbatim, particularly for speakers who stutter.
34
+ - Research in stuttering affirming speech therapy, clinical linguistics, or accessibility applications.
35
 
36
  ### Out-of-Scope Use
37
 
38
  - Non-Chinese languages or highly noisy audio.
39
  - Real-time transcription without optimization.
40
+ - Sensitive or legal audio without human verification.
41
+ - Other use cases that undermine the dignity and quality of life of people who stutter.
42
 
43
  ## Limitations & Risks
44
 
45
  - Accuracy may drop on fast speech, mixed-language speech, or heavy background noise.
46
+ - Stuttering is highly variable and heterogenous, certain stuttering patterns may still result in high transcription errors.
47
  - Not recommended to use as sole source for clinical or legal decisions.
48
 
49
  ## How to Use
 
85
  ## Citation
86
 
87
  **Paper:**
88
+ Jingjin Li, Qisheng Li, Rong Gong, Lezhi Wang, and Shaomei Wu. 2025. Our Collective Voices: The Social and Technical Values of a Grassroots Chinese Stuttered Speech Dataset. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). Association for Computing Machinery, New York, NY, USA, 2768–2783. https://doi.org/10.1145/3715275.3732179