Automatic Speech Recognition
Safetensors
Chinese
whisper
kexinf1 commited on
Commit
db8340d
·
verified ·
1 Parent(s): 2f2d8cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -14,7 +14,7 @@ This model is a version of OpenAI's `whisper-large-v2` fine-tuned on the **AImpo
14
 
15
  This model is specifically adapted to provide more accurate and authentic transcriptions for Mandarin-speaking PWS.
16
  Standard Automatic Speech Recognition (ASR) models often exhibit "fluency bias," where they "smoothen" out or delete stuttered speech patterns like repetitions and interjections.
17
- This model was fine-tuned on **literal transcriptions** that intentionally preserve these disfluencies.
18
 
19
  The primary goal is to create a more inclusive ASR system that recognizes and respects the natural speech patterns of PWS, reducing deletion errors and improving overall accuracy.
20
 
@@ -29,7 +29,7 @@ This model is intended for transcribing conversational Mandarin Chinese speech f
29
 
30
  ### Limitations
31
 
32
- * **Language Specificity:** The model is trained exclusively on Mandarin Chinese and is not intended for other languages.
33
  * **Data Specificity:** Performance is optimized for speech patterns present in the AImpower/MandarinStutteredSpeech dataset. It may not perform as well on other types of atypical speech or in environments with significant background noise.
34
  * **Variability:** Stuttering is highly variable. While the model shows significant improvements across severity levels, accuracy may still vary between individuals and contexts.
35
 
@@ -86,7 +86,7 @@ This dataset was created through a community-led, grassroots effort with Stammer
86
  * **Optimizer:** AdamW
87
  * **Batch Size:** 16
88
  * **Fine-tuning Method:** AdaLora
89
-
90
  -----
91
 
92
  ## Evaluation Results
 
14
 
15
  This model is specifically adapted to provide more accurate and authentic transcriptions for Mandarin-speaking PWS.
16
  Standard Automatic Speech Recognition (ASR) models often exhibit "fluency bias," where they "smoothen" out or delete stuttered speech patterns like repetitions and interjections.
17
+ This model was fine-tuned on literal transcriptions that intentionally preserve these disfluencies.
18
 
19
  The primary goal is to create a more inclusive ASR system that recognizes and respects the natural speech patterns of PWS, reducing deletion errors and improving overall accuracy.
20
 
 
29
 
30
  ### Limitations
31
 
32
+ * **Language Specificity:** The model is fine-tuned exclusively on Mandarin Chinese and is not intended for other languages.
33
  * **Data Specificity:** Performance is optimized for speech patterns present in the AImpower/MandarinStutteredSpeech dataset. It may not perform as well on other types of atypical speech or in environments with significant background noise.
34
  * **Variability:** Stuttering is highly variable. While the model shows significant improvements across severity levels, accuracy may still vary between individuals and contexts.
35
 
 
86
  * **Optimizer:** AdamW
87
  * **Batch Size:** 16
88
  * **Fine-tuning Method:** AdaLora
89
+ * **GPU:** Four NVIDIA A100 80G GPUs
90
  -----
91
 
92
  ## Evaluation Results