Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ This model is a version of OpenAI's `whisper-large-v2` fine-tuned on the **AImpo
|
|
| 14 |
|
| 15 |
This model is specifically adapted to provide more accurate and authentic transcriptions for Mandarin-speaking PWS.
|
| 16 |
Standard Automatic Speech Recognition (ASR) models often exhibit "fluency bias," where they "smoothen" out or delete stuttered speech patterns like repetitions and interjections.
|
| 17 |
-
This model was fine-tuned on
|
| 18 |
|
| 19 |
The primary goal is to create a more inclusive ASR system that recognizes and respects the natural speech patterns of PWS, reducing deletion errors and improving overall accuracy.
|
| 20 |
|
|
@@ -29,7 +29,7 @@ This model is intended for transcribing conversational Mandarin Chinese speech f
|
|
| 29 |
|
| 30 |
### Limitations
|
| 31 |
|
| 32 |
-
* **Language Specificity:** The model is
|
| 33 |
* **Data Specificity:** Performance is optimized for speech patterns present in the AImpower/MandarinStutteredSpeech dataset. It may not perform as well on other types of atypical speech or in environments with significant background noise.
|
| 34 |
* **Variability:** Stuttering is highly variable. While the model shows significant improvements across severity levels, accuracy may still vary between individuals and contexts.
|
| 35 |
|
|
@@ -86,7 +86,7 @@ This dataset was created through a community-led, grassroots effort with Stammer
|
|
| 86 |
* **Optimizer:** AdamW
|
| 87 |
* **Batch Size:** 16
|
| 88 |
* **Fine-tuning Method:** AdaLora
|
| 89 |
-
|
| 90 |
-----
|
| 91 |
|
| 92 |
## Evaluation Results
|
|
|
|
| 14 |
|
| 15 |
This model is specifically adapted to provide more accurate and authentic transcriptions for Mandarin-speaking PWS.
|
| 16 |
Standard Automatic Speech Recognition (ASR) models often exhibit "fluency bias," where they "smoothen" out or delete stuttered speech patterns like repetitions and interjections.
|
| 17 |
+
This model was fine-tuned on literal transcriptions that intentionally preserve these disfluencies.
|
| 18 |
|
| 19 |
The primary goal is to create a more inclusive ASR system that recognizes and respects the natural speech patterns of PWS, reducing deletion errors and improving overall accuracy.
|
| 20 |
|
|
|
|
| 29 |
|
| 30 |
### Limitations
|
| 31 |
|
| 32 |
+
* **Language Specificity:** The model is fine-tuned exclusively on Mandarin Chinese and is not intended for other languages.
|
| 33 |
* **Data Specificity:** Performance is optimized for speech patterns present in the AImpower/MandarinStutteredSpeech dataset. It may not perform as well on other types of atypical speech or in environments with significant background noise.
|
| 34 |
* **Variability:** Stuttering is highly variable. While the model shows significant improvements across severity levels, accuracy may still vary between individuals and contexts.
|
| 35 |
|
|
|
|
| 86 |
* **Optimizer:** AdamW
|
| 87 |
* **Batch Size:** 16
|
| 88 |
* **Fine-tuning Method:** AdaLora
|
| 89 |
+
* **GPU:** Four NVIDIA A100 80G GPUs
|
| 90 |
-----
|
| 91 |
|
| 92 |
## Evaluation Results
|