rosyvs commited on
Commit
00bd527
·
verified ·
1 Parent(s): 55e20a2

create model card

Browse files
Files changed (1) hide show
  1. README.md +24 -14
README.md CHANGED
@@ -1,19 +1,25 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
 
11
 
12
  ## Model Details
13
-
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
 
18
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
@@ -23,15 +29,11 @@ This is the model card of a 🤗 transformers model that has been pushed on the
23
  - **Model type:** [More Information Needed]
24
  - **Language(s) (NLP):** [More Information Needed]
25
  - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
@@ -170,11 +172,19 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
170
 
171
  ## Citation [optional]
172
 
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
  **BibTeX:**
176
 
177
- [More Information Needed]
 
 
 
 
 
 
 
 
 
178
 
179
  **APA:**
180
 
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - child_speech
5
+ - classroom_speech
6
+ - asr
7
+ medium_model:
8
+ - openai/whisper-medium
9
+ pipeline_tag: automatic-speech-recognition
10
+ base_model:
11
+ - openai/whisper-medium
12
  ---
13
 
14
  # Model Card for Model ID
15
 
16
+ ASR model tuned for child speech on public corpora
 
17
 
18
 
19
  ## Model Details
 
20
  ### Model Description
21
 
22
+ K-12 school classrooms have proven to be a challenging environment for Automatic Speech Recognition (ASR) systems, both due to background noise and conversation, and differences in linguistic and acoustic properties from adult speech, on which the majority of ASR systems are trained and evaluated. We report on experiments to improve ASR for child speech in the classroom by training and fine-tuning transformer models on public corpora of adult and child speech augmented with classroom background noise. By tuning OpenAI’s Whisper model we achieve a 38% relative reduction in word error rate (WER) to 9.2% on the public MyST dataset of child speech – the lowest yet reported – and a 7% relative reduction to reach 54% WER on a more challenging classroom speech dataset (ISAT). We also introduce a novel beam hypothesis rescoring method that incorporates a speed-aware term to capture prior knowledge of human speaking rates, as well as a Large Language Model, to select among hypotheses. We demonstrate the effectiveness of this technique on both publicly-available datasets and a classroom speech dataset.
23
 
24
  This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
25
 
 
29
  - **Model type:** [More Information Needed]
30
  - **Language(s) (NLP):** [More Information Needed]
31
  - **License:** [More Information Needed]
32
+ - **Finetuned from model [optional]:** openai/whisper-medium
33
 
34
  ### Model Sources [optional]
35
 
36
+ - **Paper:** [Automatic Speech Recognition Tuned for Child Speech in the Classroom](https://ieeexplore.ieee.org/document/10447428)
 
 
 
 
37
 
38
  ## Uses
39
 
 
172
 
173
  ## Citation [optional]
174
 
175
+ R. Southwell et al., "Automatic Speech Recognition Tuned for Child Speech in the Classroom," ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 12291-12295, doi: 10.1109/ICASSP48485.2024.10447428.
 
176
  **BibTeX:**
177
 
178
+ @INPROCEEDINGS{10447428,
179
+ author={Southwell, Rosy and Ward, Wayne and Trinh, Viet Anh and Clevenger, Charis and Clevenger, Clay and Watts, Emily and Reitman, Jason and D’Mello, Sidney and Whitehill, Jacob},
180
+ booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
181
+ title={Automatic Speech Recognition Tuned for Child Speech in the Classroom},
182
+ year={2024},
183
+ volume={},
184
+ number={},
185
+ pages={12291-12295},
186
+ keywords={Training;Oral communication;Signal processing;Linguistics;Transformers;Acoustics;Background noise;Automatic Speech Recognition;Child Speech;Language Modeling;Transfer Learning;Transformers},
187
+ doi={10.1109/ICASSP48485.2024.10447428}}
188
 
189
  **APA:**
190