speech31
/

XLS-R-english-phoneme

@@ -12,81 +12,40 @@ pipeline_tag: automatic-speech-recognition
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
 ### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters

 ### Model Description
+This model is fine-tuned on the TIMIT dataset.
+The dataset was preprocessed using Epitran for transliterating text into IPA.
+- **Developed by:** [Eunjung Yeo]
+- **Model type:** [fine-tuned model]
+- **Language(s) (SLP):** [English]
+- **Finetuned from model [optional]:** [XLS-R-300m]
 ### Model Sources [optional]
 ### Direct Use
+Phone recognition
 ### Downstream Use [optional]
+- Analysis of phonetic transcriptions
+- L2 Pronunciation Assessment (Mispronunciation Detection and Diagnosis)
+- Mispronunciation Assessment for pathological speech
 ## How to Get Started with the Model
+from transformers import AutoProcessor, AutoModelForCTC
+processor = AutoProcessor.from_pretrained("speech31/XLS-R-english-phoneme")
+model = AutoModelForCTC.from_pretrained("speech31/XLS-R-english-phoneme")
 ## Training Details
 ### Training Data
+TIMIT dataset (Can be downloaded from https://catalog.ldc.upenn.edu/LDC93s1)
+#### Preprocessing
 #### Training Hyperparameters