ikrysinska commited on
Commit
86abc1e
·
1 Parent(s): 9092183

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -11
README.md CHANGED
@@ -13,6 +13,19 @@ pipeline_tag: automatic-speech-recognition
13
  tags:
14
  - phoneme_recognition
15
  - IPA
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
  # Model Card for MultiBridge/wav2vec-LnNor-IPA-ft
18
 
@@ -46,11 +59,6 @@ This model is built for phoneme recognition tasks. It was developed by fine-tuni
46
  - Speech processing applications: Serving as a component in speech processing pipelines or prototyping.
47
 
48
 
49
- ### Out-of-Scope Use
50
-
51
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
52
-
53
- [More Information Needed]
54
 
55
  ## Bias, Risks, and Limitations
56
 
@@ -72,7 +80,32 @@ Evaluate the model's performance for your specific use case.
72
 
73
  Use the code below to get started with the model.
74
 
75
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
 
77
  ## Training Details
78
 
@@ -105,18 +138,17 @@ The training dataset was filtered. Recordings shorter than 2 seconds or longer t
105
  - optimizer: AdamW
106
  - batch size: 64
107
  - weight decay: 0.001
108
- - epochs: 50
109
 
110
  #### Speeds, Sizes, Times [optional]
111
 
112
  Avg epoch training time: 650s
113
- Number of updates: 36050
114
- Final training loss:
115
- Final validation loss:
116
 
117
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
118
 
119
- [More Information Needed]
120
 
121
  ## Evaluation
122
 
 
13
  tags:
14
  - phoneme_recognition
15
  - IPA
16
+ model-index:
17
+ - name: MultiBridge/wav2vec-LnNor-IPA-ft
18
+ results:
19
+ - task:
20
+ type: phoneme-recognition # Required. Example: automatic-speech-recognition
21
+ name: Phoneme Recognition # Optional. Example: Speech Recognition
22
+ dataset:
23
+ type: speech31/timit_english_ipa # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
24
+ name: TIMIT # Required. A pretty name for the dataset. Example: Common Voice (French)
25
+ metrics:
26
+ - type: cer # Required. Example: wer. Use metric id from https://hf.co/metrics
27
+ value: 0.0416 # Required. Example: 20.90
28
+ name: CER # Optional. Example: Test WER
29
  ---
30
  # Model Card for MultiBridge/wav2vec-LnNor-IPA-ft
31
 
 
59
  - Speech processing applications: Serving as a component in speech processing pipelines or prototyping.
60
 
61
 
 
 
 
 
 
62
 
63
  ## Bias, Risks, and Limitations
64
 
 
80
 
81
  Use the code below to get started with the model.
82
 
83
+ ```python
84
+ from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
85
+ from datasets import load_dataset
86
+ import torch
87
+
88
+ # load model and processor
89
+ processor = Wav2Vec2Processor.from_pretrained("MultiBridge/wav2vec-LnNor-IPA-ft")
90
+ model = Wav2Vec2ForCTC.from_pretrained("MultiBridge/wav2vec-LnNor-IPA-ft")
91
+
92
+ # load dummy dataset and read soundfiles
93
+ ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", split="validation")
94
+
95
+ # tokenize
96
+ input_values = processor(ds[0]["audio"]["array"], return_tensors="pt").input_values
97
+
98
+ # retrieve logits
99
+ with torch.no_grad():
100
+ logits = model(input_values).logits
101
+
102
+ # take argmax and decode
103
+ predicted_ids = torch.argmax(logits, dim=-1)
104
+ transcription = processor.batch_decode(predicted_ids)
105
+
106
+ # => should give ['mɪstɝkwɪltɝɪzðəəpɑslʌvðəmɪdəlklæsəzændwiɑəɡlædtəwɛlkəmhɪzɡɑspəl'] for MISTER QUILTER IS THE APOSTLE OF THE MIDDLE CLASSES AND WE ARE GLAD TO WELCOME HIS GOSPEL
107
+
108
+ ```
109
 
110
  ## Training Details
111
 
 
138
  - optimizer: AdamW
139
  - batch size: 64
140
  - weight decay: 0.001
141
+ - epochs: 40
142
 
143
  #### Speeds, Sizes, Times [optional]
144
 
145
  Avg epoch training time: 650s
146
+ Number of updates: 28840
147
+ Final training loss: 0.09713
148
+ Final validation loss: 0.2142
149
 
150
  <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
151
 
 
152
 
153
  ## Evaluation
154