dungca commited on
Commit
5c2fe2e
·
verified ·
1 Parent(s): c86e069

Update model artifacts from automated training run

Browse files
README.md CHANGED
@@ -1,95 +1,206 @@
1
  ---
2
- language:
3
- - ja
4
  tags:
5
- - automatic-speech-recognition
6
- - speech
7
- - whisper
8
- - peft
9
  - lora
10
- - japanese
11
- library_name: transformers
12
- base_model: openai/whisper-tiny
13
- pipeline_tag: automatic-speech-recognition
14
- datasets:
15
- - reazon-research/reazonspeech
16
- metrics:
17
- - cer
18
- - loss
19
- model-index:
20
- - name: whisper-tiny-ja-lora
21
- results:
22
- - task:
23
- type: automatic-speech-recognition
24
- name: Automatic Speech Recognition
25
- dataset:
26
- name: japanese-asr/ja_asr.reazonspeech_test
27
- type: japanese-asr/ja_asr.reazonspeech_test
28
- split: test
29
- metrics:
30
- - type: cer
31
- name: Character Error Rate (CER)
32
- value: 0.52497
33
- - type: loss
34
- name: Eval Loss
35
- value: 1.17656
36
  ---
37
 
38
- # Whisper Tiny JA LoRA (ReazonSpeech)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
- LoRA adapter fine-tuned from `openai/whisper-tiny` for Japanese ASR.
41
 
42
- ## Model Type
43
 
44
- This repository contains **LoRA adapter weights only**.
45
- Use it on top of `openai/whisper-tiny`.
46
 
47
- - Base model: `openai/whisper-tiny`
48
- - Language: Japanese (`ja`)
49
- - Training method: LoRA (`q_proj`, `v_proj`)
50
- - Dataset: `reazon-research/reazonspeech` (gated)
51
 
52
- ## Training Setup
53
 
54
- - Epochs (configured): `3`
55
- - Learning rate: `1e-5`
56
- - Batch size: `32`
57
- - LoRA r / alpha / dropout: `16 / 32 / 0.05`
58
- - Framework: `transformers`, `peft`
59
- - Runtime: Kaggle GPU P100
60
 
61
- ## Evaluation (Latest W&B Run)
62
 
63
- - `eval/cer`: **0.52497** (52.50%)
64
- - `eval/loss`: **1.17656**
65
- - `eval/runtime`: **162.422 s**
66
- - `eval/samples_per_second`: **12.314**
67
- - `eval/steps_per_second`: **0.77**
68
- - `train/global_step`: **3000**
69
- - `train/epoch`: **1.54719**
70
 
71
- > Note: WER was not logged in this run.
72
 
73
- ## Intended Use
74
 
75
- - Japanese speech-to-text transcription
76
- - Lightweight adapter training and deployment
77
 
78
- ## Limitations
79
 
80
- - Quality depends on domain/audio condition match with training data
81
- - Not validated for safety-critical production use
82
- - Requires accepted access to gated dataset when reproducing training
83
 
84
- ## Load Adapter
85
 
86
- ```python
87
- from transformers import WhisperForConditionalGeneration, WhisperProcessor
88
- from peft import PeftModel
89
 
90
- base_model_id = "openai/whisper-tiny"
91
- adapter_id = "dungca/whisper-tiny-ja-lora"
92
 
93
- processor = WhisperProcessor.from_pretrained(base_model_id)
94
- base_model = WhisperForConditionalGeneration.from_pretrained(base_model_id)
95
- model = PeftModel.from_pretrained(base_model, adapter_id)
 
1
  ---
2
+ base_model: openai/whisper-tiny
3
+ library_name: peft
4
  tags:
5
+ - base_model:adapter:openai/whisper-tiny
 
 
 
6
  - lora
7
+ - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
+ # Model Card for Model ID
11
+
12
+ <!-- Provide a quick summary of what the model is/does. -->
13
+
14
+
15
+
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+
24
+ - **Developed by:** [More Information Needed]
25
+ - **Funded by [optional]:** [More Information Needed]
26
+ - **Shared by [optional]:** [More Information Needed]
27
+ - **Model type:** [More Information Needed]
28
+ - **Language(s) (NLP):** [More Information Needed]
29
+ - **License:** [More Information Needed]
30
+ - **Finetuned from model [optional]:** [More Information Needed]
31
+
32
+ ### Model Sources [optional]
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** [More Information Needed]
37
+ - **Paper [optional]:** [More Information Needed]
38
+ - **Demo [optional]:** [More Information Needed]
39
+
40
+ ## Uses
41
+
42
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
43
+
44
+ ### Direct Use
45
+
46
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
47
+
48
+ [More Information Needed]
49
+
50
+ ### Downstream Use [optional]
51
+
52
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
53
+
54
+ [More Information Needed]
55
+
56
+ ### Out-of-Scope Use
57
+
58
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
+
60
+ [More Information Needed]
61
+
62
+ ## Bias, Risks, and Limitations
63
+
64
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
+
66
+ [More Information Needed]
67
+
68
+ ### Recommendations
69
+
70
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
71
+
72
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
73
+
74
+ ## How to Get Started with the Model
75
+
76
+ Use the code below to get started with the model.
77
+
78
+ [More Information Needed]
79
+
80
+ ## Training Details
81
+
82
+ ### Training Data
83
+
84
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
+
86
+ [More Information Needed]
87
+
88
+ ### Training Procedure
89
+
90
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
+
92
+ #### Preprocessing [optional]
93
+
94
+ [More Information Needed]
95
+
96
+
97
+ #### Training Hyperparameters
98
+
99
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
+
101
+ #### Speeds, Sizes, Times [optional]
102
+
103
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
104
+
105
+ [More Information Needed]
106
+
107
+ ## Evaluation
108
+
109
+ <!-- This section describes the evaluation protocols and provides the results. -->
110
+
111
+ ### Testing Data, Factors & Metrics
112
+
113
+ #### Testing Data
114
+
115
+ <!-- This should link to a Dataset Card if possible. -->
116
+
117
+ [More Information Needed]
118
+
119
+ #### Factors
120
+
121
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
122
+
123
+ [More Information Needed]
124
+
125
+ #### Metrics
126
+
127
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
128
+
129
+ [More Information Needed]
130
+
131
+ ### Results
132
+
133
+ [More Information Needed]
134
+
135
+ #### Summary
136
+
137
+
138
+
139
+ ## Model Examination [optional]
140
+
141
+ <!-- Relevant interpretability work for the model goes here -->
142
+
143
+ [More Information Needed]
144
+
145
+ ## Environmental Impact
146
+
147
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
148
+
149
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
150
+
151
+ - **Hardware Type:** [More Information Needed]
152
+ - **Hours used:** [More Information Needed]
153
+ - **Cloud Provider:** [More Information Needed]
154
+ - **Compute Region:** [More Information Needed]
155
+ - **Carbon Emitted:** [More Information Needed]
156
+
157
+ ## Technical Specifications [optional]
158
+
159
+ ### Model Architecture and Objective
160
+
161
+ [More Information Needed]
162
+
163
+ ### Compute Infrastructure
164
+
165
+ [More Information Needed]
166
+
167
+ #### Hardware
168
+
169
+ [More Information Needed]
170
+
171
+ #### Software
172
 
173
+ [More Information Needed]
174
 
175
+ ## Citation [optional]
176
 
177
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 
178
 
179
+ **BibTeX:**
 
 
 
180
 
181
+ [More Information Needed]
182
 
183
+ **APA:**
 
 
 
 
 
184
 
185
+ [More Information Needed]
186
 
187
+ ## Glossary [optional]
 
 
 
 
 
 
188
 
189
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
190
 
191
+ [More Information Needed]
192
 
193
+ ## More Information [optional]
 
194
 
195
+ [More Information Needed]
196
 
197
+ ## Model Card Authors [optional]
 
 
198
 
199
+ [More Information Needed]
200
 
201
+ ## Model Card Contact
 
 
202
 
203
+ [More Information Needed]
204
+ ### Framework versions
205
 
206
+ - PEFT 0.18.1
 
 
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:399be90d7e41e96d79ec6595ca699f80ea912ad509e8fd4b20b8ac6f1507622a
3
  size 1186320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c28b5a190f12421cffc4a96f8e28c478cf2cafac3552d550c526b6bc97619870
3
  size 1186320
all_results.json CHANGED
@@ -1,13 +1,13 @@
1
  {
2
- "epoch": 1.5471892728210417,
3
- "eval_cer": 0.5249730721665935,
4
- "eval_loss": 1.176564335823059,
5
- "eval_runtime": 162.422,
6
- "eval_samples_per_second": 12.314,
7
- "eval_steps_per_second": 0.77,
8
- "total_flos": 2.4041315192832e+18,
9
- "train_loss": 1.2634545720418295,
10
- "train_runtime": 7050.8095,
11
- "train_samples_per_second": 26.4,
12
- "train_steps_per_second": 0.825
13
  }
 
1
  {
2
+ "epoch": 1.2893243940175347,
3
+ "eval_cer": 0.5345673594766027,
4
+ "eval_loss": 1.2403422594070435,
5
+ "eval_runtime": 156.1144,
6
+ "eval_samples_per_second": 12.811,
7
+ "eval_steps_per_second": 0.801,
8
+ "total_flos": 2.0034345848832e+18,
9
+ "train_loss": 1.294680369567871,
10
+ "train_runtime": 5838.1826,
11
+ "train_samples_per_second": 31.883,
12
+ "train_steps_per_second": 0.996
13
  }
eval_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 1.5471892728210417,
3
- "eval_cer": 0.5249730721665935,
4
- "eval_loss": 1.176564335823059,
5
- "eval_runtime": 162.422,
6
- "eval_samples_per_second": 12.314,
7
- "eval_steps_per_second": 0.77
8
  }
 
1
  {
2
+ "epoch": 1.2893243940175347,
3
+ "eval_cer": 0.5345673594766027,
4
+ "eval_loss": 1.2403422594070435,
5
+ "eval_runtime": 156.1144,
6
+ "eval_samples_per_second": 12.811,
7
+ "eval_steps_per_second": 0.801
8
  }
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "epoch": 1.5471892728210417,
3
- "total_flos": 2.4041315192832e+18,
4
- "train_loss": 1.2634545720418295,
5
- "train_runtime": 7050.8095,
6
- "train_samples_per_second": 26.4,
7
- "train_steps_per_second": 0.825
8
  }
 
1
  {
2
+ "epoch": 1.2893243940175347,
3
+ "total_flos": 2.0034345848832e+18,
4
+ "train_loss": 1.294680369567871,
5
+ "train_runtime": 5838.1826,
6
+ "train_samples_per_second": 31.883,
7
+ "train_steps_per_second": 0.996
8
  }