constantinedivis commited on
Commit
28866d0
·
verified ·
1 Parent(s): 0fa573a

constantinedivis/whisper-tiny-rus

Browse files
Files changed (2) hide show
  1. README.md +22 -33
  2. generation_config.json +1 -11
README.md CHANGED
@@ -4,26 +4,11 @@ license: apache-2.0
4
  base_model: openai/whisper-tiny
5
  tags:
6
  - generated_from_trainer
7
- datasets:
8
- - common_voice_17_0
9
  metrics:
10
  - wer
11
  model-index:
12
  - name: whisper-tiny-rus
13
- results:
14
- - task:
15
- name: Automatic Speech Recognition
16
- type: automatic-speech-recognition
17
- dataset:
18
- name: common_voice_17_0
19
- type: common_voice_17_0
20
- config: ru
21
- split: test
22
- args: ru
23
- metrics:
24
- - name: Wer
25
- type: wer
26
- value: 39.12070231891154
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,11 +16,10 @@ should probably proofread and complete it, then remove this comment. -->
31
 
32
  # whisper-tiny-rus
33
 
34
- This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the common_voice_17_0 dataset.
35
  It achieves the following results on the evaluation set:
36
- - Loss: 0.5293
37
- - Wer Ortho: 44.5430
38
- - Wer: 39.1207
39
 
40
  ## Model description
41
 
@@ -55,25 +39,30 @@ More information needed
55
 
56
  The following hyperparameters were used during training:
57
  - learning_rate: 1e-05
58
- - train_batch_size: 16
59
- - eval_batch_size: 16
60
  - seed: 42
61
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
62
- - lr_scheduler_type: constant_with_warmup
63
- - lr_scheduler_warmup_steps: 50
64
- - training_steps: 500
65
  - mixed_precision_training: Native AMP
66
 
67
  ### Training results
68
 
69
- | Training Loss | Epoch | Step | Validation Loss | Wer Ortho | Wer |
70
- |:-------------:|:------:|:----:|:---------------:|:---------:|:-------:|
71
- | 0.301 | 0.3032 | 500 | 0.5293 | 44.5430 | 39.1207 |
 
 
 
 
 
72
 
73
 
74
  ### Framework versions
75
 
76
- - Transformers 4.48.3
77
- - Pytorch 2.5.1+cu124
78
- - Datasets 3.2.0
79
- - Tokenizers 0.21.0
 
4
  base_model: openai/whisper-tiny
5
  tags:
6
  - generated_from_trainer
 
 
7
  metrics:
8
  - wer
9
  model-index:
10
  - name: whisper-tiny-rus
11
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  # whisper-tiny-rus
18
 
19
+ This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.4508
22
+ - Wer: 37.6577
 
23
 
24
  ## Model description
25
 
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 1e-05
42
+ - train_batch_size: 128
43
+ - eval_batch_size: 128
44
  - seed: 42
45
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
+ - lr_scheduler_type: linear
47
+ - lr_scheduler_warmup_steps: 100
48
+ - training_steps: 600
49
  - mixed_precision_training: Native AMP
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss | Wer |
54
+ |:-------------:|:------:|:----:|:---------------:|:-------:|
55
+ | 0.6166 | 0.3984 | 100 | 0.6163 | 45.3709 |
56
+ | 0.5109 | 0.7968 | 200 | 0.5225 | 41.1251 |
57
+ | 0.4615 | 1.1952 | 300 | 0.4850 | 39.6391 |
58
+ | 0.4377 | 1.5936 | 400 | 0.4664 | 38.5069 |
59
+ | 0.433 | 1.9920 | 500 | 0.4544 | 37.6695 |
60
+ | 0.4186 | 2.3904 | 600 | 0.4508 | 37.6577 |
61
 
62
 
63
  ### Framework versions
64
 
65
+ - Transformers 4.49.0
66
+ - Pytorch 2.1.0+cu118
67
+ - Datasets 3.4.0
68
+ - Tokenizers 0.21.1
generation_config.json CHANGED
@@ -32,16 +32,6 @@
32
  "bos_token_id": 50257,
33
  "decoder_start_token_id": 50258,
34
  "eos_token_id": 50257,
35
- "forced_decoder_ids": [
36
- [
37
- 1,
38
- null
39
- ],
40
- [
41
- 2,
42
- 50359
43
- ]
44
- ],
45
  "is_multilingual": true,
46
  "lang_to_id": {
47
  "<|af|>": 50327,
@@ -244,5 +234,5 @@
244
  "transcribe": 50359,
245
  "translate": 50358
246
  },
247
- "transformers_version": "4.48.3"
248
  }
 
32
  "bos_token_id": 50257,
33
  "decoder_start_token_id": 50258,
34
  "eos_token_id": 50257,
 
 
 
 
 
 
 
 
 
 
35
  "is_multilingual": true,
36
  "lang_to_id": {
37
  "<|af|>": 50327,
 
234
  "transcribe": 50359,
235
  "translate": 50358
236
  },
237
+ "transformers_version": "4.49.0"
238
  }