File size: 4,219 Bytes
132612d
 
 
 
 
f7fb7b9
132612d
f7fb7b9
 
132612d
 
 
f7fb7b9
 
 
 
 
 
b86bfef
 
f7fb7b9
 
 
b86bfef
 
 
132612d
 
f7fb7b9
b6567cc
b86bfef
 
 
 
 
 
132612d
 
 
b86bfef
132612d
 
 
b86bfef
 
 
 
 
 
 
 
 
 
 
 
 
132612d
 
 
b86bfef
 
132612d
 
 
 
 
 
 
b86bfef
 
 
 
 
 
 
 
 
132612d
b86bfef
132612d
b86bfef
 
 
 
 
 
 
 
 
 
 
 
132612d
 
 
b86bfef
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
library_name: transformers
license: apache-2.0
base_model: openai/whisper-tiny
tags:
- whisper-event
- generated_from_trainer
datasets:
- asierhv/composite_corpus_eu_v2.1
metrics:
- wer
model-index:
- name: Whisper Tiny Basque
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Mozilla Common Voice 18.0
      type: mozilla-foundation/common_voice_18_0
    metrics:
    - name: Wer
      type: wer
      value: 13.56
language:
- eu
---

# Whisper Tiny Basque

This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) specifically for Basque (eu) language Automatic Speech Recognition (ASR). It was trained on the [asierhv/composite_corpus_eu_v2.1](https://huggingface.co/datasets/asierhv/composite_corpus_eu_v2.1) dataset, which is a composite corpus designed to improve Basque ASR performance.

**Key improvements and results compared to the base model:**

* **Significant WER reduction:** The fine-tuned model achieves a Word Error Rate (WER) of 14.8495 on the validation set of the `asierhv/composite_corpus_eu_v2.1` dataset, demonstrating improved accuracy compared to the base `whisper-tiny` model for Basque.
* **Performance on Common Voice:** When evaluated on the Mozilla Common Voice 18.0 dataset, the model achieved a WER of 13.56. This demonstrates the model's ability to generalize to other Basque speech datasets.

## Model description

This model leverages the power of the Whisper architecture, originally developed by OpenAI, and adapts it to the specific nuances of the Basque language. By fine-tuning the `whisper-tiny` model on a comprehensive Basque speech corpus, it learns to accurately transcribe spoken Basque. The `whisper-tiny` model is the smallest of the whisper models, providing a good balance between speed and accuracy.

## Intended uses & limitations

**Intended uses:**

* Automatic transcription of Basque speech.
* Development of Basque speech-based applications.
* Research on Basque speech processing.
* Accessibility tools for Basque speakers.

**Limitations:**

* Performance may vary depending on the quality of the audio input (e.g., background noise, recording quality).
* The model might struggle with highly dialectal or informal speech.
* While the model shows improved performance, it may still produce errors, especially with complex sentences or uncommon words.
* The model is based on the small version of whisper, and thus, accuracy may be improved with larger models.

## Training and evaluation data

* **Training dataset:** [asierhv/composite_corpus_eu_v2.1](https://huggingface.co/datasets/asierhv/composite_corpus_eu_v2.1). This dataset is a composite corpus of Basque speech data, designed to improve the performance of Basque ASR systems.
* **Evaluation Dataset:** The `test` portion of `asierhv/composite_corpus_eu_v2.1`.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:

* **learning_rate:** 3.75e-05
* **train_batch_size:** 32
* **eval_batch_size:** 16
* **seed:** 42
* **optimizer:** AdamW with betas=(0.9, 0.999) and epsilon=1e-08
* **lr_scheduler_type:** linear
* **lr_scheduler_warmup_steps:** 1000
* **training_steps:** 10000
* **mixed_precision_training:** Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | WER      |
|---------------|-------|-------|-----------------|----------|
| 0.586         | 0.1   | 1000  | 0.6249          | 34.1639  |
| 0.3145        | 0.2   | 2000  | 0.5048          | 25.2591  |
| 0.225         | 0.3   | 3000  | 0.4839          | 22.0557  |
| 0.3003        | 0.4   | 4000  | 0.4540          | 20.3072  |
| 0.132         | 0.5   | 5000  | 0.4574          | 19.0146  |
| 0.1588        | 0.6   | 6000  | 0.4380          | 17.8219  |
| 0.1841        | 0.7   | 7000  | 0.4395          | 16.6667  |
| 0.143         | 0.8   | 8000  | 0.3719          | 15.4490  |
| 0.0967        | 0.9   | 9000  | 0.3685          | 15.1368  |
| 0.1059        | 1.0   | 10000 | 0.3719          | 14.8495  |

### Framework versions

* Transformers 4.49.0.dev0
* Pytorch 2.6.0+cu124
* Datasets 3.3.1.dev0
* Tokenizers 0.21.0