File size: 1,524 Bytes
ff42d72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
language: he
license: apache-2.0
library_name: transformers
tags:
  - whisper
  - audio
  - automatic-speech-recognition
  - hebrew
datasets:
  - ivrit-ai/whisper-training
base_model: openai/whisper-tiny
pipeline_tag: automatic-speech-recognition
---

# whisper-tiny-he

Hebrew fine-tuned [Whisper Tiny](https://huggingface.co/openai/whisper-tiny) for automatic speech recognition.

## Training

- **Base model**: [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
- **Dataset**: [ivrit-ai/whisper-training](https://huggingface.co/datasets/ivrit-ai/whisper-training) (~400h Hebrew)
- **Method**: Supervised fine-tuning with `Seq2SeqTrainer`
- **Steps**: 5,000 (streaming, effective batch size 16)
- **Hardware**: Apple M4 (MPS), fp32
- **Final eval WER**: 0.659 (on 200-sample test split)

## Usage

```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration

processor = WhisperProcessor.from_pretrained("amitkot/whisper-tiny-he")
model = WhisperForConditionalGeneration.from_pretrained("amitkot/whisper-tiny-he")

model.generation_config.language = "he"
model.generation_config.task = "transcribe"
```

## Training pipeline

Trained using [whisper-acft-pipeline](https://github.com/amitkot/whisper-acft-pipeline):

```bash
uv run python scripts/finetune.py --config configs/hebrew_tiny_finetune.yaml
```

## See also

- [amitkot/whisper-tiny-he-acft](https://huggingface.co/amitkot/whisper-tiny-he-acft) — ACFT-optimized version of this model for short audio (FUTO Keyboard)