whisper-tiny-he
Hebrew fine-tuned Whisper Tiny for automatic speech recognition.
Training
- Base model: openai/whisper-tiny
- Dataset: ivrit-ai/whisper-training (~400h Hebrew)
- Method: Supervised fine-tuning with
Seq2SeqTrainer - Steps: 5,000 (streaming, effective batch size 16)
- Hardware: Apple M4 (MPS), fp32
- Final eval WER: 0.659 (on 200-sample test split)
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
processor = WhisperProcessor.from_pretrained("amitkot/whisper-tiny-he")
model = WhisperForConditionalGeneration.from_pretrained("amitkot/whisper-tiny-he")
model.generation_config.language = "he"
model.generation_config.task = "transcribe"
Training pipeline
Trained using whisper-acft-pipeline:
uv run python scripts/finetune.py --config configs/hebrew_tiny_finetune.yaml
See also
- amitkot/whisper-tiny-he-acft — ACFT-optimized version of this model for short audio (FUTO Keyboard)
- Downloads last month
- 24