whisper-tiny-he

Hebrew fine-tuned Whisper Tiny for automatic speech recognition.

Training

  • Base model: openai/whisper-tiny
  • Dataset: ivrit-ai/whisper-training (~400h Hebrew)
  • Method: Supervised fine-tuning with Seq2SeqTrainer
  • Steps: 5,000 (streaming, effective batch size 16)
  • Hardware: Apple M4 (MPS), fp32
  • Final eval WER: 0.659 (on 200-sample test split)

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration

processor = WhisperProcessor.from_pretrained("amitkot/whisper-tiny-he")
model = WhisperForConditionalGeneration.from_pretrained("amitkot/whisper-tiny-he")

model.generation_config.language = "he"
model.generation_config.task = "transcribe"

Training pipeline

Trained using whisper-acft-pipeline:

uv run python scripts/finetune.py --config configs/hebrew_tiny_finetune.yaml

See also

Downloads last month
24
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amitkot/whisper-tiny-he

Finetuned
(1714)
this model
Finetunes
1 model

Dataset used to train amitkot/whisper-tiny-he