whisper-he-ipa

This model predicts IPA phonemes transcription from Hebrew speech.

It was fine-tuned from ivrit-ai/whisper-large-v3-turbo.

Output format

The model outputs ASCII Hebrew IPA, using ASCII-safe symbols instead of some IPA characters.

After normalization, the Hebrew IPA phoneme inventory is:

abdefhijklmnopstuvwzɡʁʃʒʔˈχ

Mapping from model output to normalized IPA:

ASCII_TO_IPA = {
    "dZ": "dʒ",
    "tS": "tʃ",
    "S": "ʃ",
    "Z": "ʒ",
    "q": "ʔ",
    "'": "ˈ",
    "r": "ʁ",
    "x": "χ",
    "g": "ɡ",
}

Example:

Input: שלום עולם
Model output: Sal'om qol'am
Normalized IPA: ʃalˈom ʔolˈam

Citation

@misc{melichov2026renikud,
  title={ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion},
  author={Maxim Melichov and Yakov Kolani and Morris Alper},
  year={2026},
  note={Code and models forthcoming},
}
Downloads last month
69
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for renikud/whisper-he-ipa

Finetuned
(10)
this model