| --- |
| language: |
| - dv |
| - en |
| - ar |
| license: apache-2.0 |
| tags: |
| - whisper |
| - dhivehi |
| - code-switching |
| - automatic-speech-recognition |
| base_model: openai/whisper-small |
| pipeline_tag: automatic-speech-recognition |
| --- |
| |
| # Whisper Dhivehi Code-Switching ASR |
|
|
| Whisper-small fine-tuned for code-switched Dhivehi (with English and Arabic). |
| Adds a custom `<|dv|>` language token to the tokenizer. |
|
|
| ## Usage |
|
|
| ~~~python |
| from transformers import pipeline |
| |
| asr = pipeline( |
| task="automatic-speech-recognition", |
| model="Serialtechlab/whisper-dhivehi-code-switch-v2", |
| device=0, |
| chunk_length_s=10, |
| stride_length_s=(1, 1), |
| generate_kwargs={"num_beams": 3, "repetition_penalty": 1.05}, |
| ) |
| |
| result = asr("audio.wav") |
| print(result["text"]) |
| ~~~ |
|
|
| ## Training data |
|
|
| Fine-tuned on a synthetic code-switched dataset combining: |
| - Dhivehi: Serialtechlab/dhivehi-mms-v5-combined, dhivehi-tts-preprocessed, dv-syn-female2-for-tts |
| - English/Arabic loan words: google/fleurs (en_us, ar_eg) |
|
|
| Trained for 20,000 steps from `whisper-small` base, with a custom `<|dv|>` language token added. |