Serialtechlab's picture
Upload README.md with huggingface_hub
14ea467 verified
metadata
language:
  - dv
  - en
  - ar
license: apache-2.0
tags:
  - whisper
  - dhivehi
  - code-switching
  - automatic-speech-recognition
base_model: openai/whisper-small
pipeline_tag: automatic-speech-recognition

Whisper Dhivehi Code-Switching ASR

Whisper-small fine-tuned for code-switched Dhivehi (with English and Arabic). Adds a custom <|dv|> language token to the tokenizer.

Usage

from transformers import pipeline

asr = pipeline(
    task="automatic-speech-recognition",
    model="Serialtechlab/whisper-dhivehi-code-switch-v2",
    device=0,
    chunk_length_s=10,
    stride_length_s=(1, 1),
    generate_kwargs={"num_beams": 3, "repetition_penalty": 1.05},
)

result = asr("audio.wav")
print(result["text"])

Training data

Fine-tuned on a synthetic code-switched dataset combining:

  • Dhivehi: Serialtechlab/dhivehi-mms-v5-combined, dhivehi-tts-preprocessed, dv-syn-female2-for-tts
  • English/Arabic loan words: google/fleurs (en_us, ar_eg)

Trained for 20,000 steps from whisper-small base, with a custom <|dv|> language token added.