saudi_stt / README.md
Rabe3's picture
Upload README.md with huggingface_hub
e0ce0b9 verified
metadata
language:
  - ar
  - en
tags:
  - whisper
  - speech-recognition
  - arabic
  - saudi-arabic
  - code-switching
  - fine-tuned
base_model: openai/whisper-large-v3-turbo

Whisper Large V3 Turbo - Saudi Arabic + Code-Switching

Fine-tuned version of openai/whisper-large-v3-turbo for Saudi Arabic dialect and Arabic-English code-switching.

Training Data

  • Rabe3/SAD22_Cleaned (102k Saudi Arabic samples)
  • MohamedRashad/arabic-english-code-switching (12k samples)

Training Details

  • Base model: openai/whisper-large-v3-turbo
  • Fine-tuning framework: whisper-finetune
  • Epochs: 3
  • Learning rate: 1e-5
  • Batch size: 16

Usage

import whisper
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download

# Download and load model
path = hf_hub_download(repo_id="Rabe3/saudi_stt", filename="model.safetensors")
model = whisper.load_model("large-v3-turbo", device="cuda")
state_dict = load_file(path, device="cuda")
model.load_state_dict(state_dict)
model.eval()

# Transcribe
result = model.transcribe(
    "audio.wav",
    language="ar",
    fp16=True,
    beam_size=5,
    best_of=5,
    temperature=0.0
)
print(result["text"])