saudi_stt / README.md
Rabe3's picture
Upload README.md with huggingface_hub
e0ce0b9 verified
---
language:
- ar
- en
tags:
- whisper
- speech-recognition
- arabic
- saudi-arabic
- code-switching
- fine-tuned
base_model: openai/whisper-large-v3-turbo
---
# Whisper Large V3 Turbo - Saudi Arabic + Code-Switching
Fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) for Saudi Arabic dialect and Arabic-English code-switching.
## Training Data
- Rabe3/SAD22_Cleaned (102k Saudi Arabic samples)
- MohamedRashad/arabic-english-code-switching (12k samples)
## Training Details
- Base model: openai/whisper-large-v3-turbo
- Fine-tuning framework: whisper-finetune
- Epochs: 3
- Learning rate: 1e-5
- Batch size: 16
## Usage
```python
import whisper
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
# Download and load model
path = hf_hub_download(repo_id="Rabe3/saudi_stt", filename="model.safetensors")
model = whisper.load_model("large-v3-turbo", device="cuda")
state_dict = load_file(path, device="cuda")
model.load_state_dict(state_dict)
model.eval()
# Transcribe
result = model.transcribe(
"audio.wav",
language="ar",
fp16=True,
beam_size=5,
best_of=5,
temperature=0.0
)
print(result["text"])
```