| language: | |
| - ar | |
| - en | |
| tags: | |
| - whisper | |
| - speech-recognition | |
| - arabic | |
| - saudi-arabic | |
| - code-switching | |
| - fine-tuned | |
| base_model: openai/whisper-large-v3-turbo | |
| # Whisper Large V3 Turbo - Saudi Arabic + Code-Switching | |
| Fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) for Saudi Arabic dialect and Arabic-English code-switching. | |
| ## Training Data | |
| - Rabe3/SAD22_Cleaned (102k Saudi Arabic samples) | |
| - MohamedRashad/arabic-english-code-switching (12k samples) | |
| ## Training Details | |
| - Base model: openai/whisper-large-v3-turbo | |
| - Fine-tuning framework: whisper-finetune | |
| - Epochs: 3 | |
| - Learning rate: 1e-5 | |
| - Batch size: 16 | |
| ## Usage | |
| ```python | |
| import whisper | |
| import torch | |
| from safetensors.torch import load_file | |
| from huggingface_hub import hf_hub_download | |
| # Download and load model | |
| path = hf_hub_download(repo_id="Rabe3/saudi_stt", filename="model.safetensors") | |
| model = whisper.load_model("large-v3-turbo", device="cuda") | |
| state_dict = load_file(path, device="cuda") | |
| model.load_state_dict(state_dict) | |
| model.eval() | |
| # Transcribe | |
| result = model.transcribe( | |
| "audio.wav", | |
| language="ar", | |
| fp16=True, | |
| beam_size=5, | |
| best_of=5, | |
| temperature=0.0 | |
| ) | |
| print(result["text"]) | |
| ``` | |