Rabe3
/

saudi_stt

speech-recognition

Model card Files Files and versions

saudi_stt / README.md

Rabe3's picture

Upload README.md with huggingface_hub

e0ce0b9 verified 11 days ago

|

history blame contribute delete

1.29 kB

	---
	language:
	- ar
	- en
	tags:
	- whisper
	- speech-recognition
	- arabic
	- saudi-arabic
	- code-switching
	- fine-tuned
	base_model: openai/whisper-large-v3-turbo
	---

	# Whisper Large V3 Turbo - Saudi Arabic + Code-Switching

	Fine-tuned version of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) for Saudi Arabic dialect and Arabic-English code-switching.

	## Training Data
	- Rabe3/SAD22_Cleaned (102k Saudi Arabic samples)
	- MohamedRashad/arabic-english-code-switching (12k samples)

	## Training Details
	- Base model: openai/whisper-large-v3-turbo
	- Fine-tuning framework: whisper-finetune
	- Epochs: 3
	- Learning rate: 1e-5
	- Batch size: 16

	## Usage
	```python
	import whisper
	import torch
	from safetensors.torch import load_file
	from huggingface_hub import hf_hub_download

	# Download and load model
	path = hf_hub_download(repo_id="Rabe3/saudi_stt", filename="model.safetensors")
	model = whisper.load_model("large-v3-turbo", device="cuda")
	state_dict = load_file(path, device="cuda")
	model.load_state_dict(state_dict)
	model.eval()

	# Transcribe
	result = model.transcribe(
	"audio.wav",
	language="ar",
	fp16=True,
	beam_size=5,
	best_of=5,
	temperature=0.0
	)
	print(result["text"])
	```