DevMehdip's picture
Update README.md
ac1fb3f verified
---
base_model: openai/whisper-small
library_name: peft
tags:
- whisper
- lora
- transformers
- speech-to-text
- persian
- stt
- fine-tune
- adapter
datasets:
- vhdm/persian-voice-v1
language:
- fa
metrics:
- cer
- wer
---
# Whisper-Small Persian STT — LoRA Fine-Tuned
A fine-tuned version of **openai/whisper-small** for **Persian speech-to-text (ASR)** using **LoRA**.
This model is optimized for persian conversational speech and dataset-quality audio.
---
## Model Details
### Model Description
This model is a **LoRA fine-tuned Whisper Small** focused on **Persian (fa)** speech recognition.
It improves transcription accuracy on standard Persian audio segments (16kHz, mono, normalized WAV).
- **Developed by:** *Mehdi Pouladrag*
- **Model type:** Speech-to-Text (ASR) — Whisper Small (Seq2Seq Transformer)
- **Language(s):** Persian (fa)
- **License:** MIT (or your preferred license)
- **Finetuned from:** `openai/whisper-small`
- **Dataset:** `persian-voice-v1` (single dataset)
- **Training technique:** LoRA (Low-Rank Adaptation)
### Model Sources
- **Repository:** https://github.com/Mehdipoladrag/Fine-tuning-Whisper-Model
---
## Uses
### Direct Use
- Convert Persian speech to text
- Subtitle generation for Persian audio
- Conversational ASR
- Podcast / video transcription
- General Persian content recognition
### Downstream Use
- Integrate into ASR pipelines
- Use in real-time Persian voice applications
- Further fine-tuning on custom Persian domains (medical, legal, etc.)
### Out-of-Scope Use
- Non-Persian audio
- Low-quality/noisy multi-speaker overlapping speech
- Misuse for surveillance or unethical monitoring
---
## Bias, Risks, and Limitations
- Whisper may still struggle with dialect-heavy, noisy, or low-quality audio.
- The dataset used is relatively limited (~6099 audio–subtitle pairs), so:
- Certain accents may be underrepresented.
- Model may hallucinate or mis-transcribe in rare cases.
### Recommendations
Users should:
- Provide clean 16kHz mono WAV audio
- Use domain-specific fine-tuning if necessary
- Validate outputs before critical use