Automatic Speech Recognition
Transformers
ONNX
Arabic
English
whisper
asr
bahraini-arabic
code-switching
fine-tuned
Instructions to use Fatimaa75/whisper-base-bahraini with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Fatimaa75/whisper-base-bahraini with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="Fatimaa75/whisper-base-bahraini")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("Fatimaa75/whisper-base-bahraini") model = AutoModelForSpeechSeq2Seq.from_pretrained("Fatimaa75/whisper-base-bahraini") - Notebooks
- Google Colab
- Kaggle
Whisper-Base Fine-tuned for Bahraini Arabic
Fine-tuned version of openai/whisper-base on the Bahraini Speech Dataset for Bahraini Arabic dialect transcription with Arabic-English code-switching support.
Developed as part of the Nota AI Meeting Assistant senior project at the University of Bahrain.
Model Details
| Property | Value |
|---|---|
| Base model | openai/whisper-base |
| Parameters | 72.6M |
| Language | Bahraini Arabic + English code-switching |
| Task | Automatic Speech Recognition |
| Format | Quantized ONNX (INT8) |
| Model size | 74MB |
Training Details
| Hyperparameter | Value |
|---|---|
| Learning rate | 3e-5 |
| Batch size | 32 |
| Max steps | 8,000 |
| Warmup steps | 500 |
| Precision | bf16 |
| GPU | NVIDIA A100 (40GB) |
| Training time | ~2.5 hours |
Dataset
- Name: Hishambarakat/Bahraini_Speech_Dataset
- Size: 69,224 training clips (~42 hours after filtering)
- Filtering: Removed clips shorter than 1.5 seconds
- Augmentation: Gaussian noise, time stretching, pitch shifting
Preprocessing
- Audio resampled from 24kHz to 16kHz
- Arabic diacritics removed
- Bahraini dialect spelling preserved (no MSA normalization)
- English words preserved as-is
Evaluation Results
| Model | WER |
|---|---|
| whisper-base (no fine-tuning) | ~88% |
| This model (V1, 6000 steps) | 58.3% |
| This model (V2, 8000 steps + filtering + augmentation) | 54.4% |
Note: WER is measured against model-generated test labels, not human annotations. True WER against human ground truth is estimated to be 5-10 points lower.
Usage with Transformers.js (Browser)
import { pipeline } from '@huggingface/transformers';
const transcriber = await pipeline(
'automatic-speech-recognition',
'Fatimaa75/whisper-base-bahraini'
);
const result = await transcriber(audioData, {
language: 'arabic',
task: 'transcribe',
});
console.log(result.text);
Usage with Python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
processor = WhisperProcessor.from_pretrained("Fatimaa75/whisper-base-bahraini")
model = WhisperForConditionalGeneration.from_pretrained("Fatimaa75/whisper-base-bahraini")
# Process audio (must be 16kHz)
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
predicted_ids = model.generate(
inputs.input_features,
language="arabic",
task="transcribe",
no_repeat_ngram_size=3,
)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)
Limitations
- Model capacity: whisper-base (74M params) has a WER ceiling for this dialect — whisper-small is recommended for production
- Short clips: Clips under 1.5 seconds remain unreliable
- Heavy dialect: Non-standard Bahraini words cause failures
- Code-switching: English word preservation is inconsistent
- Test labels: Evaluation uses model-generated labels, not human annotations
Intended Use
Designed for integration into Nota, a privacy-first AI meeting assistant. Best suited for:
- Meeting transcription (post-processing)
- Transcript search and summarization
- Bahraini Arabic speech with mixed English terminology
Not recommended for:
- Real-time live captioning where accuracy is critical
- Single-word or very short utterance recognition
Citation
If you use this model, please cite:
- University of Bahrain - ITCS 499 Senior Project 2025-2026
- Nota AI Meeting Assistant
- Fine-tuned Whisper-base for Bahraini Arabic ASR
- Downloads last month
- 230
Model tree for Fatimaa75/whisper-base-bahraini
Base model
openai/whisper-base