🗣 Whisper-Small Malayalam → English (LoRA Fine-Tuned)
This model is a LoRA-fine-tuned version of OpenAI’s Whisper-Small, trained for
Malayalam → English speech translation using the
Be-win/malayalam-speech-with-english-translation-10h dataset.
📊 Evaluation Results
These scores are computed on the 10% test split of the dataset.
| Metric | Score | Notes |
|---|---|---|
| WER | 75.76% | Word Error Rate (lower is better) |
| BLEU | 22.44 | Token-based translation quality |
| COMET | 0.7483 | Semantic translation quality |
🧠 Usage Example
import torch
from transformers import WhisperProcessor, WhisperForConditionalGeneration, pipeline
from peft import PeftModel
from datasets import load_dataset
# --- Config ---
REPO_ID = "Be-win/whisper-small-mal-en-lora"
BASE_MODEL = "openai/whisper-small"
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"
# --- 1. Load Model + Processor ---
base_model = WhisperForConditionalGeneration.from_pretrained(BASE_MODEL)
peft_model = PeftModel.from_pretrained(base_model, REPO_ID)
peft_model.to(DEVICE)
processor = WhisperProcessor.from_pretrained(REPO_ID)
# --- 2. Create Pipeline ---
# The pipeline handles all preprocessing and postprocessing
speech_translator = pipeline(
"automatic-speech-recognition",
model=peft_model,
tokenizer=processor.tokenizer,
feature_extractor=processor.feature_extractor,
device=DEVICE
)
# --- 3. Define Generation Arguments ---
# These arguments force the model to perform the translation task
gen_kwargs = {"task": "translate", "language": "malayalam"}
# --- 4. Run Inference ---
# Load a sample audio file (e.g., from the test dataset)
ds = load_dataset("Be-win/malayalam-speech-with-english-translation-10h", split="test")
sample = ds[0]["audio"] # Example: {'path': '...', 'array': ..., 'sampling_rate': 16000}
# Transcribe and translate
result = speech_translator(sample["array"], generate_kwargs=gen_kwargs)
print(f"Translation: {result['text']}")
Model tree for Be-win/whisper-small-mal-en-lora
Base model
openai/whisper-smallEvaluation results
- WER on Be-win/malayalam-speech-with-english-translation-10hself-reported75.760
- BLEU on Be-win/malayalam-speech-with-english-translation-10hself-reported22.440
- COMET on Be-win/malayalam-speech-with-english-translation-10hself-reported0.748