Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

Whisper Small — Vietnamese (LoRA Fine-tuned)

Fine-tuned version of openai/whisper-small on Vietnamese speech using LoRA adapters and the Mozilla Common Voice 11 dataset.

Training Results

Metric	Value
Training Loss	0.9382
Epochs	5
Global Steps	470
Samples/sec	7.37
Total FLOPs	4.60e+18

Model Details

Base model: openai/whisper-small (244M params)
Method: LoRA (Low-Rank Adaptation)
Trainable params: ~13M (5.09% of base)
Target modules: q_proj, v_proj, k_proj, out_proj, fc1, fc2
LoRA rank: 32 · alpha: 64 · dropout: 0.05
Language: Vietnamese (vi)
Task: Transcription

Training Details

Dataset: Mozilla Common Voice 11.0 (vi)
Learning rate: 1e-4 with linear warmup (500 steps)
Batch size: 8 × 2 gradient accumulation = effective 16
Precision: FP16
Framework: 🤗 Transformers + PEFT

Data augmentation applied:

Speed perturbation ±10% (p=0.3)
Additive Gaussian noise (p=0.3)

Usage

from peft import PeftModel
from transformers import WhisperForConditionalGeneration, WhisperProcessor
import torch

base = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
model = PeftModel.from_pretrained(base, "LakoreAI/whisper-small-vi-lora")
processor = WhisperProcessor.from_pretrained("LakoreAI/whisper-small-vi-lora")

# Optional: merge LoRA for faster inference
model = model.merge_and_unload()
model.eval()

# Inference
def transcribe(audio_array, sampling_rate=16000):
    inputs = processor(audio_array, sampling_rate=sampling_rate, return_tensors="pt")
    with torch.no_grad():
        ids = model.generate(
            inputs.input_features,
            language="vietnamese",
            task="transcribe",
            max_new_tokens=225,
        )
    return processor.tokenizer.decode(ids[0], skip_special_tokens=True)

Limitations

Optimized for Vietnamese only; other languages will degrade significantly
Common Voice data skews toward read speech; spontaneous/accented speech may perform worse
Short clips (<1s) or clipped audio may cause hallucinations

Downloads last month: -

Model tree for LakoreAI/whisper-small-vi-lora

Base model

openai/whisper-small

Adapter

(234)

this model