SiangLao/lao-asr-thesis-dataset
Viewer • Updated • 5.63k • 89 • 1
How to use SiangLao/hubert-lao-asr with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="SiangLao/hubert-lao-asr") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("SiangLao/hubert-lao-asr")
model = AutoModelForCTC.from_pretrained("SiangLao/hubert-lao-asr")Fine-tuned HuBERT-Large model for Lao automatic speech recognition, achieving 25.37% CER on test data.
This model is fine-tuned from facebook/hubert-large-ll60k using the SiangLao/lao-asr-thesis-dataset.
| Split | CER | Loss |
|---|---|---|
| Test | 25.37% | 0.652 |
| Validation | 25.16% | 0.668 |
from transformers import HubertForCTC, Wav2Vec2Processor
import torch
import librosa
# Load model and processor
model = HubertForCTC.from_pretrained("SiangLao/hubert-lao-asr")
processor = Wav2Vec2Processor.from_pretrained("SiangLao/hubert-lao-asr")
# Load audio (must be 16kHz)
audio, sr = librosa.load("audio.wav", sr=16000)
# Generate prediction
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)[0]
# Clean transcription
transcription = transcription.replace("<unk>", " ").strip()
print(transcription)
@thesis{naovalath2025lao,
title={Lao Automatic Speech Recognition using Transfer Learning},
author={Souphaxay Naovalath and Sounmy Chanthavong},
advisor={Dr. Somsack Inthasone},
school={National University of Laos, Faculty of Natural Sciences, Computer Science Department},
year={2025}
}
Base model
facebook/hubert-large-ll60k