Whisper-Medium-KsponSpeech

The Whisper-medium Model finetunned with KsponSpeech

Model Description

  • Developed by : yw0nam
  • Shared by : yw0nam
  • Model type : ASR
  • License: [apache-2.0]

Uses


processor = WhisperProcessor.from_pretrained("openai/whisper-medium", language="ko", task="transcribe")
model = WhisperForConditionalGeneration.from_pretrained('spow12/whisper-medium-zeroth_korean').cuda()

data, _ = librosa.load(wav_path, sr=16000)
input_features = processor(data, sampling_rate=16000, return_tensors="pt").input_features.cuda()

predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]

Metrics

Metric result
WER 3.96
CER 1.71
Downloads last month
21
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train mobi/whisper-medium-zeroth_korean

Space using mobi/whisper-medium-zeroth_korean 1