Instructions to use INo0121/whisper-base-ko-callvoice with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use INo0121/whisper-base-ko-callvoice with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="INo0121/whisper-base-ko-callvoice")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("INo0121/whisper-base-ko-callvoice") model = AutoModelForSpeechSeq2Seq.from_pretrained("INo0121/whisper-base-ko-callvoice") - Notebooks
- Google Colab
- Kaggle
Whisper Base for Korean Low quaiity Call Voices
This model is a fine-tuned version of openai/whisper-base on the Korean Low Quaiity Call Voices dataset. It achieves the following results on the evaluation set:
- Loss: 0.4941
- Cer: 30.7538
Model description
ํ๋ก์ ํธ ์ฉ๋๋ก ํ์ธํ๋๋ ๋ชจ๋ธ์ ๋๋ค. OpenAI์ Whisper-Base ๋ชจ๋ธ์ ๋ฐํ์ผ๋ก 'ํ๊ตญ์ด ์ ์์ง ์์ฑ ํตํ ๋ฐ์ดํฐ'์ ๋ํ ์ ํ๋๋ฅผ ์ฆ๊ฐ์ํค๊ณ ์ ํ์ธํ๋์ ์งํํ ๋ชจ๋ธ์ด๋ฉฐ, ์ฌ์ฉํ ๋ฐ์ดํฐ๋ AI-HUB์ โ์ ์์ง ์ ํ๋ง ์์ฑ์ธ์ ๋ฐ์ดํฐโ ์ค ์ผ๋ถ๋ก์ ์ค๋์ค ํ์ผ ๊ธฐ์ค 240,771.06์ด(ํ์ผ 1๊ฐ๋น ํ๊ท ๊ธธ์ด๋ ์ฝ 5.296์ด) ํ ์คํธ ๋ฐ์ดํฐ ๊ธฐ์ค ์ด 1,696,414๊ธ์์ ํฌ๊ธฐ์ ๋๋ค.
This is a fine-tuned model for project use. This model was fine-tuned to increase the accuracy of โKorean low-quality voice call dataโ based on OpenAIโs Whisper-Base model. The data used is part of AI-HUBโs โlow-quality telephone network voice recognition dataโ, which is 240,771.06 seconds based on audio files(average length per file is about 5.296 seconds). The total size is 1,696,414 characters based on text data.
Intended uses & limitations
ํ์ธํ๋์ ์ฌ์ฉ๋ Base model๊ณผ dataset ๋ชจ๋ ํ์ต ๋ชฉ์ ์ผ๋ก ์ฌ์ฉํ์์ผ๋ฉฐ, ๋ฐ๋ผ์ ๋ณธ ๋ชจ๋ธ ์ญ์ ํ์ต ๋ชฉ์ ์ผ๋ก๋ง ์ฌ์ฉ ๊ฐ๋ฅํฉ๋๋ค.
Both the base model and dataset used for fine tuning were used for learning purposes, so this model can also be used only for learning purposes.
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 8000
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer |
|---|---|---|---|---|
| 0.6416 | 0.44 | 1000 | 0.6564 | 64.1489 |
| 0.5914 | 0.88 | 2000 | 0.5688 | 37.4957 |
| 0.435 | 1.32 | 3000 | 0.5349 | 32.6734 |
| 0.4056 | 1.76 | 4000 | 0.5124 | 30.9065 |
| 0.3368 | 2.2 | 5000 | 0.5057 | 32.6925 |
| 0.3107 | 2.64 | 6000 | 0.4979 | 32.8315 |
| 0.3016 | 3.08 | 7000 | 0.4947 | 29.3060 |
| 0.2979 | 3.52 | 8000 | 0.4941 | 30.7538 |
Framework versions
- Transformers 4.34.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3
- Downloads last month
- 35