whisper-medium-komixv2-mlx
seastar105/whisper-medium-komixv2๋ฅผ MLX ํฌ๋งท์ผ๋ก ๋ณํํ ๋ชจ๋ธ์ ๋๋ค. Apple Silicon (M1/M2/M3/M4)์์ mlx-whisper๋ก ๋ฐ๋ก ์ฌ์ฉํ ์ ์์ต๋๋ค.
์๋ณธ ๋ชจ๋ธ ์ ๋ณด
| ํญ๋ชฉ | ๋ด์ฉ |
|---|---|
| ๊ธฐ๋ฐ ๋ชจ๋ธ | openai/whisper-medium (769M ํ๋ผ๋ฏธํฐ) |
| ํ์ต ๋ฐ์ดํฐ | AI-Hub ํ๊ตญ์ด ์์ฑ (ํ์, ์ ํ, ๋ฐฉ์ก, ์ฃผ์, ๋ค์ค๋๋ฉ์ธ) |
| ํ์ต ํ๊ฒฝ | Google TPU Research Cloud, 50,000 ์คํ |
| ์์ ์ | seastar105 |
๋ณํ ์์ธ
| ํญ๋ชฉ | ๊ฐ |
|---|---|
| ๋ณํ ํ์ | Flax (flax_model.msgpack) โ NumPy โ safetensors |
| ๊ฐ์ค์น dtype | float16 |
| ๊ฐ์ค์น ํค ์ | 946 |
| ํ์ผ ํฌ๊ธฐ | ~1.4 GB |
| Conv1d ์ถ ๋ณํ | (out, in, kernel) โ (out, kernel, in) |
| ํค ๋ฆฌ๋งคํ | HuggingFace โ OpenAI โ MLX ํ์ |
ํ๊ฐ ๊ฒฐ๊ณผ (Zeroth-Korean ํ ์คํธ์ )
46๊ฐ ์ํ (์ ์ฒด 457๊ฐ์ 10%) ๊ธฐ์ค, greedy decoding:
| ๋ชจ๋ธ | WER (โ) | CER (โ) | ์ฒ๋ฆฌ ์๊ฐ |
|---|---|---|---|
| whisper-medium-komixv2-mlx (๋ณธ ๋ชจ๋ธ) | 21.81% | 6.14% | 49์ด |
| mlx-community/whisper-medium-mlx (๋ฒ์ฉ) | 25.25% | 6.89% | 48์ด |
ํ๊ตญ์ด CER ๊ธฐ์ค 0.75%p ๊ฐ์ (๋ฒ์ฉ ๋ชจ๋ธ ๋๋น).
์ฌ์ฉ๋ฒ
pip install mlx-whisper
import mlx_whisper
result = mlx_whisper.transcribe(
"audio.wav",
path_or_hf_repo="youngouk/whisper-medium-komixv2-mlx",
language="ko",
)
print(result["text"])
๋ชจ๋ธ ์ค์ (config.json)
{
"n_mels": 80,
"n_audio_ctx": 1500,
"n_audio_state": 1024,
"n_audio_head": 16,
"n_audio_layer": 24,
"n_vocab": 51865,
"n_text_ctx": 448,
"n_text_state": 1024,
"n_text_head": 16,
"n_text_layer": 24,
"model_type": "whisper"
}
๋ผ์ด์ ์ค
- ๊ธฐ๋ฐ ๋ชจ๋ธ (OpenAI Whisper): MIT License
- ์๋ณธ ๋ชจ๋ธ (seastar105/whisper-medium-komixv2): AI-Hub ํ๊ตญ์ด ์์ฑ ๋ฐ์ดํฐ๋ก ํ์ต๋จ
- AI-Hub ๋ฐ์ดํฐ ์ด์ฉ์ฝ๊ด์ด ์ ์ฉ๋ ์ ์์ต๋๋ค: AI-Hub ์ด์ฉ์ ์ฑ
๊ฐ์ฌ
- seastar105 โ ํ๊ตญ์ด Whisper ๋ชจ๋ธ ํ์ต
- ml-explore/mlx-examples โ MLX Whisper ๊ตฌํ
- AI-Hub โ ํ๊ตญ์ด ์์ฑ ๋ฐ์ดํฐ์ ์ ๊ณต
- Downloads last month
- 69
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for youngouk/whisper-medium-komixv2-mlx
Base model
openai/whisper-medium Finetuned
seastar105/whisper-medium-komixv2