whisper-medium-komixv2-mlx

seastar105/whisper-medium-komixv2๋ฅผ MLX ํฌ๋งท์œผ๋กœ ๋ณ€ํ™˜ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. Apple Silicon (M1/M2/M3/M4)์—์„œ mlx-whisper๋กœ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์›๋ณธ ๋ชจ๋ธ ์ •๋ณด

ํ•ญ๋ชฉ ๋‚ด์šฉ
๊ธฐ๋ฐ˜ ๋ชจ๋ธ openai/whisper-medium (769M ํŒŒ๋ผ๋ฏธํ„ฐ)
ํ•™์Šต ๋ฐ์ดํ„ฐ AI-Hub ํ•œ๊ตญ์–ด ์Œ์„ฑ (ํšŒ์˜, ์ „ํ™”, ๋ฐฉ์†ก, ์ฃผ์†Œ, ๋‹ค์ค‘๋„๋ฉ”์ธ)
ํ•™์Šต ํ™˜๊ฒฝ Google TPU Research Cloud, 50,000 ์Šคํ…
์›์ €์ž seastar105

๋ณ€ํ™˜ ์ƒ์„ธ

ํ•ญ๋ชฉ ๊ฐ’
๋ณ€ํ™˜ ํ˜•์‹ Flax (flax_model.msgpack) โ†’ NumPy โ†’ safetensors
๊ฐ€์ค‘์น˜ dtype float16
๊ฐ€์ค‘์น˜ ํ‚ค ์ˆ˜ 946
ํŒŒ์ผ ํฌ๊ธฐ ~1.4 GB
Conv1d ์ถ• ๋ณ€ํ™˜ (out, in, kernel) โ†’ (out, kernel, in)
ํ‚ค ๋ฆฌ๋งคํ•‘ HuggingFace โ†’ OpenAI โ†’ MLX ํ˜•์‹

ํ‰๊ฐ€ ๊ฒฐ๊ณผ (Zeroth-Korean ํ…Œ์ŠคํŠธ์…‹)

46๊ฐœ ์ƒ˜ํ”Œ (์ „์ฒด 457๊ฐœ์˜ 10%) ๊ธฐ์ค€, greedy decoding:

๋ชจ๋ธ WER (โ†“) CER (โ†“) ์ฒ˜๋ฆฌ ์‹œ๊ฐ„
whisper-medium-komixv2-mlx (๋ณธ ๋ชจ๋ธ) 21.81% 6.14% 49์ดˆ
mlx-community/whisper-medium-mlx (๋ฒ”์šฉ) 25.25% 6.89% 48์ดˆ

ํ•œ๊ตญ์–ด CER ๊ธฐ์ค€ 0.75%p ๊ฐœ์„  (๋ฒ”์šฉ ๋ชจ๋ธ ๋Œ€๋น„).

์‚ฌ์šฉ๋ฒ•

pip install mlx-whisper
import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="youngouk/whisper-medium-komixv2-mlx",
    language="ko",
)
print(result["text"])

๋ชจ๋ธ ์„ค์ • (config.json)

{
  "n_mels": 80,
  "n_audio_ctx": 1500,
  "n_audio_state": 1024,
  "n_audio_head": 16,
  "n_audio_layer": 24,
  "n_vocab": 51865,
  "n_text_ctx": 448,
  "n_text_state": 1024,
  "n_text_head": 16,
  "n_text_layer": 24,
  "model_type": "whisper"
}

๋ผ์ด์„ ์Šค

  • ๊ธฐ๋ฐ˜ ๋ชจ๋ธ (OpenAI Whisper): MIT License
  • ์›๋ณธ ๋ชจ๋ธ (seastar105/whisper-medium-komixv2): AI-Hub ํ•œ๊ตญ์–ด ์Œ์„ฑ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋จ
  • AI-Hub ๋ฐ์ดํ„ฐ ์ด์šฉ์•ฝ๊ด€์ด ์ ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค: AI-Hub ์ด์šฉ์ •์ฑ…

๊ฐ์‚ฌ

  • seastar105 โ€” ํ•œ๊ตญ์–ด Whisper ๋ชจ๋ธ ํ•™์Šต
  • ml-explore/mlx-examples โ€” MLX Whisper ๊ตฌํ˜„
  • AI-Hub โ€” ํ•œ๊ตญ์–ด ์Œ์„ฑ ๋ฐ์ดํ„ฐ์…‹ ์ œ๊ณต
Downloads last month
69
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for youngouk/whisper-medium-komixv2-mlx

Finetuned
(1)
this model