batisay-ko-base 1.0 โ€” ๋ฌด๋ฃŒ ํ•œ๊ตญ์–ด STT (Apache 2.0)

ํ•œ๊ตญ์–ด fine-tuned Whisper Large v3 Turbo (809M). Apache 2.0 ยท ๊ฒŒ์ดํŒ… ์—†์Œ ยท ์™ธ๋ถ€ ํŒ๋งค ํฌํ•จ ์ž์œ  ์‚ฌ์šฉ. ์‹ค์‚ฌ์šฉ ํ†ตํ™”(long-form)์— ๊ฐ•๊ฑดํ•œ ๋ฌด๋ฃŒ base ๋ชจ๋ธ โ€” BatiFlow App ์˜ ๊ธฐ๋ณธ ํ†ตํ™” ๋ฐฑ์—… ๋ชจ๋ธ.

CER ๋ฒค์น˜๋งˆํฌ

์ธก์ • batisay-ko-base ๋น„๊ต
์ผ๋ฐ˜ ์Œ์„ฑ (KsponSpeech clean, RTZR-strict N=500) 7.77% Whisper Large v3 raw 17.03% / RTZR API 5.91-6.18%
์‹ค์‚ฌ์šฉ ํ†ตํ™” (2~51๋ถ„ long-form, 5๊ฑด) 19.11% ๋‹ค์–‘ํ•œ ํ™”์žยทํ™˜๊ฒฝ์—์„œ ์•ˆ์ •์  (turbo 1.1 ์€ 14.85%)

base ์˜ ๊ฐ•์ ์€ ์‹คํ†ตํ™” ์ผ๋ฐ˜ํ™” ๊ฐ•๊ฑด์„ฑ ์ž…๋‹ˆ๋‹ค. ํ•™์Šต์€ ์ผ๋ฐ˜ ์Œ์„ฑ ์œ„์ฃผ์ง€๋งŒ, ๊ณผ์ ํ•ฉ์ด ์—†์–ด ์‹ค์‚ฌ์šฉ ํ†ตํ™”์—์„œ ๋ฌด๋‚œํ•ฉ๋‹ˆ๋‹ค.

ํŒŒ์ผ

ggml-batisay-ko-base.bin         1.6 GB  (F32, ์ตœ๊ณ  quality)
ggml-batisay-ko-base-q5_0.bin    547 MB  (Q5, balanced) โญ ๊ถŒ์žฅ
ggml-batisay-ko-base-q4_0.bin    452 MB  (Q4, Mac 8GB)
model.safetensors                1.6 GB  (transformers / MLX ์†Œ์Šค)

์‚ฌ์šฉ โ€” whisper.cpp / BatiFlow App

huggingface-cli download batiai/batisay-ko-base ggml-batisay-ko-base-q5_0.bin --local-dir .
./whisper-cli -m ggml-batisay-ko-base-q5_0.bin -l ko -f audio.wav --output-txt

์‚ฌ์šฉ โ€” Python (transformers)

from transformers import WhisperForConditionalGeneration, WhisperProcessor
model = WhisperForConditionalGeneration.from_pretrained('batiai/batisay-ko-base')
processor = WhisperProcessor.from_pretrained('batiai/batisay-ko-base', language='Korean', task='transcribe')

ํ•™์Šต

  • Base: openai/whisper-large-v3-turbo (809M params)
  • Data: KsponSpeech 1000h + Zeroth-Korean 50h
  • Epoch 3, LR 1e-5 linear, 2 GPU DDP (A6000 48GB), train 35.8h
  • ํ•™์Šต ์‹œ์ : 2026-05-28 ยท ๋‚ด๋ถ€ ๋นŒ๋“œ: V7

๋ผ์ด์„ ์Šค

Apache 2.0 (BatiAI Open Tier 1) โ€” ๊ฒŒ์ดํŒ… ์—†์Œ, ์ƒ์—…ยท์™ธ๋ถ€ SaaS ํฌํ•จ ์ œ์•ฝ ์—†์Œ.

๊ด€๋ จ ๋ชจ๋ธ (batisay STT ๋ผ์ธ์—…)

๋ชจ๋ธ ๋ฒ„์ „ ๋ผ์ด์„ ์Šค ํŠน์ง•
batisay-ko-base (๋ณธ ๋ชจ๋ธ) 1.0 Apache 2.0 (๋ฌด๋ฃŒ) ๋ฌด๋ฃŒยท๊ฐ•๊ฑด. ์ผ๋ฐ˜ 7.77% / ์‹คํ†ตํ™” 19.11%
batisay-ko-turbo 1.1 Community v2 (gated) ํ†ตํ™”ยทํšŒ์˜ยท๋Œ€ํ™” ๋ฉ€ํ‹ฐ๋„๋ฉ”์ธ ๊ฐ•ํ™”. ์‹คํ†ตํ™” 14.85% (base ์ถ”์›”), ์ผ๋ฐ˜ 6.99%
batisay-ko-large 1.0 Community v2 (gated) ๊ณ ํ’ˆ์งˆ ๋ฌธ์„œ/clean ์ „์‚ฌ ํŠนํ™”

โ†’ ๋ฌด๋ฃŒยท์™ธ๋ถ€ SaaS = base (๋ณธ ๋ชจ๋ธ) / ํ†ตํ™”ยท๋ฉ€ํ‹ฐ๋„๋ฉ”์ธ ์ •ํ™•๋„ = turbo 1.1 / ๋ฌธ์„œ ์ „์‚ฌ = large

๋ฌธ์˜: support@bati.ai ยท https://flow.bati.ai

Downloads last month
442
Safetensors
Model size
0.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for batiai/batisay-ko-base

Finetuned
(559)
this model

Collection including batiai/batisay-ko-base