bigdefence's picture
Update README.md
f8bb3a8 verified
---
license: apache-2.0
language:
- ko
base_model:
- naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B
tags:
- speech-to-text
- korean
- llama
- audio
- voice
- bigdefence
- HyperCLOVAX
- naver
pipeline_tag: audio-text-to-text
---
## ๐ŸŽง Bigvox
- **Bigvox**์€ ํ•œ๊ตญ์–ด ์Œ์„ฑ ์ธ์‹์— ํŠนํ™”๋œ ๊ณ ์„ฑ๋Šฅ, ์ €์ง€์—ฐ ์Œ์„ฑ ์–ธ์–ด ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. [naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B) ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๐Ÿš€
- **End-to-End** ์Œ์„ฑ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๊ตฌ์กฐ๋ฅผ ์ฑ„ํƒํ•˜์—ฌ ์Œ์„ฑ ์ž…๋ ฅ๋ถ€ํ„ฐ ํ…์ŠคํŠธ ์ถœ๋ ฅ๊นŒ์ง€ ํ•˜๋‚˜์˜ ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ์ฒ˜๋ฆฌํ•˜๋ฉฐ, ์ถ”๊ฐ€์ ์ธ ์ค‘๊ฐ„ ๋ชจ๋ธ ์—†์ด ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ฒ˜๋ฆฌ๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/653494138bde2fae198fe89e/NwonFS__hErgVy0p2Weu4.png)
### ๐Ÿ“‚ ๋ชจ๋ธ ์ ‘๊ทผ
- **GitHub**: [bigdefence/bigvox-hyperclovax](https://github.com/bigdefence/bigvox-hyperclovax) ๐ŸŒ
- **HuggingFace**: [bigdefence/Bigvox-HyperCLOVAX-Audio](https://huggingface.co/bigdefence/Bigvox-HyperCLOVAX-Audio) ๐Ÿค—
- **๋ชจ๋ธ ํฌ๊ธฐ**: 1B ํŒŒ๋ผ๋ฏธํ„ฐ ๐Ÿ“Š
## ๐ŸŒŸ ์ฃผ์š” ํŠน์ง•
- **๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด ํŠนํ™”**: ํ•œ๊ตญ์–ด ์Œ์„ฑ ํŒจํ„ด๊ณผ ์–ธ์–ด์  ํŠน์„ฑ์— ์ตœ์ ํ™”
- **โšก ๊ฒฝ๋Ÿ‰ํ™”**: 1B ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ํšจ์œจ์ ์ธ ์ถ”๋ก  ์„ฑ๋Šฅ
- **๐ŸŽฏ ๊ณ ์ •ํ™•๋„**: ๋‹ค์–‘ํ•œ ํ•œ๊ตญ์–ด ์Œ์„ฑ ํ™˜๊ฒฝ์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ
- **๐Ÿ”ง ์‹ค์šฉ์„ฑ**: ์‹ค์‹œ๊ฐ„ ์Œ์„ฑ ์ธ์‹ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ ํ•ฉ
## ๐Ÿ“‹ ๋ชจ๋ธ ์ •๋ณด
| ํ•ญ๋ชฉ | ์„ธ๋ถ€์‚ฌํ•ญ |
|------|----------|
| **๊ธฐ๋ฐ˜ ๋ชจ๋ธ** | naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B |
| **์–ธ์–ด** | ํ•œ๊ตญ์–ด (Korean) |
| **๋ชจ๋ธ ํฌ๊ธฐ** | ~1B ํŒŒ๋ผ๋ฏธํ„ฐ |
| **์ž‘์—… ์œ ํ˜•** | Speech-to-Text ์Œ์„ฑ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ |
| **๋ผ์ด์„ ์Šค** | Apache 2.0 |
### ๐Ÿ”ง ๋ ˆํฌ์ง€ํ† ๋ฆฌ ๋‹ค์šด๋กœ๋“œ ๋ฐ ํ™˜๊ฒฝ ์„ค์ •
**Bigvox**์„ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ ˆํฌ์ง€ํ† ๋ฆฌ๋ฅผ ํด๋ก ํ•˜๊ณ  ํ™˜๊ฒฝ์„ ์„ค์ •ํ•˜์„ธ์š”. ๐Ÿ› ๏ธ
1. **๋ ˆํฌ์ง€ํ† ๋ฆฌ ํด๋ก **:
```bash
git clone https://github.com/bigdefence/bigvox-hyperclovax
cd bigvox-hyperclovax
```
2. **์˜์กด์„ฑ ์„ค์น˜**:
```bash
bash setting.sh
```
### ๐Ÿ“ฅ ๋‹ค์šด๋กœ๋“œ ๋ฐฉ๋ฒ•
**Huggingface CLI ์‚ฌ์šฉ**:
```bash
pip install -U huggingface_hub
huggingface-cli download bigdefence/Bigvox-HyperCLOVAX-Audio --local-dir ./checkpoints
```
**Snapshot Download ์‚ฌ์šฉ**:
```bash
pip install -U huggingface_hub
```
```python
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="bigdefence/Bigvox-HyperCLOVAX-Audio",
local_dir="./checkpoints",
resume_download=True
)
```
**Git ์‚ฌ์šฉ**:
```bash
git lfs install
git clone https://huggingface.co/bigdefence/Bigvox-HyperCLOVAX-Audio
```
### ๐Ÿ› ๏ธ ์˜์กด์„ฑ ๋ชจ๋ธ
- **Speech Encoder**: [Whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) ๐ŸŽค
### ๐Ÿ”„ ๋กœ์ปฌ ์ถ”๋ก 
**Bigvox**์œผ๋กœ ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ผ ๋ชจ๋ธ์„ ์„ค์ •ํ•˜๊ณ  ๋กœ์ปฌ์—์„œ ์‹คํ–‰ํ•˜์„ธ์š”. ๐Ÿ“ก
1. **๋ชจ๋ธ ์ค€๋น„**:
- [HuggingFace](https://huggingface.co/bigdefence/Bigvox-HyperCLOVAX-Audio)์—์„œ **Bigvox** ๋‹ค์šด๋กœ๋“œ ๐Ÿ“ฆ
- [HuggingFace](https://huggingface.co/openai/whisper-large-v3)์—์„œ **Whisper-large-v3** ์Œ์„ฑ ์ธ์ฝ”๋”๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ `./models/speech_encoder/` ๋””๋ ‰ํ† ๋ฆฌ์— ๋ฐฐ์น˜ ๐ŸŽค
2. **์ถ”๋ก  ์‹คํ–‰**:
- **์Œ์„ฑ-ํ…์ŠคํŠธ(S2T)** ์ถ”๋ก :
- **Non-Streaming**
```bash
python3 omni_speech/infer/bigvox.py --query_audio test_audio.wav
```
- **Streaming**
```bash
python3 omni_speech/infer/bigvox_streaming.py --query_audio test_audio.wav
```
## ๐Ÿ”ง ํ›ˆ๋ จ ์„ธ๋ถ€์‚ฌํ•ญ
### ํ›ˆ๋ จ ์„ค์ •
- **Base Model**: naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B
- **Hardware**: 1x NVIDIA RTX 6000A GPU
- **Training Time**: 3์‹œ๊ฐ„
## โš ๏ธ ์ œํ•œ์‚ฌํ•ญ
- ๋ฐฐ๊ฒฝ ์†Œ์Œ์ด ์‹ฌํ•œ ํ™˜๊ฒฝ์—์„œ๋Š” ์„ฑ๋Šฅ์ด ์ €ํ•˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
- ๋งค์šฐ ๋น ๋ฅธ ๋ฐœํ™”๋‚˜ ์ค‘์–ผ๊ฑฐ๋ฆฌ๋Š” ๋งํˆฌ์— ๋Œ€ํ•ด์„œ๋Š” ์ธ์‹๋ฅ ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
- ์ „๋ฌธ ์šฉ์–ด๋‚˜ ๊ณ ์œ ๋ช…์‚ฌ์— ๋Œ€ํ•œ ์ธ์‹๋ฅ ์€ ๋„๋ฉ”์ธ์— ๋”ฐ๋ผ ์ฐจ์ด๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
## ๐Ÿ“œ ๋ผ์ด์„ ์Šค
์ด ๋ชจ๋ธ์€ Apache 2.0 ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค. ์ƒ์—…์  ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋ฉฐ, ์ž์„ธํ•œ ๋‚ด์šฉ์€ [LICENSE](LICENSE) ํŒŒ์ผ์„ ์ฐธ์กฐํ•˜์„ธ์š”.
## ๐Ÿ“ž ๋ฌธ์˜์‚ฌํ•ญ
- **๊ฐœ๋ฐœ**: BigDefence
## ๐Ÿ“ˆ ์—…๋ฐ์ดํŠธ ๋กœ๊ทธ
### v1.0.0 (2024.12)
- ๐ŸŽ‰ **์ดˆ๊ธฐ ๋ชจ๋ธ ๋ฆด๋ฆฌ์ฆˆ**: Bigvox ๊ณต๊ฐœ
- ๐Ÿ‡ฐ๐Ÿ‡ท **ํ•œ๊ตญ์–ด ํŠนํ™”**: HyperCLOVAX-SEED-Text-Instruct-0.5B ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ์Œ์„ฑ-ํ…์ŠคํŠธ ์Œ์„ฑ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋ชจ๋ธ
---
## ๐Ÿค ๊ธฐ์—ฌํ•˜๊ธฐ
**Bigvox** ํ”„๋กœ์ ํŠธ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์œผ์‹œ๋‹ค๋ฉด:
---
**BigDefence**์™€ ํ•จ๊ป˜ ํ•œ๊ตญ์–ด AI ์Œ์„ฑ ์ธ์‹์˜ ๋ฏธ๋ž˜๋ฅผ ๋งŒ๋“ค์–ด๊ฐ€์„ธ์š”! ๐Ÿš€๐Ÿ‡ฐ๐Ÿ‡ท
*"Every voice matters, every word counts - ๋ชจ๋“  ๋ชฉ์†Œ๋ฆฌ๊ฐ€ ์ค‘์š”ํ•˜๊ณ , ๋ชจ๋“  ๋ง์ด ๊ฐ€์น˜ ์žˆ์Šต๋‹ˆ๋‹ค"*