gorani-3B / README.md
aripos1's picture
Update README.md
b7d1204 verified
---
license: apache-2.0
datasets:
- aripos1/gorani_dataset
language:
- ko
- en
- ja
base_model:
- unsloth/Llama-3.2-3B-Instruct-bnb-4bit
pipeline_tag: text-generation
library_name: transformers
---
# Gorani Model Card
## ์†Œ๊ฐœ (Introduce)
์ด ๋ชจ๋ธ์€ ๋ฒˆ์—ญ์„ ์œ„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ํ•œ๊ตญ ๊ณ ์œ ์–ด์˜ ์ •ํ™•ํ•œ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด์˜ ์–ธ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ํ˜ผํ•ฉํ•˜์—ฌ **unsloth/Llama-3.2-3B-Instruct-bnb-4bit**์„ ํ•™์Šต์‹œ์ผœ ์ƒ์„ฑ๋œ **gorani-1B** ์ž…๋‹ˆ๋‹ค.
gorani๋Š” ํ˜„์žฌ **ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด**๋งŒ ๋ฒˆ์—ญ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
### ๋ชจ๋ธ ์ •๋ณด
- **๊ฐœ๋ฐœ์ž**: airpos1
- **๋ชจ๋ธ ์œ ํ˜•**: **llama**๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” **3B** ๋งค๊ฐœ๋ณ€์ˆ˜ ๋ชจ๋ธ์ธ **gorani-3B**
- **์ง€์› ์–ธ์–ด**: ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด
- **๋ผ์ด์„ผ์Šค**: **llama**
## Training Hyperparameters
- **per_device_train_batch_size**: 8
- **gradient_accumulation_steps**: 1
- **warmup_steps**: 5
- **learning_rate**: 2e-4
- **fp16**: `not is_bfloat16_supported()`
- **num_train_epochs**: 3
- **weight_decay**: 0.01
- **lr_scheduler_type**: "linear"
## ํ•™์Šต ๋ฐ์ดํ„ฐ
[๋ฐ์ดํ„ฐ์…‹ ๋งํฌ](https://huggingface.co/datasets/aripos1/gorani_dataset)
## ํ•™์Šต ์„ฑ๋Šฅ ๋น„๊ต
![image/png](https://cdn-uploads.huggingface.co/production/uploads/676f7b45ffba1987fabb1586/yyzKBbmmHTJtYovU2g4xM.png)
## Training Results
![image/png](https://cdn-uploads.huggingface.co/production/uploads/676f7b45ffba1987fabb1586/QO6QprIrjlzS3eh50UGfa.png)