gorani-3B / README.md
aripos1's picture
Update README.md
b7d1204 verified
metadata
license: apache-2.0
datasets:
  - aripos1/gorani_dataset
language:
  - ko
  - en
  - ja
base_model:
  - unsloth/Llama-3.2-3B-Instruct-bnb-4bit
pipeline_tag: text-generation
library_name: transformers

Gorani Model Card

์†Œ๊ฐœ (Introduce)

์ด ๋ชจ๋ธ์€ ๋ฒˆ์—ญ์„ ์œ„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ํ•œ๊ตญ ๊ณ ์œ ์–ด์˜ ์ •ํ™•ํ•œ ๋ฒˆ์—ญ์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด์˜ ์–ธ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ํ˜ผํ•ฉํ•˜์—ฌ unsloth/Llama-3.2-3B-Instruct-bnb-4bit์„ ํ•™์Šต์‹œ์ผœ ์ƒ์„ฑ๋œ gorani-1B ์ž…๋‹ˆ๋‹ค.
gorani๋Š” ํ˜„์žฌ ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด๋งŒ ๋ฒˆ์—ญ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

  • ๊ฐœ๋ฐœ์ž: airpos1
  • ๋ชจ๋ธ ์œ ํ˜•: llama๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” 3B ๋งค๊ฐœ๋ณ€์ˆ˜ ๋ชจ๋ธ์ธ gorani-3B
  • ์ง€์› ์–ธ์–ด: ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด
  • ๋ผ์ด์„ผ์Šค: llama

Training Hyperparameters

  • per_device_train_batch_size: 8
  • gradient_accumulation_steps: 1
  • warmup_steps: 5
  • learning_rate: 2e-4
  • fp16: not is_bfloat16_supported()
  • num_train_epochs: 3
  • weight_decay: 0.01
  • lr_scheduler_type: "linear"

ํ•™์Šต ๋ฐ์ดํ„ฐ

๋ฐ์ดํ„ฐ์…‹ ๋งํฌ

ํ•™์Šต ์„ฑ๋Šฅ ๋น„๊ต

image/png

Training Results

image/png