aripos1
/

gorani-3B

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions

gorani-3B / README.md

aripos1's picture

Update README.md

b7d1204 verified 12 months ago

|

history blame contribute delete

1.5 kB

	---
	license: apache-2.0
	datasets:
	- aripos1/gorani_dataset
	language:
	- ko
	- en
	- ja
	base_model:
	- unsloth/Llama-3.2-3B-Instruct-bnb-4bit
	pipeline_tag: text-generation
	library_name: transformers
	---
	# Gorani Model Card

	## 소개 (Introduce)
	이 모델은 번역을 위한 모델입니다. 한국 고유어의 정확한 번역을 생성하기 위해 한국어, 영어, 일본어의 언어 데이터를 혼합하여 unsloth/Llama-3.2-3B-Instruct-bnb-4bit을 학습시켜 생성된 gorani-1B 입니다.
	gorani는 현재 한국어, 영어, 일본어만 번역을 지원합니다.

	### 모델 정보
	- 개발자: airpos1
	- 모델 유형: llama를 기반으로 하는 3B 매개변수 모델인 gorani-3B
	- 지원 언어: 한국어, 영어, 일본어
	- 라이센스: llama

	## Training Hyperparameters
	- per_device_train_batch_size: 8
	- gradient_accumulation_steps: 1
	- warmup_steps: 5
	- learning_rate: 2e-4
	- fp16: `not is_bfloat16_supported()`
	- num_train_epochs: 3
	- weight_decay: 0.01
	- lr_scheduler_type: "linear"

	## 학습 데이터
	[데이터셋 링크](https://huggingface.co/datasets/aripos1/gorani_dataset)

	## 학습 성능 비교
	![image/png](https://cdn-uploads.huggingface.co/production/uploads/676f7b45ffba1987fabb1586/yyzKBbmmHTJtYovU2g4xM.png)

	## Training Results
	![image/png](https://cdn-uploads.huggingface.co/production/uploads/676f7b45ffba1987fabb1586/QO6QprIrjlzS3eh50UGfa.png)