Update README.md
Browse files
README.md
CHANGED
|
@@ -6,20 +6,22 @@ language:
|
|
| 6 |
|
| 7 |
# gemma-7b non IT λ²μ μ±ν
νμΈ νλλ λ²μ
|
| 8 |
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
| 11 |
|
| 12 |
## νΈλ μ΄λ μ 보
|
| 13 |
-
-
|
| 14 |
- GPU : RTX 3090 24G x 1
|
| 15 |
- optimizer : adamw_torch
|
| 16 |
- lr scheduler type : cosine
|
| 17 |
-
-
|
| 18 |
-
-
|
| 19 |
- train loss : 0.8991
|
| 20 |
- eval loss : 0.7305
|
| 21 |
|
| 22 |
-
## μ¬μ©λ²
|
| 23 |
```
|
| 24 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 25 |
import transformers
|
|
|
|
| 6 |
|
| 7 |
# gemma-7b non IT λ²μ μ±ν
νμΈ νλλ λ²μ
|
| 8 |
|
| 9 |
+
κ°λ¨ν μ±ν
ννμ λ°μ΄ν°λ‘ νμΈ νλλ λ²μ μ
λλ€.
|
| 10 |
+
|
| 11 |
+
## history
|
| 12 |
+
- 0.1 : 2024-04-05 μ΅μ΄ SFTλ²μ μ
λ‘λ, DPOλ κ³ λ―Ό μ€
|
| 13 |
|
| 14 |
## νΈλ μ΄λ μ 보
|
| 15 |
+
- μ¬μ©λ°μ΄ν°μ
: maywell/koVast μ philschmid/gemma-tokenizer-chatml μ λ§κ² λ³μ‘°νμ¬ μ¬μ©
|
| 16 |
- GPU : RTX 3090 24G x 1
|
| 17 |
- optimizer : adamw_torch
|
| 18 |
- lr scheduler type : cosine
|
| 19 |
+
- νΈλ μ΄λ μκ° : 140μκ°
|
| 20 |
+
- μν¬ν¬ : 1
|
| 21 |
- train loss : 0.8991
|
| 22 |
- eval loss : 0.7305
|
| 23 |
|
| 24 |
+
## μ¬μ©λ² (bfloat16, GPU λ©λͺ¨λ¦¬ μ½ 17κΈ°κ° νμ)
|
| 25 |
```
|
| 26 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 27 |
import transformers
|