Update README.md
Browse files
README.md
CHANGED
|
@@ -31,15 +31,16 @@ A6000 * 4, Deepspeed off-load를 이용해 batch size를 극대화 시켰습니
|
|
| 31 |
- Warmup min LR 1e-6
|
| 32 |
- Zero Stage3 off-load
|
| 33 |
|
| 34 |
-
**Perplexity**
|
| 35 |
|
|
|
|
|
|
|
|
|
|
| 36 |
- Solar-10.7B (chihoonlee10/T3Q-ko-solar-dpo-v1.0) - 3.161
|
| 37 |
- EEVE-10.8B (yanolja/EEVE-Korean-Instruct-10.8B-v1.0) - 3.505
|
| 38 |
- KULLM3 (nlpai-lab/KULLM3) - 2.903
|
| 39 |
- MLP-KTLim (MLP-KTLim/Bllossom) - 4.385
|
| 40 |
-
|
| 41 |
-
- Open-Llama3-8B (beomi/Llama-3-Open-Ko-8B) - 3.529
|
| 42 |
-
- KoSaul-8B - 2.649
|
| 43 |
|
| 44 |
**Model Architecture** Llama 3 is an auto-regressive language model
|
| 45 |
|
|
|
|
| 31 |
- Warmup min LR 1e-6
|
| 32 |
- Zero Stage3 off-load
|
| 33 |
|
| 34 |
+
**Perplexity** 법령 데이터를 바탕으로 PPL을 평가했습니다.
|
| 35 |
|
| 36 |
+
- KoSaul-8B - 2.649
|
| 37 |
+
- Open-Llama3-8B (beomi/Llama-3-Open-Ko-8B) - 3.529
|
| 38 |
+
- Open-Llama2-7B (beomi/llama-2-ko-7b) - 3.393
|
| 39 |
- Solar-10.7B (chihoonlee10/T3Q-ko-solar-dpo-v1.0) - 3.161
|
| 40 |
- EEVE-10.8B (yanolja/EEVE-Korean-Instruct-10.8B-v1.0) - 3.505
|
| 41 |
- KULLM3 (nlpai-lab/KULLM3) - 2.903
|
| 42 |
- MLP-KTLim (MLP-KTLim/Bllossom) - 4.385
|
| 43 |
+
|
|
|
|
|
|
|
| 44 |
|
| 45 |
**Model Architecture** Llama 3 is an auto-regressive language model
|
| 46 |
|