Update README.md
Browse files
README.md
CHANGED
|
@@ -31,6 +31,21 @@ A6000 * 4, Deepspeed off-load를 이용해 batch size를 극대화 시켰습니
|
|
| 31 |
- Warmup min LR 1e-6
|
| 32 |
- Zero Stage3 off-load
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
**Model Architecture** Llama 3 is an auto-regressive language model
|
| 35 |
|
| 36 |
**Model Release Date** 2024.05.08.
|
|
|
|
| 31 |
- Warmup min LR 1e-6
|
| 32 |
- Zero Stage3 off-load
|
| 33 |
|
| 34 |
+
**Perplexity**
|
| 35 |
+
Solar-10.7B (chihoonlee10/T3Q-ko-solar-dpo-v1.0) - 3.161
|
| 36 |
+
|
| 37 |
+
EEVE-10.8B (yanolja/EEVE-Korean-Instruct-10.8B-v1.0) - 3.505
|
| 38 |
+
|
| 39 |
+
KULLM3 (nlpai-lab/KULLM3) - 2.903
|
| 40 |
+
|
| 41 |
+
MLP-KTLim (MLP-KTLim/Bllossom) - 4.385
|
| 42 |
+
|
| 43 |
+
Open-Llama2-7B (beomi/llama-2-ko-7b) - 3.393
|
| 44 |
+
|
| 45 |
+
Open-Llama3-8B (beomi/Llama-3-Open-Ko-8B) - 3.529
|
| 46 |
+
|
| 47 |
+
KoSaul-8B - 2.649
|
| 48 |
+
|
| 49 |
**Model Architecture** Llama 3 is an auto-regressive language model
|
| 50 |
|
| 51 |
**Model Release Date** 2024.05.08.
|