Update README.md
Browse files
README.md
CHANGED
|
@@ -132,10 +132,7 @@ As shown in the table, our SeaLLM model outperforms most 13B baselines and reach
|
|
| 132 |
| Llama-2-13b-chat | 61.17 | 43.29 | 39.97 | 35.50 | 23.74
|
| 133 |
| Polylm-13b-chat | 32.23 | 29.26 | 29.01 | 25.36 | 18.08
|
| 134 |
| Qwen-PolyLM-7b-chat | 53.65 | 61.58 | 39.26 | 33.69 | 29.02
|
| 135 |
-
| SeaLLM-13b
|
| 136 |
-
| SeaLLM-13bChat/SFT/v1 | 63.53 | 45.47 | 50.25 | 39.85 | 36.07
|
| 137 |
-
| SeaLLM-13bChat/SFT/v2 | 62.35 | 45.81 | 49.92 | 40.04 | 36.49
|
| 138 |
-
|
| 139 |
|
| 140 |
|
| 141 |
### MMLU - Preserving English-based knowledge
|
|
@@ -164,8 +161,7 @@ As shown in the table below, the 1-shot reading comprehension performance is sig
|
|
| 164 |
|-----------| ------- | ------- | ------- | ------- | ------- | ------- | ------- |
|
| 165 |
| Llama-2-13b | 83.22 | 78.02 | 71.03 | 59.31 | 30.73 | 64.46 | 59.77
|
| 166 |
| Llama-2-13b-chat | 80.46 | 70.54 | 62.87 | 63.05 | 25.73 | 60.93 | 51.21
|
| 167 |
-
| SeaLLM-13b-chat
|
| 168 |
-
| SeaLLM-13b-chat-v2 | 81.51 | 76.10 | 73.64 | 69.11 | 64.54 | 72.98 | 69.10
|
| 169 |
|
| 170 |
|
| 171 |
#### Translation
|
|
@@ -174,12 +170,12 @@ For translation tasks, we evaluate our models with the [FloRes-200](https://gith
|
|
| 174 |
|
| 175 |
Similarly observed, our SeaLLM models outperform Llama-2 significantly in the new languages.
|
| 176 |
|
|
|
|
| 177 |
| FloRes-200 (chrF++) | En-Zh | En-Vi | En-Id | En-Th | En->X | Zh-En | Vi-En | Id-En | Th-En | X->En
|
| 178 |
|-------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
| 179 |
-
| Llama-2-13b
|
| 180 |
-
| Llama-2-13b-chat
|
| 181 |
-
| SeaLLM-13b-chat
|
| 182 |
-
| SeaLLM-13b-chat-v2 | 22.75 | 58.78 | 65.90 | 42.60 | 55.76 | 53.34 | 60.80 | 65.44 | 57.05 | 61.10
|
| 183 |
|
| 184 |
Our models are also performing competitively with ChatGPT for translation between SEA languages without English pivoting.
|
| 185 |
|
|
@@ -197,7 +193,7 @@ Lastly, in 2-shot [XL-sum summarization tasks](https://aclanthology.org/2021.fin
|
|
| 197 |
|-------- | ---- | ---- | ---- | ---- | ---- |
|
| 198 |
| Llama-2-13b | 32.57 | 34.37 | 18.61 | 25.14 | 16.91
|
| 199 |
| Llama-2-13b-chat | 25.11 | 31.13 | 18.29 | 22.45 | 17.51
|
| 200 |
-
| SeaLLM-13b-chat
|
| 201 |
|
| 202 |
## Acknowledge our linguists
|
| 203 |
|
|
|
|
| 132 |
| Llama-2-13b-chat | 61.17 | 43.29 | 39.97 | 35.50 | 23.74
|
| 133 |
| Polylm-13b-chat | 32.23 | 29.26 | 29.01 | 25.36 | 18.08
|
| 134 |
| Qwen-PolyLM-7b-chat | 53.65 | 61.58 | 39.26 | 33.69 | 29.02
|
| 135 |
+
| SeaLLM-13b-chat | 63.53 | 46.31 | 49.25 | 40.61 | 36.30
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
|
| 138 |
### MMLU - Preserving English-based knowledge
|
|
|
|
| 161 |
|-----------| ------- | ------- | ------- | ------- | ------- | ------- | ------- |
|
| 162 |
| Llama-2-13b | 83.22 | 78.02 | 71.03 | 59.31 | 30.73 | 64.46 | 59.77
|
| 163 |
| Llama-2-13b-chat | 80.46 | 70.54 | 62.87 | 63.05 | 25.73 | 60.93 | 51.21
|
| 164 |
+
| SeaLLM-13b-chat | 75.23 | 75.65 | 72.86 | 64.37 | 61.37 | 69.90 | 66.20
|
|
|
|
| 165 |
|
| 166 |
|
| 167 |
#### Translation
|
|
|
|
| 170 |
|
| 171 |
Similarly observed, our SeaLLM models outperform Llama-2 significantly in the new languages.
|
| 172 |
|
| 173 |
+
|
| 174 |
| FloRes-200 (chrF++) | En-Zh | En-Vi | En-Id | En-Th | En->X | Zh-En | Vi-En | Id-En | Th-En | X->En
|
| 175 |
|-------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
| 176 |
+
| Llama-2-13b | 24.36 | 53.20 | 60.41 | 22.16 | 45.26 | 53.20 | 59.10 | 63.42 | 38.48 | 53.55
|
| 177 |
+
| Llama-2-13b-chat | 19.58 | 51.70 | 57.14 | 21.18 | 37.40 | 52.27 | 54.32 | 60.55 | 30.18 | 49.33
|
| 178 |
+
| SeaLLM-13b-chat | 23.12 | 53.67 | 59.00 | 60.93 | 66.16 | 65.66 | 43.33 | 57.39
|
|
|
|
| 179 |
|
| 180 |
Our models are also performing competitively with ChatGPT for translation between SEA languages without English pivoting.
|
| 181 |
|
|
|
|
| 193 |
|-------- | ---- | ---- | ---- | ---- | ---- |
|
| 194 |
| Llama-2-13b | 32.57 | 34.37 | 18.61 | 25.14 | 16.91
|
| 195 |
| Llama-2-13b-chat | 25.11 | 31.13 | 18.29 | 22.45 | 17.51
|
| 196 |
+
| SeaLLM-13b-chat | 26.88 | 33.39 | 19.39 | 25.96 | 21.37
|
| 197 |
|
| 198 |
## Acknowledge our linguists
|
| 199 |
|