Update README.md
Browse files
README.md
CHANGED
|
@@ -109,13 +109,10 @@ We use GPT-4 as an evaluator to rate the comparison between our models versus Ch
|
|
| 109 |
|
| 110 |
Compared with [PolyLM-13b-chat](https://arxiv.org/pdf/2307.06018.pdf), a recent multilingual model, our model significantly outperforms across all languages and categories.
|
| 111 |
|
|
|
|
| 112 |
<div class="row" style="display: flex; clear: both;">
|
| 113 |
-
|
| 114 |
-
<img src="
|
| 115 |
-
</div>
|
| 116 |
-
<div class="column" style="float: left; width: 49%">
|
| 117 |
-
<img src="seallm_vs_polylm_by_cat_sea.png" alt="Forest" style="width:100%">
|
| 118 |
-
</div>
|
| 119 |
</div>
|
| 120 |
|
| 121 |
Compared with Llama-2-13b-chat, our SeaLLM-13b performs significantly better in all SEA languages,
|
|
@@ -123,13 +120,10 @@ despite the fact that Llama-2 was already trained on a decent data amount of Vi,
|
|
| 123 |
In english, our model is 46% as good as Llama-2-13b-chat, even though it did not undergo complex human-labor intensive RLHF.
|
| 124 |
|
| 125 |
|
|
|
|
| 126 |
<div class="row" style="display: flex; clear: both;">
|
| 127 |
-
<
|
| 128 |
-
|
| 129 |
-
</div>
|
| 130 |
-
<div class="column" style="float: left; width: 49%">
|
| 131 |
-
<img src="seallm_vs_llama2_by_cat_sea.png" alt="Forest" style="width:100%">
|
| 132 |
-
</div>
|
| 133 |
</div>
|
| 134 |
|
| 135 |
Compared with ChatGPT-3.5, our SeaLLM-13b model is performing 45% as good as ChatGPT for Thai.
|
|
@@ -137,12 +131,8 @@ For important aspects such as Safety and Task-Solving, our model nearly on par w
|
|
| 137 |
|
| 138 |
|
| 139 |
<div class="row" style="display: flex; clear: both;">
|
| 140 |
-
<
|
| 141 |
-
|
| 142 |
-
</div>
|
| 143 |
-
<div class="column" style="float: left; width: 49%">
|
| 144 |
-
<img src="seallm_vs_chatgpt_by_cat_sea.png" alt="Forest" style="width:100%">
|
| 145 |
-
</div>
|
| 146 |
</div>
|
| 147 |
|
| 148 |
### M3Exam - World Knowledge in Regional Languages
|
|
|
|
| 109 |
|
| 110 |
Compared with [PolyLM-13b-chat](https://arxiv.org/pdf/2307.06018.pdf), a recent multilingual model, our model significantly outperforms across all languages and categories.
|
| 111 |
|
| 112 |
+
|
| 113 |
<div class="row" style="display: flex; clear: both;">
|
| 114 |
+
<img src="seallm_vs_polylm_by_lang.png" alt="Snow" style="float: left; width: 48%">
|
| 115 |
+
<img src="seallm_vs_polylm_by_cat_sea.png" alt="Forest" style="float: left; width: 48%">
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
</div>
|
| 117 |
|
| 118 |
Compared with Llama-2-13b-chat, our SeaLLM-13b performs significantly better in all SEA languages,
|
|
|
|
| 120 |
In english, our model is 46% as good as Llama-2-13b-chat, even though it did not undergo complex human-labor intensive RLHF.
|
| 121 |
|
| 122 |
|
| 123 |
+
|
| 124 |
<div class="row" style="display: flex; clear: both;">
|
| 125 |
+
<img src="seallm_vs_llama2_by_lang.png" alt="Snow" style="float: left; width: 48%">
|
| 126 |
+
<img src="seallm_vs_llama2_by_cat_sea.png" alt="Forest" style="float: left; width: 48%">
|
|
|
|
|
|
|
|
|
|
|
|
|
| 127 |
</div>
|
| 128 |
|
| 129 |
Compared with ChatGPT-3.5, our SeaLLM-13b model is performing 45% as good as ChatGPT for Thai.
|
|
|
|
| 131 |
|
| 132 |
|
| 133 |
<div class="row" style="display: flex; clear: both;">
|
| 134 |
+
<img src="seallm_vs_chatgpt_by_lang.png" alt="Snow" style="float: left; width: 48%">
|
| 135 |
+
<img src="seallm_vs_chatgpt_by_cat_sea.png" alt="Forest" style="float: left; width: 48%">
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
</div>
|
| 137 |
|
| 138 |
### M3Exam - World Knowledge in Regional Languages
|