Update README.md
Browse files
README.md
CHANGED
|
@@ -45,7 +45,7 @@ pipeline_tag: text-generation
|
|
| 45 |
<div style="display: inline-block;">
|
| 46 |
|
| 47 |
<a rel="noopener nofollow" href="https://www.modelscope.cn/organization/sustc/">
|
| 48 |
-
<img src="https://img.shields.io/badge
|
| 49 |
</a>
|
| 50 |
|
| 51 |
</div>
|
|
@@ -78,36 +78,58 @@ pipeline_tag: text-generation
|
|
| 78 |
|
| 79 |
# News
|
| 80 |
|
| 81 |
-
- 2023-12-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 83 |
and surpassed all models under 70B.
|
| 84 |
|
| 85 |
-
- 2023-12-01: SUS-Chat-34B is now
|
|
|
|
| 86 |
|
| 87 |
-
#
|
| 88 |
|
| 89 |
<img src="https://hackmd.io/_uploads/HJlDtzhBa.png" id="fig-sus"
|
| 90 |
alt="Figure 1: DALL·E 2023-12-01 11.03.28 - An imposing, majestic wild boar combined with elements of a futuristic transformer robot. The boar itself should be intricately blended with these tra" />
|
| 91 |
|
| 92 |
-
**SUS-Chat** is a 34B bilingual Chinese-English dialogue model,
|
| 93 |
-
released by the **Southern University of Science and
|
| 94 |
-
**
|
| 95 |
-
|
| 96 |
-
`01-ai/Yi-34B` and has been
|
| 97 |
-
|
| 98 |
-
capabilities of the base model,
|
| 99 |
-
model’s response to human
|
| 100 |
-
fine-tuning and excels at
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
enhancing the usability of
|
|
|
|
| 104 |
|
| 105 |
It has surpassed all models of the same size in almost all benchmark
|
| 106 |
tests and is better suited to meet the practical needs of complex
|
| 107 |
multilingual tasks. Compared to larger models, SUS-Chat-34B remains
|
| 108 |
-
highly competitive and achieved state-of-the-art performance in our
|
| 109 |
comprehensive evaluations.
|
| 110 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 111 |
SUS-Chat powerfully demonstrates that through the right instruction
|
| 112 |
fine-tuning, academic institutions can achieve better performance
|
| 113 |
without increasing model parameters, using open-source datasets and
|
|
@@ -124,10 +146,9 @@ open-sourced the evaluation framework
|
|
| 124 |
replication and comparison by other researchers.
|
| 125 |
|
| 126 |
In TLEM, we utilized various benchmark tests including MMLU, CMMLU,
|
| 127 |
-
C-Eval, BBH, GSM-8K, and MATH,
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
incorporated
|
| 131 |
[lm-eval](https://github.com/EleutherAI/lm-evaluation-harness) to test
|
| 132 |
SUS-Chat and similar models on winogrande, hellaswag, arc, and
|
| 133 |
truthful-qa, assessing the model’s common-sense reasoning ability and
|
|
@@ -136,28 +157,176 @@ susceptibility to illusions.
|
|
| 136 |
Overall, the SUS-Chat-34B model significantly outperformed models of
|
| 137 |
similar scale and achieved the most advanced comprehensive performance.
|
| 138 |
|
| 139 |
-
| model | mmlu-chat | cmmlu-chat | ceval-chat | gsm8k | BBH | MATH | winogrande | arc | hellaswag | truthfulqa | average |
|
| 140 |
-
|:------------------|----------:|-----------:|-----------:|------:|------:|------:|-----------:|------:|----------:|-----------:|--------:|
|
| 141 |
-
| GPT-4 | 83 | 71 | 69.9 | 91.4 | 86.7 | 45.8 | 87.5 | 94.5 | 91.4 | nan | 80.1333 |
|
| 142 |
-
| SUS-Chat-34B | 77.35 | 78.68 | 82.42 | 80.06 | 67.62 | 28.8 | 81.22 | 81.54 | 83.79 | 57.47 | 71.895 |
|
| 143 |
-
| Qwen-72B-Chat | 74.52 | 77.02 | 77.22 | 76.57 | 72.63 | 35.9 | 80.58 | 81.29 | 87.02 | 50.64 | 71.339 |
|
| 144 |
-
| DeepSeek-67B-Chat | 69.43 | 48.51 | 59.7 | 74.45 | 69.73 | 29.56 | 76.09 | 82.1 | 86.06 | 56.37 | 65.2 |
|
| 145 |
-
| OrionStar-34B | 68.51 | 66.88 | 65.13 | 54.36 | 62.88 | 12.8 | 77.27 | 80.19 | 84.54 | 53.24 | 62.58 |
|
| 146 |
-
| Yi-34B-Chat | 66.96 | 55.16 | 77.16 | 63.76 | 61.54 | 10.02 | 76.64 | 70.66 | 82.29 | 54.57 | 61.876 |
|
| 147 |
-
|
| 148 |
<img
|
| 149 |
src="https://github.com/SUSTech-IDEA/SUS-Chat/raw/main/assets/radar.png"
|
| 150 |
id="fig-bench" alt="Figure 2: Benchmark" />
|
| 151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
# Usage
|
| 153 |
|
| 154 |
SUS-Chat-34B is a standard LLaMA model and should be seamlessly
|
| 155 |
compatible with the LLaMA ecosystem. We provide the following example to
|
| 156 |
demonstrate how it can be used for multi-turn dialogues.
|
| 157 |
|
| 158 |
-
|
| 159 |
-
|
|
|
|
| 160 |
|
|
|
|
|
|
|
|
|
|
| 161 |
|
| 162 |
def chat_template(messages):
|
| 163 |
history = ""
|
|
@@ -230,5 +399,5 @@ model.
|
|
| 230 |
|
| 231 |
This model is developed entirely for academic research and free
|
| 232 |
commercial use, but it must adhere to the
|
| 233 |
-
[license](https://github.com/
|
| 234 |
-
from 01-ai.
|
|
|
|
| 45 |
<div style="display: inline-block;">
|
| 46 |
|
| 47 |
<a rel="noopener nofollow" href="https://www.modelscope.cn/organization/sustc/">
|
| 48 |
+
<img src="https://img.shields.io/badge/🤖ModelScope-sustc-blue" style="margin: 0 0;">
|
| 49 |
</a>
|
| 50 |
|
| 51 |
</div>
|
|
|
|
| 78 |
|
| 79 |
# News
|
| 80 |
|
| 81 |
+
- 2023-12-06: Try [SUS-Chat-34B
|
| 82 |
+
chat-ui](https://huggingface.co/spaces/SUSTech/SUS-Chat-34B).
|
| 83 |
+
|
| 84 |
+
- 2023-12-05: SUS-Chat-34B is now available on
|
| 85 |
+
[ModelScope🤖](https://www.modelscope.cn/models/SUSTC/SUS-Chat-34B/summary)
|
| 86 |
+
|
| 87 |
+
- 2023-12-05: SUS-Chat-34B is ranked 2nd in [Open LLM
|
| 88 |
leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 89 |
and surpassed all models under 70B.
|
| 90 |
|
| 91 |
+
- 2023-12-01: SUS-Chat-34B is now available on
|
| 92 |
+
[HuggingFace🤗](https://huggingface.co/SUSTech/SUS-Chat-34B).
|
| 93 |
|
| 94 |
+
# Introduction
|
| 95 |
|
| 96 |
<img src="https://hackmd.io/_uploads/HJlDtzhBa.png" id="fig-sus"
|
| 97 |
alt="Figure 1: DALL·E 2023-12-01 11.03.28 - An imposing, majestic wild boar combined with elements of a futuristic transformer robot. The boar itself should be intricately blended with these tra" />
|
| 98 |
|
| 99 |
+
**SUS-Chat-34B** is a 34B bilingual Chinese-English dialogue model,
|
| 100 |
+
jointly released by the **[Southern University of Science and
|
| 101 |
+
Technology](https://huggingface.co/SUSTech)** and
|
| 102 |
+
**[IDEA-CCNL](https://huggingface.co/IDEA-CCNL)**. This model is based
|
| 103 |
+
on [`01-ai/Yi-34B`](https://huggingface.co/01-ai/Yi-34B) and has been
|
| 104 |
+
fine-tuned on millions of high-quality, multilingual instruction data.
|
| 105 |
+
While maintaining the strong language capabilities of the base model,
|
| 106 |
+
the SUS-Chat-34B model has improved the model’s response to human
|
| 107 |
+
instructions through high-quality instruction fine-tuning and excels at
|
| 108 |
+
imitating human thought processes through chains of thought. It
|
| 109 |
+
introduces inter-instruction attention sharing in long texts, expanding
|
| 110 |
+
the window size from 4K to 8K, significantly enhancing the usability of
|
| 111 |
+
multi-turn dialogues.
|
| 112 |
|
| 113 |
It has surpassed all models of the same size in almost all benchmark
|
| 114 |
tests and is better suited to meet the practical needs of complex
|
| 115 |
multilingual tasks. Compared to larger models, SUS-Chat-34B remains
|
| 116 |
+
highly competitive and has achieved state-of-the-art performance in our
|
| 117 |
comprehensive evaluations.
|
| 118 |
|
| 119 |
+
SUS-Chat-34B model has the following highlights: 1. Large-scale complex
|
| 120 |
+
instruction following data: Trained with 1.4 billion tokens of
|
| 121 |
+
high-quality complex instruction data, covering Chinese and English,
|
| 122 |
+
multi-turn dialogues, mathematics, reasoning, and various other types of
|
| 123 |
+
instruction data; 2. Strong performance in general tasks: The
|
| 124 |
+
SUS-Chat-34B model excels in numerous mainstream Chinese and English
|
| 125 |
+
tasks, surpassing other open-source instruction fine-tuned models of the
|
| 126 |
+
same parameter scale. It also competes well against models with larger
|
| 127 |
+
parameter scales; 3. Longer context window and excellent multi-turn
|
| 128 |
+
dialogue capabilities: Currently, SUS-Chat-34B supports an 8K context
|
| 129 |
+
window, and is trained with a large amount of multi-turn instruction and
|
| 130 |
+
single-multi-turn mixed data, demonstrating remarkable capabilities in
|
| 131 |
+
long-text dialogue information focus and instruction follow-up.
|
| 132 |
+
|
| 133 |
SUS-Chat powerfully demonstrates that through the right instruction
|
| 134 |
fine-tuning, academic institutions can achieve better performance
|
| 135 |
without increasing model parameters, using open-source datasets and
|
|
|
|
| 146 |
replication and comparison by other researchers.
|
| 147 |
|
| 148 |
In TLEM, we utilized various benchmark tests including MMLU, CMMLU,
|
| 149 |
+
C-Eval, BBH, GSM-8K, and MATH, to measure the model’s knowledge and
|
| 150 |
+
thinking capabilities. In these metrics, the SUS-Chat-34B model achieved
|
| 151 |
+
state-of-the-art performance. Additionally, we incorporated
|
|
|
|
| 152 |
[lm-eval](https://github.com/EleutherAI/lm-evaluation-harness) to test
|
| 153 |
SUS-Chat and similar models on winogrande, hellaswag, arc, and
|
| 154 |
truthful-qa, assessing the model’s common-sense reasoning ability and
|
|
|
|
| 157 |
Overall, the SUS-Chat-34B model significantly outperformed models of
|
| 158 |
similar scale and achieved the most advanced comprehensive performance.
|
| 159 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 160 |
<img
|
| 161 |
src="https://github.com/SUSTech-IDEA/SUS-Chat/raw/main/assets/radar.png"
|
| 162 |
id="fig-bench" alt="Figure 2: Benchmark" />
|
| 163 |
|
| 164 |
+
<div>
|
| 165 |
+
|
| 166 |
+
<table>
|
| 167 |
+
<colgroup>
|
| 168 |
+
<col style="width: 50%" />
|
| 169 |
+
<col style="width: 50%" />
|
| 170 |
+
</colgroup>
|
| 171 |
+
<tbody>
|
| 172 |
+
<tr class="odd">
|
| 173 |
+
<td style="text-align: center;"><div width="50.0%"
|
| 174 |
+
data-layout-align="center">
|
| 175 |
+
<h2 id="english-understanding">English Understanding</h2>
|
| 176 |
+
<table>
|
| 177 |
+
<thead>
|
| 178 |
+
<tr class="header">
|
| 179 |
+
<th style="text-align: right;">Model</th>
|
| 180 |
+
<th style="text-align: center;">mmlu (0-shot)</th>
|
| 181 |
+
</tr>
|
| 182 |
+
</thead>
|
| 183 |
+
<tbody>
|
| 184 |
+
<tr class="odd">
|
| 185 |
+
<td style="text-align: right;">GPT-4</td>
|
| 186 |
+
<td style="text-align: center;">83</td>
|
| 187 |
+
</tr>
|
| 188 |
+
<tr class="even">
|
| 189 |
+
<td style="text-align: right;">SUS-Chat-34B</td>
|
| 190 |
+
<td style="text-align: center;"><span
|
| 191 |
+
class="math inline">$\underline{74.35}$</span></td>
|
| 192 |
+
</tr>
|
| 193 |
+
<tr class="odd">
|
| 194 |
+
<td style="text-align: right;">Qwen-72b-Chat</td>
|
| 195 |
+
<td style="text-align: center;"><strong>74.52</strong></td>
|
| 196 |
+
</tr>
|
| 197 |
+
<tr class="even">
|
| 198 |
+
<td style="text-align: right;">Deepseek-68b-Chat</td>
|
| 199 |
+
<td style="text-align: center;">69.43</td>
|
| 200 |
+
</tr>
|
| 201 |
+
<tr class="odd">
|
| 202 |
+
<td style="text-align: right;">OrionStar-Yi-34B-Chat</td>
|
| 203 |
+
<td style="text-align: center;">68.51</td>
|
| 204 |
+
</tr>
|
| 205 |
+
<tr class="even">
|
| 206 |
+
<td style="text-align: right;">Yi-34B-Chat</td>
|
| 207 |
+
<td style="text-align: center;">66.96</td>
|
| 208 |
+
</tr>
|
| 209 |
+
</tbody>
|
| 210 |
+
</table>
|
| 211 |
+
</div></td>
|
| 212 |
+
<td style="text-align: center;"><div width="50.0%"
|
| 213 |
+
data-layout-align="center">
|
| 214 |
+
<h2 id="chinese-capabilities">Chinese Capabilities</h2>
|
| 215 |
+
<table>
|
| 216 |
+
<colgroup>
|
| 217 |
+
<col style="width: 34%" />
|
| 218 |
+
<col style="width: 32%" />
|
| 219 |
+
<col style="width: 32%" />
|
| 220 |
+
</colgroup>
|
| 221 |
+
<thead>
|
| 222 |
+
<tr class="header">
|
| 223 |
+
<th style="text-align: right;">Model</th>
|
| 224 |
+
<th style="text-align: center;">cmmlu (0-shot)</th>
|
| 225 |
+
<th style="text-align: center;">C-Eval (0-shot)<a href="#fn1"
|
| 226 |
+
class="footnote-ref" id="fnref1"
|
| 227 |
+
role="doc-noteref"><sup>1</sup></a></th>
|
| 228 |
+
</tr>
|
| 229 |
+
</thead>
|
| 230 |
+
<tbody>
|
| 231 |
+
<tr class="odd">
|
| 232 |
+
<td style="text-align: right;">GPT-4</td>
|
| 233 |
+
<td style="text-align: center;">71</td>
|
| 234 |
+
<td style="text-align: center;">69.9</td>
|
| 235 |
+
</tr>
|
| 236 |
+
<tr class="even">
|
| 237 |
+
<td style="text-align: right;">SUS-Chat-34B</td>
|
| 238 |
+
<td style="text-align: center;"><strong>78.68</strong></td>
|
| 239 |
+
<td style="text-align: center;"><strong>82.42</strong></td>
|
| 240 |
+
</tr>
|
| 241 |
+
<tr class="odd">
|
| 242 |
+
<td style="text-align: right;">Qwen-72b-Chat</td>
|
| 243 |
+
<td style="text-align: center;"><span
|
| 244 |
+
class="math inline">$\underline{77.02}$</span></td>
|
| 245 |
+
<td style="text-align: center;"><span
|
| 246 |
+
class="math inline">$\underline{77.22}$</span></td>
|
| 247 |
+
</tr>
|
| 248 |
+
<tr class="even">
|
| 249 |
+
<td style="text-align: right;">Deepseek-68b-Chat</td>
|
| 250 |
+
<td style="text-align: center;">48.51</td>
|
| 251 |
+
<td style="text-align: center;">59.7</td>
|
| 252 |
+
</tr>
|
| 253 |
+
<tr class="odd">
|
| 254 |
+
<td style="text-align: right;">OrionStar-Yi-34B-Chat</td>
|
| 255 |
+
<td style="text-align: center;">66.88</td>
|
| 256 |
+
<td style="text-align: center;">65.13</td>
|
| 257 |
+
</tr>
|
| 258 |
+
<tr class="even">
|
| 259 |
+
<td style="text-align: right;">Yi-34B-Chat</td>
|
| 260 |
+
<td style="text-align: center;">55.16</td>
|
| 261 |
+
<td style="text-align: center;">77.16</td>
|
| 262 |
+
</tr>
|
| 263 |
+
</tbody>
|
| 264 |
+
</table>
|
| 265 |
+
</div></td>
|
| 266 |
+
</tr>
|
| 267 |
+
</tbody>
|
| 268 |
+
</table>
|
| 269 |
+
<section id="footnotes" class="footnotes footnotes-end-of-document"
|
| 270 |
+
role="doc-endnotes">
|
| 271 |
+
<hr />
|
| 272 |
+
<ol>
|
| 273 |
+
<li id="fn1"><p>C-Eval results are evaluated on the validation
|
| 274 |
+
datasets<a href="#fnref1" class="footnote-back"
|
| 275 |
+
role="doc-backlink">↩︎</a></p></li>
|
| 276 |
+
</ol>
|
| 277 |
+
</section>
|
| 278 |
+
|
| 279 |
+
</div>
|
| 280 |
+
|
| 281 |
+
## Math & Reasoning
|
| 282 |
+
|
| 283 |
+
| Model | gsm8k (0-shot) | MATH (0-shot) | BBH (0-shot) |
|
| 284 |
+
|----------------------:|:-------------------:|:-------------------:|:-------------------:|
|
| 285 |
+
| GPT-4 | 91.4 | 45.8 | 86.7 |
|
| 286 |
+
| SUS-Chat-34B | **80.06** | 28.7 | 67.62 |
|
| 287 |
+
| Qwen-72b-Chat | $\underline{76.57}$ | **35.9** | **72.63** |
|
| 288 |
+
| Deepseek-68b-Chat | 74.45 | $\underline{29.56}$ | $\underline{69.73}$ |
|
| 289 |
+
| OrionStar-Yi-34B-Chat | 54.36 | 12.8 | 62.88 |
|
| 290 |
+
| Yi-34B-Chat | 63.76 | 10.02 | 61.54 |
|
| 291 |
+
|
| 292 |
+
## More Tasks
|
| 293 |
+
|
| 294 |
+
| Model | winogrande (5-shot) | arc (25-shot) | hellaswag (10-shot) | TruthfulQA mc1 (0-shot) | TruthfulQA mc2 (0-shot) |
|
| 295 |
+
|----------------------:|:-------------------:|:-------------------:|:-------------------:|:-----------------------:|:-----------------------:|
|
| 296 |
+
| GPT-4 | — | 94.5 | 91.4 | 59.00 | — |
|
| 297 |
+
| SUS-Chat-34B | **81.22** | $\underline{81.54}$ | 83.79 | **40.64** | **57.47** |
|
| 298 |
+
| Qwen-72b-Chat | 76.09 | **82.10** | $\underline{86.06}$ | 39.17 | $\underline{56.37}$ |
|
| 299 |
+
| Deepseek-68b-Chat | $\underline{80.58}$ | 81.29 | **87.02** | $\underline{40.02}$ | 50.64 |
|
| 300 |
+
| OrionStar-Yi-34B-Chat | 77.27 | 80.19 | 84.54 | 36.47 | 53.24 |
|
| 301 |
+
| Yi-34B-Chat | 76.64 | 70.66 | 82.29 | 38.19 | 54.57 |
|
| 302 |
+
|
| 303 |
+
## Overall
|
| 304 |
+
|
| 305 |
+
| Model | Average |
|
| 306 |
+
|----------------------:|:---------:|
|
| 307 |
+
| SUS-Chat-34B | **69.05** |
|
| 308 |
+
| Qwen-72b-Chat | 68.41 |
|
| 309 |
+
| Deepseek-68b-Chat | 62.91 |
|
| 310 |
+
| OrionStar-Yi-34B-Chat | 60.21 |
|
| 311 |
+
| Yi-34B-Chat | 59.72 |
|
| 312 |
+
|
| 313 |
+
To reproduce the results, please start a corresponding vllm server and
|
| 314 |
+
refer to
|
| 315 |
+
[here](https://sustech-tlem.static.hf.space/index.html#start-evaluating-your-model-in-3-line).
|
| 316 |
+
|
| 317 |
# Usage
|
| 318 |
|
| 319 |
SUS-Chat-34B is a standard LLaMA model and should be seamlessly
|
| 320 |
compatible with the LLaMA ecosystem. We provide the following example to
|
| 321 |
demonstrate how it can be used for multi-turn dialogues.
|
| 322 |
|
| 323 |
+
Feel free to [open an
|
| 324 |
+
issue](https://github.com/SUSTech-IDEA/SUS-Chat/issues) if you have any
|
| 325 |
+
questions.
|
| 326 |
|
| 327 |
+
``` python
|
| 328 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer # 🤗 Transformers, or
|
| 329 |
+
# from modelscope import AutoModelForCausalLM, AutoTokenizer # 🤖 ModelScope
|
| 330 |
|
| 331 |
def chat_template(messages):
|
| 332 |
history = ""
|
|
|
|
| 399 |
|
| 400 |
This model is developed entirely for academic research and free
|
| 401 |
commercial use, but it must adhere to the
|
| 402 |
+
[license](https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMENT.txt)
|
| 403 |
+
from [01-ai](https://huggingface.co/01-ai).
|