zyq commited on
Commit
af9d8af
·
1 Parent(s): 7522ec2

add readme

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -13,9 +13,9 @@ library_name: transformers
13
  <div align="center">
14
 
15
  <br>
16
- <h1> InnoMegrez2 </h1>
17
 
18
- <a href="https://github.com/sii-research/InnoMegrez2">
19
  <b>🔗 Github</b>
20
  </a> &nbsp;|&nbsp;
21
  <a href="https://github.com/sii-research/InnoMegrez2/blob/main/docs/tech_report.pdf">
@@ -33,7 +33,7 @@ library_name: transformers
33
 
34
  ## Introduction
35
 
36
- InnoMegrez2 is a device native large language model. Megrez2 takes advantages of both the accuracy of Mixture-of-Experts (MoE) architecture and the compact size of Dense models. This preview model was trained on 5T Tokens of data. The official release, with larger training data and better reasoning and agent capabilities, will come later this year.
37
 
38
  ## Model Card
39
 
@@ -62,7 +62,7 @@ InnoMegrez2 is a device native large language model. Megrez2 takes advantages of
62
 
63
  ## Performance
64
 
65
- We evaluated InnoMegrez2 using the open-source evaluation tool [OpenCompass](https://github.com/open-compass/opencompass) on several important benchmarks. Some of the evaluation results are shown in the table below.
66
 
67
  <div align="center">
68
  <table>
@@ -70,7 +70,7 @@ We evaluated InnoMegrez2 using the open-source evaluation tool [OpenCompass](htt
70
  <tr>
71
  <th align="center">Benchmark</th>
72
  <th align="center">Metric</th>
73
- <th align="center"><sup>InnoMegrez2</th>
74
  <th align="center"><sup>Qwen2.5-3B</sup></th>
75
  <th align="center"><sup>Qwen2.5-7B</sup></th>
76
  <th align="center"><sup>Qwen3-4B</sup></th>
@@ -210,7 +210,7 @@ The following contains a code snippet illustrating how to use the model generate
210
  from transformers import AutoModelForCausalLM, AutoTokenizer
211
  import torch
212
 
213
- path = "sii-research/InnoMegrez2"
214
  device = "cuda"
215
 
216
  tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
@@ -240,7 +240,7 @@ print(responses)
240
 
241
  ## How to Deploy
242
 
243
- InnoMegrez2 support using `vLLM` and `SGLang` as inference backends. For more information, please visit the [gitHub repository](https://github.com/sii-research/InnoMegrez2).
244
 
245
  ## Best Practice
246
 
 
13
  <div align="center">
14
 
15
  <br>
16
+ <h1> InnoMegrez2-Preview </h1>
17
 
18
+ <a href="https://github.com/sii-research/InnoMegrez2-Preview">
19
  <b>🔗 Github</b>
20
  </a> &nbsp;|&nbsp;
21
  <a href="https://github.com/sii-research/InnoMegrez2/blob/main/docs/tech_report.pdf">
 
33
 
34
  ## Introduction
35
 
36
+ InnoMegrez2-Preview is a device native large language model. Megrez2 takes advantages of both the accuracy of Mixture-of-Experts (MoE) architecture and the compact size of Dense models. This preview model was trained on 5T Tokens of data. The official release, with larger training data and better reasoning and agent capabilities, will come later this year.
37
 
38
  ## Model Card
39
 
 
62
 
63
  ## Performance
64
 
65
+ We evaluated InnoMegrez2-Preview using the open-source evaluation tool [OpenCompass](https://github.com/open-compass/opencompass) on several important benchmarks. Some of the evaluation results are shown in the table below.
66
 
67
  <div align="center">
68
  <table>
 
70
  <tr>
71
  <th align="center">Benchmark</th>
72
  <th align="center">Metric</th>
73
+ <th align="center"><sup>InnoMegrez2-Preview</th>
74
  <th align="center"><sup>Qwen2.5-3B</sup></th>
75
  <th align="center"><sup>Qwen2.5-7B</sup></th>
76
  <th align="center"><sup>Qwen3-4B</sup></th>
 
210
  from transformers import AutoModelForCausalLM, AutoTokenizer
211
  import torch
212
 
213
+ path = "sii-research/InnoMegrez2-Preview"
214
  device = "cuda"
215
 
216
  tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
 
240
 
241
  ## How to Deploy
242
 
243
+ InnoMegrez2-Preview support using `vLLM` and `SGLang` as inference backends. For more information, please visit the [gitHub repository](https://github.com/sii-research/InnoMegrez2).
244
 
245
  ## Best Practice
246