Mxode
/

NanoTranslator-XS

text-generation

text-generation-inference

Model card Files Files and versions

Mxode commited on Sep 13, 2024

Commit

a544a5e

·

verified ·

1 Parent(s): ccfb337

Update README_zh-CN.md

Files changed (1) hide show

README_zh-CN.md +16 -13

README_zh-CN.md CHANGED Viewed

@@ -1,26 +1,29 @@
-# **NanoTranslator-S**
 [English](README.md) | 简体中文
 ## Introduction
-这是 NanoTranslator 的 Small 型号，目前仅支持**英译中**。仓库中同时提供了 ONNX 版本的模型。
-| Size | Params. |  V.  |  H.  |  I.  |  L.  | Att. H. | KV H. | Tie Emb. |
-| :--: | :-----: | :--: | :--: | :--: | :--: | :-----: | :---: | :------: |
-|  XL  |  50 M   | 8000 | 320  | 1792 |  24  |   16    |   4   |   True   |
-|  L   |  22 M   | 8000 | 256  | 1408 |  16  |   16    |   4   |   True   |
-|  M   |  9 M  | 4000 | 168 | 896 |  16  |   12    |   4   |   True   |
-|  S   | 2 M  | 2000 |  96  | 512  |  12  |   12    |   4   |   True   |
 - **V.** - vocab size
 - **H.** - hidden size
 - **I.** - intermediate size
 - **L.** - num layers
-- **Att. H.** - num attention heads
-- **KV H.** - num kv heads
-- **Tie Emb.** - tie word embeddings
@@ -38,7 +41,7 @@ Prompt 格式如下：
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
-model_path = 'Mxode/NanoTranslator-S'
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
@@ -75,7 +78,7 @@ print(response)
 根据实际测试，使用 ONNX 模型推理会比直接使用 transformers 推理要**快 2～10 倍**。
-如果希望使用 ONNX 模型，那么你需要手动切换到 [onnx 分支](https://huggingface.co/Mxode/NanoTranslator-S/tree/onnx)并从本地加载。
 参考文档：

+# **NanoTranslator-XS**
 [English](README.md) | 简体中文
 ## Introduction
+这是 NanoTranslator 的 **X-Small** 型号，目前仅支持**英译中**。仓库中同时提供了 ONNX 版本的模型。
+| Size | P. | Arch. | Act. |  V.  |  H.  |  I.  |  L.  | A.H. | K.H. | Tie |
+| :--: | :-----: | :--: | :--: | :--: | :-----: | :---: | :------: | ---- | ---- | :--: |
+|  XL  |  100  |  LLaMA  |  SwiGLU  | 16000 | 768  | 4096 |  8   |  24  |  8   | True |
+|  L   |  78  | LLaMA | GeGLU  | 16000 | 768  | 4096 |  6   |  24  |  8   | True |
+| M2 | 22 | Qwen2 | GeGLU | 4000  | 432  | 2304 |  6   |  24  |  8   | True |
+|  M   |  22  |  LLaMA  |  SwiGLU  | 8000  | 256  | 1408 |  16  |  16  |  4   | True |
+|  S   | 9 | LLaMA | SwiGLU | 4000  | 168  | 896  |  16  |  12  |  4   | True |
+| XS | 2 | LLaMA | SwiGLU | 2000 | 96 | 512 | 12 | 12 | 4 | True |
+- **P.** - Parameters (in million)
 - **V.** - vocab size
 - **H.** - hidden size
 - **I.** - intermediate size
 - **L.** - num layers
+- **A.H.** - num attention heads
+- **K.H.** - num kv heads
+- **Tie** - tie word embeddings
 import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
+model_path = 'Mxode/NanoTranslator-XS'
 tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
 根据实际测试，使用 ONNX 模型推理会比直接使用 transformers 推理要**快 2～10 倍**。
+如果希望使用 ONNX 模型，那么你需要手动切换到 [onnx 分支](https://huggingface.co/Mxode/NanoTranslator-XS/tree/onnx)并从本地加载。
 参考文档：