Add link to the paper, Github repository and project page
#8
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -16,6 +16,8 @@ tags:
|
|
| 16 |
|
| 17 |
# MiniCPM-2B-128k
|
| 18 |
|
|
|
|
|
|
|
| 19 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 20 |
|
| 21 |
[OpenBMB Technical Blog Series](https://openbmb.vercel.app/)
|
|
@@ -26,7 +28,10 @@ To our best knowledge, MiniCPM-2B-128k is the first long context(>=128k) SLM sma
|
|
| 26 |
In comparison with the previous released [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16), the improvements include:
|
| 27 |
|
| 28 |
- Supports 128k context, achieving the best score under 7B on the comprehensive long-text evaluation InfiniteBench, but performance drops within 4k context
|
| 29 |
-
- To facilitate community developers, the model has updated the <user>{}<AI> directive template to chatml format (user
|
|
|
|
|
|
|
|
|
|
| 30 |
- Due to the parallel mechanism requirement, removed tie_embedding and expanded the vocabulary to 127660.
|
| 31 |
|
| 32 |
For more details, please refer to the [GitHub repo](https://github.com/OpenBMB/MiniCPM) and [Blog](https://openbmb.vercel.app/minicpm-2b-128k-en).
|
|
@@ -36,7 +41,10 @@ MiniCPM 是面壁与清华大学自然语言处理实验室共同开源的系列
|
|
| 36 |
MiniCPM-2B-128k 是一次基于 [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16) 的长度扩展尝试,也是第一个 3B 以下的长文本模型。相对于之前发布的版本,改进如下:
|
| 37 |
|
| 38 |
- 支持 128k 上下文,在综合长文本评测 [InfiniteBench](https://github.com/OpenBMB/InfiniteBench) 上取得 7B 以下最佳成绩,但在 4k 以内性能有下降
|
| 39 |
-
- 为方便社区开发者使用,该模型在对齐时将 <用户>{}<AI> 指令模板更新为了 chatml 格式(<|im_start|>user
|
|
|
|
|
|
|
|
|
|
| 40 |
- 由于并行机制需要,去除了 tie_embedding,并扩展词表到 127660。
|
| 41 |
|
| 42 |
更多细节请参考 [GitHub repo](https://github.com/OpenBMB/MiniCPM) 和 [Blog](https://openbmb.vercel.app/minicpm-2b-128k)
|
|
@@ -90,4 +98,4 @@ model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, d
|
|
| 90 |
|
| 91 |
responds, history = model.chat(tokenizer, "山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?", temperature=0.8, top_p=0.8)
|
| 92 |
print(responds)
|
| 93 |
-
```
|
|
|
|
| 16 |
|
| 17 |
# MiniCPM-2B-128k
|
| 18 |
|
| 19 |
+
This repository contains the model based on CPM-2B, trained as described in the paper [MiniCPM4: Ultra-Efficient LLMs on End Devices](https://huggingface.co/papers/2506.07900)
|
| 20 |
+
|
| 21 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 22 |
|
| 23 |
[OpenBMB Technical Blog Series](https://openbmb.vercel.app/)
|
|
|
|
| 28 |
In comparison with the previous released [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16), the improvements include:
|
| 29 |
|
| 30 |
- Supports 128k context, achieving the best score under 7B on the comprehensive long-text evaluation InfiniteBench, but performance drops within 4k context
|
| 31 |
+
- To facilitate community developers, the model has updated the <user>{}<AI> directive template to chatml format (user
|
| 32 |
+
{}
|
| 33 |
+
assistant
|
| 34 |
+
) during alignment, which also aids users in deploying and using the vllm openai compatible server mode.
|
| 35 |
- Due to the parallel mechanism requirement, removed tie_embedding and expanded the vocabulary to 127660.
|
| 36 |
|
| 37 |
For more details, please refer to the [GitHub repo](https://github.com/OpenBMB/MiniCPM) and [Blog](https://openbmb.vercel.app/minicpm-2b-128k-en).
|
|
|
|
| 41 |
MiniCPM-2B-128k 是一次基于 [MiniCPM-2B](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16) 的长度扩展尝试,也是第一个 3B 以下的长文本模型。相对于之前发布的版本,改进如下:
|
| 42 |
|
| 43 |
- 支持 128k 上下文,在综合长文本评测 [InfiniteBench](https://github.com/OpenBMB/InfiniteBench) 上取得 7B 以下最佳成绩,但在 4k 以内性能有下降
|
| 44 |
+
- 为方便社区开发者使用,该模型在对齐时将 <用户>{}<AI> 指令模板更新为了 chatml 格式(<|im_start|>user
|
| 45 |
+
{}<|im_end|>
|
| 46 |
+
<|im_start|>assistant
|
| 47 |
+
),这也有助于用户使用 vllm openai compatible server 模式部署和使用。
|
| 48 |
- 由于并行机制需要,去除了 tie_embedding,并扩展词表到 127660。
|
| 49 |
|
| 50 |
更多细节请参考 [GitHub repo](https://github.com/OpenBMB/MiniCPM) 和 [Blog](https://openbmb.vercel.app/minicpm-2b-128k)
|
|
|
|
| 98 |
|
| 99 |
responds, history = model.chat(tokenizer, "山东省最高的山是哪座山, 它比黄山高还是矮?差距多少?", temperature=0.8, top_p=0.8)
|
| 100 |
print(responds)
|
| 101 |
+
```
|