OpenBMB-LLM commited on
Commit
9da0a79
·
verified ·
1 Parent(s): 94c7e15

Add model card README

Browse files
Files changed (1) hide show
  1. README.md +11 -15
README.md CHANGED
@@ -35,7 +35,7 @@ datasets:
35
 
36
  ## Highlights
37
 
38
- We are releasing **MiniCPM5-1B**, the first model in the **MiniCPM5** series. It is a dense 1B Transformer built for on-device, local deployment, and resource-constrained scenarios, reaching 1B-class open-source SOTA on public evaluations. The model uses the standard `LlamaForCausalLM` architecture, supports native 128K context, and is released in BF16, GGUF, MLX, AWQ, and GPTQ variants.
39
 
40
  🏆 **1B-class open-source SOTA**: compared with strong open-source models in the same size class, MiniCPM5-1B reaches SOTA within this comparison set. Its advantage is most visible in agentic tool use, code generation, and difficult reasoning.
41
 
@@ -63,23 +63,19 @@ Use this directory to choose the model format that matches your runtime:
63
 
64
  ## Model Information
65
 
66
- | Item | Description |
67
- | --- | --- |
68
- | Model name | MiniCPM5-1B |
69
- | Model type | Dense decoder-only language model |
70
- | Architecture | Standard `LlamaForCausalLM` |
71
- | Parameters | ~1.0B total parameters |
72
- | Layers | 24 Transformer layers |
73
- | Attention | GQA, 16 query heads / 2 KV heads |
74
- | Context length | Native 128K context (`max_position_embeddings = 131,072`) |
75
- | RoPE | `rope_theta = 5e6`, no extra RoPE scaling required |
76
- | Chat modes | Think / No Think via `enable_thinking` |
77
- | Main scenarios | Local assistants, coding agents, tool assistants, reasoning assistants, and resource-constrained deployment |
78
- | License | Apache-2.0 |
79
 
80
  ## Introduction
81
 
82
- MiniCPM5-1B is a compact dense decoder-only Transformer trained to improve output quality at the 1B scale. It keeps the standard `LlamaForCausalLM` architecture (24 layers, GQA 8:1, native 128K context, ~1.0B total params) so it runs on mainstream inference engines (Transformers, vLLM, SGLang, llama.cpp, MLX, Ollama, LM Studio) without custom kernels.
83
 
84
  For full architecture details and per-component parameter breakdown, see the [Transformers deployment cookbook](https://github.com/OpenBMB/MiniCPM/blob/minicpm5/docs/deployment/transformers.md).
85
 
 
35
 
36
  ## Highlights
37
 
38
+ We are releasing **MiniCPM5-1B**, the first model in the **MiniCPM5** series. It is a dense 1B Transformer built for on-device, local deployment, and resource-constrained scenarios, reaching 1B-class open-source SOTA on the benchmark suite.
39
 
40
  🏆 **1B-class open-source SOTA**: compared with strong open-source models in the same size class, MiniCPM5-1B reaches SOTA within this comparison set. Its advantage is most visible in agentic tool use, code generation, and difficult reasoning.
41
 
 
63
 
64
  ## Model Information
65
 
66
+ MiniCPM5-1B has the following features:
67
+
68
+ - **Type**: Causal Language Model
69
+ - **Architecture**: Standard `LlamaForCausalLM`
70
+ - **Number of Parameters**: 1,080,632,832
71
+ - **Number of Non-Embedding Parameters**: 679,552,512
72
+ - **Number of Layers**: 24
73
+ - **Number of Attention Heads (GQA)**: 16 for Q and 2 for KV
74
+ - **Context Length**: 131,072
 
 
 
 
75
 
76
  ## Introduction
77
 
78
+ MiniCPM5-1B is a compact dense decoder-only Transformer trained to improve output quality at the 1B scale. It keeps the standard `LlamaForCausalLM` architecture (24 layers, GQA 8:1, native 128K context, 1,080,632,832 parameters) so it runs on mainstream inference engines (Transformers, vLLM, SGLang, llama.cpp, MLX, Ollama, LM Studio) without custom kernels.
79
 
80
  For full architecture details and per-component parameter breakdown, see the [Transformers deployment cookbook](https://github.com/OpenBMB/MiniCPM/blob/minicpm5/docs/deployment/transformers.md).
81