HaoxingChen commited on
Commit
666feaf
·
verified ·
1 Parent(s): 623eea7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -14,12 +14,14 @@ We introduce **GroveMoE**, a new sparse architecture using **adjugate experts**
14
  - **Sparse Activation**: 33 B params total, only **3.14–3.28 B** active per token.
15
  - **Traning**: Mid-training + SFT, up-cycled from Qwen3-30B-A3B-Base; preserves prior knowledge while adding new capabilities.
16
 
17
- ## Model Lists
18
- | GroveMoE Series | Download
19
- |---|---
20
- GroveMoE-Base | 🤗 [HuggingFace](https://huggingface.co/inclusionAI/GroveMoE-Base)
21
- GroveMoE-Inst | 🤗 [HuggingFace](https://huggingface.co/inclusionAI/GroveMoE-Inst)
22
 
 
 
 
 
 
 
23
 
24
  ## Performance
25
 
@@ -32,7 +34,7 @@ GroveMoE-Inst | 🤗 [HuggingFace](https://huggingface.co/inclusionAI/GroveMoE-
32
  |Mistral-Small-3.2| 24B | 68.1 | 37.5 | 59.9 | 61.9 | 33.4 | 28.1 | 69.5 | 32.2 |
33
  |GroveMoE-Inst|3.14~3.28B | <font color=#FBD98D>**72.8**</font> | <font color=#FBD98D>**47.7**</font> | <font color=#FBD98D>**61.3**</font> |<font color=#FBD98D>**71.2**</font> |<font color=#FBD98D>**43.5**</font> | <font color=#FBD98D>**44.4**</font> |<font color=#FBD98D>**74.5**</font> | <font color=#FBD98D>**34.6**</font> |
34
 
35
- We bold the top1 scores separately for all models.
36
 
37
  ## Usage
38
  Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library.
 
14
  - **Sparse Activation**: 33 B params total, only **3.14–3.28 B** active per token.
15
  - **Traning**: Mid-training + SFT, up-cycled from Qwen3-30B-A3B-Base; preserves prior knowledge while adding new capabilities.
16
 
17
+ ## Model Downloads
 
 
 
 
18
 
19
+ <div align="center">
20
+ | **Model** | **#Total Params** | **#Activated Params** | **Download** |
21
+ | :----------------: | :---------------: | :-------------------: | :----------: |
22
+ | GroveMoE-Base | 33B | 3.14~3.28B | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Base) |
23
+ | GroveMoE-Inst | 3B | 3.14~3.28B | [🤗 HuggingFace](https://huggingface.co/inclusionAI/GroveMoE-Inst) |
24
+ </div>
25
 
26
  ## Performance
27
 
 
34
  |Mistral-Small-3.2| 24B | 68.1 | 37.5 | 59.9 | 61.9 | 33.4 | 28.1 | 69.5 | 32.2 |
35
  |GroveMoE-Inst|3.14~3.28B | <font color=#FBD98D>**72.8**</font> | <font color=#FBD98D>**47.7**</font> | <font color=#FBD98D>**61.3**</font> |<font color=#FBD98D>**71.2**</font> |<font color=#FBD98D>**43.5**</font> | <font color=#FBD98D>**44.4**</font> |<font color=#FBD98D>**74.5**</font> | <font color=#FBD98D>**34.6**</font> |
36
 
37
+ We bold the top1 scores separately for all models. More details will be reported in our [technical report](https://arxiv.org/abs/2508.07785).
38
 
39
  ## Usage
40
  Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library.