XiaomiMiMo
/

MiMo-V2-Flash-Base

@@ -1,8 +1,9 @@
 ---
-license: mit
 base_model:
 - XiaomiMiMo/MiMo-V2-Flash-Base
 library_name: transformers
 ---
 <br/><br/>
@@ -20,10 +21,12 @@ library_name: transformers
   |
   <a href="https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash" target="_blank">🤗 HuggingFace</a>
   &nbsp;|
-  <a href="https://github.com/XiaomiMiMo/MiMo-V2-Flash/blob/main/paper.pdf" target="_blank">📔 Technical Report </a>
   &nbsp;|
   <a href="https://mimo.xiaomi.com/blog/mimo-v2-flash" target="_blank">📰 Blog </a>
   &nbsp;|
   <br/><br/>
   <strong>Play around!</strong> &nbsp;
   <a href="https://aistudio.xiaomimimo.com" target="_blank">🗨️ Xiaomi MiMo Studio </a>
@@ -36,6 +39,8 @@ library_name: transformers
 **MiMo-V2-Flash** is a Mixture-of-Experts (MoE) language model with **309B total parameters** and **15B active parameters**. Designed for high-speed reasoning and agentic workflows, it utilizes a novel hybrid attention architecture and Multi-Token Prediction (MTP) to achieve state-of-the-art performance while significantly reducing inference costs.
 <p align="center">
   <img width="80%" src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/MiMo-v2-flash-performance.jpg?raw=true">
 </p>
@@ -304,7 +309,7 @@ If you find our work helpful, please cite our technical report:
   title={MiMo-V2-Flash Technical Report},
   author={LLM-Core Xiaomi},
   year={2025},
-  url={https://github.com/XiaomiMiMo/MiMo-V2-Flash/paper.pdf}
 }
 ```
@@ -317,4 +322,4 @@ Please contact us at [mimo@xiaomi.com](mailto:mimo@xiaomi.com), join our WeChat
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat2.jpg?raw=true" width="20%" />
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat3.jpg?raw=true" width="20%" />
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat4.jpg?raw=true" width="20%" />
-</p>

 ---
 base_model:
 - XiaomiMiMo/MiMo-V2-Flash-Base
 library_name: transformers
+license: mit
+pipeline_tag: text-generation
 ---
 <br/><br/>
   |
   <a href="https://huggingface.co/XiaomiMiMo/MiMo-V2-Flash" target="_blank">🤗 HuggingFace</a>
   &nbsp;|
+  <a href="https://huggingface.co/papers/2601.02780" target="_blank">📔 Technical Report </a>
   &nbsp;|
   <a href="https://mimo.xiaomi.com/blog/mimo-v2-flash" target="_blank">📰 Blog </a>
   &nbsp;|
+  <a href="https://github.com/XiaomiMiMo/MiMo-V2-Flash" target="_blank">💻 GitHub </a>
+  &nbsp;|
   <br/><br/>
   <strong>Play around!</strong> &nbsp;
   <a href="https://aistudio.xiaomimimo.com" target="_blank">🗨️ Xiaomi MiMo Studio </a>
 **MiMo-V2-Flash** is a Mixture-of-Experts (MoE) language model with **309B total parameters** and **15B active parameters**. Designed for high-speed reasoning and agentic workflows, it utilizes a novel hybrid attention architecture and Multi-Token Prediction (MTP) to achieve state-of-the-art performance while significantly reducing inference costs.
+The model was presented in the [MiMo-V2-Flash Technical Report](https://huggingface.co/papers/2601.02780).
 <p align="center">
   <img width="80%" src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/MiMo-v2-flash-performance.jpg?raw=true">
 </p>
   title={MiMo-V2-Flash Technical Report},
   author={LLM-Core Xiaomi},
   year={2025},
+  url={https://huggingface.co/papers/2601.02780}
 }
 ```
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat2.jpg?raw=true" width="20%" />
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat3.jpg?raw=true" width="20%" />
   <img src="https://github.com/XiaomiMiMo/MiMo-V2-Flash/raw/main/figures/wechat_group/wechat4.jpg?raw=true" width="20%" />
+</p>