Link model to paper and add library metadata
Browse filesHi! I'm Niels from the Hugging Face community science team.
This PR improves the model card metadata by:
- Adding the `arxiv` ID to link the model repository with its [research paper](https://huggingface.co/papers/2512.23808) on the Hub.
- Adding `library_name: transformers` since the configuration indicates compatibility with the Transformers library.
- Updating paper links in the README to point to the Hugging Face paper page.
These changes help improve the discoverability of your work and enable better integration with the Hugging Face ecosystem.
README.md
CHANGED
|
@@ -1,13 +1,16 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
pipeline_tag: any-to-any
|
|
|
|
| 4 |
tags:
|
| 5 |
- Audio-to-Text
|
| 6 |
- Text-to-Audio
|
| 7 |
- Audio-to-Audio
|
| 8 |
- Text-to-Text
|
| 9 |
- Audio-Text-to-Text
|
|
|
|
| 10 |
---
|
|
|
|
| 11 |
<div align="center">
|
| 12 |
<picture>
|
| 13 |
<source srcset="https://github.com/XiaomiMiMo/MiMo-VL/raw/main/figures/Xiaomi_MiMo_darkmode.png?raw=true" media="(prefers-color-scheme: dark)">
|
|
@@ -32,7 +35,7 @@ tags:
|
|
| 32 |
|
|
| 33 |
<a href="https://github.com/XiaomiMiMo/MiMo-Audio" target="_blank">π€ GitHub</a>
|
| 34 |
|
|
| 35 |
-
<a href="https://
|
| 36 |
|
|
| 37 |
<a href="https://xiaomimimo.github.io/MiMo-Audio-Demo" target="_blank">π° Blog</a>
|
| 38 |
|
|
|
@@ -73,7 +76,7 @@ MiMo-Audio couples a patch encoder, an LLM, and a patch decoder to improve model
|
|
| 73 |
## Explore MiMo-Audio Now! πππ
|
| 74 |
- π§ **Try the Hugging Face demo:** [MiMo-Audio Demo](https://huggingface.co/spaces/XiaomiMiMo/mimo_audio_chat)
|
| 75 |
- π° **Read the Official Blog:** [MiMo-Audio Blog](https://xiaomimimo.github.io/MiMo-Audio-Demo)
|
| 76 |
-
- π **Dive into the Technical Report:** [MiMo-Audio Technical Report](https://
|
| 77 |
|
| 78 |
|
| 79 |
## Model Download
|
|
@@ -110,7 +113,7 @@ pip install -r requirements.txt
|
|
| 110 |
pip install flash-attn==2.7.4.post1
|
| 111 |
```
|
| 112 |
|
| 113 |
-
>
|
| 114 |
> If the compilation of flash-attn takes too long, you can download the precompiled wheel and install it manually:
|
| 115 |
>
|
| 116 |
> * [Download Precompiled Wheel](https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl)
|
|
@@ -158,7 +161,7 @@ This toolkit is designed to evaluate MiMo-Audio and other recent audio LLMs as m
|
|
| 158 |
title={MiMo-Audio: Audio Language Models are Few-Shot Learners},
|
| 159 |
author={LLM-Core-Team Xiaomi},
|
| 160 |
year={2025},
|
| 161 |
-
url={
|
| 162 |
}
|
| 163 |
```
|
| 164 |
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
pipeline_tag: any-to-any
|
| 4 |
+
library_name: transformers
|
| 5 |
tags:
|
| 6 |
- Audio-to-Text
|
| 7 |
- Text-to-Audio
|
| 8 |
- Audio-to-Audio
|
| 9 |
- Text-to-Text
|
| 10 |
- Audio-Text-to-Text
|
| 11 |
+
arxiv: 2512.23808
|
| 12 |
---
|
| 13 |
+
|
| 14 |
<div align="center">
|
| 15 |
<picture>
|
| 16 |
<source srcset="https://github.com/XiaomiMiMo/MiMo-VL/raw/main/figures/Xiaomi_MiMo_darkmode.png?raw=true" media="(prefers-color-scheme: dark)">
|
|
|
|
| 35 |
|
|
| 36 |
<a href="https://github.com/XiaomiMiMo/MiMo-Audio" target="_blank">π€ GitHub</a>
|
| 37 |
|
|
| 38 |
+
<a href="https://huggingface.co/papers/2512.23808" target="_blank">π Paper</a>
|
| 39 |
|
|
| 40 |
<a href="https://xiaomimimo.github.io/MiMo-Audio-Demo" target="_blank">π° Blog</a>
|
| 41 |
|
|
|
|
|
| 76 |
## Explore MiMo-Audio Now! πππ
|
| 77 |
- π§ **Try the Hugging Face demo:** [MiMo-Audio Demo](https://huggingface.co/spaces/XiaomiMiMo/mimo_audio_chat)
|
| 78 |
- π° **Read the Official Blog:** [MiMo-Audio Blog](https://xiaomimimo.github.io/MiMo-Audio-Demo)
|
| 79 |
+
- π **Dive into the Technical Report:** [MiMo-Audio Technical Report](https://huggingface.co/papers/2512.23808)
|
| 80 |
|
| 81 |
|
| 82 |
## Model Download
|
|
|
|
| 113 |
pip install flash-attn==2.7.4.post1
|
| 114 |
```
|
| 115 |
|
| 116 |
+
> [!Note]
|
| 117 |
> If the compilation of flash-attn takes too long, you can download the precompiled wheel and install it manually:
|
| 118 |
>
|
| 119 |
> * [Download Precompiled Wheel](https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl)
|
|
|
|
| 161 |
title={MiMo-Audio: Audio Language Models are Few-Shot Learners},
|
| 162 |
author={LLM-Core-Team Xiaomi},
|
| 163 |
year={2025},
|
| 164 |
+
url={https://github.com/XiaomiMiMo/MiMo-Audio},
|
| 165 |
}
|
| 166 |
```
|
| 167 |
|