Automatic Speech Recognition
Transformers
Safetensors
Chinese
English
Yue Chinese
qwen2
text-generation
text-generation-inference
Instructions to use XiaomiMiMo/MiMo-V2.5-ASR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use XiaomiMiMo/MiMo-V2.5-ASR with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="XiaomiMiMo/MiMo-V2.5-ASR")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("XiaomiMiMo/MiMo-V2.5-ASR") model = AutoModelForCausalLM.from_pretrained("XiaomiMiMo/MiMo-V2.5-ASR") - Notebooks
- Google Colab
- Kaggle
In auto mode, the language tag tends to output Chinese
#1
by funnyice - opened
In auto mode, the language tag tends to output <chinese>, although the transcription result is correct.
Language: Auto (tag='')
Text channel: <chinese> um, and then I'll be coming to you ...
FINAL_TEXT: um, and then I'll be coming to you ...
Yes, we suspect this may be because we have assigned the tag to code-switching data. Regardless of the tag used, however, the transcription results under auto mode are already sufficiently accurate. In fact, the Language Tag is designed to offer an option: when you confirm the language in the audio, it provides stronger conditioning for transcription, leading to higher accuracy.