| | --- |
| | language: |
| | - en |
| | - zh |
| | license: apache-2.0 |
| | pipeline_tag: audio-classification |
| | tags: |
| | - Language Identification |
| | - LID |
| | - Audio Classification |
| | - VoxLingua107 |
| | - audio |
| | - automatic-speech-recognition |
| | - asr |
| | --- |
| | |
| | <div align="center"> |
| | <h1> |
| | FireRedASR2S - FireRedLID |
| | <br> |
| | A SOTA Industrial-Grade Spoken Language Identification System |
| | </h1> |
| |
|
| | </div> |
| |
|
| | [[Paper]](https://huggingface.co/papers/2603.10420) |
| | [[Code]](https://github.com/FireRedTeam/FireRedASR2S) |
| | [[Blog]](https://fireredteam.github.io/demos/firered_asr/) |
| | [[Demo]](https://huggingface.co/spaces/FireRedTeam/FireRedASR) |
| | |
| | |
| | FireRedLID is the Spoken Language Identification (LID) module of **FireRedASR2S**, a state-of-the-art (SOTA), industrial-grade, all-in-one ASR system. It supports 100+ languages and 20+ Chinese dialects/accents, achieving 97.18% accuracy on the FLEURS benchmark, outperforming Whisper and SpeechBrain-LID. |
| | |
| | This model was introduced in the paper [FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System](https://huggingface.co/papers/2603.10420). |
| | |
| | ## 🔥 News |
| | - [2026.02.12] We release FireRedASR2S (FireRedASR2-AED, FireRedVAD, FireRedLID, and FireRedPunc) with model weights and inference code. |
| | |
| | ## Evaluation |
| | ### FireRedLID |
| | Metric: Utterance-level LID Accuracy (%). Higher is better. |
| | |
| | |Testset\Model|Languages|FireRedLID|Whisper|SpeechBrain|Dolphin| |
| | |:-----------------:|:---------:|:---------:|:-----:|:---------:|:-----:| |
| | |FLEURS test |82 languages |**97.18** |79.41 |92.91 |-| |
| | |CommonVoice test |74 languages |**92.07** |80.81 |78.75 |-| |
| | |KeSpeech + MagicData|20+ Chinese dialects/accents |**88.47** |-|-|69.01| |
| | |
| | |
| | ## Sample Usage |
| | |
| | To use this module independently, first clone the [GitHub repository](https://github.com/FireRedTeam/FireRedASR2S) and install the dependencies. |
| | |
| | ### Python API Usage |
| | ```python |
| | from fireredasr2s.fireredlid import FireRedLid, FireRedLidConfig |
| | |
| | batch_uttid = ["hello_zh", "hello_en"] |
| | batch_wav_path = ["assets/hello_zh.wav", "assets/hello_en.wav"] |
| |
|
| | config = FireRedLidConfig(use_gpu=True, use_half=False) |
| | model = FireRedLid.from_pretrained("FireRedTeam/FireRedLID", config) |
| | |
| | results = model.process(batch_uttid, batch_wav_path) |
| | print(results) |
| | # [{'uttid': 'hello_zh', 'lang': 'zh mandarin', 'confidence': 0.996, 'dur_s': 2.32, 'rtf': '0.0741', 'wav': 'assets/hello_zh.wav'}, {'uttid': 'hello_en', 'lang': 'en', 'confidence': 0.996, 'dur_s': 2.24, 'rtf': '0.0741', 'wav': 'assets/hello_en.wav'}] |
| | ``` |
| | |
| | ## Citation |
| | ```bibtex |
| | @article{xu2026fireredasr2s, |
| | title={FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System}, |
| | author={Xu, Kaituo and Jia, Yan and Huang, Kai and Chen, Junjie and Li, Wenpeng and Liu, Kun and Xie, Feng-Long and Tang, Xu and Hu, Yao}, |
| | journal={arXiv preprint arXiv:2603.10420}, |
| | year={2026} |
| | } |
| | ``` |