Add paper link and pipeline tag metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +19 -10
README.md CHANGED
@@ -1,15 +1,13 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
- - en
5
- - zh
 
 
6
  tags:
7
  - voice-activity-detection
8
- - Voice Acticity Detection
9
- - voice activity detection
10
- - speech activity detection
11
- - Audio Event Detection
12
- - audio event detection
13
  - vad
14
  - aed
15
  - streaming
@@ -19,7 +17,6 @@ tags:
19
  - asr
20
  ---
21
 
22
-
23
  <div align="center">
24
  <h1>
25
  FireRedVAD: A SOTA Industrial-Grade
@@ -29,17 +26,19 @@ Voice Activity Detection & Audio Event Detection
29
 
30
  </div>
31
 
 
32
  [[Code]](https://github.com/FireRedTeam/FireRedVAD)
33
  [[HuggingFace]](https://huggingface.co/FireRedTeam/FireRedVAD)
34
  [[ModelScope]](https://www.modelscope.cn/models/xukaituo/FireRedVAD)
35
 
36
 
37
- FireRedVAD is a state-of-the-art (SOTA) industrial-grade Voice Activity Detection (VAD) and Audio Event Detection (AED) solution.
38
 
39
  FireRedVAD supports non-streaming/streaming VAD and non-streaming AED. It supports speech/singing/music detection in 100+ languages. Non-streaming VAD achieves 97.57% F1 on FLEURS-VAD-102, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD.
40
 
41
 
42
  ## 🔥 News
 
43
  - [2026.03.03] We release FireRedVAD as a standalone repository, along with model weights and inference code.
44
  - [2026.02.12] We release [FireRedASR2S](https://github.com/FireRedTeam/FireRedASR2S) (FireRedASR2-AED, FireRedVAD, FireRedLID, and FireRedPunc) with model weights and inference code.
45
 
@@ -214,3 +213,13 @@ print(result)
214
  **Q: What audio format is supported?**
215
 
216
  16kHz 16-bit mono PCM wav. Use ffmpeg to convert other formats: `ffmpeg -i <input_audio_path> -ar 16000 -ac 1 -acodec pcm_s16le -f wav <output_wav_path>`
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
+ - en
4
+ - zh
5
+ license: apache-2.0
6
+ pipeline_tag: voice-activity-detection
7
  tags:
8
  - voice-activity-detection
9
+ - speech-activity-detection
10
+ - audio-event-detection
 
 
 
11
  - vad
12
  - aed
13
  - streaming
 
17
  - asr
18
  ---
19
 
 
20
  <div align="center">
21
  <h1>
22
  FireRedVAD: A SOTA Industrial-Grade
 
26
 
27
  </div>
28
 
29
+ [[Paper]](https://huggingface.co/papers/2603.10420)
30
  [[Code]](https://github.com/FireRedTeam/FireRedVAD)
31
  [[HuggingFace]](https://huggingface.co/FireRedTeam/FireRedVAD)
32
  [[ModelScope]](https://www.modelscope.cn/models/xukaituo/FireRedVAD)
33
 
34
 
35
+ FireRedVAD is a state-of-the-art (SOTA) industrial-grade Voice Activity Detection (VAD) and Audio Event Detection (AED) solution. It was introduced as part of [FireRedASR2S](https://huggingface.co/papers/2603.10420).
36
 
37
  FireRedVAD supports non-streaming/streaming VAD and non-streaming AED. It supports speech/singing/music detection in 100+ languages. Non-streaming VAD achieves 97.57% F1 on FLEURS-VAD-102, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD.
38
 
39
 
40
  ## 🔥 News
41
+ - [2026.03.12] 🔥 We release FireRedASR2S technical report. See [arXiv](https://arxiv.org/abs/2603.10420).
42
  - [2026.03.03] We release FireRedVAD as a standalone repository, along with model weights and inference code.
43
  - [2026.02.12] We release [FireRedASR2S](https://github.com/FireRedTeam/FireRedASR2S) (FireRedASR2-AED, FireRedVAD, FireRedLID, and FireRedPunc) with model weights and inference code.
44
 
 
213
  **Q: What audio format is supported?**
214
 
215
  16kHz 16-bit mono PCM wav. Use ffmpeg to convert other formats: `ffmpeg -i <input_audio_path> -ar 16000 -ac 1 -acodec pcm_s16le -f wav <output_wav_path>`
216
+
217
+ ## Citation
218
+ ```bibtex
219
+ @article{xu2026fireredasr2s,
220
+ title={FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System},
221
+ author={Xu, Kaituo and Jia, Yan and Huang, Kai and Chen, Junjie and Li, Wenpeng and Liu, Kun and Xie, Feng-Long and Tang, Xu and Hu, Yao},
222
+ journal={arXiv preprint arXiv:2603.10420},
223
+ year={2026}
224
+ }
225
+ ```