FireRedTeam
/

FireRedChat-pvad

Voice Activity Detection

Model card Files Files and versions

FireRedChat-pvad / README.md

FireRedTeam's picture

Update README.md

74561b1 verified 4 months ago

|

history blame contribute delete

1.64 kB

	---
	license: apache-2.0
	language:
	- zh
	- en
	base_model:
	- speechbrain/spkrec-ecapa-voxceleb
	tags:
	- agent
	- voice-activity-detection
	---

	<div align="center">
	<h1>FireRedChat-pVAD</h1>
	</div>
	<div align="center">
	<a href="https://fireredteam.github.io/demos/firered_chat/">Demo</a> •
	<a href="https://arxiv.org/pdf/2509.06502">Paper</a> •
	<a href="https://huggingface.co/FireRedTeam">Huggingface</a>
	</div>

	## Descriptions
	FireRedChat's personalized Voice Activity Detection (pVAD) model, an open-weight model for detecting voice activity with speaker embedding updates.. [LiveKit plugin available here](https://github.com/fireredchat-submodules/livekit-plugins-fireredchat-pvad)

	- Supports speaker embedding updates for improved voice activity detection.
	- The plugin requires a compatible LiveKit Agents fork or modification to include `update_speaker` call for the first user utterance.

	## Roadmap
	- [x] 2025/09
	- [x] Release the pVAD model weights and LiveKit plugin.

	## Usage
	For inference, please use the [LiveKit plugin](https://github.com/fireredchat-submodules/livekit-plugins-fireredchat-pvad). Install and configure as follows:

	```python
	from livekit.plugins import fireredchat_pvad as pvad

	def prewarm(proc: JobProcess):
	proc.userdata["vad"] = pvad.VAD.load(activation_threshold=0.5)

	# After the first utterance (or when primary speaker switches based on RMS), call VADStream's update_speaker() to update speaker embedding.
	```

	## License
	The model weights and plugin code are licensed under the Apache-2.0 license.

	### Acknowledgment
	- Speaker embedding model: speechbrain/spkrec-ecapa-voxceleb

	---