|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- turn-detection |
|
|
- end-of-utterance |
|
|
- voice-agent |
|
|
- livekit |
|
|
- onnx |
|
|
--- |
|
|
|
|
|
# Turn Detector V4 (Fine-tuned) |
|
|
|
|
|
This is a fine-tuned version of the LiveKit Turn Detector model, optimized for specific production use cases. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Base Model**: Qwen2-0.5B-Instruct |
|
|
- **Task**: End-of-Utterance (EOU) detection for voice agents |
|
|
- **Format**: ONNX (INT8 quantized) |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) |
|
|
- **Training Data**: 1735 production conversation records |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Accuracy**: 79.25% @ threshold 0.38 |
|
|
- **Dataset**: 1735 annotated production records |
|
|
- **Improvement**: +13.08% over LiveKit v1.2.2-en baseline |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from livekit.agents import turn_detector |
|
|
|
|
|
# Use with LiveKit agents |
|
|
detector = turn_detector.EOUModel.load( |
|
|
model_id="Vurtnec/turn-detector", |
|
|
download_files=["model.onnx"] |
|
|
) |
|
|
``` |
|
|
|
|
|
## Model Files |
|
|
|
|
|
- `model.onnx`: ONNX Runtime optimized model (250MB) |
|
|
- Tokenizer files: Standard Qwen2 tokenizer configuration |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base Model**: LiveKit Turn Detector v1.2.2-en |
|
|
- **Fine-tuning Approach**: LoRA with rank=8, alpha=16 |
|
|
- **Training Dataset**: 1735 production EOU examples |
|
|
- **Validation Split**: 10% |
|
|
- **Training Date**: December 2024 |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{turn-detector-v4, |
|
|
author = {Vurtnec}, |
|
|
title = {Turn Detector V4 - Fine-tuned EOU Model}, |
|
|
year = {2024}, |
|
|
publisher = {HuggingFace}, |
|
|
howpublished = {\url{https://huggingface.co/Vurtnec/turn-detector}} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|