|
|
--- |
|
|
language: |
|
|
- ar |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- eou-detection |
|
|
- arabic |
|
|
- saudi-dialect |
|
|
- conversation |
|
|
- livekit |
|
|
metrics: |
|
|
- f1 |
|
|
- precision |
|
|
- recall |
|
|
pipeline_tag: text-classification |
|
|
--- |
|
|
|
|
|
# Arabic End-of-Utterance (EOU) Detection Model |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Fine-tuned model for Arabic End-of-Utterance detection, optimized for Saudi dialect conversations. |
|
|
Designed for real-time integration with LiveKit voice agents. |
|
|
|
|
|
## Performance Metrics (Step 2400) |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| F1 Score | 0.534 | |
|
|
| Precision | 0.431 | |
|
|
| Recall | 0.702 | |
|
|
| FPR | 0.150 | |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Real-time voice agent turn detection |
|
|
- Arabic conversational AI systems |
|
|
- Saudi dialect speech processing |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- Base Model: [specify your base model] |
|
|
- Training Steps: 2400 |
|
|
- Validation Loss: 0.462 |
|
|
- Training Date: December 2024 |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("{username}/{repo_name}") |
|
|
tokenizer = AutoTokenizer.from_pretrained("{username}/{repo_name}") |
|
|
|
|
|
# Example inference |
|
|
text = "نعم، أنا أفهم ما تقصد" |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
outputs = model(**inputs) |
|
|
eou_probability = torch.softmax(outputs.logits, dim=-1)[0][1].item() |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Optimized for Saudi dialect |
|
|
- May require threshold tuning for specific use cases |
|
|
- Designed for conversational contexts |
|
|
|
|
|
## Citation |
|
|
```bibtex |
|
|
@misc{arabic-eou-2024, |
|
|
author = {Your Name}, |
|
|
title = {Arabic EOU Detection Model}, |
|
|
year = {2024}, |
|
|
publisher = {HuggingFace}, |
|
|
url = {https://huggingface.co/{username}/{repo_name}} |
|
|
} |
|
|
``` |
|
|
|