---
language:
- ar
license: apache-2.0
tags:
- arabic
- end-of-utterance
- eou
- turn-detection
- conversational-ai
- livekit
- bert
- arabert
datasets:
- arabic-eou-detection-10k
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: Arabic End-of-Utterance Detector
  results:
  - task:
      type: text-classification
      name: End-of-Utterance Detection
    dataset:
      name: Arabic EOU Detection
      type: arabic-eou-detection-10k
    metrics:
    - type: accuracy
      value: 0.90
      name: Accuracy
    - type: f1
      value: 0.92
      name: F1 Score (EOU)
    - type: precision
      value: 0.90
      name: Precision (EOU)
    - type: recall
      value: 0.93
      name: Recall (EOU)
---

# Arabic End-of-Utterance (EOU) Detector

**Detect when a speaker has finished their utterance in Arabic conversations.**

This model is fine-tuned from [AraBERT v2](https://huggingface.co/aubmindlab/bert-base-arabertv2) for binary classification of Arabic text to determine if an utterance is complete (EOU) or incomplete (No EOU).

## Model Description

- **Model Type**: BERT-based binary classifier
- **Base Model**: [aubmindlab/bert-base-arabertv2](https://huggingface.co/aubmindlab/bert-base-arabertv2)
- **Language**: Arabic (ar)
- **Task**: End-of-Utterance Detection
- **License**: Apache 2.0

## Performance

| Metric | Value |
|--------|-------|
| **Accuracy** | 90% |
| **Precision (EOU)** | 0.90 |
| **Recall (EOU)** | 0.93 |
| **F1-Score (EOU)** | 0.92 |
| **Test Samples** | 1,001 |

### Confusion Matrix

```
           Predicted
           No EOU  EOU
Actual No  333     62   (84.3% correct)
       EOU 42      564  (93.1% correct)
```

## Available Formats

This repository includes three model formats:

1. **PyTorch** (`pytorch_model.bin` or `model.safetensors`) - For training and fine-tuning
2. **ONNX** (`model.onnx`) - For optimized CPU/GPU inference (~2-3x faster)
3. **Quantized ONNX** (`model_quantized.onnx`) - For production (75% smaller, 2-3x faster)

## Quick Start

### Installation

```bash
pip install transformers torch onnxruntime
```

### PyTorch Inference

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "your-username/arabic-eou-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Inference
def predict_eou(text: str):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
    with torch.no_grad():
        outputs = model(**inputs)
    
    logits = outputs.logits
    probs = torch.softmax(logits, dim=-1)
    is_eou = torch.argmax(probs, dim=-1).item() == 1
    confidence = probs[0, 1].item()
    
    return is_eou, confidence

# Test
text = "مرحبا كيف حالك"
is_eou, conf = predict_eou(text)
print(f"Is EOU: {is_eou}, Confidence: {conf:.4f}")
```

### ONNX Inference (Recommended for Production)

```python
import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer

# Load model and tokenizer
model_name = "your-username/arabic-eou-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load ONNX model (use model_quantized.onnx for best performance)
session = ort.InferenceSession(
    "model_quantized.onnx",  # or "model.onnx"
    providers=['CPUExecutionProvider']
)

# Inference
def predict_eou(text: str):
    inputs = tokenizer(
        text,
        padding="max_length",
        max_length=512,
        truncation=True,
        return_tensors="np"
    )
    
    outputs = session.run(
        None,
        {
            'input_ids': inputs['input_ids'].astype(np.int64),
            'attention_mask': inputs['attention_mask'].astype(np.int64)
        }
    )
    
    logits = outputs[0]
    probs = np.exp(logits) / np.sum(np.exp(logits), axis=-1, keepdims=True)
    is_eou = np.argmax(probs, axis=-1)[0] == 1
    confidence = float(probs[0, 1])
    
    return is_eou, confidence

# Test
text = "مرحبا كيف حالك"
is_eou, conf = predict_eou(text)
print(f"Is EOU: {is_eou}, Confidence: {conf:.4f}")
```

## Use Cases

- **Voice Assistants**: Detect when user has finished speaking
- **Conversational AI**: Improve turn-taking in Arabic chatbots
- **LiveKit Agents**: Custom turn detection for Arabic conversations
- **Speech Recognition**: Post-processing for better utterance segmentation

## Integration with LiveKit

```python
from livekit.plugins.arabic_turn_detector import ArabicTurnDetector

# Download model from HuggingFace
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="your-username/arabic-eou-detector",
    filename="model_quantized.onnx"
)

# Create turn detector
turn_detector = ArabicTurnDetector(
    model_path=model_path,
    unlikely_threshold=0.7
)

# Use in agent
session = AgentSession(
    turn_detector=turn_detector,
    # ... other config
)
```

## Training Details

### Training Data

- **Dataset**: Arabic EOU Detection (10,072 samples)
- **Train/Val/Test Split**: 80/10/10
- **Classes**: 
  - `0`: Incomplete utterance (No EOU)
  - `1`: Complete utterance (EOU)

### Training Hyperparameters

- **Base Model**: aubmindlab/bert-base-arabertv2
- **Learning Rate**: 2e-5
- **Batch Size**: 32
- **Epochs**: 10
- **Optimizer**: AdamW
- **Weight Decay**: 0.01
- **Max Sequence Length**: 512

### Preprocessing

- AraBERT normalization (diacritics removal, character normalization)
- Tokenization with AraBERT tokenizer
- Padding to max length (512 tokens)

## Limitations

- **Language**: Optimized for Modern Standard Arabic (MSA)
- **Domain**: Trained on conversational Arabic text
- **Sequence Length**: Maximum 512 tokens
- **Dialects**: May have reduced accuracy on dialectal Arabic

## Citation

If you use this model, please cite:

```bibtex
@misc{arabic-eou-detector,
  author = {Your Name},
  title = {Arabic End-of-Utterance Detector},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/your-username/arabic-eou-detector}}
}
```

## License

Apache 2.0

## Acknowledgments

- **AraBERT**: [aubmindlab/bert-base-arabertv2](https://huggingface.co/aubmindlab/bert-base-arabertv2)
- **HuggingFace Transformers**: Model training and inference
- **ONNX Runtime**: Model optimization and deployment

## Contact

For issues or questions, please open an issue on the [GitHub repository](https://github.com/Ahmed-Ezzat20/hams_task).