azeddinShr
/

marbert-arabic-eou

Text Classification

end-of-utterance

Model card Files Files and versions

marbert-arabic-eou / README.md

azeddinShr's picture

Create README.md

4136011 verified about 2 months ago

|

history blame contribute delete

2.52 kB

	---
	language:
	- ar
	tags:
	- text-classification
	- eou
	- end-of-utterance
	- turn-detection
	- arabic
	- saudi-dialect
	- marbert
	base_model: UBC-NLP/MARBERT
	license: apache-2.0
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	---

	# MARBERT Arabic End-of-Utterance Detection

	Fine-tuned MARBERT model for Arabic End-of-Utterance (EOU) detection in real-time voice agents.

	## Model Description

	- Base Model: UBC-NLP/MARBERT (163M parameters)
	- Task: Binary sequence classification (complete vs incomplete utterance)
	- Language: Arabic (emphasis on Saudi/Gulf dialect)
	- Training Data: 125K samples from SADA22 dataset
	- Inference Speed: ~30ms average latency on CPU

	## Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| F1 Score \| 0.8174 \|
	\| Accuracy \| 0.7995 \|
	\| Precision \| 0.7506 \|
	\| Recall \| 0.8971 \|
	\| AUC-ROC \| 0.8249 \|

	Test Set: 31,289 samples (50% complete, 50% incomplete)

	## Usage
	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model = AutoModelForSequenceClassification.from_pretrained("azeddinShr/marbert-arabic-eou")
	tokenizer = AutoTokenizer.from_pretrained("azeddinShr/marbert-arabic-eou")

	def predict_eou(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
	with torch.no_grad():
	outputs = model(**inputs)
	probs = torch.softmax(outputs.logits, dim=-1)
	eou_prob = probs[0][1].item()
	return eou_prob

	# Example
	text = "شكرا جزيلا على المساعدة"
	prob = predict_eou(text)
	is_complete = prob > 0.5
	print(f"EOU Probability: {prob:.3f} - {'Complete' if is_complete else 'Incomplete'}")
	```

	## Training Details

	- Epochs: 6
	- Batch Size: 16 (train), 32 (eval)
	- Learning Rate: 2e-5
	- Optimizer: AdamW
	- Max Length: 128 tokens
	- Training Time: ~2 minutes (GPU)

	## Use Cases

	- Real-time Arabic voice agents
	- Turn-taking detection in conversations
	- Streaming speech-to-text applications
	- Voice assistant interrupt handling

	## Limitations

	- Best performance on Saudi/Gulf Arabic dialects
	- Requires Arabic text input (not audio)

	## Citation
	```bibtex
	@model{marbert-arabic-eou,
	author = {azeddinShr},
	title = {MARBERT Arabic End-of-Utterance Detection},
	year = {2025},
	publisher = {HuggingFace},
	url = {https://huggingface.co/azeddinShr/marbert-arabic-eou}
	}
	```

	## Dataset

	Training dataset: [azeddinShr/arabic-eou-sada22](https://huggingface.co/datasets/azeddinShr/arabic-eou-sada22)