Arabic End-of-Turn (EOU) Detection Model — MARBERT Fine-Tuned

This model fine-tunes MARBERT for detecting end-of-turn (EOU) boundaries in Arabic dialogue.
It predicts whether a given user message represents a continuation or an end of turn.

Repository: nihad-ask/Arabert-EOU-detection-model
Task: Binary End-of-Utterance Classification
Language: Arabic (MSA + saudi dilect)
Base Model: UBC-NLP/MARBERT

🚦 Task Definition

This is a binary classification task:

Label	Meaning
0	Speaker will continue (NOT end of turn)
1	End of turn (EOU detected)

This helps conversational agents determine if the user has finished typing or is likely to continue.

📌 Use Cases

Conversational AI / Chatbots
Dialogue Systems
Turn-taking prediction
Speech-to-text segmentation
Customer support automation

📊 Evaluation

Balanced Validation Set

Accuracy: 0.9098

Class	Precision	Recall	F1-score	Support
0 – Continue	0.9058	0.9148	0.9103	1702
1 – End of Turn	0.9139	0.9048	0.9094	1702

Overall:

Metric	Score
Accuracy	0.9098
Macro Avg F1	0.9098
Weighted Avg F1	0.9098
Total Samples	3404

🧪 How to Use

Python (PyTorch)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "nihad-ask/marbert-EOU-detection-model"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "تمام و بعدين؟"

inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()

if prediction == 1:
    print("End of turn")
else:
    print("Speaker will continue")


@misc{marbert_eou_2025,
  author = {Nihad Askri},
  title = {MARBERT Arabic End-of-Utterance Detection},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/nihad-ask/marbert-arabic-EOU-detection-model}}
}