nihad-ask's picture
Update README.md
82c06f5 verified

Arabic End-of-Turn (EOU) Detection Model β€” MARBERT Fine-Tuned

This model fine-tunes MARBERT for detecting end-of-turn (EOU) boundaries in Arabic dialogue.
It predicts whether a given user message represents a continuation or an end of turn.

  • Repository: nihad-ask/Arabert-EOU-detection-model
  • Task: Binary End-of-Utterance Classification
  • Language: Arabic (MSA + saudi dilect)
  • Base Model: UBC-NLP/MARBERT

🚦 Task Definition

This is a binary classification task:

Label Meaning
0 Speaker will continue (NOT end of turn)
1 End of turn (EOU detected)

This helps conversational agents determine if the user has finished typing or is likely to continue.


πŸ“Œ Use Cases

  • Conversational AI / Chatbots
  • Dialogue Systems
  • Turn-taking prediction
  • Speech-to-text segmentation
  • Customer support automation

πŸ“Š Evaluation

Balanced Validation Set

Accuracy: 0.9098

Class Precision Recall F1-score Support
0 – Continue 0.9058 0.9148 0.9103 1702
1 – End of Turn 0.9139 0.9048 0.9094 1702

Overall:

Metric Score
Accuracy 0.9098
Macro Avg F1 0.9098
Weighted Avg F1 0.9098
Total Samples 3404

πŸ§ͺ How to Use

Python (PyTorch)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "nihad-ask/marbert-EOU-detection-model"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

text = "ΨͺΩ…Ψ§Ω… و Ψ¨ΨΉΨ―ΩŠΩ†ΨŸ"

inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()

if prediction == 1:
    print("End of turn")
else:
    print("Speaker will continue")


@misc{marbert_eou_2025,
  author = {Nihad Askri},
  title = {MARBERT Arabic End-of-Utterance Detection},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/nihad-ask/marbert-arabic-EOU-detection-model}}
}