# Arabic End-of-Utterance (EOU) Classifier

## Overview
This repository contains a custom PyTorch model for **End-of-Utterance (EOU) detection** in Arabic conversational text.  
The model predicts whether a given text segment represents the end of a speaker’s turn.

This is a **custom architecture** (not a Hugging Face `AutoModel`) and is intended for research and development use.

---

## Task
Given an input text segment, the model outputs a binary prediction:

- `0` → The speaker is expected to continue speaking  
- `1` → The speaker has finished their turn  

---

## Model Details
- Framework: PyTorch
- Architecture: Custom `EOUClassifier`
- Task: Binary classification (EOU detection)
- Language: Arabic

---

## Tokenizer
This model uses the tokenizer from:

`Omartificial-Intelligence-Space/SA-BERT-V1`

The tokenizer is **not included** in this repository and must be loaded separately.

---

## Files
- `model.py` — Model architecture (`EOUClassifier`)
- `model.pt` — Trained model weights
- `config.json` — Model configuration
- `README.md` — This file

---

## Loading the Model
```python
import torch
from transformers import AutoTokenizer
from model import EOUClassifier

tokenizer = AutoTokenizer.from_pretrained(
    "Omartificial-Intelligence-Space/SA-BERT-V1"
)

model = EOUClassifier()
model.load_state_dict(
    torch.load("model.pt", map_location="cpu")
)
model.eval()

examples = ["مقصدي من الموضوع انه", "اتمنى تقدر تساعدني"]


batch = tokenizer(examples, padding=True, truncation=True, return_tensors="pt")
batch.to(device)

out = model(batch["input_ids"], batch["attention_mask"])
```

## license 

MIT