yashsoni78/hindi-end-of-utterance-detection
Viewer • Updated • 1k • 4
A fine-tuned DistilBERT model for End-of-Utterance (EOU) Detection in conversational Hindi. This model identifies whether a Hindi dialogue phrase marks the end of a speaker's turn, making it suitable for voice assistants, dialogue systems, or turn-taking logic in chatbots.
distilbert-base-multilingual-cased1: End of Utterance (EOU)0: Not End of Utterance (NOT_EOU)This model was fine-tuned on the hindi-conversational-eou dataset — a balanced collection of 1000 Hindi conversational phrases labeled for end-of-turn detection.
Each example in the dataset is a short Hindi phrase labeled with:
"text": The utterance string"label": 0 or 1 (as defined above)(Note: These are example metrics — replace with your actual numbers if available)
This model is ideal for:
from transformers import pipeline
classifier = pipeline("text-classification", model="yashsoni78/distilbert-hindi-eou-detector")
# Example phrases
examples = [
"क्या तुम मेरे साथ चलोगे?",
"अगर हम वहाँ जाते तो",
]
for text in examples:
result = classifier(text)
print(f"{text} => {result}")
🔍 Limitations
- Trained on a small dataset (1000 examples); may not generalize to complex or domain-specific Hindi.
- Only binary EOU detection, no deeper semantic understanding.
- Assumes input is in colloquial conversational Hindi.
🧾 Citation
If you use this model in your research or application, please cite:
@misc{distilbert_hindi_eou_2025,
title = {distilbert-hindi-eou-detector},
author = {Yash Soni},
year = {2025},
howpublished = {\url{https://huggingface.co/yashsoni78/distilbert-hindi-eou-detector}},
note = {Fine-tuned model for Hindi end-of-utterance detection}
}
📄 License
This model is released under the MIT License. You are free to use, modify, and distribute with attribution.
🙏 Acknowledgements
- Base model: distilbert-base-multilingual-cased
- Dataset: hindi-end-of-utterance-detection
- Created with the help of 🤗 Transformers
Base model
distilbert/distilbert-base-uncased