turn-detector / README.md

hayatiali

Upload model via Fine-tune Assistant

29ec639 verified about 2 months ago

preview code

raw

history blame contribute delete

11.4 kB

metadata

language: tr
license: other
license_name: siriusai-premium-v1
license_link: LICENSE
tags:
  - turkish
  - text-classification
  - bert
  - nlp
  - transformers
  - siriusai
  - production-ready
  - enterprise
base_model: dbmdz/bert-base-turkish-uncased
datasets:
  - custom
metrics:
  - f1
  - precision
  - recall
  - accuracy
  - mcc
library_name: transformers
pipeline_tag: text-classification
model-index:
  - name: turn-detector
    results:
      - task:
          type: text-classification
          name: Text Classification
        metrics:
          - type: f1
            value: 0.9924276856095726
            name: Macro F1
          - type: mcc
            value: 0.9848560799888242

turn-detector - Turkish Text Classification Model

This model is designed for classifying Turkish text into different turn-taking categories in a conversation.

Developed by SiriusAI Tech Brain Team

Mission

To enhance conversational AI by accurately detecting turn-taking dynamics in Turkish dialogues, enabling more natural and engaging interactions.

The turn-detector model is capable of classifying responses in Turkish conversations into two distinct categories: agent_response and backchannel. This functionality is crucial for developing advanced voice assistants and dialogue systems that better understand human interactions. By leveraging the power of the BertForSequenceClassification architecture, the model achieves remarkable accuracy and reliability.

Why This Model Matters

High Accuracy: With an impressive accuracy of over 99%, this model ensures reliable classifications in real-world applications.
Enterprise-Grade Performance: Designed for production use, it meets the stringent requirements of enterprise clients.
NLP Expertise: Developed using state-of-the-art natural language processing techniques, it provides a competitive edge in understanding Turkish conversations.
Scalable Solution: Easily integratable into existing systems, allowing for seamless deployment in various applications.
Robust Training: Trained on a substantial dataset, ensuring its effectiveness across diverse conversational contexts.

Model Overview

Property	Value
Architecture	BertForSequenceClassification
Base Model	`dbmdz/bert-base-turkish-uncased`
Task	Text Classification
Language	Turkish (tr)
Categories	2 labels
Model Size	~110M parameters
Inference Time	~10-15ms (GPU) / ~40-50ms (CPU)

Performance Metrics

Final Evaluation Results

Metric	Score	Description
Macro F1	0.9924	Harmonic mean of precision and recall
MCC	0.9849	Matthews Correlation Coefficient
Accuracy	99.3242%	Ratio of correctly predicted instances to total instances

Per-Class Performance

Category	Accuracy	Correct	Total
agent_response	99.5%	7,429	7,464
backchannel	98.9%	3,741	3,782

Dataset

Dataset Statistics

Split	Samples	Purpose
Train	44,982	Model training
Test	11,246	Model evaluation
Total	56,228	Complete dataset

Category Distribution

Category	Samples	Percentage	Description
turn_action	56,228	100.0%	turn_action category

Subcategory Breakdown

Category	Subcategories
turn_action	agent_response, backchannel

Label Definitions

Label	ID	Description	Turkish Examples
agent_response	0	Represents a direct response from the agent in a conversation	"Merhaba, size nasıl yardımcı olabilirim?"
backchannel	1	Indicates acknowledgment or encouragement from the listener	"Evet", "Anladım"

Important: Category Boundaries

The distinction between agent_response and backchannel is critical. An agent_response represents a substantive reply to a query, while backchannel responses are brief acknowledgments that do not provide new information.

Training Procedure

Hyperparameters

Parameter	Value
Base Model	`dbmdz/bert-base-turkish-uncased`
Max Sequence Length	128 tokens
Batch Size	16
Learning Rate	2e-5
Epochs	3
Optimizer	AdamW
Weight Decay	0.01
Loss Function	CrossEntropyLoss / Focal Loss
Problem Type	Single-label / Multi-label Classification

Training Environment

Resource	Specification
Hardware	Apple Silicon (MPS) / CUDA GPU
Framework	PyTorch + Transformers
Training Time	Varies based on dataset size

Usage

Installation

pip install transformers torch

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "hayatiali/turn-detector"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

LABELS = ["agent_response", "backchannel"]

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)[0]

    scores = {label: float(prob) for label, prob in zip(LABELS, probs)}
    primary = max(scores, key=scores.get)
    return {"category": primary, "confidence": scores[primary], "all_scores": scores}

# Examples
print(predict("Merhaba, nasılsınız?"))

Production Class

class TurnDetectorClassifier:
    LABELS = ["agent_response", "backchannel"]

    def __init__(self, model_path="hayatiali/turn-detector"):
        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
        self.model = AutoModelForSequenceClassification.from_pretrained(model_path)
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.model.to(self.device).eval()

    def predict(self, text: str) -> dict:
        inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
        inputs = {k: v.to(self.device) for k, v in inputs.items()}

        with torch.no_grad():
            logits = self.model(**inputs).logits
            probs = torch.softmax(logits, dim=-1)[0].cpu().numpy()

        scores = dict(zip(self.LABELS, probs))
        return {"category": max(scores, key=scores.get), "confidence": max(scores.values()), "scores": scores}

Batch Inference

def predict_batch(texts: list, batch_size: int = 32) -> list:
    results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        inputs = tokenizer(batch, return_tensors="pt", truncation=True, max_length=128, padding=True)
        inputs = {k: v.to(device) for k, v in inputs.items()}

        with torch.no_grad():
            probs = torch.softmax(model(**inputs).logits, dim=-1).cpu().numpy()

        for prob in probs:
            scores = dict(zip(LABELS, prob))
            results.append(scores)
    return results

Limitations & Known Issues

⚠️ Model Limitations

Limitation	Details	Impact
Dataset Bias	Model performance may vary on conversational data outside the training set.	Could lead to inaccuracies in specific domains.
Language Nuance	Captures standard Turkish but may struggle with dialects or highly informal speech.	Reduced accuracy in non-standard language use.
Context Understanding	Limited ability to understand context beyond single-turn interactions.	May misclassify responses that rely on previous context.

⚠️ Production Deployment Considerations

Consideration	Details	Recommendation
Model Size	Large model size may impact deployment on limited-resource environments.	Consider model distillation or quantization for constrained environments.

Not Suitable For

Real-time critical applications without human oversight.
Scenarios requiring high levels of contextual understanding across multiple turns.
Use cases in non-Turkish languages without adaptation.

Ethical Considerations

Intended Use

Conversational AI applications.
Voice assistants and chatbots.
Customer service automation.

Risks

Bias in Training Data: If the training data is biased, the model may perpetuate those biases in its predictions.
Misuse of Technology: Potential for the model to be used in contexts that require ethical considerations, such as surveillance or deceptive practices.

Recommendations

Human Oversight: Always implement human oversight in applications that utilize the model.
Monitoring: Continuously monitor model outputs for unexpected or biased behavior.
Updates: Regularly update the model with new data to improve accuracy and mitigate biases.

Technical Specifications

Model Architecture

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings
    (encoder): BertEncoder (12 layers)
    (pooler): BertPooler
  )
  (dropout): Dropout(p=0.1)
  (classifier): Linear(in_features=768, out_features=2)
)

Total Parameters: ~110M

Input/Output

Input: Turkish text (max 128 tokens)
Output: 2-dimensional probability vector
Tokenizer: BERTurk WordPiece (32k vocab)

Citation

@misc{turn-detector-2025,
  title={turn-detector - Turkish Text Classification Model},
  author={SiriusAI Tech Brain Team},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/hayatiali/turn-detector}},
  note={Fine-tuned from dbmdz/bert-base-turkish-uncased}
}

Model Card Authors

SiriusAI Tech Brain Team

Contact

Email: info@siriusaitech.com
Repository: GitHub

Changelog

v1.0 (Current)

Initial release
2-category text classification
Macro F1: 0.9924, MCC: 0.9849

License: SiriusAI Tech Premium License v1.0

Commercial Use: Requires Premium License. Contact: info@siriusaitech.com

Free Use Allowed For:

Academic research and education
Non-profit organizations (with approval)
Evaluation (30 days)

Disclaimer: This model is designed for text classification applications. Always implement with appropriate safeguards and human oversight. Model predictions should inform decisions, not replace human judgment.