Upload model via Fine-tune Assistant

Browse files

Files changed (12) hide show

README.md +362 -0
benchmark/adversarial_samples.csv +81 -0
benchmark/benchmark_results.json +873 -0
config.json +33 -0
evaluation_results.json +25 -0
label_config.json +17 -0
model.safetensors +3 -0
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +59 -0
training_args.bin +3 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,362 @@

+---
+language: tr
+license: other
+license_name: siriusai-premium-v1
+license_link: LICENSE
+tags:
+- turkish
+- text-classification
+- bert
+- nlp
+- transformers
+- siriusai
+- production-ready
+- enterprise
+base_model: dbmdz/bert-base-turkish-uncased
+datasets:
+- custom
+metrics:
+- f1
+- precision
+- recall
+- accuracy
+- mcc
+library_name: transformers
+pipeline_tag: text-classification
+model-index:
+- name: turn-detector
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    metrics:
+    - type: f1
+      value: 0.9924276856095726
+      name: Macro F1
+    - type: mcc
+      value: 0.9848560799888242
+---
+# turn-detector - Turkish Text Classification Model
+<p align="center">
+  <a href="https://huggingface.co/hayatiali/turn-detector"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-turn--detector-yellow" alt="Hugging Face"></a>
+  <a href="https://huggingface.co/hayatiali/turn-detector"><img src="https://img.shields.io/badge/Model-Production%20Ready-brightgreen" alt="Production Ready"></a>
+  <img src="https://img.shields.io/badge/Language-Turkish-blue" alt="Turkish">
+  <img src="https://img.shields.io/badge/Task-Text%20Classification-orange" alt="Text Classification">
+</p>
+This model is designed for classifying Turkish text into different turn-taking categories in a conversation.
+*Developed by SiriusAI Tech Brain Team*
+---
+## Mission
+> **To enhance conversational AI by accurately detecting turn-taking dynamics in Turkish dialogues, enabling more natural and engaging interactions.**
+The `turn-detector` model is capable of classifying responses in Turkish conversations into two distinct categories: **agent_response** and **backchannel**. This functionality is crucial for developing advanced voice assistants and dialogue systems that better understand human interactions. By leveraging the power of the `BertForSequenceClassification` architecture, the model achieves remarkable accuracy and reliability.
+### Why This Model Matters
+- **High Accuracy**: With an impressive accuracy of over 99%, this model ensures reliable classifications in real-world applications.
+- **Enterprise-Grade Performance**: Designed for production use, it meets the stringent requirements of enterprise clients.
+- **NLP Expertise**: Developed using state-of-the-art natural language processing techniques, it provides a competitive edge in understanding Turkish conversations.
+- **Scalable Solution**: Easily integratable into existing systems, allowing for seamless deployment in various applications.
+- **Robust Training**: Trained on a substantial dataset, ensuring its effectiveness across diverse conversational contexts.
+---
+## Model Overview
+| Property | Value |
+|----------|-------|
+| **Architecture** | BertForSequenceClassification |
+| **Base Model** | `dbmdz/bert-base-turkish-uncased` |
+| **Task** | Text Classification |
+| **Language** | Turkish (tr) |
+| **Categories** | 2 labels |
+| **Model Size** | ~110M parameters |
+| **Inference Time** | ~10-15ms (GPU) / ~40-50ms (CPU) |
+---
+## Performance Metrics
+### Final Evaluation Results
+| Metric | Score | Description |
+|--------|-------|-------------|
+| **Macro F1** | **0.9924** | Harmonic mean of precision and recall |
+| **MCC** | **0.9849** | Matthews Correlation Coefficient |
+| **Accuracy** | **99.3242%** | Ratio of correctly predicted instances to total instances |
+### Per-Class Performance
+| Category | Accuracy | Correct | Total |
+|----------|----------|---------|-------|
+| **agent_response** | 99.5% | 7,429 | 7,464 |
+| **backchannel** | 98.9% | 3,741 | 3,782 |
+---
+## Dataset
+### Dataset Statistics
+| Split | Samples | Purpose |
+|-------|---------|---------|
+| **Train** | 44,982 | Model training |
+| **Test** | 11,246 | Model evaluation |
+| **Total** | 56,228 | Complete dataset |
+### Category Distribution
+| Category | Samples | Percentage | Description |
+|----------|---------|------------|-------------|
+| **turn_action** | 56,228 | 100.0% | turn_action category |
+### Subcategory Breakdown
+| Category | Subcategories |
+|----------|---------------|
+| **turn_action** | agent_response, backchannel |
+---
+## Label Definitions
+| Label | ID | Description | Turkish Examples |
+|-------|-----|-------------|------------------|
+| **agent_response** | 0 | Represents a direct response from the agent in a conversation | "Merhaba, size nasıl yardımcı olabilirim?" |
+| **backchannel** | 1 | Indicates acknowledgment or encouragement from the listener | "Evet", "Anladım" |
+### Important: Category Boundaries
+The distinction between **agent_response** and **backchannel** is critical. An **agent_response** represents a substantive reply to a query, while **backchannel** responses are brief acknowledgments that do not provide new information.
+---
+## Training Procedure
+### Hyperparameters
+| Parameter | Value |
+|-----------|-------|
+| **Base Model** | `dbmdz/bert-base-turkish-uncased` |
+| **Max Sequence Length** | 128 tokens |
+| **Batch Size** | 16 |
+| **Learning Rate** | 2e-5 |
+| **Epochs** | 3 |
+| **Optimizer** | AdamW |
+| **Weight Decay** | 0.01 |
+| **Loss Function** | CrossEntropyLoss / Focal Loss |
+| **Problem Type** | Single-label / Multi-label Classification |
+### Training Environment
+| Resource | Specification |
+|----------|---------------|
+| **Hardware** | Apple Silicon (MPS) / CUDA GPU |
+| **Framework** | PyTorch + Transformers |
+| **Training Time** | Varies based on dataset size |
+---
+## Usage
+### Installation
+```bash
+pip install transformers torch
+```
+### Quick Start
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+model_name = "hayatiali/turn-detector"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+model.eval()
+LABELS = ["agent_response", "backchannel"]
+def predict(text):
+    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+    with torch.no_grad():
+        outputs = model(**inputs)
+        probs = torch.softmax(outputs.logits, dim=-1)[0]
+    scores = {label: float(prob) for label, prob in zip(LABELS, probs)}
+    primary = max(scores, key=scores.get)
+    return {"category": primary, "confidence": scores[primary], "all_scores": scores}
+# Examples
+print(predict("Merhaba, nasılsınız?"))
+```
+### Production Class
+```python
+class TurnDetectorClassifier:
+    LABELS = ["agent_response", "backchannel"]
+    def __init__(self, model_path="hayatiali/turn-detector"):
+        self.tokenizer = AutoTokenizer.from_pretrained(model_path)
+        self.model = AutoModelForSequenceClassification.from_pretrained(model_path)
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        self.model.to(self.device).eval()
+    def predict(self, text: str) -> dict:
+        inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+        inputs = {k: v.to(self.device) for k, v in inputs.items()}
+        with torch.no_grad():
+            logits = self.model(**inputs).logits
+            probs = torch.softmax(logits, dim=-1)[0].cpu().numpy()
+        scores = dict(zip(self.LABELS, probs))
+        return {"category": max(scores, key=scores.get), "confidence": max(scores.values()), "scores": scores}
+```
+### Batch Inference
+```python
+def predict_batch(texts: list, batch_size: int = 32) -> list:
+    results = []
+    for i in range(0, len(texts), batch_size):
+        batch = texts[i:i + batch_size]
+        inputs = tokenizer(batch, return_tensors="pt", truncation=True, max_length=128, padding=True)
+        inputs = {k: v.to(device) for k, v in inputs.items()}
+        with torch.no_grad():
+            probs = torch.softmax(model(**inputs).logits, dim=-1).cpu().numpy()
+        for prob in probs:
+            scores = dict(zip(LABELS, prob))
+            results.append(scores)
+    return results
+```
+---
+## Limitations & Known Issues
+### ⚠️ Model Limitations
+| Limitation | Details | Impact |
+|------------|---------|--------|
+| **Dataset Bias** | Model performance may vary on conversational data outside the training set. | Could lead to inaccuracies in specific domains. |
+| **Language Nuance** | Captures standard Turkish but may struggle with dialects or highly informal speech. | Reduced accuracy in non-standard language use. |
+| **Context Understanding** | Limited ability to understand context beyond single-turn interactions. | May misclassify responses that rely on previous context. |
+### ⚠️ Production Deployment Considerations
+| Consideration | Details | Recommendation |
+|---------------|---------|----------------|
+| **Model Size** | Large model size may impact deployment on limited-resource environments. | Consider model distillation or quantization for constrained environments. |
+### Not Suitable For
+- Real-time critical applications without human oversight.
+- Scenarios requiring high levels of contextual understanding across multiple turns.
+- Use cases in non-Turkish languages without adaptation.
+---
+## Ethical Considerations
+### Intended Use
+- Conversational AI applications.
+- Voice assistants and chatbots.
+- Customer service automation.
+### Risks
+- **Bias in Training Data**: If the training data is biased, the model may perpetuate those biases in its predictions.
+- **Misuse of Technology**: Potential for the model to be used in contexts that require ethical considerations, such as surveillance or deceptive practices.
+### Recommendations
+1. **Human Oversight**: Always implement human oversight in applications that utilize the model.
+2. **Monitoring**: Continuously monitor model outputs for unexpected or biased behavior.
+3. **Updates**: Regularly update the model with new data to improve accuracy and mitigate biases.
+---
+## Technical Specifications
+### Model Architecture
+```
+BertForSequenceClassification(
+  (bert): BertModel(
+    (embeddings): BertEmbeddings
+    (encoder): BertEncoder (12 layers)
+    (pooler): BertPooler
+  )
+  (dropout): Dropout(p=0.1)
+  (classifier): Linear(in_features=768, out_features=2)
+)
+Total Parameters: ~110M
+```
+### Input/Output
+- **Input**: Turkish text (max 128 tokens)
+- **Output**: 2-dimensional probability vector
+- **Tokenizer**: BERTurk WordPiece (32k vocab)
+---
+## Citation
+```bibtex
+@misc{turn-detector-2025,
+  title={turn-detector - Turkish Text Classification Model},
+  author={SiriusAI Tech Brain Team},
+  year={2025},
+  publisher={Hugging Face},
+  howpublished={\url{https://huggingface.co/hayatiali/turn-detector}},
+  note={Fine-tuned from dbmdz/bert-base-turkish-uncased}
+}
+```
+---
+## Model Card Authors
+**SiriusAI Tech Brain Team**
+## Contact
+- **Email**: info@siriusaitech.com
+- **Repository**: [GitHub](https://github.com/sirius-tedarik)
+---
+## Changelog
+### v1.0 (Current)
+- Initial release
+- 2-category text classification
+- Macro F1: 0.9924, MCC: 0.9849
+---
+**License**: SiriusAI Tech Premium License v1.0
+**Commercial Use**: Requires Premium License. Contact: info@siriusaitech.com
+**Free Use Allowed For**:
+- Academic research and education
+- Non-profit organizations (with approval)
+- Evaluation (30 days)
+**Disclaimer**: This model is designed for text classification applications. Always implement with appropriate safeguards and human oversight. Model predictions should inform decisions, not replace human judgment.

benchmark/adversarial_samples.csv ADDED Viewed

	@@ -0,0 +1,81 @@

+text,expected_label,predicted_label,difficulty,confidence,is_correct
+"Ömer, nasıl yardımcı olabilirim?",agent_response,agent_response,baseline,0.8812354207038879,True
+"Merhaba, hangi konuda yardım edebilirim?",agent_response,agent_response,baseline,0.9911393523216248,True
+"Tabii ki, size bununla ilgili bilgi verebilirim.",agent_response,agent_response,baseline,0.5156869292259216,True
+"Elbette, bu konuda size destek olacağım.",agent_response,backchannel,baseline,0.5563545823097229,False
+"Anladım, hemen kontrol ediyorum.",agent_response,backchannel,baseline,0.5459677577018738,False
+"Lütfen bekleyin, birazdan yanıt vereceğim.",agent_response,agent_response,baseline,0.8637070655822754,True
+Bu konuda yardımcı olmaktan memnuniyet duyarım.,agent_response,agent_response,baseline,0.6278860569000244,True
+Hemen sizin için araştırıyorum.,agent_response,agent_response,baseline,0.7357267737388611,True
+"Endişelenmeyin, bu konuyu halledeceğiz.",agent_response,agent_response,baseline,0.6491527557373047,True
+"Herhangi başka bir sorunuz varsa, sormaktan çekinmeyin.",agent_response,agent_response,baseline,0.9041098952293396,True
+totes agree lol,agent_response,backchannel,length_noise,0.9879427552223206,False
+yup yup yup yup yup,agent_response,backchannel,length_noise,0.988431453704834,False
+"OMG cant believe u did that, like seriously, i mean come on, its just too much, you know what i mean? cuz if you dont then idk what to say, like seriously",agent_response,agent_response,length_noise,0.909318745136261,True
+nah bro,agent_response,backchannel,length_noise,0.9873980283737183,False
+yasss that's wassup,agent_response,backchannel,length_noise,0.974721372127533,False
+okay okay okay i get it already no need to repeat urself over and over again like i'm not deaf or whatever,agent_response,agent_response,length_noise,0.9450967907905579,True
+omg thts crazee,agent_response,backchannel,length_noise,0.9885514974594116,False
+u r kidding right?,agent_response,backchannel,length_noise,0.9817968606948853,False
+"wow just wow, i mean, wow! i never thought that this would happen, like ever, not in a million years, and yet here we are, unbelievable, just totally unbelievable, you feel me?",agent_response,agent_response,length_noise,0.8823995590209961,True
+hah lol whatevs,agent_response,backchannel,length_noise,0.9895368814468384,False
+"Ah, anlıyorum. Devam edebilir misiniz?",agent_response,agent_response,semantic_overlap,0.8250168561935425,True
+"Hmm, bunu biraz daha açabilir misiniz?",agent_response,agent_response,semantic_overlap,0.745111882686615,True
+"Evet, bu gerçekten ilginç. Daha fazla bilgi verebilir misiniz?",agent_response,agent_response,semantic_overlap,0.9849535226821899,True
+Bu konuda düşündüğünüz başka bir şey var mı?,agent_response,agent_response,semantic_overlap,0.9519035220146179,True
+"Hımm, pekala. Başka bir açıdan bakacak olursak?",agent_response,backchannel,semantic_overlap,0.903683066368103,False
+"Evet, kesinlikle. Peki başka hangi yönlerini ele alabiliriz?",agent_response,agent_response,semantic_overlap,0.9927364587783813,True
+"Tamam, peki buna ek olarak ne söyleyebilirsiniz?",agent_response,agent_response,semantic_overlap,0.9534065127372742,True
+"Anladım, devam etmek ister misiniz?",agent_response,agent_response,semantic_overlap,0.974102795124054,True
+"Evet, peki başka bir detaya dikkat çekmek ister misiniz?",agent_response,agent_response,semantic_overlap,0.9879535436630249,True
+"Hmm, çok iyi bir nokta. Bunu biraz daha açar mısınız?",agent_response,agent_response,semantic_overlap,0.9757851362228394,True
+"Oh great, another software update that will surely make everything run faster, just like last time.",agent_response,agent_response,edge_cases,0.895721971988678,True
+"I'm sure the server downtime at exactly 5 PM on a Friday was purely coincidental, and not at all inconvenient.",agent_response,agent_response,edge_cases,0.849263072013855,True
+"Yeah, because deleting the database with a single command is exactly what everyone wants, right?",agent_response,agent_response,edge_cases,0.7744247317314148,True
+"I just love it when my AI assistant corrects me even when I'm right, it's like having a personal grammar teacher.",agent_response,agent_response,edge_cases,0.5396984815597534,True
+"No, I absolutely don't need any more disk space. Who needs to store files anyway?",agent_response,agent_response,edge_cases,0.9811112284660339,True
+"Sure, let's implement the new feature without any testing. What could possibly go wrong?",agent_response,agent_response,edge_cases,0.9612233638763428,True
+"Oh, another meeting about meetings? This is exactly why I got into tech.",agent_response,agent_response,edge_cases,0.9544288516044617,True
+I'm really looking forward to debugging this code at 2 AM again. It's the highlight of my week.,agent_response,agent_response,edge_cases,0.8809834122657776,True
+The best part of working with AI is when it confidently gives you the wrong answer.,agent_response,agent_response,edge_cases,0.8558328151702881,True
+"Of course, let’s deploy the untested code on a Friday evening, I have nothing better to do.",agent_response,agent_response,edge_cases,0.7736720442771912,True
+"Evet, seni anlıyorum.",backchannel,backchannel,baseline,0.8567759990692139,True
+"Hmm, ilginç.",backchannel,backchannel,baseline,0.985055685043335,True
+"Evet, devam et.",backchannel,backchannel,baseline,0.8956389427185059,True
+Gerçekten mi?,backchannel,backchannel,baseline,0.9868144989013672,True
+"Tamam, bu mantıklı.",backchannel,backchannel,baseline,0.7614496946334839,True
+Anladım.,backchannel,backchannel,baseline,0.9884626269340515,True
+"Evet, bu doğru.",backchannel,backchannel,baseline,0.8082573413848877,True
+"Ah, şimdi anlıyorum.",backchannel,backchannel,baseline,0.9578026533126831,True
+Bu ilginç bir nokta.,backchannel,backchannel,baseline,0.6748051643371582,True
+"Evet, buna katılıyorum.",backchannel,backchannel,baseline,0.8088875412940979,True
+yaaaa broooo,backchannel,backchannel,length_noise,0.9909811615943909,True
+huh? r u srz??,backchannel,backchannel,length_noise,0.9855925440788269,True
+OMG this is like the most amazing thing ever I mean I can't even begin to explain how incredible this whole situation is because it's just that awesome you know what I mean like seriously wow just wow ok???,backchannel,agent_response,length_noise,0.7402034997940063,False
+idk wat u mean,backchannel,backchannel,length_noise,0.9897193908691406,True
+sure sure sure sure sure,backchannel,backchannel,length_noise,0.9763302206993103,True
+omg totally 100% agree with you on that one no doubt about it in fact I was just thinking the same thing the other day and it's crazy how we're like on the same wavelength all the time isn't it?,backchannel,agent_response,length_noise,0.9482101798057556,False
+no wayyyy,backchannel,backchannel,length_noise,0.991447925567627,True
+"heyyy, u ther?",backchannel,backchannel,length_noise,0.990699827671051,True
+wow cant believe it happened like that i mean who would have thought that everything would turn out this way after all the planning we did it just goes to show that sometimes things have a way of working out on their own despite all the odds and challenges we faced right from the start,backchannel,agent_response,length_noise,0.971515953540802,False
+kk thx bye,backchannel,backchannel,length_noise,0.99072265625,True
+"Hmm, ilginç bir nokta.",backchannel,backchannel,semantic_overlap,0.9318225979804993,True
+"Anladım, peki ya sonra?",backchannel,backchannel,semantic_overlap,0.9160572290420532,True
+"Hmm, o konuda biraz daha bilgi verir misin?",backchannel,agent_response,semantic_overlap,0.7073332667350769,False
+Gerçekten mi? Daha fazla duymak isterim.,backchannel,backchannel,semantic_overlap,0.7160800099372864,True
+"Bu mantıklı, başka neler oldu?",backchannel,agent_response,semantic_overlap,0.812181293964386,False
+"Hmm, bunu daha önce duymamıştım.",backchannel,backchannel,semantic_overlap,0.8978190422058105,True
+"Bir dakika, bunu doğru mu anlıyorum?",backchannel,agent_response,semantic_overlap,0.7696111798286438,False
+"Peki, sonra ne yaptılar?",backchannel,backchannel,semantic_overlap,0.6477120518684387,True
+Gerçekten mi? Bu beni düşündürdü.,backchannel,backchannel,semantic_overlap,0.9161955714225769,True
+"İlginç, devam et lütfen.",backchannel,backchannel,semantic_overlap,0.7439655661582947,True
+"Evet evet, tabii ki de tebrik ederim, dünya harikası bir iş çıkardın (!)",backchannel,agent_response,edge_cases,0.877510666847229,False
+"Çok güzel, bu kadar net bir çözüm bulduğunu(!) hiç düşünmemiştim doğrusu.",backchannel,agent_response,edge_cases,0.9826798439025879,False
+"Ah, tabii ki! Çünkü herkes daima müşteri hizmetlerinin ne kadar hızlı olduğunu söyler (!)",backchannel,agent_response,edge_cases,0.5608082413673401,False
+"Eğer bu kadar 'yaratıcı' bir fikir daha duyar mıyım diye düşünüyordum, teşekkürler!",backchannel,agent_response,edge_cases,0.9653686881065369,False
+Bir işin en iyi nasıl yapılmaması gerektiğini görmek için harika (!) bir örnekti.,backchannel,agent_response,edge_cases,0.9630967378616333,False
+"Evet, kesinlikle bugünkü toplantıda hiçbir şey anlaşılmadı diyemem.",backchannel,agent_response,edge_cases,0.8947334289550781,False
+"Harika, seninki gibi bir çözüm sayesinde sorunlarımız iki katına çıkacak (!)",backchannel,agent_response,edge_cases,0.8286226987838745,False
+"Tabii ki de, Türk çayı yurt dışında sudan bile ucuzdur (!).",backchannel,backchannel,edge_cases,0.8883209228515625,True
+"Bu kadar ‘detaylı’ bir analiz için üç cümle yeterli oldu, harikasın!",backchannel,agent_response,edge_cases,0.610821545124054,False
+"Elbette, herkesin sabırsızlıkla beklediği o 'harika' PowerPoint sunumunu bir daha görelim.",backchannel,agent_response,edge_cases,0.9590893387794495,False

benchmark/benchmark_results.json ADDED Viewed

	@@ -0,0 +1,873 @@

+{
+  "model_name": "turn-detector",
+  "generated_at": "2025-12-14T21:37:22.477273",
+  "difficulty_results": {
+    "baseline": {
+      "total": 20,
+      "correct": 18,
+      "accuracy": 0.9
+    },
+    "length_noise": {
+      "total": 20,
+      "correct": 10,
+      "accuracy": 0.5
+    },
+    "semantic_overlap": {
+      "total": 20,
+      "correct": 16,
+      "accuracy": 0.8
+    },
+    "edge_cases": {
+      "total": 20,
+      "correct": 11,
+      "accuracy": 0.55
+    }
+  },
+  "overall_accuracy": 0.6875,
+  "total_samples": 80,
+  "correct_samples": 55,
+  "samples": [
+    {
+      "text": "Ömer, nasıl yardımcı olabilirim?",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.8812354207038879,
+      "is_correct": true
+    },
+    {
+      "text": "Merhaba, hangi konuda yardım edebilirim?",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.9911393523216248,
+      "is_correct": true
+    },
+    {
+      "text": "Tabii ki, size bununla ilgili bilgi verebilirim.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.5156869292259216,
+      "is_correct": true
+    },
+    {
+      "text": "Elbette, bu konuda size destek olacağım.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.5563545823097229,
+      "is_correct": false
+    },
+    {
+      "text": "Anladım, hemen kontrol ediyorum.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.5459677577018738,
+      "is_correct": false
+    },
+    {
+      "text": "Lütfen bekleyin, birazdan yanıt vereceğim.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.8637070655822754,
+      "is_correct": true
+    },
+    {
+      "text": "Bu konuda yardımcı olmaktan memnuniyet duyarım.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.6278860569000244,
+      "is_correct": true
+    },
+    {
+      "text": "Hemen sizin için araştırıyorum.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.7357267737388611,
+      "is_correct": true
+    },
+    {
+      "text": "Endişelenmeyin, bu konuyu halledeceğiz.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.6491527557373047,
+      "is_correct": true
+    },
+    {
+      "text": "Herhangi başka bir sorunuz varsa, sormaktan çekinmeyin.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "agent_response",
+      "confidence": 0.9041098952293396,
+      "is_correct": true
+    },
+    {
+      "text": "totes agree lol",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9879427552223206,
+      "is_correct": false
+    },
+    {
+      "text": "yup yup yup yup yup",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.988431453704834,
+      "is_correct": false
+    },
+    {
+      "text": "OMG cant believe u did that, like seriously, i mean come on, its just too much, you know what i mean? cuz if you dont then idk what to say, like seriously",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.909318745136261,
+      "is_correct": true
+    },
+    {
+      "text": "nah bro",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9873980283737183,
+      "is_correct": false
+    },
+    {
+      "text": "yasss that's wassup",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.974721372127533,
+      "is_correct": false
+    },
+    {
+      "text": "okay okay okay i get it already no need to repeat urself over and over again like i'm not deaf or whatever",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.9450967907905579,
+      "is_correct": true
+    },
+    {
+      "text": "omg thts crazee",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9885514974594116,
+      "is_correct": false
+    },
+    {
+      "text": "u r kidding right?",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9817968606948853,
+      "is_correct": false
+    },
+    {
+      "text": "wow just wow, i mean, wow! i never thought that this would happen, like ever, not in a million years, and yet here we are, unbelievable, just totally unbelievable, you feel me?",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.8823995590209961,
+      "is_correct": true
+    },
+    {
+      "text": "hah lol whatevs",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9895368814468384,
+      "is_correct": false
+    },
+    {
+      "text": "Ah, anlıyorum. Devam edebilir misiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.8250168561935425,
+      "is_correct": true
+    },
+    {
+      "text": "Hmm, bunu biraz daha açabilir misiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.745111882686615,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, bu gerçekten ilginç. Daha fazla bilgi verebilir misiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9849535226821899,
+      "is_correct": true
+    },
+    {
+      "text": "Bu konuda düşündüğünüz başka bir şey var mı?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9519035220146179,
+      "is_correct": true
+    },
+    {
+      "text": "Hımm, pekala. Başka bir açıdan bakacak olursak?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.903683066368103,
+      "is_correct": false
+    },
+    {
+      "text": "Evet, kesinlikle. Peki başka hangi yönlerini ele alabiliriz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9927364587783813,
+      "is_correct": true
+    },
+    {
+      "text": "Tamam, peki buna ek olarak ne söyleyebilirsiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9534065127372742,
+      "is_correct": true
+    },
+    {
+      "text": "Anladım, devam etmek ister misiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.974102795124054,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, peki başka bir detaya dikkat çekmek ister misiniz?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9879535436630249,
+      "is_correct": true
+    },
+    {
+      "text": "Hmm, çok iyi bir nokta. Bunu biraz daha açar mısınız?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.9757851362228394,
+      "is_correct": true
+    },
+    {
+      "text": "Oh great, another software update that will surely make everything run faster, just like last time.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.895721971988678,
+      "is_correct": true
+    },
+    {
+      "text": "I'm sure the server downtime at exactly 5 PM on a Friday was purely coincidental, and not at all inconvenient.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.849263072013855,
+      "is_correct": true
+    },
+    {
+      "text": "Yeah, because deleting the database with a single command is exactly what everyone wants, right?",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.7744247317314148,
+      "is_correct": true
+    },
+    {
+      "text": "I just love it when my AI assistant corrects me even when I'm right, it's like having a personal grammar teacher.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.5396984815597534,
+      "is_correct": true
+    },
+    {
+      "text": "No, I absolutely don't need any more disk space. Who needs to store files anyway?",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9811112284660339,
+      "is_correct": true
+    },
+    {
+      "text": "Sure, let's implement the new feature without any testing. What could possibly go wrong?",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9612233638763428,
+      "is_correct": true
+    },
+    {
+      "text": "Oh, another meeting about meetings? This is exactly why I got into tech.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9544288516044617,
+      "is_correct": true
+    },
+    {
+      "text": "I'm really looking forward to debugging this code at 2 AM again. It's the highlight of my week.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8809834122657776,
+      "is_correct": true
+    },
+    {
+      "text": "The best part of working with AI is when it confidently gives you the wrong answer.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8558328151702881,
+      "is_correct": true
+    },
+    {
+      "text": "Of course, let’s deploy the untested code on a Friday evening, I have nothing better to do.",
+      "expected_label": "agent_response",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.7736720442771912,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, seni anlıyorum.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.8567759990692139,
+      "is_correct": true
+    },
+    {
+      "text": "Hmm, ilginç.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.985055685043335,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, devam et.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.8956389427185059,
+      "is_correct": true
+    },
+    {
+      "text": "Gerçekten mi?",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.9868144989013672,
+      "is_correct": true
+    },
+    {
+      "text": "Tamam, bu mantıklı.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.7614496946334839,
+      "is_correct": true
+    },
+    {
+      "text": "Anladım.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.9884626269340515,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, bu doğru.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.8082573413848877,
+      "is_correct": true
+    },
+    {
+      "text": "Ah, şimdi anlıyorum.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.9578026533126831,
+      "is_correct": true
+    },
+    {
+      "text": "Bu ilginç bir nokta.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.6748051643371582,
+      "is_correct": true
+    },
+    {
+      "text": "Evet, buna katılıyorum.",
+      "expected_label": "backchannel",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.8088875412940979,
+      "is_correct": true
+    },
+    {
+      "text": "yaaaa broooo",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9909811615943909,
+      "is_correct": true
+    },
+    {
+      "text": "huh? r u srz??",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9855925440788269,
+      "is_correct": true
+    },
+    {
+      "text": "OMG this is like the most amazing thing ever I mean I can't even begin to explain how incredible this whole situation is because it's just that awesome you know what I mean like seriously wow just wow ok???",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.7402034997940063,
+      "is_correct": false
+    },
+    {
+      "text": "idk wat u mean",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9897193908691406,
+      "is_correct": true
+    },
+    {
+      "text": "sure sure sure sure sure",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9763302206993103,
+      "is_correct": true
+    },
+    {
+      "text": "omg totally 100% agree with you on that one no doubt about it in fact I was just thinking the same thing the other day and it's crazy how we're like on the same wavelength all the time isn't it?",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.9482101798057556,
+      "is_correct": false
+    },
+    {
+      "text": "no wayyyy",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.991447925567627,
+      "is_correct": true
+    },
+    {
+      "text": "heyyy, u ther?",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.990699827671051,
+      "is_correct": true
+    },
+    {
+      "text": "wow cant believe it happened like that i mean who would have thought that everything would turn out this way after all the planning we did it just goes to show that sometimes things have a way of working out on their own despite all the odds and challenges we faced right from the start",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.971515953540802,
+      "is_correct": false
+    },
+    {
+      "text": "kk thx bye",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.99072265625,
+      "is_correct": true
+    },
+    {
+      "text": "Hmm, ilginç bir nokta.",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.9318225979804993,
+      "is_correct": true
+    },
+    {
+      "text": "Anladım, peki ya sonra?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.9160572290420532,
+      "is_correct": true
+    },
+    {
+      "text": "Hmm, o konuda biraz daha bilgi verir misin?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.7073332667350769,
+      "is_correct": false
+    },
+    {
+      "text": "Gerçekten mi? Daha fazla duymak isterim.",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.7160800099372864,
+      "is_correct": true
+    },
+    {
+      "text": "Bu mantıklı, başka neler oldu?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.812181293964386,
+      "is_correct": false
+    },
+    {
+      "text": "Hmm, bunu daha önce duymamıştım.",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.8978190422058105,
+      "is_correct": true
+    },
+    {
+      "text": "Bir dakika, bunu doğru mu anlıyorum?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.7696111798286438,
+      "is_correct": false
+    },
+    {
+      "text": "Peki, sonra ne yaptılar?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.6477120518684387,
+      "is_correct": true
+    },
+    {
+      "text": "Gerçekten mi? Bu beni düşündürdü.",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.9161955714225769,
+      "is_correct": true
+    },
+    {
+      "text": "İlginç, devam et lütfen.",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.7439655661582947,
+      "is_correct": true
+    },
+    {
+      "text": "Evet evet, tabii ki de tebrik ederim, dünya harikası bir iş çıkardın (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.877510666847229,
+      "is_correct": false
+    },
+    {
+      "text": "Çok güzel, bu kadar net bir çözüm bulduğunu(!) hiç düşünmemiştim doğrusu.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9826798439025879,
+      "is_correct": false
+    },
+    {
+      "text": "Ah, tabii ki! Çünkü herkes daima müşteri hizmetlerinin ne kadar hızlı olduğunu söyler (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.5608082413673401,
+      "is_correct": false
+    },
+    {
+      "text": "Eğer bu kadar 'yaratıcı' bir fikir daha duyar mıyım diye düşünüyordum, teşekkürler!",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9653686881065369,
+      "is_correct": false
+    },
+    {
+      "text": "Bir işin en iyi nasıl yapılmaması gerektiğini görmek için harika (!) bir örnekti.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9630967378616333,
+      "is_correct": false
+    },
+    {
+      "text": "Evet, kesinlikle bugünkü toplantıda hiçbir şey anlaşılmadı diyemem.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8947334289550781,
+      "is_correct": false
+    },
+    {
+      "text": "Harika, seninki gibi bir çözüm sayesinde sorunlarımız iki katına çıkacak (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8286226987838745,
+      "is_correct": false
+    },
+    {
+      "text": "Tabii ki de, Türk çayı yurt dışında sudan bile ucuzdur (!).",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "backchannel",
+      "confidence": 0.8883209228515625,
+      "is_correct": true
+    },
+    {
+      "text": "Bu kadar ‘detaylı’ bir analiz için üç cümle yeterli oldu, harikasın!",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.610821545124054,
+      "is_correct": false
+    },
+    {
+      "text": "Elbette, herkesin sabırsızlıkla beklediği o 'harika' PowerPoint sunumunu bir daha görelim.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9590893387794495,
+      "is_correct": false
+    }
+  ],
+  "misclassifications": [
+    {
+      "text": "Elbette, bu konuda size destek olacağım.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.5563545823097229,
+      "is_correct": false
+    },
+    {
+      "text": "Anladım, hemen kontrol ediyorum.",
+      "expected_label": "agent_response",
+      "difficulty": "baseline",
+      "predicted_label": "backchannel",
+      "confidence": 0.5459677577018738,
+      "is_correct": false
+    },
+    {
+      "text": "totes agree lol",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9879427552223206,
+      "is_correct": false
+    },
+    {
+      "text": "yup yup yup yup yup",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.988431453704834,
+      "is_correct": false
+    },
+    {
+      "text": "nah bro",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9873980283737183,
+      "is_correct": false
+    },
+    {
+      "text": "yasss that's wassup",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.974721372127533,
+      "is_correct": false
+    },
+    {
+      "text": "omg thts crazee",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9885514974594116,
+      "is_correct": false
+    },
+    {
+      "text": "u r kidding right?",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9817968606948853,
+      "is_correct": false
+    },
+    {
+      "text": "hah lol whatevs",
+      "expected_label": "agent_response",
+      "difficulty": "length_noise",
+      "predicted_label": "backchannel",
+      "confidence": 0.9895368814468384,
+      "is_correct": false
+    },
+    {
+      "text": "Hımm, pekala. Başka bir açıdan bakacak olursak?",
+      "expected_label": "agent_response",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "backchannel",
+      "confidence": 0.903683066368103,
+      "is_correct": false
+    },
+    {
+      "text": "OMG this is like the most amazing thing ever I mean I can't even begin to explain how incredible this whole situation is because it's just that awesome you know what I mean like seriously wow just wow ok???",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.7402034997940063,
+      "is_correct": false
+    },
+    {
+      "text": "omg totally 100% agree with you on that one no doubt about it in fact I was just thinking the same thing the other day and it's crazy how we're like on the same wavelength all the time isn't it?",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.9482101798057556,
+      "is_correct": false
+    },
+    {
+      "text": "wow cant believe it happened like that i mean who would have thought that everything would turn out this way after all the planning we did it just goes to show that sometimes things have a way of working out on their own despite all the odds and challenges we faced right from the start",
+      "expected_label": "backchannel",
+      "difficulty": "length_noise",
+      "predicted_label": "agent_response",
+      "confidence": 0.971515953540802,
+      "is_correct": false
+    },
+    {
+      "text": "Hmm, o konuda biraz daha bilgi verir misin?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.7073332667350769,
+      "is_correct": false
+    },
+    {
+      "text": "Bu mantıklı, başka neler oldu?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.812181293964386,
+      "is_correct": false
+    },
+    {
+      "text": "Bir dakika, bunu doğru mu anlıyorum?",
+      "expected_label": "backchannel",
+      "difficulty": "semantic_overlap",
+      "predicted_label": "agent_response",
+      "confidence": 0.7696111798286438,
+      "is_correct": false
+    },
+    {
+      "text": "Evet evet, tabii ki de tebrik ederim, dünya harikası bir iş çıkardın (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.877510666847229,
+      "is_correct": false
+    },
+    {
+      "text": "Çok güzel, bu kadar net bir çözüm bulduğunu(!) hiç düşünmemiştim doğrusu.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9826798439025879,
+      "is_correct": false
+    },
+    {
+      "text": "Ah, tabii ki! Çünkü herkes daima müşteri hizmetlerinin ne kadar hızlı olduğunu söyler (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.5608082413673401,
+      "is_correct": false
+    },
+    {
+      "text": "Eğer bu kadar 'yaratıcı' bir fikir daha duyar mıyım diye düşünüyordum, teşekkürler!",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9653686881065369,
+      "is_correct": false
+    },
+    {
+      "text": "Bir işin en iyi nasıl yapılmaması gerektiğini görmek için harika (!) bir örnekti.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9630967378616333,
+      "is_correct": false
+    },
+    {
+      "text": "Evet, kesinlikle bugünkü toplantıda hiçbir şey anlaşılmadı diyemem.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8947334289550781,
+      "is_correct": false
+    },
+    {
+      "text": "Harika, seninki gibi bir çözüm sayesinde sorunlarımız iki katına çıkacak (!)",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.8286226987838745,
+      "is_correct": false
+    },
+    {
+      "text": "Bu kadar ‘detaylı’ bir analiz için üç cümle yeterli oldu, harikasın!",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.610821545124054,
+      "is_correct": false
+    },
+    {
+      "text": "Elbette, herkesin sabırsızlıkla beklediği o 'harika' PowerPoint sunumunu bir daha görelim.",
+      "expected_label": "backchannel",
+      "difficulty": "edge_cases",
+      "predicted_label": "agent_response",
+      "confidence": 0.9590893387794495,
+      "is_correct": false
+    }
+  ]
+}

config.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "architectures": [
+    "BertForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "agent_response",
+    "1": "backchannel"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "agent_response": 0,
+    "backchannel": 1
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "problem_type": "multi_label_classification",
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.4",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 32000
+}

evaluation_results.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "overall": {
+    "macro_f1": 0.9924276856095726,
+    "micro_f1": 0.9932420416147963,
+    "mcc": 0.9848560799888242,
+    "accuracy": 99.32420416147963
+  },
+  "per_class": {
+    "agent_response": {
+      "accuracy": 99.53108252947482,
+      "correct": 7429,
+      "total": 7464
+    },
+    "backchannel": {
+      "accuracy": 98.91591750396616,
+      "correct": 3741,
+      "total": 3782
+    }
+  },
+  "labels": [
+    "agent_response",
+    "backchannel"
+  ],
+  "evaluated_at": "2025-12-14T21:35:53.969061"
+}

label_config.json ADDED Viewed

	@@ -0,0 +1,17 @@

+{
+  "labels": [
+    "agent_response",
+    "backchannel"
+  ],
+  "id2label": {
+    "0": "agent_response",
+    "1": "backchannel"
+  },
+  "label2id": {
+    "agent_response": 0,
+    "backchannel": 1
+  },
+  "num_labels": 2,
+  "base_model": "dbmdz/bert-base-turkish-uncased",
+  "trained_at": "2025-12-14T21:35:28.772117"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0045c21ba8bef6fde11ce20a30700a63b715c2d0c40cf82c1d7bcab17adab137
+size 442499064

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,59 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "max_len": 512,
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1502a401bd9d1d7469710211efa36b287addea220a6762248357b0afb9e79f51
+size 5841

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff