Vurtnec
/

eot-detector-smollm2

@@ -1,59 +1,143 @@
 ---
 base_model: HuggingFaceTB/SmolLM2-135M
-library_name: transformers
-model_name: eot-detector-smollm2
 tags:
-- generated_from_trainer
-- trl
-- hf_jobs
-- sft
-licence: license
 ---
-# Model Card for eot-detector-smollm2
-This model is a fine-tuned version of [HuggingFaceTB/SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="Vurtnec/eot-detector-smollm2", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
 ```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 0.25.1
-- Transformers: 4.57.3
-- Pytorch: 2.9.1
-- Datasets: 4.4.1
-- Tokenizers: 0.22.1
-## Citations
-Cite TRL as:
-```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
-}
-```

 ---
+license: apache-2.0
 base_model: HuggingFaceTB/SmolLM2-135M
 tags:
+- end-of-turn-detection
+- turn-taking
+- voice-ai
+- lora
+- peft
+datasets:
+- Vurtnec/eot-detection-dataset
+language:
+- en
+pipeline_tag: text-generation
+model-index:
+- name: eot-detector-smollm2
+  results:
+  - task:
+      type: text-classification
+      name: End-of-Turn Detection
+    dataset:
+      name: EOT Detection Test Set
+      type: Vurtnec/eot-detection-testset
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.7667
+    - name: Precision
+      type: precision
+      value: 1.0
+    - name: Recall
+      type: recall
+      value: 0.5333
+    - name: F1
+      type: f1
+      value: 0.6957
 ---
+# EOT Detector - SmolLM2 135M
+A fine-tuned model for **End-of-Turn (EOT) detection** in conversations, based on [SmolLM2-135M](https://huggingface.co/HuggingFaceTB/SmolLM2-135M).
+## Model Description
+This model predicts whether a user has finished speaking in a conversation (end-of-turn) or is still continuing. It's designed for voice AI applications where accurate turn-taking is critical to avoid interrupting users.
+### Key Features
+- **Base Model**: SmolLM2-135M (135M parameters)
+- **Fine-tuning Method**: LoRA (r=4, alpha=8)
+- **Task**: Binary classification (complete vs incomplete turn)
+- **Inference Speed**: ~10ms on CPU
+## Training Details
+| Parameter | Value |
+|-----------|-------|
+| Base Model | HuggingFaceTB/SmolLM2-135M |
+| LoRA Rank | 4 |
+| LoRA Alpha | 8 |
+| Learning Rate | 2e-4 |
+| Epochs | 3 |
+| Training Samples | 50 |
+| Hardware | T4 GPU |
+## Evaluation Results
+Evaluated on [Vurtnec/eot-detection-testset](https://huggingface.co/datasets/Vurtnec/eot-detection-testset) (30 samples):
+| Metric | Value |
+|--------|-------|
+| **Accuracy** | 76.67% |
+| **Precision** | 100% |
+| **Recall** | 53.33% |
+| **F1 Score** | 69.57% |
+### Classification Report
+```
+              precision    recall  f1-score   support
+  Incomplete       0.68      1.00      0.81        15
+    Complete       1.00      0.53      0.70        15
+    accuracy                           0.77        30
+   macro avg       0.84      0.77      0.75        30
 ```
+### Analysis
+- **High Precision (100%)**: When the model predicts "complete", it's always correct
+- **Lower Recall (53%)**: The model is conservative, sometimes missing completed turns
+- This is preferable for voice AI: better to wait slightly longer than to interrupt users
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# Load model
+base_model = "HuggingFaceTB/SmolLM2-135M"
+adapter_model = "Vurtnec/eot-detector-smollm2"
+tokenizer = AutoTokenizer.from_pretrained(base_model)
+model = AutoModelForCausalLM.from_pretrained(base_model)
+model = PeftModel.from_pretrained(model, adapter_model)
+# Format input
+def format_conversation(messages):
+    text = ""
+    for msg in messages:
+        text += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n"
+    text += "<|im_start|>label\n"
+    return text
+# Example
+messages = [
+    {"role": "user", "content": "Hi, I need help"},
+    {"role": "assistant", "content": "Sure, what do you need?"},
+    {"role": "user", "content": "Well, um..."}
+]
+input_text = format_conversation(messages)
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=10)
+result = tokenizer.decode(outputs[0])
+# Check for <|eot|> (complete) or <|continue|> (incomplete)
+```
+## Datasets
+- **Training**: [Vurtnec/eot-detection-dataset](https://huggingface.co/datasets/Vurtnec/eot-detection-dataset) (50 samples)
+- **Testing**: [Vurtnec/eot-detection-testset](https://huggingface.co/datasets/Vurtnec/eot-detection-testset) (30 samples)
+## Limitations
+- Trained on limited English data (50 samples)
+- May not generalize well to domain-specific conversations
+- Conservative prediction style (prefers "incomplete" when uncertain)
+## License
+Apache 2.0