Vurtnec/eot-detection-dataset
Viewer • Updated • 50 • 3
How to use Vurtnec/eot-detector-smollm2 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M")
model = PeftModel.from_pretrained(base_model, "Vurtnec/eot-detector-smollm2")A fine-tuned model for End-of-Turn (EOT) detection in conversations, based on SmolLM2-135M.
This model predicts whether a user has finished speaking in a conversation (end-of-turn) or is still continuing. It's designed for voice AI applications where accurate turn-taking is critical to avoid interrupting users.
| Parameter | Value |
|---|---|
| Base Model | HuggingFaceTB/SmolLM2-135M |
| LoRA Rank | 4 |
| LoRA Alpha | 8 |
| Learning Rate | 2e-4 |
| Epochs | 3 |
| Training Samples | 50 |
| Hardware | T4 GPU |
Evaluated on Vurtnec/eot-detection-testset (30 samples):
| Metric | Value |
|---|---|
| Accuracy | 76.67% |
| Precision | 100% |
| Recall | 53.33% |
| F1 Score | 69.57% |
precision recall f1-score support
Incomplete 0.68 1.00 0.81 15
Complete 1.00 0.53 0.70 15
accuracy 0.77 30
macro avg 0.84 0.77 0.75 30
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load model
base_model = "HuggingFaceTB/SmolLM2-135M"
adapter_model = "Vurtnec/eot-detector-smollm2"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, adapter_model)
# Format input
def format_conversation(messages):
text = ""
for msg in messages:
text += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n"
text += "<|im_start|>label\n"
return text
# Example
messages = [
{"role": "user", "content": "Hi, I need help"},
{"role": "assistant", "content": "Sure, what do you need?"},
{"role": "user", "content": "Well, um..."}
]
input_text = format_conversation(messages)
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
result = tokenizer.decode(outputs[0])
# Check for <|eot|> (complete) or <|continue|> (incomplete)
Apache 2.0
Base model
HuggingFaceTB/SmolLM2-135M