🇷🇺 Qwen3-0.6B Turn Detection (Probability-Based)

This model is a specialized conversational boundary detector for Russian real-estate dialogues.

It predicts the probability that a user has finished their turn (<|im_end|>) versus continuing their sentence. It is fine-tuned using Single-Token Loss Masking on a balanced dataset of ~20k complete and incomplete conversational turns.

🚀 Key Features

  • Base Model: unsloth/Qwen3-0.6B (fast, efficient, good Russian support).
  • Method: Probability-based Turn Detection. Instead of a binary classifier head, it uses the model's intrinsic next-token prediction.
  • Performance:
    • Complete Turns: Predicts <|im_end|> with high confidence (>90%).
    • Incomplete Turns: Predicts the continuation word (next token), assigning near-zero probability to <|im_end|>.
  • Latency: Extremely fast inference on CPU/GPU due to 0.6B size.

📊 Training Data

Trained on RAS1981/turn-detection-probability-balanced.

  • Contrastive Pairs: Each complete sentence has a corresponding incomplete version.
  • Balanced: 50% complete turns, 50% incomplete turns.
  • Domain: Russian real-estate inquiries (renting, buying, viewing).

🛠️ How to Use (Inference)

1. Load Model & Tokenizer

from unsloth import FastLanguageModel
import torch

model_name = "RAS1981/qwen3-0.6b-turn-detection-probability-balanced"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)
EOS_ID = tokenizer.eos_token_id # 151645 for Qwen

2. Predict Turn Completion Probability

The core idea is to check the probability of the End-of-Sequence (EOS) token.

@torch.no_grad()
def get_eos_prob(text):
    # Prepare chat template
    messages = [
        {"role": "system", "content": "Ты определяешь конец реплики пользователя по смыслу."},
        {"role": "user", "content": text}
    ]
    
    # Format prompt WITHOUT generation prompt
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
    
    # Tokenize and STRIP trailing EOS if present (critical step!)
    prompt_ids = tokenizer(prompt, add_special_tokens=False).input_ids
    
    # Qwen adds <|im_end|>\n automatically. Strip them to predict the boundary.
    if len(prompt_ids) > 2 and prompt_ids[-1] == 198 and prompt_ids[-2] == 151645:
        prompt_ids = prompt_ids[:-2]
    elif len(prompt_ids) > 1 and prompt_ids[-1] == 151645:
        prompt_ids = prompt_ids[:-1]
        
    inputs = torch.tensor([prompt_ids]).to("cuda")
    
    # Get logits for the LAST token position
    logits = model(inputs).logits[:, -1, :]
    
    # Calculate probability of EOS token
    prob = torch.softmax(logits, dim=-1)[0, EOS_ID].item()
    return prob

# Example Usage
print(get_eos_prob("До свидания."))          # High Prob (e.g., 0.96) -> Turn Complete
print(get_eos_prob("Я хотел бы узнать...")) # Low Prob (e.g., 0.00) -> Turn Incomplete

📈 Evaluation Results

Phrase Type EOS Probability Interpretation
"До свидания." Complete 0.9626 CONFIDENT END
"Алло, здравствуйте" Ambiguous 0.2599 WAIT (User likely continues)
"Я хотел бы узнать про" Incomplete 0.0000 CONFIDENT CONTINUE
"Нет, вы знаете, я наверное" Incomplete 0.0000 CONFIDENT CONTINUE

Threshold Recommendation

  • Turn Complete: prob > 0.5 (Safe default)
  • Turn Incomplete: prob <= 0.5

🧠 Methodology: Single-Token Loss Masking

We trained the model to optimize the loss only on the final token.

  • For complete examples, the target label is <|im_end|>.
  • For incomplete examples, the target label is the actual next word.
  • All previous tokens are masked with -100 in the loss function.

This forces the model to focus purely on the boundary condition: "Given this context, does the turn end here or continue?"

📜 License

Apache 2.0

Downloads last month
16
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RAS1981/qwen3-0.6b-turn-detection-v1

Finetunes
1 model

Dataset used to train RAS1981/qwen3-0.6b-turn-detection-v1