Russian Addressee Detection Model
Fine-tuned RuBERT for detecting whether speech is directed at the system or not in Russian voice conversations.
Model Description
This model classifies whether a spoken utterance is directed at the voice assistant (addressee) or is ambient speech (talking to yourself, someone else in the room, or just mumbling). Designed for real-time voice chat applications to prevent false triggers.
Base Model: DeepPavlov/rubert-base-cased
Task: Binary classification (addressed to system / not addressed to system)
Language: Russian
Performance
| Metric | Score |
|---|---|
| Accuracy | 95.16% |
| Precision | 88.89% |
| Recall | 94.12% |
| F1-Score | 91.43% |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("Silxxor/Russian-Addressee-detector")
model = AutoModelForSequenceClassification.from_pretrained("Silxxor/Russian-Addressee-detector")
text = "покажи мне погоду на завтра"
inputs = tokenizer(text, return_tensors="pt", max_length=64, truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=-1).item()
# 0 = not addressed to system, 1 = addressed to system
print("Addressed to system" if prediction == 1 else "Not addressed to system")
Training Data
- Source: Synthetically generated dataset based on Russian conversational patterns
- Wake words: "Люси" (Lucy), "ассистент" (assistant), "компьютер" (computer)
- Mixed dataset of direct commands, questions with wake words, and ambient speech
- Addressed utterances (containing wake words or direct commands) labeled as 1
- Non-addressed utterances (self-talk, background conversation) labeled as 0
- Balanced dataset
Training Details
- Epochs: 3
- Batch size: 16
- Learning rate: 2e-5
- Weight decay: 0.01
- Optimizer: AdamW
- Best model selection: F1 score
Limitations
- Trained on synthetically generated data, not natural speech patterns
- Optimized for specific wake words (Lucy, assistant, computer)
- May not generalize well to other assistant names or contexts
- Performance depends on ASR quality
- Cultural and contextual nuances may affect accuracy
Intended Use
Voice assistants and conversational AI systems that need to distinguish between speech directed at them versus ambient conversation in Russian.
- Downloads last month
- 49
Model tree for Silxxor/Russian-Addressee-detector
Base model
DeepPavlov/rubert-base-cased