Russian Addressee Detection Model

Fine-tuned RuBERT for detecting whether speech is directed at the system or not in Russian voice conversations.

Model Description

This model classifies whether a spoken utterance is directed at the voice assistant (addressee) or is ambient speech (talking to yourself, someone else in the room, or just mumbling). Designed for real-time voice chat applications to prevent false triggers.

Base Model: DeepPavlov/rubert-base-cased
Task: Binary classification (addressed to system / not addressed to system) Language: Russian

Performance

Metric Score
Accuracy 95.16%
Precision 88.89%
Recall 94.12%
F1-Score 91.43%

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Silxxor/Russian-Addressee-detector")
model = AutoModelForSequenceClassification.from_pretrained("Silxxor/Russian-Addressee-detector")

text = "покажи мне погоду на завтра"
inputs = tokenizer(text, return_tensors="pt", max_length=64, truncation=True, padding=True)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1).item()

# 0 = not addressed to system, 1 = addressed to system
print("Addressed to system" if prediction == 1 else "Not addressed to system")

Training Data

  • Source: Synthetically generated dataset based on Russian conversational patterns
  • Wake words: "Люси" (Lucy), "ассистент" (assistant), "компьютер" (computer)
  • Mixed dataset of direct commands, questions with wake words, and ambient speech
  • Addressed utterances (containing wake words or direct commands) labeled as 1
  • Non-addressed utterances (self-talk, background conversation) labeled as 0
  • Balanced dataset

Training Details

  • Epochs: 3
  • Batch size: 16
  • Learning rate: 2e-5
  • Weight decay: 0.01
  • Optimizer: AdamW
  • Best model selection: F1 score

Limitations

  • Trained on synthetically generated data, not natural speech patterns
  • Optimized for specific wake words (Lucy, assistant, computer)
  • May not generalize well to other assistant names or contexts
  • Performance depends on ASR quality
  • Cultural and contextual nuances may affect accuracy

Intended Use

Voice assistants and conversational AI systems that need to distinguish between speech directed at them versus ambient conversation in Russian.

Downloads last month
49
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Silxxor/Russian-Addressee-detector

Finetuned
(63)
this model