File size: 2,376 Bytes
4e1738c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | ---
language: ru
license: mit
tags:
- text-classification
- intent-classification
- onnx
- rubert
- chatbot
- rag
datasets:
- custom
metrics:
- f1
- accuracy
pipeline_tag: text-classification
model-index:
- name: intent-classifier-rubert-tiny2
results:
- task:
type: text-classification
name: Intent Classification
metrics:
- name: F1 (weighted)
type: f1
value: 0.90
- name: Accuracy
type: accuracy
value: 0.90
---
# Intent Classifier (ruBERT-tiny2)
Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents.
## Use Case
RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:
- **rag** - user wants to search documents / knowledge base
- **chat** - greeting, small talk, bot questions
- **followup** - clarification of previous answer
This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).
## Results
| Class | Precision | Recall | F1 |
|-------|-----------|--------|-----|
| rag | 0.94 | 0.98 | 0.96 |
| chat | 0.87 | 0.90 | 0.88 |
| followup | 0.86 | 0.73 | 0.79 |
| **Overall** | | | **0.90** |
## Quick Start (ONNX)
```python
import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer
session = ort.InferenceSession("model.onnx")
tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
LABELS = ["rag", "chat", "followup"]
def classify(text):
inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
return LABELS[np.argmax(probs)], float(probs.max())
label, conf = classify("какие условия возврата?")
print(f"{label} ({conf:.1%})") # rag (95.2%)
```
## Training
- **Base model:** cointegrated/rubert-tiny2 (29M params)
- **Dataset:** 2,877 synthetic examples (template-based + augmented)
- **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
- **Export:** ONNX format, ~111MB
## Links
- [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation
|