Add model card
Browse files
README.md
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: ru
|
| 3 |
+
license: mit
|
| 4 |
+
tags:
|
| 5 |
+
- text-classification
|
| 6 |
+
- intent-classification
|
| 7 |
+
- onnx
|
| 8 |
+
- rubert
|
| 9 |
+
- chatbot
|
| 10 |
+
- rag
|
| 11 |
+
datasets:
|
| 12 |
+
- custom
|
| 13 |
+
metrics:
|
| 14 |
+
- f1
|
| 15 |
+
- accuracy
|
| 16 |
+
pipeline_tag: text-classification
|
| 17 |
+
model-index:
|
| 18 |
+
- name: intent-classifier-rubert-tiny2
|
| 19 |
+
results:
|
| 20 |
+
- task:
|
| 21 |
+
type: text-classification
|
| 22 |
+
name: Intent Classification
|
| 23 |
+
metrics:
|
| 24 |
+
- name: F1 (weighted)
|
| 25 |
+
type: f1
|
| 26 |
+
value: 0.90
|
| 27 |
+
- name: Accuracy
|
| 28 |
+
type: accuracy
|
| 29 |
+
value: 0.90
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
# Intent Classifier (ruBERT-tiny2)
|
| 33 |
+
|
| 34 |
+
Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents.
|
| 35 |
+
|
| 36 |
+
## Use Case
|
| 37 |
+
|
| 38 |
+
RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:
|
| 39 |
+
- **rag** - user wants to search documents / knowledge base
|
| 40 |
+
- **chat** - greeting, small talk, bot questions
|
| 41 |
+
- **followup** - clarification of previous answer
|
| 42 |
+
|
| 43 |
+
This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).
|
| 44 |
+
|
| 45 |
+
## Results
|
| 46 |
+
|
| 47 |
+
| Class | Precision | Recall | F1 |
|
| 48 |
+
|-------|-----------|--------|-----|
|
| 49 |
+
| rag | 0.94 | 0.98 | 0.96 |
|
| 50 |
+
| chat | 0.87 | 0.90 | 0.88 |
|
| 51 |
+
| followup | 0.86 | 0.73 | 0.79 |
|
| 52 |
+
| **Overall** | | | **0.90** |
|
| 53 |
+
|
| 54 |
+
## Quick Start (ONNX)
|
| 55 |
+
|
| 56 |
+
```python
|
| 57 |
+
import numpy as np
|
| 58 |
+
import onnxruntime as ort
|
| 59 |
+
from transformers import AutoTokenizer
|
| 60 |
+
|
| 61 |
+
session = ort.InferenceSession("model.onnx")
|
| 62 |
+
tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
|
| 63 |
+
LABELS = ["rag", "chat", "followup"]
|
| 64 |
+
|
| 65 |
+
def classify(text):
|
| 66 |
+
inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
|
| 67 |
+
outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
|
| 68 |
+
probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
|
| 69 |
+
return LABELS[np.argmax(probs)], float(probs.max())
|
| 70 |
+
|
| 71 |
+
label, conf = classify("какие условия возврата?")
|
| 72 |
+
print(f"{label} ({conf:.1%})") # rag (95.2%)
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
## Training
|
| 76 |
+
|
| 77 |
+
- **Base model:** cointegrated/rubert-tiny2 (29M params)
|
| 78 |
+
- **Dataset:** 2,877 synthetic examples (template-based + augmented)
|
| 79 |
+
- **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
|
| 80 |
+
- **Export:** ONNX format, ~111MB
|
| 81 |
+
|
| 82 |
+
## Links
|
| 83 |
+
|
| 84 |
+
- [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation
|