Gleckus commited on
Commit
4e1738c
·
verified ·
1 Parent(s): 46e8940

Add model card

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ru
3
+ license: mit
4
+ tags:
5
+ - text-classification
6
+ - intent-classification
7
+ - onnx
8
+ - rubert
9
+ - chatbot
10
+ - rag
11
+ datasets:
12
+ - custom
13
+ metrics:
14
+ - f1
15
+ - accuracy
16
+ pipeline_tag: text-classification
17
+ model-index:
18
+ - name: intent-classifier-rubert-tiny2
19
+ results:
20
+ - task:
21
+ type: text-classification
22
+ name: Intent Classification
23
+ metrics:
24
+ - name: F1 (weighted)
25
+ type: f1
26
+ value: 0.90
27
+ - name: Accuracy
28
+ type: accuracy
29
+ value: 0.90
30
+ ---
31
+
32
+ # Intent Classifier (ruBERT-tiny2)
33
+
34
+ Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents.
35
+
36
+ ## Use Case
37
+
38
+ RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:
39
+ - **rag** - user wants to search documents / knowledge base
40
+ - **chat** - greeting, small talk, bot questions
41
+ - **followup** - clarification of previous answer
42
+
43
+ This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).
44
+
45
+ ## Results
46
+
47
+ | Class | Precision | Recall | F1 |
48
+ |-------|-----------|--------|-----|
49
+ | rag | 0.94 | 0.98 | 0.96 |
50
+ | chat | 0.87 | 0.90 | 0.88 |
51
+ | followup | 0.86 | 0.73 | 0.79 |
52
+ | **Overall** | | | **0.90** |
53
+
54
+ ## Quick Start (ONNX)
55
+
56
+ ```python
57
+ import numpy as np
58
+ import onnxruntime as ort
59
+ from transformers import AutoTokenizer
60
+
61
+ session = ort.InferenceSession("model.onnx")
62
+ tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
63
+ LABELS = ["rag", "chat", "followup"]
64
+
65
+ def classify(text):
66
+ inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
67
+ outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
68
+ probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
69
+ return LABELS[np.argmax(probs)], float(probs.max())
70
+
71
+ label, conf = classify("какие условия возврата?")
72
+ print(f"{label} ({conf:.1%})") # rag (95.2%)
73
+ ```
74
+
75
+ ## Training
76
+
77
+ - **Base model:** cointegrated/rubert-tiny2 (29M params)
78
+ - **Dataset:** 2,877 synthetic examples (template-based + augmented)
79
+ - **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
80
+ - **Export:** ONNX format, ~111MB
81
+
82
+ ## Links
83
+
84
+ - [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation