Gleckus
/

intent-classifier-rubert-tiny2

Text Classification

intent-classification

Eval Results (legacy)

Model card Files Files and versions

Gleckus commited on Mar 24

Commit

4e1738c

·

verified ·

1 Parent(s): 46e8940

Add model card

Files changed (1) hide show

README.md +84 -0

README.md ADDED Viewed

	@@ -0,0 +1,84 @@

+---
+language: ru
+license: mit
+tags:
+  - text-classification
+  - intent-classification
+  - onnx
+  - rubert
+  - chatbot
+  - rag
+datasets:
+  - custom
+metrics:
+  - f1
+  - accuracy
+pipeline_tag: text-classification
+model-index:
+  - name: intent-classifier-rubert-tiny2
+    results:
+      - task:
+          type: text-classification
+          name: Intent Classification
+        metrics:
+          - name: F1 (weighted)
+            type: f1
+            value: 0.90
+          - name: Accuracy
+            type: accuracy
+            value: 0.90
+---
+# Intent Classifier (ruBERT-tiny2)
+Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents.
+## Use Case
+RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:
+- **rag** - user wants to search documents / knowledge base
+- **chat** - greeting, small talk, bot questions
+- **followup** - clarification of previous answer
+This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).
+## Results
+| Class | Precision | Recall | F1 |
+|-------|-----------|--------|-----|
+| rag | 0.94 | 0.98 | 0.96 |
+| chat | 0.87 | 0.90 | 0.88 |
+| followup | 0.86 | 0.73 | 0.79 |
+| **Overall** | | | **0.90** |
+## Quick Start (ONNX)
+```python
+import numpy as np
+import onnxruntime as ort
+from transformers import AutoTokenizer
+session = ort.InferenceSession("model.onnx")
+tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
+LABELS = ["rag", "chat", "followup"]
+def classify(text):
+    inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
+    outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
+    probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
+    return LABELS[np.argmax(probs)], float(probs.max())
+label, conf = classify("какие условия возврата?")
+print(f"{label} ({conf:.1%})")  # rag (95.2%)
+```
+## Training
+- **Base model:** cointegrated/rubert-tiny2 (29M params)
+- **Dataset:** 2,877 synthetic examples (template-based + augmented)
+- **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
+- **Export:** ONNX format, ~111MB
+## Links
+- [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation