| --- |
| language: ru |
| license: mit |
| tags: |
| - text-classification |
| - intent-classification |
| - onnx |
| - rubert |
| - chatbot |
| - rag |
| datasets: |
| - custom |
| metrics: |
| - f1 |
| - accuracy |
| pipeline_tag: text-classification |
| model-index: |
| - name: intent-classifier-rubert-tiny2 |
| results: |
| - task: |
| type: text-classification |
| name: Intent Classification |
| metrics: |
| - name: F1 (weighted) |
| type: f1 |
| value: 0.90 |
| - name: Accuracy |
| type: accuracy |
| value: 0.90 |
| --- |
| |
| # Intent Classifier (ruBERT-tiny2) |
|
|
| Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents. |
|
|
| ## Use Case |
|
|
| RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing: |
| - **rag** - user wants to search documents / knowledge base |
| - **chat** - greeting, small talk, bot questions |
| - **followup** - clarification of previous answer |
|
|
| This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0). |
|
|
| ## Results |
|
|
| | Class | Precision | Recall | F1 | |
| |-------|-----------|--------|-----| |
| | rag | 0.94 | 0.98 | 0.96 | |
| | chat | 0.87 | 0.90 | 0.88 | |
| | followup | 0.86 | 0.73 | 0.79 | |
| | **Overall** | | | **0.90** | |
|
|
| ## Quick Start (ONNX) |
|
|
| ```python |
| import numpy as np |
| import onnxruntime as ort |
| from transformers import AutoTokenizer |
| |
| session = ort.InferenceSession("model.onnx") |
| tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2") |
| LABELS = ["rag", "chat", "followup"] |
| |
| def classify(text): |
| inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128) |
| outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]}) |
| probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum() |
| return LABELS[np.argmax(probs)], float(probs.max()) |
| |
| label, conf = classify("какие условия возврата?") |
| print(f"{label} ({conf:.1%})") # rag (95.2%) |
| ``` |
|
|
| ## Training |
|
|
| - **Base model:** cointegrated/rubert-tiny2 (29M params) |
| - **Dataset:** 2,877 synthetic examples (template-based + augmented) |
| - **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU |
| - **Export:** ONNX format, ~111MB |
|
|
| ## Links |
|
|
| - [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation |
|
|