--- language: ru license: mit tags: - text-classification - intent-classification - onnx - rubert - chatbot - rag datasets: - custom metrics: - f1 - accuracy pipeline_tag: text-classification model-index: - name: intent-classifier-rubert-tiny2 results: - task: type: text-classification name: Intent Classification metrics: - name: F1 (weighted) type: f1 value: 0.90 - name: Accuracy type: accuracy value: 0.90 --- # Intent Classifier (ruBERT-tiny2) Fine-tuned **cointegrated/rubert-tiny2** for classifying Russian chatbot messages into 3 intents. ## Use Case RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing: - **rag** - user wants to search documents / knowledge base - **chat** - greeting, small talk, bot questions - **followup** - clarification of previous answer This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0). ## Results | Class | Precision | Recall | F1 | |-------|-----------|--------|-----| | rag | 0.94 | 0.98 | 0.96 | | chat | 0.87 | 0.90 | 0.88 | | followup | 0.86 | 0.73 | 0.79 | | **Overall** | | | **0.90** | ## Quick Start (ONNX) ```python import numpy as np import onnxruntime as ort from transformers import AutoTokenizer session = ort.InferenceSession("model.onnx") tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2") LABELS = ["rag", "chat", "followup"] def classify(text): inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128) outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]}) probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum() return LABELS[np.argmax(probs)], float(probs.max()) label, conf = classify("какие условия возврата?") print(f"{label} ({conf:.1%})") # rag (95.2%) ``` ## Training - **Base model:** cointegrated/rubert-tiny2 (29M params) - **Dataset:** 2,877 synthetic examples (template-based + augmented) - **Training:** 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU - **Export:** ONNX format, ~111MB ## Links - [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation