Add model card

4e1738c verified 5 days ago

2.38 kB

language: ru
license: mit
tags:
  - text-classification
  - intent-classification
  - onnx
  - rubert
  - chatbot
  - rag
datasets:
  - custom
metrics:
  - f1
  - accuracy
pipeline_tag: text-classification
model-index:
  - name: intent-classifier-rubert-tiny2
    results:
      - task:
          type: text-classification
          name: Intent Classification
        metrics:
          - name: F1 (weighted)
            type: f1
            value: 0.9
          - name: Accuracy
            type: accuracy
            value: 0.9

Intent Classifier (ruBERT-tiny2)

Fine-tuned cointegrated/rubert-tiny2 for classifying Russian chatbot messages into 3 intents.

Use Case

RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:

rag - user wants to search documents / knowledge base
chat - greeting, small talk, bot questions
followup - clarification of previous answer

This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).

Results

Class	Precision	Recall	F1
rag	0.94	0.98	0.96
chat	0.87	0.90	0.88
followup	0.86	0.73	0.79
Overall			0.90

Quick Start (ONNX)

import numpy as np
import onnxruntime as ort
from transformers import AutoTokenizer

session = ort.InferenceSession("model.onnx")
tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
LABELS = ["rag", "chat", "followup"]

def classify(text):
    inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
    outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
    probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
    return LABELS[np.argmax(probs)], float(probs.max())

label, conf = classify("какие условия возврата?")
print(f"{label} ({conf:.1%})")  # rag (95.2%)

Training

Base model: cointegrated/rubert-tiny2 (29M params)
Dataset: 2,877 synthetic examples (template-based + augmented)
Training: 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
Export: ONNX format, ~111MB

Gleckus
/

intent-classifier-rubert-tiny2

Intent Classifier (ruBERT-tiny2)

Use Case

Results

Quick Start (ONNX)

Training

Links