Gleckus
/

intent-classifier-rubert-tiny2

Text Classification

intent-classification

Eval Results (legacy)

Model card Files Files and versions

intent-classifier-rubert-tiny2 / README.md

Gleckus's picture

Add model card

4e1738c verified 6 days ago

|

history blame contribute delete

2.38 kB

	---
	language: ru
	license: mit
	tags:
	- text-classification
	- intent-classification
	- onnx
	- rubert
	- chatbot
	- rag
	datasets:
	- custom
	metrics:
	- f1
	- accuracy
	pipeline_tag: text-classification
	model-index:
	- name: intent-classifier-rubert-tiny2
	results:
	- task:
	type: text-classification
	name: Intent Classification
	metrics:
	- name: F1 (weighted)
	type: f1
	value: 0.90
	- name: Accuracy
	type: accuracy
	value: 0.90
	---

	# Intent Classifier (ruBERT-tiny2)

	Fine-tuned cointegrated/rubert-tiny2 for classifying Russian chatbot messages into 3 intents.

	## Use Case

	RAG (Retrieval-Augmented Generation) chatbots need to classify user messages before processing:
	- rag - user wants to search documents / knowledge base
	- chat - greeting, small talk, bot questions
	- followup - clarification of previous answer

	This model replaces LLM API calls (300-2000ms, ~$0.001/req) with local inference (3.7ms, $0).

	## Results

	\| Class \| Precision \| Recall \| F1 \|
	\|-------\|-----------\|--------\|-----\|
	\| rag \| 0.94 \| 0.98 \| 0.96 \|
	\| chat \| 0.87 \| 0.90 \| 0.88 \|
	\| followup \| 0.86 \| 0.73 \| 0.79 \|
	\| Overall \| \| \| 0.90 \|

	## Quick Start (ONNX)

	```python
	import numpy as np
	import onnxruntime as ort
	from transformers import AutoTokenizer

	session = ort.InferenceSession("model.onnx")
	tokenizer = AutoTokenizer.from_pretrained("Gleckus/intent-classifier-rubert-tiny2")
	LABELS = ["rag", "chat", "followup"]

	def classify(text):
	inputs = tokenizer(text, return_tensors="np", padding="max_length", truncation=True, max_length=128)
	outputs = session.run(None, {"input_ids": inputs["input_ids"], "attention_mask": inputs["attention_mask"]})
	probs = np.exp(outputs[0][0]) / np.exp(outputs[0][0]).sum()
	return LABELS[np.argmax(probs)], float(probs.max())

	label, conf = classify("какие условия возврата?")
	print(f"{label} ({conf:.1%})") # rag (95.2%)
	```

	## Training

	- Base model: cointegrated/rubert-tiny2 (29M params)
	- Dataset: 2,877 synthetic examples (template-based + augmented)
	- Training: 5 epochs, batch 32, lr 2e-5, Google Colab T4 GPU
	- Export: ONNX format, ~111MB

	## Links

	- [GitHub Repository](https://github.com/GleckusZeroFive/intent-classifier) - full code, dataset, documentation