RuBERT v2 Tiny (INT8, ONNX)
This repository contains an INT8-quantized version of RuBERT v2 Tiny, converted to the ONNX format for efficient CPU inference.
Based on the original model: https://huggingface.co/cointegrated/rubert-tiny2
Post-training INT8 quantization
Optimized for fast and lightweight inference
Suitable for embeddings, semantic search, and text classification
Note: This is a derivative work with format conversion and quantization only.
- Downloads last month
- -
Model tree for TrendHD/rubert-tiny2-int8
Base model
cointegrated/rubert-tiny2