DeBERTa v3 MNLI (ONNX, int8 Quantized)
Production-ready ONNX conversion of cross-encoder/nli-deberta-v3-base for in-browser zero-shot classification — zero server cost, zero latency, complete privacy.
Highlights
- Zero-shot classification — classify text into any custom categories without fine-tuning
- ~243 MB quantized — DeBERTa v3 architecture for state-of-the-art NLI performance
- transformers.js compatible — drop-in
pipeline('zero-shot-classification') - Trained on MultiNLI + SNLI — 943k premise-hypothesis pairs
Quick Start
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline(
'zero-shot-classification',
'affectively-ai/deberta-v3-base-mnli-onnx',
{ dtype: 'q8' }
);
const result = await classifier(
'I just got promoted at work and I feel incredible!',
['joy', 'career', 'stress', 'health']
);
// { labels: ['joy', 'career', ...], scores: [0.92, 0.85, ...] }
Conversion Details
| Property | Value |
|---|---|
| Base model | cross-encoder/nli-deberta-v3-base |
| Training data | MultiNLI (393k) + SNLI (550k) |
| Export | PyTorch → ONNX via Optimum |
| Quantization | int8 dynamic (ORTQuantizer, avx512_vnni) |
| Quantized size | ~243 MB |
Use Cases
This model powers flexible classification in Edgework.ai — bringing fast, cheap, and private inference as close to the user as possible. Ideal for:
- Dynamic emotion categorization with user-defined labels
- Intent detection without per-intent training data
- Topic tagging for journal entries
- Content routing based on custom taxonomies
About
Published by AFFECTIVELY · Managed by @buley
We convert, quantize, and publish production-ready ONNX models for edge and in-browser inference. Every release is tested for correctness and stability before publication.
- Downloads last month
- 41
Model tree for affectively-ai/deberta-v3-base-mnli-onnx
Base model
microsoft/deberta-v3-base
Quantized
cross-encoder/nli-deberta-v3-base