DeBERTa v3 MNLI (ONNX, int8 Quantized)

Production-ready ONNX conversion of cross-encoder/nli-deberta-v3-base for in-browser zero-shot classification — zero server cost, zero latency, complete privacy.

Highlights

Zero-shot classification — classify text into any custom categories without fine-tuning
~243 MB quantized — DeBERTa v3 architecture for state-of-the-art NLI performance
transformers.js compatible — drop-in pipeline('zero-shot-classification')
Trained on MultiNLI + SNLI — 943k premise-hypothesis pairs

Quick Start

import { pipeline } from '@huggingface/transformers';

const classifier = await pipeline(
  'zero-shot-classification',
  'affectively-ai/deberta-v3-base-mnli-onnx',
  { dtype: 'q8' }
);

const result = await classifier(
  'I just got promoted at work and I feel incredible!',
  ['joy', 'career', 'stress', 'health']
);
// { labels: ['joy', 'career', ...], scores: [0.92, 0.85, ...] }

Conversion Details

Property	Value
Base model	cross-encoder/nli-deberta-v3-base
Training data	MultiNLI (393k) + SNLI (550k)
Export	PyTorch → ONNX via Optimum
Quantization	int8 dynamic (`ORTQuantizer`, avx512_vnni)
Quantized size	~243 MB

Use Cases

This model powers flexible classification in Edgework.ai — bringing fast, cheap, and private inference as close to the user as possible. Ideal for:

Dynamic emotion categorization with user-defined labels
Intent detection without per-intent training data
Topic tagging for journal entries
Content routing based on custom taxonomies

About

Published by AFFECTIVELY · Managed by @buley

We convert, quantize, and publish production-ready ONNX models for edge and in-browser inference. Every release is tested for correctness and stability before publication.

All models · GitHub · Edgework.ai

Downloads last month: 41

Model tree for affectively-ai/deberta-v3-base-mnli-onnx

Base model

microsoft/deberta-v3-base

Quantized

cross-encoder/nli-deberta-v3-base