DeBERTa v3 MNLI (ONNX, int8 Quantized)

Production-ready ONNX conversion of cross-encoder/nli-deberta-v3-base for in-browser zero-shot classification — zero server cost, zero latency, complete privacy.

Highlights

  • Zero-shot classification — classify text into any custom categories without fine-tuning
  • ~243 MB quantized — DeBERTa v3 architecture for state-of-the-art NLI performance
  • transformers.js compatible — drop-in pipeline('zero-shot-classification')
  • Trained on MultiNLI + SNLI — 943k premise-hypothesis pairs

Quick Start

import { pipeline } from '@huggingface/transformers';

const classifier = await pipeline(
  'zero-shot-classification',
  'affectively-ai/deberta-v3-base-mnli-onnx',
  { dtype: 'q8' }
);

const result = await classifier(
  'I just got promoted at work and I feel incredible!',
  ['joy', 'career', 'stress', 'health']
);
// { labels: ['joy', 'career', ...], scores: [0.92, 0.85, ...] }

Conversion Details

Property Value
Base model cross-encoder/nli-deberta-v3-base
Training data MultiNLI (393k) + SNLI (550k)
Export PyTorch → ONNX via Optimum
Quantization int8 dynamic (ORTQuantizer, avx512_vnni)
Quantized size ~243 MB

Use Cases

This model powers flexible classification in Edgework.ai — bringing fast, cheap, and private inference as close to the user as possible. Ideal for:

  • Dynamic emotion categorization with user-defined labels
  • Intent detection without per-intent training data
  • Topic tagging for journal entries
  • Content routing based on custom taxonomies

About

Published by AFFECTIVELY · Managed by @buley

We convert, quantize, and publish production-ready ONNX models for edge and in-browser inference. Every release is tested for correctness and stability before publication.

Downloads last month
41
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for affectively-ai/deberta-v3-base-mnli-onnx

Quantized
(4)
this model

Datasets used to train affectively-ai/deberta-v3-base-mnli-onnx