CLINC150 Contrastive MPNet Max-Similarity Classifier

This model is a contrastive learning–based intent classifier with built-in out-of-scope (OOS) detection, trained on the CLINC150 dataset.

Instead of a traditional softmax classifier, it uses embedding-based retrieval with a contrastively fine-tuned encoder, enabling robust generalization and strong OOS handling.


πŸš€ Key Features

  • Encoder: sentence-transformers/all-mpnet-base-v2 (contrastively fine-tuned)
  • Training Objective: Triplet loss with hard negative mining
  • Inference Method: k-NN retrieval over labeled exemplars
  • Voting Strategy: Max similarity (top match)
  • OOS Detection: Confidence threshold on cosine similarity

🧠 How It Works

  1. Input text is encoded into a dense embedding
  2. The embedding is compared against a bank of labeled exemplars
  3. The most similar example (max similarity) determines the predicted intent
  4. If similarity < threshold β†’ prediction is OOS (out-of-scope)

This approach replaces probability-based classification with metric learning + retrieval, which improves robustness on unseen inputs.


πŸ“Š Performance (CLINC150 Test Set)

  • Accuracy: 0.937
  • Macro F1: 0.953
  • OOS Recall: 0.803
  • OOS Precision: 0.935
  • OOS F1: 0.864
  • False OOS Rate: 0.012

The model achieves a strong balance between:

  • high in-scope classification quality
  • accurate OOS detection
  • low false rejection rate

βš™οΈ Threshold Configuration

  • Similarity threshold: 0.71

  • Selected via OOS-aware constrained optimization:

    • maximize macro F1
    • constrain false OOS rate ≀ 3%

πŸ“¦ Use Cases

  • Intent classification with rejection (chatbots, assistants)
  • Production systems requiring safe fallback on unknown inputs
  • Retrieval-based classification pipelines
  • Few-shot / exemplar-based systems

⚠️ Limitations

  • Requires storing exemplar embeddings (memory tradeoff)
  • Performance depends on exemplar coverage
  • Threshold tuning may need adjustment for new domains

🧩 Architecture Summary

Component Choice
Encoder MPNet (contrastively fine-tuned)
Training Triplet loss + hard negatives
Retrieval k-NN (k=5)
Decision rule Max similarity
OOS detection Threshold-based

πŸ’‘ Why This Model?

This model demonstrates that contrastive retrieval + thresholding can outperform traditional softmax classifiers for:

  • OOS detection
  • generalization
  • robustness to distribution shift
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support