CLINC150 Contrastive MPNet Max-Similarity Classifier

This model is a contrastive learning–based intent classifier with built-in out-of-scope (OOS) detection, trained on the CLINC150 dataset.

Instead of a traditional softmax classifier, it uses embedding-based retrieval with a contrastively fine-tuned encoder, enabling robust generalization and strong OOS handling.

🚀 Key Features

Encoder: sentence-transformers/all-mpnet-base-v2 (contrastively fine-tuned)
Training Objective: Triplet loss with hard negative mining
Inference Method: k-NN retrieval over labeled exemplars
Voting Strategy: Max similarity (top match)
OOS Detection: Confidence threshold on cosine similarity

🧠 How It Works

Input text is encoded into a dense embedding
The embedding is compared against a bank of labeled exemplars
The most similar example (max similarity) determines the predicted intent
If similarity < threshold → prediction is OOS (out-of-scope)

This approach replaces probability-based classification with metric learning + retrieval, which improves robustness on unseen inputs.

📊 Performance (CLINC150 Test Set)

Accuracy: 0.937
Macro F1: 0.953
OOS Recall: 0.803
OOS Precision: 0.935
OOS F1: 0.864
False OOS Rate: 0.012

The model achieves a strong balance between:

high in-scope classification quality
accurate OOS detection
low false rejection rate

⚙️ Threshold Configuration

Similarity threshold: 0.71
Selected via OOS-aware constrained optimization:
- maximize macro F1
- constrain false OOS rate ≤ 3%

📦 Use Cases

Intent classification with rejection (chatbots, assistants)
Production systems requiring safe fallback on unknown inputs
Retrieval-based classification pipelines
Few-shot / exemplar-based systems

⚠️ Limitations

Requires storing exemplar embeddings (memory tradeoff)
Performance depends on exemplar coverage
Threshold tuning may need adjustment for new domains

🧩 Architecture Summary

Component	Choice
Encoder	MPNet (contrastively fine-tuned)
Training	Triplet loss + hard negatives
Retrieval	k-NN (k=5)
Decision rule	Max similarity
OOS detection	Threshold-based

💡 Why This Model?

This model demonstrates that contrastive retrieval + thresholding can outperform traditional softmax classifiers for:

OOS detection
generalization
robustness to distribution shift

Downloads last month: -; Downloads are not tracked for this model. How to track