Instructions to use belrem/llm-prompt-intent-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use belrem/llm-prompt-intent-classifier with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("belrem/llm-prompt-intent-classifier") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
tags:
|
| 4 |
+
- text-classification
|
| 5 |
+
- sentence-transformers
|
| 6 |
+
- prompt-classification
|
| 7 |
+
- ai-safety
|
| 8 |
+
- llm
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
datasets:
|
| 11 |
+
- belrem/llm-prompt-intent
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# LLM Prompt Intent Classifier
|
| 15 |
+
|
| 16 |
+
Classifies user prompts sent to LLMs into four intent categories using
|
| 17 |
+
[all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
|
| 18 |
+
sentence embeddings and a **Logistic Regression** classification head.
|
| 19 |
+
|
| 20 |
+
## Labels
|
| 21 |
+
|
| 22 |
+
| ID | Label | Description |
|
| 23 |
+
|----|-------|-------------|
|
| 24 |
+
| 0 | `creative` | Fiction, brainstorming, roleplay, poetry |
|
| 25 |
+
| 1 | `informational` | Factual questions, explanations, definitions |
|
| 26 |
+
| 2 | `task` | Code, translation, summarisation, editing |
|
| 27 |
+
| 3 | `adversarial` | Jailbreaks, prompt injection, manipulation |
|
| 28 |
+
|
| 29 |
+
## Classifier comparison
|
| 30 |
+
|
| 31 |
+
| Classifier | Accuracy | F1 macro | F1 weighted |
|
| 32 |
+
|---|---|---|---|
|
| 33 |
+
| Logistic Regression | 0.8218 | 0.8222 | 0.8209 |
|
| 34 |
+
| Linear SVM | 0.7816 | 0.7824 | 0.7816 |
|
| 35 |
+
| MLP | 0.8103 | 0.8090 | 0.8100 |
|
| 36 |
+
|
| 37 |
+
## Best model: Logistic Regression
|
| 38 |
+
|
| 39 |
+
```
|
| 40 |
+
precision recall f1-score support
|
| 41 |
+
|
| 42 |
+
creative 0.78 0.89 0.83 45
|
| 43 |
+
informational 0.84 0.77 0.80 48
|
| 44 |
+
task 0.80 0.89 0.85 37
|
| 45 |
+
adversarial 0.87 0.75 0.80 44
|
| 46 |
+
|
| 47 |
+
accuracy 0.82 174
|
| 48 |
+
macro avg 0.82 0.83 0.82 174
|
| 49 |
+
weighted avg 0.83 0.82 0.82 174
|
| 50 |
+
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
## Confusion matrix (best model)
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
Predicted →
|
| 57 |
+
creative info task adversarial
|
| 58 |
+
creative 40 2 3 0
|
| 59 |
+
informational 3 37 5 3
|
| 60 |
+
task 0 2 33 2
|
| 61 |
+
adversarial 8 3 0 33
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
## Inference
|
| 65 |
+
|
| 66 |
+
```python
|
| 67 |
+
from sentence_transformers import SentenceTransformer
|
| 68 |
+
import joblib
|
| 69 |
+
from huggingface_hub import hf_hub_download
|
| 70 |
+
|
| 71 |
+
embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
|
| 72 |
+
clf_path = hf_hub_download(repo_id="belrem/llm-prompt-intent-classifier", filename="classifier.joblib")
|
| 73 |
+
clf = joblib.load(clf_path)
|
| 74 |
+
|
| 75 |
+
prompt = "Write a poem about the ocean."
|
| 76 |
+
vec = embedder.encode([prompt])
|
| 77 |
+
label_id = clf.predict(vec)[0]
|
| 78 |
+
labels = ["creative", "informational", "task", "adversarial"]
|
| 79 |
+
print(labels[label_id]) # → creative
|
| 80 |
+
```
|
| 81 |
+
|
| 82 |
+
## Limitations
|
| 83 |
+
|
| 84 |
+
- Adversarial prompts are the hardest class: sophisticated jailbreaks using
|
| 85 |
+
creative or hypothetical framing may be misclassified as `creative` or `task`.
|
| 86 |
+
- Intent is inherently ambiguous — a prompt can be simultaneously creative and
|
| 87 |
+
a task. The model predicts the dominant intent.
|
| 88 |
+
- Dataset skew: adversarial examples from AdvBench may not reflect real-world
|
| 89 |
+
jailbreak distributions.
|