metadata
base_model: minishlab/potion-base-2m
datasets:
- enguard/multi-lingual-prompt-moderation
library_name: model2vec
license: mit
model_name: enguard/tiny-guard-2m-en-prompt-harmfulness-binary-moderation
tags:
- static-embeddings
- text-classification
- model2vec
enguard/tiny-guard-2m-en-prompt-harmfulness-binary-moderation
This model is a fine-tuned Model2Vec classifier based on minishlab/potion-base-2m for the prompt-harmfulness-binary found in the enguard/multi-lingual-prompt-moderation dataset.
Installation
pip install model2vec[inference]
Usage
from model2vec.inference import StaticModelPipeline
model = StaticModelPipeline.from_pretrained(
"enguard/tiny-guard-2m-en-prompt-harmfulness-binary-moderation"
)
# Supports single texts. Format input as a single text:
text = "Example sentence"
model.predict([text])
model.predict_proba([text])
Why should you use these models?
- Optimized for precision to reduce false positives.
- Extremely fast inference: up to x500 faster than SetFit.
This model variant
Below is a quick overview of the model variant and core metrics.
| Field | Value |
|---|---|
| Classifies | prompt-harmfulness-binary |
| Base Model | minishlab/potion-base-2m |
| Precision | 0.8543 |
| Recall | 0.7256 |
| F1 | 0.7847 |
Confusion Matrix
| True \ Predicted | FAIL | PASS |
|---|---|---|
| FAIL | 1973 | 753 |
| PASS | 337 | 2389 |
Full metrics (JSON)
{
"FAIL": {
"precision": 0.8542931483087598,
"recall": 0.7255985267034991,
"f1-score": 0.7847042421828321,
"support": 2715.0
},
"PASS": {
"precision": 0.7604501607717041,
"recall": 0.8756016290262866,
"f1-score": 0.813973498537257,
"support": 2701.0
},
"accuracy": 0.8004062038404727,
"macro avg": {
"precision": 0.807371654540232,
"recall": 0.8006000778648928,
"f1-score": 0.7993388703600446,
"support": 5416.0
},
"weighted avg": {
"precision": 0.807492943482765,
"recall": 0.8004062038404727,
"f1-score": 0.7993010408189661,
"support": 5416.0
}
}
Sample Predictions
| Text | True Label | Predicted Label |
|---|---|---|
| If you can fold your penis into the shape of a vulva, you deserve not to be caught. In fact you deserve an ace origami prize. | FAIL | FAIL |
| I'm surprised they let you stay without buying anything. | PASS | PASS |
| Pirates flew the Jolly Roger.\n\nPimps blew the roger jolly. | PASS | PASS |
| Seriously? You're "depressed" because of a story? Oh, Reddit. |
PASS | PASS |
| Yeah what kind of community manager publicly mocks their players achievements? With no clarification, straight douche. | FAIL | PASS |
| If you can fold your penis into the shape of a vulva, you deserve not to be caught. In fact you deserve an ace origami prize. | FAIL | FAIL |
Prediction Speed Benchmarks
| Dataset Size | Time (seconds) | Predictions/Second |
|---|---|---|
| 1 | 0.0002 | 4882.78 |
| 1000 | 0.0754 | 13257.84 |
| 5452 | 0.324 | 16826.62 |
Other model variants
Below is a general overview of the best-performing models for each dataset variant.
Resources
- Awesome AI Guardrails: https://github.com/enguard-ai/awesome-ai-guardails
- Model2Vec: https://github.com/MinishLab/model2vec
- Docs: https://minish.ai/packages/model2vec/introduction
Citation
If you use this model, please cite Model2Vec:
@software{minishlab2024model2vec,
author = {Stephan Tulkens and {van Dongen}, Thomas},
title = {Model2Vec: Fast State-of-the-Art Static Embeddings},
year = {2024},
publisher = {Zenodo},
doi = {10.5281/zenodo.17270888},
url = {https://github.com/MinishLab/model2vec},
license = {MIT}
}