Spamo-v1
A fine-tuned version of distilbert-base-uncased for binary spam classification. Spamo-v1 was trained on the UCI SMS Spam Collection dataset to detect spam messages with high precision.
Model Performance
| Metric | Validation | Test |
|---|---|---|
| Accuracy | 99.46% | 98.92% |
| F1 | 98.04% | 95.65% |
| Precision | 100.00% | 98.51% |
| Recall | 96.15% | 92.96% |
Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="ereniko/Spamo-v1")
classifier("Congratulations! You've won a free iPhone. Click here to claim now!")
# [{'label': 'SPAM', 'score': 0.999}]
Training Details
- Base model: distilbert-base-uncased
- Dataset: ucirvine/sms_spam (4459 train / 557 val / 558 test)
- Epochs: 4
- Batch size: 32
- Weight decay: 0.01
- Warmup steps: 100
- Class weights: Ham 0.577 / Spam 3.731 (to handle class imbalance)
- Hardware: NVIDIA A100 40GB
- Training time: ~50 seconds
Limitations
- Trained on SMS-style short messages, may be less effective on long-form emails
- Phishing-style spam with subtle phrasing may slip through
- English only
- Downloads last month
- 35
Model tree for ereniko/Spamo-v1
Base model
distilbert/distilbert-base-uncased