Spamo-v1

A fine-tuned version of distilbert-base-uncased for binary spam classification. Spamo-v1 was trained on the UCI SMS Spam Collection dataset to detect spam messages with high precision.

Model Performance

Metric Validation Test
Accuracy 99.46% 98.92%
F1 98.04% 95.65%
Precision 100.00% 98.51%
Recall 96.15% 92.96%

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="ereniko/Spamo-v1")
classifier("Congratulations! You've won a free iPhone. Click here to claim now!")
# [{'label': 'SPAM', 'score': 0.999}]

Training Details

  • Base model: distilbert-base-uncased
  • Dataset: ucirvine/sms_spam (4459 train / 557 val / 558 test)
  • Epochs: 4
  • Batch size: 32
  • Weight decay: 0.01
  • Warmup steps: 100
  • Class weights: Ham 0.577 / Spam 3.731 (to handle class imbalance)
  • Hardware: NVIDIA A100 40GB
  • Training time: ~50 seconds

Limitations

  • Trained on SMS-style short messages, may be less effective on long-form emails
  • Phishing-style spam with subtle phrasing may slip through
  • English only
Downloads last month
35
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ereniko/Spamo-v1

Finetuned
(11048)
this model

Dataset used to train ereniko/Spamo-v1