Spamo-v1

A fine-tuned version of distilbert-base-uncased for binary spam classification. Spamo-v1 was trained on the UCI SMS Spam Collection dataset to detect spam messages with high precision.

Model Performance

Metric	Validation	Test
Accuracy	99.46%	98.92%
F1	98.04%	95.65%
Precision	100.00%	98.51%
Recall	96.15%	92.96%

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="ereniko/Spamo-v1")
classifier("Congratulations! You've won a free iPhone. Click here to claim now!")
# [{'label': 'SPAM', 'score': 0.999}]

Training Details

Base model: distilbert-base-uncased
Dataset: ucirvine/sms_spam (4459 train / 557 val / 558 test)
Epochs: 4
Batch size: 32
Weight decay: 0.01
Warmup steps: 100
Class weights: Ham 0.577 / Spam 3.731 (to handle class imbalance)
Hardware: NVIDIA A100 40GB
Training time: ~50 seconds

Limitations

Trained on SMS-style short messages, may be less effective on long-form emails
Phishing-style spam with subtle phrasing may slip through
English only

Downloads last month: 1

Safetensors

Model size

67M params

Tensor type

F32

Model tree for ereniko/Spamo-v1

Base model

distilbert/distilbert-base-uncased

Finetuned

(11524)

this model

ereniko
/

Spamo-v1