Heimdall SMS Guard

Lightweight SMS spam classifier. Character n-gram TF-IDF features fed into a class-balanced Logistic Regression, with a tuned decision threshold.

Small, fast, CPU-only, no GPU required. Suitable for demos, baselines, and educational use.

Model Architecture

Component	Value
Vectorizer	`TfidfVectorizer`
Analyzer	`char_wb`
n-gram range	`(3, 5)`
Classifier	`LogisticRegression`
Class weight	`balanced`
Solver	`liblinear`
Decision threshold	`0.41`

Pipeline serialized via joblib as a dict {pipeline, threshold, config}.

Test Metrics

Evaluated on a held-out stratified test split (774 messages, 12.40% spam).

Metric	Value
Accuracy	0.9884
Spam Precision	0.9485
Spam Recall	0.9583
Spam F1	0.9534
False Positives	5
False Negatives	4

Intended Use

SMS spam classification demos
Baseline benchmark for stronger spam models
Education / coursework on text classification
Research on lightweight NLP pipelines

Not Intended Use

Live consumer SMS filtering at scale
Anti-fraud or anti-phishing in financial messaging
Regulated communications filtering
Multilingual or non-English SMS
Image / MMS / RCS content

Limitations

See limitations.md for full detail.

Trained on an older English SMS spam dataset
May not generalize to modern spam patterns
Weak against unicode obfuscation, emoji-heavy spam, shortened links, callback scams
May produce false positives on transactional messages (OTP, 2FA, bank alerts, delivery notifications)
No drift detection or live monitoring included

Local Inference

pip install -r requirements.txt
python inference.py

Programmatic use:

from inference import predict

result = predict("Free entry to win a prize. Text WIN to 12345.")
print(result)
# {'label': 'spam', 'label_id': 1, 'spam_probability': 0.97, 'threshold': 0.41}

Status

Demo-ready. Not production-ready.

For production deployment, additional work required: monitoring, drift detection, fallback strategy, input validation hardening, threshold recalibration on live traffic, and privacy-aware logging.

Files

File	Purpose
`model.joblib`	Trained pipeline + threshold + config
`inference.py`	Loader + `predict()` API
`requirements.txt`	Runtime dependencies
`metrics.json`	Final test metrics
`example_inputs.json`	Sample ham + spam messages
`limitations.md`	Full limitations writeup

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

accuracy
self-reported

0.988
Spam Precision
self-reported

0.949
Spam Recall
self-reported

0.958
Spam F1
self-reported

0.953