tags: - text-classification - spam-detection - scikit-learn - logistic-regression metrics: - accuracy

Spam Detector

This model is a lightweight, high-performance Spam vs. Ham classifier built using Logistic Regression and TF-IDF Vectorization. It is designed to identify promotional spam, phishing attempts, and unwanted messages in real-time.

Model Details

Algorithm: Logistic Regression
Feature Extraction: TF-IDF (Term Frequency-Inverse Document Frequency)
Training Data: Spam Collection Dataset
Language: English

Performance

The model achieves the following results on the test set:

Training Accuracy: ~96.7%
Test Accuracy: ~94%

Files Included

spam_model.pkl: The trained Logistic Regression weights.
vectorizer.pkl: The TF-IDF vocabulary and IDF weights used to transform text.

How to use locally

import joblib

# Load the files
model = joblib.load('spam_model.pkl')
vectorizer = joblib.load('vectorizer.pkl')

# Predict
text = ["You have won a gift card. Claim your reward at this link."]
features = vectorizer.transform(text)
prediction = model.predict(features)
print("Spam" if prediction[0] == 0 else "Ham")
\```

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support