tags: - text-classification - spam-detection - scikit-learn - logistic-regression metrics: - accuracy

Spam Detector

This model is a lightweight, high-performance Spam vs. Ham classifier built using Logistic Regression and TF-IDF Vectorization. It is designed to identify promotional spam, phishing attempts, and unwanted messages in real-time.

Model Details

  • Algorithm: Logistic Regression
  • Feature Extraction: TF-IDF (Term Frequency-Inverse Document Frequency)
  • Training Data: Spam Collection Dataset
  • Language: English

Performance

The model achieves the following results on the test set:

  • Training Accuracy: ~96.7%
  • Test Accuracy: ~94%

Files Included

  • spam_model.pkl: The trained Logistic Regression weights.
  • vectorizer.pkl: The TF-IDF vocabulary and IDF weights used to transform text.

How to use locally

import joblib

# Load the files
model = joblib.load('spam_model.pkl')
vectorizer = joblib.load('vectorizer.pkl')

# Predict
text = ["You have won a gift card. Claim your reward at this link."]
features = vectorizer.transform(text)
prediction = model.predict(features)
print("Spam" if prediction[0] == 0 else "Ham")
\```
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support