tags: - text-classification - spam-detection - scikit-learn - logistic-regression metrics: - accuracy
Spam Detector
This model is a lightweight, high-performance Spam vs. Ham classifier built using Logistic Regression and TF-IDF Vectorization. It is designed to identify promotional spam, phishing attempts, and unwanted messages in real-time.
Model Details
- Algorithm: Logistic Regression
- Feature Extraction: TF-IDF (Term Frequency-Inverse Document Frequency)
- Training Data: Spam Collection Dataset
- Language: English
Performance
The model achieves the following results on the test set:
- Training Accuracy: ~96.7%
- Test Accuracy: ~94%
Files Included
spam_model.pkl: The trained Logistic Regression weights.vectorizer.pkl: The TF-IDF vocabulary and IDF weights used to transform text.
How to use locally
import joblib
# Load the files
model = joblib.load('spam_model.pkl')
vectorizer = joblib.load('vectorizer.pkl')
# Predict
text = ["You have won a gift card. Claim your reward at this link."]
features = vectorizer.transform(text)
prediction = model.predict(features)
print("Spam" if prediction[0] == 0 else "Ham")
\```
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support