File size: 4,503 Bytes

---

language: en
license: mit
tags:
- spam-detection
- text-classification
- sms
- bert
- transformers
datasets:
- sms-spam-collection
metrics:
- accuracy
- precision
- recall
- f1
widget:
- text: "Congratulations! You've won a $1000 gift card. Click here to claim now!"
  example_title: "Spam Example"
- text: "Hey, are we still meeting for lunch tomorrow at 12?"
  example_title: "Ham Example"
- text: "URGENT! Your account has been suspended. Verify now to restore access."
  example_title: "Spam Example 2"
- text: "Thanks for your help today. I really appreciate it!"
  example_title: "Ham Example 2"
---


# SMS Spam Detection with BERT

🎯 A high-performance SMS spam classifier built with BERT achieving **99.16% accuracy**.

## Model Description

This model is a fine-tuned BERT classifier designed to detect spam messages in SMS text. It can classify messages as either:
- **HAM** (legitimate message)
- **SPAM** (unwanted/spam message)

## Performance Metrics

| Metric | Score |
|--------|-------|
| **Accuracy** | 99.16% |
| **Precision** | 97.30% |
| **Recall** | 96.43% |
| **F1-Score** | 96.86% |

## Quick Start

### Using Transformers Pipeline

```python

from transformers import pipeline



# Load the model

classifier = pipeline("text-classification", model="niru-nny/SMS_Spam_Detection")



# Classify a message

result = classifier("Congratulations! You've won a $1000 gift card!")

print(result)

# Output: [{'label': 'SPAM', 'score': 0.9987}]

```

### Using AutoModel and AutoTokenizer

```python

from transformers import AutoTokenizer, AutoModelForSequenceClassification

import torch



# Load model and tokenizer

model_name = "niru-nny/SMS_Spam_Detection"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForSequenceClassification.from_pretrained(model_name)



# Prepare input

text = "Hey, are we still meeting for lunch tomorrow?"

inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)



# Get prediction

with torch.no_grad():

    outputs = model(**inputs)

    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

    predicted_class = torch.argmax(predictions, dim=-1).item()



# Map to label

labels = ["HAM", "SPAM"]

print(f"Prediction: {labels[predicted_class]} (confidence: {predictions[0][predicted_class]:.4f})")

```

## Training Details

### Dataset
- **Source:** SMS Spam Collection Dataset
- **Total Messages:** 5,574
- **Ham Messages:** 4,827 (86.6%)
- **Spam Messages:** 747 (13.4%)

### Training Configuration
- **Base Model:** `bert-base-uncased`
- **Max Sequence Length:** 128 tokens
- **Batch Size:** 16
- **Learning Rate:** 2e-5
- **Epochs:** 3
- **Optimizer:** AdamW

### Data Split
- **Training:** 80%
- **Validation:** 20%

## Model Architecture

```

Input Text → BERT Tokenizer → BERT Encoder (12 layers) → [CLS] Token → Classification Head → Output (HAM/SPAM)

```

## Use Cases

✅ **Spam Filtering**: Automatically filter spam messages in messaging applications  
✅ **SMS Gateway Protection**: Protect users from phishing and scam attempts  
✅ **Content Moderation**: Pre-screen messages in communication platforms  
✅ **Fraud Detection**: Identify suspicious messages in financial apps  

## Limitations

- Model is trained specifically on English SMS messages
- May not generalize well to other languages or message formats
- Performance may vary on messages with heavy slang or abbreviations
- Trained on historical data; new spam patterns may emerge

## Ethical Considerations

⚠️ **Privacy**: Ensure compliance with data protection regulations when processing user messages  
⚠️ **False Positives**: Important legitimate messages might be incorrectly flagged as spam  
⚠️ **Bias**: Model may reflect biases present in training data  

## Citation

If you use this model, please cite:

```bibtex

@model{sms_spam_detection_bert_2026,

  title={SMS Spam Detection with BERT},

  author={niru-nny},

  year={2026},

  url={https://huggingface.co/niru-nny/SMS_Spam_Detection}

}

```

## License

MIT License

## Contact

For questions or feedback, please open an issue on the [model repository](https://huggingface.co/niru-nny/SMS_Spam_Detection/discussions).

---

**Model Card:** For detailed information about model development, evaluation, and responsible AI considerations, see the complete model card in the repository.