| --- |
| license: mit |
| --- |
| |
| # bert-sms-detector |
|
|
| **bert-sms-detector** is a fine-tuned BERT-based model for **SMS spam detection**. |
| It classifies input text messages as **spam** or **ham** (not spam). |
|
|
| --- |
|
|
| ## Model Details |
|
|
| - **Base Model**: `bert-base-uncased` |
| - **Task**: Text classification (spam detection) |
| - **Dataset**: [UC Irvine SMS Spam Collection](https://huggingface.co/datasets/ucirvine/sms_spam) |
| - **Fine-tuning**: The model has been fine-tuned to detect spam messages from SMS text. |
|
|
| --- |
|
|
| ## Usage |
|
|
| ```python |
| !pip install transformers |
| |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline |
| |
| model_name = "alanjoshua2005/bert-sms-detector" |
| tokenizer = AutoTokenizer.from_pretrained(model_name) |
| model = AutoModelForSequenceClassification.from_pretrained(model_name) |
| |
| classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) |
| |
| texts = [ |
| "Congratulations! You've won a free ticket to the Bahamas.", |
| "Hey, are we still meeting tomorrow at 5 PM?", |
| "Free entry in 2 tickets to the concert. Text WIN to 80088." |
| ] |
| |
| label_map = {"LABEL_0": "Not Spam", "LABEL_1": "Spam"} |
| |
| results = classifier(texts) |
| |
| for text, result in zip(texts, results): |
| print(f"Text: {text}\nPrediction: {label_map[result['label']]}\n") |
| |
| ``` |