|
|
--- |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
pipeline_tag: text-classification |
|
|
library_name: bertopic |
|
|
tags: |
|
|
- code |
|
|
--- |
|
|
|
|
|
# SpamHunter Model |
|
|
|
|
|
This is a fine-tuned BERT model for spam detection. |
|
|
|
|
|
## Model Details |
|
|
- **Base Model**: bert-base-uncased |
|
|
- **Dataset**: Custom spam emails dataset |
|
|
- **Training Steps**: 3 epochs |
|
|
- **Validation Accuracy**: ~99% |
|
|
|
|
|
## How to Use |
|
|
|
|
|
### Direct Integration with Transformers |
|
|
```python |
|
|
from transformers import BertTokenizer, BertForSequenceClassification |
|
|
|
|
|
# Load model and tokenizer |
|
|
tokenizer = BertTokenizer.from_pretrained("ar4min/SpamHunter") |
|
|
model = BertForSequenceClassification.from_pretrained("ar4min/SpamHunter") |
|
|
|
|
|
# Example |
|
|
text = "Congratulations! You've won a $1000 gift card. Click here to claim now." |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
outputs = model(**inputs) |
|
|
prediction = outputs.logits.argmax(-1).item() |
|
|
|
|
|
print("Spam" if prediction == 1 else "Not Spam") |
|
|
|