udit-k
/

HamSpamBERT

Text Classification

Generated from Trainer

Model card Files Files and versions

udit-k commited on Apr 13, 2024

Commit

e6a29ab

·

verified ·

1 Parent(s): 64319dc

Update README.md

Files changed (1) hide show

README.md +22 -5

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 # HamSpamBERT
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0072
 - Accuracy: 0.9991
@@ -25,19 +25,36 @@ It achieves the following results on the evaluation set:
 - Recall: 0.9933
 - F1: 0.9966
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters

 # HamSpamBERT
+This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on [Spam-Ham](https://huggingface.co/datasets/SalehAhmad/Spam-Ham) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0072
 - Accuracy: 0.9991
 - Recall: 0.9933
 - F1: 0.9966
+```python
+from transformers import pipeline, BertTokenizer, BertForSequenceClassification
+tokenizer = BertTokenizer.from_pretrained("udit-k/HamSpamBERT")
+model = BertForSequenceClassification.from_pretrained("udit-k/HamSpamBERT")
+classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
+text = "Call this number to win FREE IPL FINAL tickets!!!"
+result = classifier(text)
+print(result)
+```
+```
+[{'label': 'LABEL_1', 'score': 0.9999189376831055}]
+```
 ## Model description
+This model is a fine-tuned version of the [BERT](https://huggingface.co/bert-base-uncased) model on [Spam-Ham](https://huggingface.co/datasets/SalehAhmad/Spam-Ham) dataset to improve the performance of sentiment analysis on Spam Detection tasks.
+LABEL_0 = Ham (Not spam)
+LABEL_1 = Spam
 ## Intended uses & limitations
+This model can be used to detect spam texts. The primary limitation of this model is that it was trained on a corpus of about 4700 rows and evaluated on around 1200 rows.
 ## Training and evaluation data
+Training corpus = 80%
+Evaluation corpus = 20%
 ### Training hyperparameters