St3w31
/

BiLSTMSpamClassifier

@@ -1,10 +1,29 @@
 # BiLSTM Text Classifier
-Simple BiLSTM model PyTorch trained for SPAM detection on SMS Span collection (Almeida, Tiago and Jos Hidalgo. 2011. SMS Spam Collection. UCI Machine Learning Repository. https://doi.org/10.24432/C5CC84.).
 ## Important Notes
-- The model returns the logits as output, so in order to get the probability pass the output to `torch.sigmoid`.
-- The model use `bert-base-uncased` tokenizer
 ## Files
 - `BiLSTMClassifier.safetensors`: trained weights
@@ -29,10 +48,10 @@ state_dict = load_file("BiLSTMClassifier.safetensors")
 model.load_state_dict(state_dict)
 model.eval()
-sample_text = "URGENT HIRING! Earn $500/day working from home. No experience needed. Apply here: www.somenthing.io/hiring"
 tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
 tokens = tokenizer(sample_text, return_tensors="pt")
 logits = model(tokens["input_ids"])
-p = torch.sigmoid(logits)
-```

+---
+license: mit
+library_name: pytorch
+tags:
+  - bilstm
+  - lstm
+  - pytorch
+  - text-classification
+  - spam-detection
+task_categories:
+  - text-classification
+datasets:
+  - ucirvine/sms_spam
+language:
+  - en
+---
 # BiLSTM Text Classifier
+Simple BiLSTM model PyTorch trained for SPAM detection on SMS Spam Collection
+(Almeida, Tiago and Jos Hidalgo. 2011. *SMS Spam Collection*.
+UCI Machine Learning Repository. https://doi.org/10.24432/C5CC84).
 ## Important Notes
+- The model returns **logits** as output; to obtain probabilities, apply `torch.sigmoid`.
+- The model uses the `bert-base-uncased` tokenizer **only for tokenization** (the encoder is NOT BERT).
 ## Files
 - `BiLSTMClassifier.safetensors`: trained weights
 model.load_state_dict(state_dict)
 model.eval()
+sample_text = "URGENT HIRING! Earn $500/day working from home. No experience needed."
 tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
 tokens = tokenizer(sample_text, return_tensors="pt")
 logits = model(tokens["input_ids"])
+prob = torch.sigmoid(logits)