St3w31 commited on
Commit
0292bfe
·
verified ·
1 Parent(s): 95732b3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +25 -6
README.md CHANGED
@@ -1,10 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # BiLSTM Text Classifier
2
 
3
- Simple BiLSTM model PyTorch trained for SPAM detection on SMS Span collection (Almeida, Tiago and Jos Hidalgo. 2011. SMS Spam Collection. UCI Machine Learning Repository. https://doi.org/10.24432/C5CC84.).
 
 
4
 
5
  ## Important Notes
6
- - The model returns the logits as output, so in order to get the probability pass the output to `torch.sigmoid`.
7
- - The model use `bert-base-uncased` tokenizer
8
 
9
  ## Files
10
  - `BiLSTMClassifier.safetensors`: trained weights
@@ -29,10 +48,10 @@ state_dict = load_file("BiLSTMClassifier.safetensors")
29
  model.load_state_dict(state_dict)
30
  model.eval()
31
 
32
- sample_text = "URGENT HIRING! Earn $500/day working from home. No experience needed. Apply here: www.somenthing.io/hiring"
33
 
34
  tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
35
  tokens = tokenizer(sample_text, return_tensors="pt")
 
36
  logits = model(tokens["input_ids"])
37
- p = torch.sigmoid(logits)
38
- ```
 
1
+ ---
2
+ license: mit
3
+ library_name: pytorch
4
+ tags:
5
+ - bilstm
6
+ - lstm
7
+ - pytorch
8
+ - text-classification
9
+ - spam-detection
10
+ task_categories:
11
+ - text-classification
12
+ datasets:
13
+ - ucirvine/sms_spam
14
+ language:
15
+ - en
16
+ ---
17
+
18
  # BiLSTM Text Classifier
19
 
20
+ Simple BiLSTM model PyTorch trained for SPAM detection on SMS Spam Collection
21
+ (Almeida, Tiago and Jos Hidalgo. 2011. *SMS Spam Collection*.
22
+ UCI Machine Learning Repository. https://doi.org/10.24432/C5CC84).
23
 
24
  ## Important Notes
25
+ - The model returns **logits** as output; to obtain probabilities, apply `torch.sigmoid`.
26
+ - The model uses the `bert-base-uncased` tokenizer **only for tokenization** (the encoder is NOT BERT).
27
 
28
  ## Files
29
  - `BiLSTMClassifier.safetensors`: trained weights
 
48
  model.load_state_dict(state_dict)
49
  model.eval()
50
 
51
+ sample_text = "URGENT HIRING! Earn $500/day working from home. No experience needed."
52
 
53
  tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
54
  tokens = tokenizer(sample_text, return_tensors="pt")
55
+
56
  logits = model(tokens["input_ids"])
57
+ prob = torch.sigmoid(logits)