jongador commited on
Commit
2a1cd60
·
verified ·
1 Parent(s): f48f8b1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language: en
4
+ library_name: transformers
5
+ pipeline_tag: text-classification
6
+ datasets:
7
+ - imdb
8
+ metrics:
9
+ - accuracy
10
+ tags:
11
+ - text-classification
12
+ - sentiment-analysis
13
+ - bert
14
+ - imdb
15
+ - adversarial-nlp
16
+ - textattack
17
+ - deprecated
18
+ ---
19
+
20
+ # bert-imdb-256
21
+
22
+ > ⚠️ **DEPRECATED — kept for legacy compatibility.**
23
+ > This model was trained with `max_seq_length=256` on a constrained-VRAM laptop GPU as a preliminary step. For new work, use [`jongador/bert-imdb-512`](https://huggingface.co/jongador/bert-imdb-512), which covers ~95–98% of IMDB reviews (vs. ~85–90%) and achieves higher accuracy (94.14% vs. 92.20%).
24
+
25
+ `bert-base-uncased` fine-tuned on the IMDB sentiment classification dataset with `max_seq_length=256`. Trained as a victim model for adversarial NLP research (TextBugger / TextFooler / DeepWordBug-style attacks).
26
+
27
+ ## Model Details
28
+
29
+ - **Architecture**: `bert-base-uncased` (12 layers, 768 hidden, 12 heads, ~110M parameters)
30
+ - **Tokenization**: WordPiece (subwords)
31
+ - **Max sequence length**: 256 tokens
32
+ - **Task**: Binary sentiment classification (positive / negative)
33
+
34
+ ## Training
35
+
36
+ Trained from `bert-base-uncased` on the IMDB train split (25,000 examples) using [TextAttack](https://github.com/QData/TextAttack) 0.3.x.
37
+
38
+ | Hyperparameter | Value |
39
+ | --- | --- |
40
+ | Epochs | 5 |
41
+ | Per-device batch size | 2 |
42
+ | Gradient accumulation | 8 (effective batch 16) |
43
+ | Learning rate | 2e-5 |
44
+ | Weight decay | 0.01 |
45
+ | Warmup steps | 500 |
46
+ | Random seed | 786 |
47
+ | Hardware | NVIDIA RTX 3050 Laptop (4 GB VRAM) |
48
+
49
+ The aggressive gradient accumulation (batch 2 × accum 8) was a workaround for the 4 GB VRAM ceiling on the laptop GPU used for this preliminary run. The newer [`jongador/bert-imdb-512`](https://huggingface.co/jongador/bert-imdb-512) variant uses batch 8 × accum 2 on a cluster RTX 3090 (24 GB).
50
+
51
+ ## Evaluation
52
+
53
+ Best epoch checkpoint on the IMDB test split (25,000 examples):
54
+
55
+ | Metric | Value |
56
+ | --- | --- |
57
+ | Accuracy | 92.20% |
58
+
59
+ ## How to Use
60
+
61
+ ```python
62
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
63
+
64
+ tokenizer = AutoTokenizer.from_pretrained("jongador/bert-imdb-256")
65
+ model = AutoModelForSequenceClassification.from_pretrained("jongador/bert-imdb-256")
66
+ ```
67
+
68
+ ## License
69
+
70
+ MIT