ByteMeHarder-404
/

bert-imdb-ensemble

Text Classification

sentiment-analysis

Eval Results (legacy)

Model card Files Files and versions

ByteMeHarder-404 commited on Sep 14, 2025

Commit

70879f7

·

verified ·

1 Parent(s): 9c4e6c0

Create README.md

Files changed (1) hide show

README.md +95 -0

README.md ADDED Viewed

	@@ -0,0 +1,95 @@

+---
+language:
+- en
+library_name: transformers
+tags:
+- ensemble
+- text-classification
+- sentiment-analysis
+- imdb
+license: apache-2.0
+datasets:
+- imdb
+metrics:
+- accuracy
+- f1
+pipeline_tag: text-classification
+base_model:
+- bert-base-uncased
+model-index:
+- name: BERT IMDb Ensemble for Sentiment Analysis
+  results:
+  - task:
+      type: text-classification
+      name: Sentiment Classification
+    dataset:
+      name: IMDb
+      type: imdb
+      split: test
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.939
+    - name: F1
+      type: f1
+      value: 0.939
+---
+# BERT IMDb Ensemble for Sentiment Analysis 🎬🎭
+## Model description
+This is an **ensemble of 3 BERT-base-uncased models** fine-tuned on the IMDb dataset for **binary sentiment classification** (positive vs. negative reviews).
+Each model was trained with a different random seed, and predictions are combined using weighted or unweighted averaging for more robust performance.
+- **Base model:** `bert-base-uncased`
+- **Task:** Sentiment classification (binary: 0 = negative, 1 = positive)
+- **Ensembling strategy:** Weighted logits averaging
+---
+## Training procedure
+- **Dataset:** IMDb (train/test split from Hugging Face `datasets`)
+- **Preprocessing:**
+  - Tokenization with `bert-base-uncased`
+  - Truncation at 512 tokens
+- **Hyperparameters:**
+  - Epochs: 2
+  - Batch size: 8
+  - Optimizer: AdamW (default in `Trainer`)
+  - FP16: Enabled
+  - Seeds: `[42, 123, 999]`
+---
+## Evaluation results
+Across the three models, results are very consistent:
+| Model (Seed) | Epochs | Val. Accuracy | Val. Macro F1 |
+|--------------|--------|---------------|---------------|
+| 42           | 2      | 93.74%        | 0.9374        |
+| 123          | 2      | 93.84%        | 0.9383        |
+| 999          | 2      | 93.98%        | 0.9398        |
+**Ensemble performance** (weighted example `[0.2, 0.2, 0.6]`) improves stability and helps reduce variance across seeds.
+---
+## How to use
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")
+model = AutoModelForSequenceClassification.from_pretrained("ByteMeHarder-404/bert-imdb-ensemble")
+inputs = tokenizer("This movie was an absolute masterpiece!", return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
+print(probs)  # tensor([[0.01, 0.99]]) -> positive sentiment