bert-sst2-nesterov / README.md
mi55th's picture
Update README.md
bd29416 verified
---
license: apache-2.0
datasets:
- nyu-mll/glue
- stanfordnlp/sst2
base_model:
- google-bert/bert-base-uncased
tags:
- sentiment-analysis
- text-classification
- transformers
- pytorch
- bert
- sst2
- glue
pipeline_tag: text-classification
---
# BERT-base-uncased fine-tuned on SST-2 (GLUE)
This repository contains a `bert-base-uncased` model fine-tuned for **binary sentiment classification** on the [GLUE/SST-2](https://huggingface.co/datasets/glue/viewer/sst2) dataset.
## Model summary
- **Task**: sentiment analysis (binary classification)
- **Labels**: negative (`0`), positive (`1`)
- **Base model**: `bert-base-uncased`
- **Library**: Transformers (`Trainer` API)
- **Note**: In the training notebook, the model was fine-tuned on a small subset (640 train / 640 validation) for demonstration purposes. For production use, fine-tune on the full dataset and validate thoroughly.
## Intended uses
### ✅ Supported
- Quick demos of sentiment classification on English sentences
- Educational examples of fine-tuning with `Trainer`
- Baseline experiments on SST-2-like sentiment data
### ⚠️ Not recommended
- High-stakes or safety-critical decisions (medical, legal, hiring, etc.)
- Domains significantly different from SST-2 (e.g., clinical notes, finance news) without further fine-tuning
- Non-English text (model and data are English-focused)
## Limitations and biases
- **Dataset bias**: SST-2 reflects movie review sentiment distribution and language patterns; performance may degrade on other domains.
- **Small fine-tuning subset**: if you trained on 640 samples, results are not representative of the full SST-2 benchmark.
- **Short-text behavior**: very short/ambiguous or sarcastic statements can be misclassified.
- **Offensive/toxic content**: the model may output confident predictions on harmful text; it does not provide safety filtering.
## Training data
Fine-tuning used the GLUE benchmark dataset configuration **SST-2** (Stanford Sentiment Treebank v2 as used in GLUE).
- **Dataset**: `glue`, config `sst2`
- **Text field**: `sentence`
- **Label field**: `label` (`0`/`1`)
In the provided Colab:
- `train`: selected `range(640)`
- `validation`: selected `range(640)`
- `test`: predictions generated without labels (GLUE test split)
## Training procedure
### Preprocessing
- Tokenizer: `AutoTokenizer.from_pretrained("bert-base-uncased")`
- Truncation enabled (`truncation=True`)
- Dynamic padding via `DataCollatorWithPadding`
### Hyperparameters (from Colab)
- `epochs`: 3
- `learning_rate`: 2e-5
- `batch_size`: 16 (per device)
- `weight_decay`: 0.01
- `evaluation`: each epoch
- `checkpointing`: each epoch
- `best model selection`: accuracy on validation
- `logging`: disabled (`report_to="none"`)
## Results (validation)
- **Accuracy**: 0.8625
- **Loss**: 0.33919745683670044
> *(Optional: add confusion matrix, F1, etc. if available)*