bert-amazon-reviews_teacher
This model is a fine-tuned version of google-bert/bert-base-uncased on the amazon reviews dataset dataset. i used 50000 examples from the dataset to finetune, 10000 test and 10000 to validate the bert model. It achieves the following results on the evaluation set:
- Loss: 0.2498
- Accuracy: 0.899
- Auc: 0.964
Model description
The model used is based on BERT (Bidirectional Encoder Representations from Transformers), originally introduced by Google AI. It is the uncased variant, meaning all text is lowercased before tokenization and the vocabulary does not differentiate between uppercase and lowercase letters. The base model was pre-trained on BookCorpus (800M words) and English Wikipedia (2,500M words) using a masked language modeling (MLM) objective and next sentence prediction (NSP). This fine-tuned version adapts the general-purpose BERT representations to the sentiment classification.
Training and evaluation data
The model was fine-tuned on the amazon reviews for sentiment classification dataset.
Source: link to dataset
Task: binary sentiment classification
Languages: English
Labels: [1, 2] 1 for negative and 2 for positive
Size: 516.93 mb
Training set: 3600000 samples
Test set: 400000 samples
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Auc |
|---|---|---|---|---|---|
| 0.3359 | 1.0 | 6250 | 0.2891 | 0.882 | 0.957 |
| 0.3157 | 2.0 | 12500 | 0.2799 | 0.888 | 0.958 |
| 0.3071 | 3.0 | 18750 | 0.2742 | 0.891 | 0.96 |
| 0.2997 | 4.0 | 25000 | 0.2733 | 0.891 | 0.96 |
| 0.2961 | 5.0 | 31250 | 0.2634 | 0.896 | 0.961 |
| 0.2934 | 6.0 | 37500 | 0.2712 | 0.892 | 0.962 |
| 0.289 | 7.0 | 43750 | 0.2644 | 0.897 | 0.963 |
| 0.284 | 8.0 | 50000 | 0.2494 | 0.9 | 0.964 |
| 0.2825 | 9.0 | 56250 | 0.2516 | 0.897 | 0.964 |
| 0.2796 | 10.0 | 62500 | 0.2498 | 0.899 | 0.964 |
Framework versions
- Transformers 4.55.0
- Pytorch 2.7.1+cu126
- Datasets 4.0.0
- Tokenizers 0.21.4
- Downloads last month
- 5
Model tree for peeyush01/bert-amazon-reviews_teacher
Base model
google-bert/bert-base-uncased