bert-amazon-reviews_teacher

This model is a fine-tuned version of google-bert/bert-base-uncased on the amazon reviews dataset dataset. i used 50000 examples from the dataset to finetune, 10000 test and 10000 to validate the bert model. It achieves the following results on the evaluation set:

Loss: 0.2498
Accuracy: 0.899
Auc: 0.964

Model description

The model used is based on BERT (Bidirectional Encoder Representations from Transformers), originally introduced by Google AI. It is the uncased variant, meaning all text is lowercased before tokenization and the vocabulary does not differentiate between uppercase and lowercase letters. The base model was pre-trained on BookCorpus (800M words) and English Wikipedia (2,500M words) using a masked language modeling (MLM) objective and next sentence prediction (NSP). This fine-tuned version adapts the general-purpose BERT representations to the sentiment classification.

Training and evaluation data

The model was fine-tuned on the amazon reviews for sentiment classification dataset.

Source: link to dataset

Task: binary sentiment classification

Languages: English

Labels: [1, 2] 1 for negative and 2 for positive

Size: 516.93 mb

Training set: 3600000 samples

Test set: 400000 samples

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Auc
0.3359	1.0	6250	0.2891	0.882	0.957
0.3157	2.0	12500	0.2799	0.888	0.958
0.3071	3.0	18750	0.2742	0.891	0.96
0.2997	4.0	25000	0.2733	0.891	0.96
0.2961	5.0	31250	0.2634	0.896	0.961
0.2934	6.0	37500	0.2712	0.892	0.962
0.289	7.0	43750	0.2644	0.897	0.963
0.284	8.0	50000	0.2494	0.9	0.964
0.2825	9.0	56250	0.2516	0.897	0.964
0.2796	10.0	62500	0.2498	0.899	0.964

Framework versions

Transformers 4.55.0
Pytorch 2.7.1+cu126
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for peeyush01/bert-amazon-reviews_teacher

Base model

google-bert/bert-base-uncased

Finetuned

(6277)

this model

peeyush01
/

bert-amazon-reviews_teacher