jysh1023
/

tiny-bert-sst2-distilled

Text Classification

Generated from Trainer

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

Metrics Training metrics Community

tiny-bert-sst2-distilled

This model was trained from scratch on the glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6749
Accuracy: 0.8200

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 128
eval_batch_size: 128
seed: 33
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 7
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.1125	1.0	3	0.6731	0.8177
0.0984	2.0	6	0.6756	0.8188
0.1273	3.0	9	0.6754	0.8177
0.0758	4.0	12	0.6751	0.8188
0.1188	5.0	15	0.6754	0.8188
0.0936	6.0	18	0.6749	0.8200
0.0781	7.0	21	0.6748	0.8200

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0

Downloads last month: 4

Safetensors

Model size

4.39M params

Tensor type

F32

·

Dataset used to train jysh1023/tiny-bert-sst2-distilled

Evaluation results

Accuracy on glue
validation set self-reported

0.820