tinybert_base_train_book_ent_15p_s_init_kd_complete_stsb
This model is a fine-tuned version of gokulsrinivasagan/tinybert_base_train_book_ent_15p_s_init_kd_complete on the GLUE STSB dataset. It achieves the following results on the evaluation set:
- Loss: 0.7455
- Pearson: 0.8187
- Spearmanr: 0.8154
- Combined Score: 0.8170
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 256
- eval_batch_size: 256
- seed: 10
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
|---|---|---|---|---|---|---|
| 2.6257 | 1.0 | 23 | 2.5060 | 0.1353 | 0.1309 | 0.1331 |
| 1.8313 | 2.0 | 46 | 1.6575 | 0.6385 | 0.6265 | 0.6325 |
| 1.1806 | 3.0 | 69 | 1.3400 | 0.7092 | 0.7236 | 0.7164 |
| 0.8641 | 4.0 | 92 | 1.0966 | 0.7756 | 0.7786 | 0.7771 |
| 0.7115 | 5.0 | 115 | 0.8398 | 0.7931 | 0.7898 | 0.7914 |
| 0.6248 | 6.0 | 138 | 0.7820 | 0.8130 | 0.8090 | 0.8110 |
| 0.5846 | 7.0 | 161 | 0.7455 | 0.8187 | 0.8154 | 0.8170 |
| 0.4653 | 8.0 | 184 | 0.8070 | 0.8201 | 0.8177 | 0.8189 |
| 0.4188 | 9.0 | 207 | 0.7894 | 0.8156 | 0.8131 | 0.8143 |
| 0.3692 | 10.0 | 230 | 0.8148 | 0.8154 | 0.8138 | 0.8146 |
| 0.3428 | 11.0 | 253 | 1.1896 | 0.8115 | 0.8174 | 0.8145 |
| 0.3529 | 12.0 | 276 | 0.9953 | 0.8173 | 0.8180 | 0.8176 |
Framework versions
- Transformers 4.51.2
- Pytorch 2.6.0+cu126
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- -
Model tree for gokulsrinivasagan/tinybert_base_train_book_ent_15p_s_init_kd_complete_stsb
Base model
google/bert_uncased_L-4_H-512_A-8Dataset used to train gokulsrinivasagan/tinybert_base_train_book_ent_15p_s_init_kd_complete_stsb
Evaluation results
- Spearmanr on GLUE STSBself-reported0.815