687e517828ebafbeab0f56adcb2f91d8
This model is a fine-tuned version of meta-llama/Llama-3.1-8B on the dair-ai/emotion [split] dataset. It achieves the following results on the evaluation set:
- Loss: 2.4531
- Data Size: 1.0
- Epoch Runtime: 562.5043
- Accuracy: 0.9158
- F1 Macro: 0.8640
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 17.5178 | 0 | 7.0141 | 0.0811 | 0.0791 |
| No log | 1 | 500 | 19.0941 | 0.0078 | 11.0949 | 0.2087 | 0.0995 |
| No log | 2 | 1000 | 11.1537 | 0.0156 | 22.8170 | 0.2908 | 0.0751 |
| No log | 3 | 1500 | 7.6085 | 0.0312 | 45.1487 | 0.3488 | 0.0862 |
| No log | 4 | 2000 | 6.3485 | 0.0625 | 79.0063 | 0.3488 | 0.0862 |
| 0.5337 | 5 | 2500 | 6.9886 | 0.125 | 115.9590 | 0.2908 | 0.0751 |
| 6.1782 | 6 | 3000 | 5.0944 | 0.25 | 211.3386 | 0.4748 | 0.2820 |
| 0.4917 | 7 | 3500 | 2.4945 | 0.5 | 317.8534 | 0.8095 | 0.6641 |
| 1.3205 | 8.0 | 4000 | 1.3265 | 1.0 | 590.8614 | 0.8851 | 0.8442 |
| 0.7872 | 9.0 | 4500 | 1.2189 | 1.0 | 562.3368 | 0.9052 | 0.8589 |
| 0.7396 | 10.0 | 5000 | 1.1344 | 1.0 | 552.5370 | 0.9088 | 0.8625 |
| 0.6374 | 11.0 | 5500 | 2.1038 | 1.0 | 555.9663 | 0.8952 | 0.8525 |
| 0.5037 | 12.0 | 6000 | 2.2902 | 1.0 | 553.9349 | 0.8997 | 0.8348 |
| 0.4258 | 13.0 | 6500 | 1.9858 | 1.0 | 559.1593 | 0.9128 | 0.8663 |
| 0.2505 | 14.0 | 7000 | 2.4531 | 1.0 | 562.5043 | 0.9158 | 0.8640 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- -
Model tree for contemmcm/687e517828ebafbeab0f56adcb2f91d8
Base model
meta-llama/Llama-3.1-8B