BoolQ_Llama-3.2-1B-9pmfiw8i
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4059
- Model Preparation Time: 0.006
- Mdl: 1914.7894
- Accumulated Loss: 1327.2309
- Correct Preds: 2774.0
- Total Preds: 3270.0
- Accuracy: 0.8483
- Correct Gen Preds: 2621.0
- Gen Accuracy: 0.8015
- Correct Gen Preds 9642: 1695.0
- Correct Preds 9642: 1778.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8776
- Gen Accuracy 9642: 0.8366
- Correct Gen Preds 2822: 916.0
- Correct Preds 2822: 996.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.8091
- Gen Accuracy 2822: 0.7441
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.006 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.3127 | 1.0 | 295 | 0.4749 | 0.006 | 2240.5980 | 1553.0642 | 2632.0 | 3270.0 | 0.8049 | 2386.0 | 0.7297 | 1628.0 | 1712.0 | 2026.0 | 0.8450 | 0.8036 | 749.0 | 920.0 | 1231.0 | 0.7474 | 0.6084 |
| 0.3406 | 2.0 | 590 | 0.4059 | 0.006 | 1914.7894 | 1327.2309 | 2774.0 | 3270.0 | 0.8483 | 2621.0 | 0.8015 | 1695.0 | 1778.0 | 2026.0 | 0.8776 | 0.8366 | 916.0 | 996.0 | 1231.0 | 0.8091 | 0.7441 |
| 0.0461 | 3.0 | 885 | 0.7790 | 0.006 | 3675.0574 | 2547.3556 | 2700.0 | 3270.0 | 0.8257 | 2591.0 | 0.7924 | 1682.0 | 1709.0 | 2026.0 | 0.8435 | 0.8302 | 901.0 | 991.0 | 1231.0 | 0.8050 | 0.7319 |
| 0.2643 | 4.0 | 1180 | 1.1030 | 0.006 | 5203.4628 | 3606.7656 | 2706.0 | 3270.0 | 0.8275 | 2694.0 | 0.8239 | 1868.0 | 1876.0 | 2026.0 | 0.9260 | 0.9220 | 820.0 | 830.0 | 1231.0 | 0.6742 | 0.6661 |
| 0.0001 | 5.0 | 1475 | 1.0459 | 0.006 | 4934.3817 | 3420.2528 | 2767.0 | 3270.0 | 0.8462 | 2751.0 | 0.8413 | 1807.0 | 1814.0 | 2026.0 | 0.8954 | 0.8919 | 937.0 | 953.0 | 1231.0 | 0.7742 | 0.7612 |
| 0.0003 | 6.0 | 1770 | 1.0837 | 0.006 | 5112.5263 | 3543.7332 | 2769.0 | 3270.0 | 0.8468 | 2767.0 | 0.8462 | 1790.0 | 1798.0 | 2026.0 | 0.8875 | 0.8835 | 969.0 | 971.0 | 1231.0 | 0.7888 | 0.7872 |
| 0.0 | 7.0 | 2065 | 1.3123 | 0.006 | 6190.9951 | 4291.2708 | 2763.0 | 3270.0 | 0.8450 | 2768.0 | 0.8465 | 1799.0 | 1799.0 | 2026.0 | 0.8880 | 0.8880 | 963.0 | 964.0 | 1231.0 | 0.7831 | 0.7823 |
| 0.0 | 8.0 | 2360 | 1.2884 | 0.006 | 6078.2692 | 4213.1351 | 2764.0 | 3270.0 | 0.8453 | 2770.0 | 0.8471 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 944.0 | 944.0 | 1231.0 | 0.7669 | 0.7669 |
| 0.6191 | 9.0 | 2655 | 1.3592 | 0.006 | 6411.9574 | 4444.4302 | 2766.0 | 3270.0 | 0.8459 | 2771.0 | 0.8474 | 1821.0 | 1821.0 | 2026.0 | 0.8988 | 0.8988 | 944.0 | 945.0 | 1231.0 | 0.7677 | 0.7669 |
| 0.0 | 10.0 | 2950 | 1.3659 | 0.006 | 6443.7488 | 4466.4663 | 2768.0 | 3270.0 | 0.8465 | 2772.0 | 0.8477 | 1824.0 | 1824.0 | 2026.0 | 0.9003 | 0.9003 | 942.0 | 944.0 | 1231.0 | 0.7669 | 0.7652 |
| 0.0 | 11.0 | 3245 | 1.3656 | 0.006 | 6442.3540 | 4465.4995 | 2763.0 | 3270.0 | 0.8450 | 2768.0 | 0.8465 | 1817.0 | 1817.0 | 2026.0 | 0.8968 | 0.8968 | 945.0 | 946.0 | 1231.0 | 0.7685 | 0.7677 |
| 0.0 | 12.0 | 3540 | 1.3850 | 0.006 | 6533.8980 | 4528.9530 | 2761.0 | 3270.0 | 0.8443 | 2766.0 | 0.8459 | 1810.0 | 1810.0 | 2026.0 | 0.8934 | 0.8934 | 950.0 | 951.0 | 1231.0 | 0.7725 | 0.7717 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 3
Model tree for donoway/BoolQ_Llama-3.2-1B-9pmfiw8i
Base model
meta-llama/Llama-3.2-1B