qwen25-0.5b-prefix-tuning
This model is a fine-tuned version of Qwen/Qwen2.5-0.5B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.0820
- Bleu: 0.0062
- Rougel: 0.0738
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 2
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 12
Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Rougel |
|---|---|---|---|---|---|
| No log | 0 | 0 | 3.1247 | 0.0014 | 0.0523 |
| 1.8148 | 0.16 | 20 | 3.0069 | 0.0019 | 0.0538 |
| 1.6696 | 0.32 | 40 | 2.9177 | 0.0021 | 0.0594 |
| 1.7065 | 0.48 | 60 | 2.8417 | 0.0021 | 0.0595 |
| 1.6933 | 0.64 | 80 | 2.7770 | 0.0021 | 0.0608 |
| 1.7361 | 0.8 | 100 | 2.7165 | 0.0021 | 0.0704 |
| 1.6731 | 0.96 | 120 | 2.6617 | 0.0013 | 0.0708 |
| 1.7361 | 1.12 | 140 | 2.6140 | 0.0013 | 0.0713 |
| 1.684 | 1.28 | 160 | 2.5721 | 0.0019 | 0.0717 |
| 1.5104 | 1.44 | 180 | 2.5317 | 0.0017 | 0.0682 |
| 1.674 | 1.6 | 200 | 2.4960 | 0.0017 | 0.0696 |
| 1.5632 | 1.76 | 220 | 2.4623 | 0.0027 | 0.0699 |
| 1.5437 | 1.92 | 240 | 2.4329 | 0.0028 | 0.0754 |
| 1.4255 | 2.08 | 260 | 2.4085 | 0.0028 | 0.0748 |
| 1.6271 | 2.24 | 280 | 2.3852 | 0.0041 | 0.0752 |
| 1.4799 | 2.4 | 300 | 2.3632 | 0.0041 | 0.0744 |
| 1.5331 | 2.56 | 320 | 2.3437 | 0.0040 | 0.0735 |
| 1.514 | 2.7200 | 340 | 2.3266 | 0.0041 | 0.0740 |
| 1.4895 | 2.88 | 360 | 2.3120 | 0.0046 | 0.0754 |
| 1.4881 | 3.04 | 380 | 2.2989 | 0.0051 | 0.0753 |
| 1.4331 | 3.2 | 400 | 2.2879 | 0.0051 | 0.0753 |
| 1.4564 | 3.36 | 420 | 2.2765 | 0.0051 | 0.0753 |
| 1.3368 | 3.52 | 440 | 2.2671 | 0.0056 | 0.0756 |
| 1.4613 | 3.68 | 460 | 2.2583 | 0.0060 | 0.0759 |
| 1.4115 | 3.84 | 480 | 2.2504 | 0.0065 | 0.0762 |
| 1.505 | 4.0 | 500 | 2.2427 | 0.0073 | 0.0769 |
| 1.4828 | 4.16 | 520 | 2.2347 | 0.0073 | 0.0765 |
| 1.4893 | 4.32 | 540 | 2.2285 | 0.0072 | 0.0761 |
| 1.5359 | 4.48 | 560 | 2.2214 | 0.0072 | 0.0759 |
| 1.4248 | 4.64 | 580 | 2.2151 | 0.0072 | 0.0759 |
| 1.5041 | 4.8 | 600 | 2.2087 | 0.0072 | 0.0756 |
| 1.5033 | 4.96 | 620 | 2.2031 | 0.0072 | 0.0754 |
| 1.4361 | 5.12 | 640 | 2.1981 | 0.0071 | 0.0747 |
| 1.4457 | 5.28 | 660 | 2.1928 | 0.0072 | 0.0747 |
| 1.4021 | 5.44 | 680 | 2.1872 | 0.0071 | 0.0751 |
| 1.3328 | 5.6 | 700 | 2.1821 | 0.0071 | 0.0751 |
| 1.5165 | 5.76 | 720 | 2.1769 | 0.0071 | 0.0751 |
| 1.339 | 5.92 | 740 | 2.1722 | 0.0071 | 0.0749 |
| 1.3154 | 6.08 | 760 | 2.1672 | 0.0071 | 0.0749 |
| 1.3402 | 6.24 | 780 | 2.1630 | 0.0071 | 0.0751 |
| 1.5663 | 6.4 | 800 | 2.1584 | 0.0071 | 0.0751 |
| 1.3663 | 6.5600 | 820 | 2.1538 | 0.0071 | 0.0749 |
| 1.4353 | 6.72 | 840 | 2.1489 | 0.0071 | 0.0749 |
| 1.3013 | 6.88 | 860 | 2.1447 | 0.0063 | 0.0746 |
| 1.2856 | 7.04 | 880 | 2.1405 | 0.0063 | 0.0746 |
| 1.4434 | 7.2 | 900 | 2.1364 | 0.0063 | 0.0746 |
| 1.4239 | 7.36 | 920 | 2.1324 | 0.0063 | 0.0742 |
| 1.3253 | 7.52 | 940 | 2.1287 | 0.0063 | 0.0742 |
| 1.2715 | 7.68 | 960 | 2.1254 | 0.0063 | 0.0742 |
| 1.404 | 7.84 | 980 | 2.1222 | 0.0063 | 0.0742 |
| 1.3504 | 8.0 | 1000 | 2.1188 | 0.0063 | 0.0744 |
| 1.3449 | 8.16 | 1020 | 2.1154 | 0.0063 | 0.0744 |
| 1.3085 | 8.32 | 1040 | 2.1121 | 0.0063 | 0.0742 |
| 1.3783 | 8.48 | 1060 | 2.1093 | 0.0063 | 0.0744 |
| 1.4059 | 8.64 | 1080 | 2.1065 | 0.0063 | 0.0744 |
| 1.3948 | 8.8 | 1100 | 2.1040 | 0.0063 | 0.0742 |
| 1.4517 | 8.96 | 1120 | 2.1015 | 0.0063 | 0.0744 |
| 1.4014 | 9.12 | 1140 | 2.0992 | 0.0063 | 0.0744 |
| 1.4002 | 9.28 | 1160 | 2.0971 | 0.0063 | 0.0746 |
| 1.2871 | 9.44 | 1180 | 2.0950 | 0.0063 | 0.0746 |
| 1.4064 | 9.6 | 1200 | 2.0930 | 0.0063 | 0.0746 |
| 1.4157 | 9.76 | 1220 | 2.0914 | 0.0062 | 0.0742 |
| 1.364 | 9.92 | 1240 | 2.0900 | 0.0062 | 0.0742 |
| 1.4199 | 10.08 | 1260 | 2.0887 | 0.0062 | 0.0742 |
| 1.4118 | 10.24 | 1280 | 2.0876 | 0.0062 | 0.0738 |
| 1.3544 | 10.4 | 1300 | 2.0867 | 0.0062 | 0.0738 |
| 1.3636 | 10.56 | 1320 | 2.0859 | 0.0062 | 0.0738 |
| 1.4293 | 10.72 | 1340 | 2.0850 | 0.0062 | 0.0738 |
| 1.3867 | 10.88 | 1360 | 2.0842 | 0.0062 | 0.0738 |
| 1.2971 | 11.04 | 1380 | 2.0836 | 0.0062 | 0.0738 |
| 1.3792 | 11.2 | 1400 | 2.0832 | 0.0062 | 0.0738 |
| 1.3477 | 11.36 | 1420 | 2.0827 | 0.0062 | 0.0738 |
| 1.3216 | 11.52 | 1440 | 2.0824 | 0.0062 | 0.0738 |
| 1.3645 | 11.68 | 1460 | 2.0822 | 0.0062 | 0.0737 |
| 1.3225 | 11.84 | 1480 | 2.0821 | 0.0062 | 0.0738 |
| 1.2991 | 12.0 | 1500 | 2.0820 | 0.0062 | 0.0738 |
Framework versions
- PEFT 0.14.0
- Transformers 4.51.1
- Pytorch 2.5.1+cu124
- Datasets 3.5.0
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for rtweera/qwen25-0.5b-prefix-tuning
Base model
Qwen/Qwen2.5-0.5B