Magpie-Align/Magpie-Pro-MT-300K-v0.1
Viewer • Updated • 300k • 3.72k • 32
How to use fblgit/pancho-v1-qw25-3B-UNAMGS with PEFT:
Task type is invalid.
This model is a fine-tuned version of Qwen/Qwen2.5-3B-Instruct: It achieves the following results on the evaluation set:
Trained with MagPie:
UNA on MLPs 4, 10, 16, 22, 28
MGS on 3 Scales.
Following https://arxiv.org/abs//2410.21228 facts.
Any derivative (sft, merges, etc) using ANY layer from this model MUST include either UNA or MGS or PANCHO in their model name in order to obtain a LICENSE for derivatives of this model.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.2127 | 0.0015 | 1 | 0.8711 |
| 0.9905 | 0.0509 | 35 | 0.7338 |
| 0.9685 | 0.1019 | 70 | 0.7114 |
| 0.9554 | 0.1528 | 105 | 0.6994 |
| 0.9077 | 0.2037 | 140 | 0.6915 |
| 0.9149 | 0.2547 | 175 | 0.6859 |
| 0.9363 | 0.3056 | 210 | 0.6795 |
| 0.8975 | 0.3566 | 245 | 0.6745 |
| 0.9095 | 0.4075 | 280 | 0.6709 |
| 0.9216 | 0.4584 | 315 | 0.6681 |
| 0.9143 | 0.5094 | 350 | 0.6666 |
| 0.8879 | 0.5603 | 385 | 0.6645 |
| 0.9194 | 0.6112 | 420 | 0.6625 |
| 0.9123 | 0.6622 | 455 | 0.6615 |
| 0.9056 | 0.7131 | 490 | 0.6591 |
| 0.9172 | 0.7641 | 525 | 0.6578 |
| 0.886 | 0.8150 | 560 | 0.6566 |
| 0.9155 | 0.8659 | 595 | 0.6568 |
| 0.9029 | 0.9169 | 630 | 0.6560 |
| 0.8942 | 0.9678 | 665 | 0.6555 |