llama-3.1-coherence-reg-adapter
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3659
- Mse: 0.3659
- Rmse: 0.6049
- Mae: 0.4649
- R2: 0.1011
- Rounded Accuracy: 0.6873
- Mae Class 0: 2.9010
- Mse Class 0: 8.5636
- Mae Class 1: 2.2816
- Mse Class 1: 5.3301
- Mae Class 2: 1.4010
- Mse Class 2: 2.0718
- Mae Class 3: 0.6129
- Mse Class 3: 0.4212
- Mae Class 4: 0.3256
- Mse Class 4: 0.1442
- Pred Count 0: 0
- Pred Percent 0: 0.0
- Pred Count 1: 1
- Pred Percent 1: 0.0246
- Pred Count 2: 14
- Pred Percent 2: 0.3444
- Pred Count 3: 671
- Pred Percent 3: 16.5068
- Pred Count 4: 3379
- Pred Percent 4: 83.1242
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Mse | Rmse | Mae | R2 | Rounded Accuracy | Mae Class 0 | Mse Class 0 | Mae Class 1 | Mse Class 1 | Mae Class 2 | Mse Class 2 | Mae Class 3 | Mse Class 3 | Mae Class 4 | Mse Class 4 | Pred Count 0 | Pred Percent 0 | Pred Count 1 | Pred Percent 1 | Pred Count 2 | Pred Percent 2 | Pred Count 3 | Pred Percent 3 | Pred Count 4 | Pred Percent 4 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.4377 | 0.2460 | 500 | 0.5783 | 0.5783 | 0.7604 | 0.6132 | -0.4208 | 0.4396 | 3.1517 | 10.6746 | 2.3930 | 6.0712 | 1.3554 | 1.9757 | 0.4875 | 0.4073 | 0.5703 | 0.4337 | 11 | 0.2706 | 10 | 0.2460 | 40 | 0.9840 | 2396 | 58.9422 | 1608 | 39.5572 |
| 0.5224 | 0.4920 | 1000 | 0.4044 | 0.4044 | 0.6359 | 0.5216 | 0.0063 | 0.6369 | 2.9176 | 8.7381 | 2.3312 | 5.5281 | 1.3939 | 2.0149 | 0.5432 | 0.3448 | 0.4252 | 0.2213 | 1 | 0.0246 | 7 | 0.1722 | 24 | 0.5904 | 1108 | 27.2571 | 2925 | 71.9557 |
| 0.3067 | 0.7381 | 1500 | 0.3962 | 0.3962 | 0.6294 | 0.5328 | 0.0265 | 0.5808 | 2.6475 | 7.1262 | 2.1129 | 4.5911 | 1.2537 | 1.6802 | 0.4870 | 0.2874 | 0.4709 | 0.2670 | 0 | 0.0 | 2 | 0.0492 | 30 | 0.7380 | 1598 | 39.3112 | 2435 | 59.9016 |
| 0.3516 | 0.9841 | 2000 | 0.3659 | 0.3659 | 0.6049 | 0.4649 | 0.1011 | 0.6873 | 2.9010 | 8.5636 | 2.2816 | 5.3301 | 1.4010 | 2.0718 | 0.6129 | 0.4212 | 0.3256 | 0.1442 | 0 | 0.0 | 1 | 0.0246 | 14 | 0.3444 | 671 | 16.5068 | 3379 | 83.1242 |
Framework versions
- PEFT 0.13.2
- Transformers 4.49.0
- Pytorch 2.5.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Jennny/llama-3.1-coherence-reg-adapter
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct