---
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
- llama-factory
- prefix-tuning
- generated_from_trainer
model-index:
- name: train_wsc_1756729607
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# train_wsc_1756729607

This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the wsc dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3526
- Num Input Tokens Seen: 437760

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 123
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10.0

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
|:-------------:|:------:|:----:|:---------------:|:-----------------:|
| 0.5349        | 0.5020 | 125  | 1.2114          | 22304             |
| 0.4306        | 1.0040 | 250  | 0.4652          | 44064             |
| 0.3766        | 1.5060 | 375  | 0.3868          | 65808             |
| 0.3159        | 2.0080 | 500  | 0.3830          | 88048             |
| 0.3958        | 2.5100 | 625  | 0.3676          | 109696            |
| 0.3521        | 3.0120 | 750  | 0.3507          | 131872            |
| 0.3677        | 3.5141 | 875  | 0.3535          | 154416            |
| 0.3426        | 4.0161 | 1000 | 0.3507          | 176048            |
| 0.3393        | 4.5181 | 1125 | 0.3546          | 198432            |
| 0.3601        | 5.0201 | 1250 | 0.3592          | 219680            |
| 0.3422        | 5.5221 | 1375 | 0.3506          | 241136            |
| 0.3609        | 6.0241 | 1500 | 0.3502          | 263616            |
| 0.3457        | 6.5261 | 1625 | 0.3554          | 285424            |
| 0.315         | 7.0281 | 1750 | 0.3651          | 307792            |
| 0.3149        | 7.5301 | 1875 | 0.3626          | 329840            |
| 0.3441        | 8.0321 | 2000 | 0.3485          | 351552            |
| 0.3574        | 8.5341 | 2125 | 0.3516          | 373424            |
| 0.3673        | 9.0361 | 2250 | 0.3545          | 395616            |
| 0.3419        | 9.5382 | 2375 | 0.3475          | 417520            |


### Framework versions

- PEFT 0.15.2
- Transformers 4.51.3
- Pytorch 2.8.0+cu128
- Datasets 3.6.0
- Tokenizers 0.21.1