---
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: llama3_rm
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# llama3_rm

This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3742
- Accuracy: 0.8948

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 2

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy |
|:-------------:|:------:|:----:|:---------------:|:--------:|
| 0.6626        | 0.0705 | 10   | 0.5923          | 0.6719   |
| 0.5039        | 0.1410 | 20   | 0.5139          | 0.7568   |
| 0.4865        | 0.2115 | 30   | 0.4816          | 0.7646   |
| 0.4814        | 0.2819 | 40   | 0.4566          | 0.7891   |
| 0.4932        | 0.3524 | 50   | 0.4449          | 0.7979   |
| 0.4723        | 0.4229 | 60   | 0.4267          | 0.8031   |
| 0.3906        | 0.4934 | 70   | 0.4042          | 0.8208   |
| 0.3418        | 0.5639 | 80   | 0.3907          | 0.8245   |
| 0.4427        | 0.6344 | 90   | 0.3736          | 0.8359   |
| 0.4022        | 0.7048 | 100  | 0.3578          | 0.8484   |
| 0.3738        | 0.7753 | 110  | 0.3470          | 0.8542   |
| 0.3619        | 0.8458 | 120  | 0.3328          | 0.8609   |
| 0.3266        | 0.9163 | 130  | 0.3256          | 0.8651   |
| 0.2786        | 0.9868 | 140  | 0.3245          | 0.8693   |
| 0.1685        | 1.0573 | 150  | 0.4035          | 0.8786   |
| 0.0421        | 1.1278 | 160  | 0.4395          | 0.8823   |
| 0.0655        | 1.1982 | 170  | 0.3843          | 0.8828   |
| 0.0734        | 1.2687 | 180  | 0.3645          | 0.8823   |
| 0.1441        | 1.3392 | 190  | 0.4277          | 0.8833   |
| 0.1176        | 1.4097 | 200  | 0.4040          | 0.8896   |
| 0.0826        | 1.4802 | 210  | 0.3609          | 0.8870   |
| 0.056         | 1.5507 | 220  | 0.3542          | 0.8891   |
| 0.0487        | 1.6211 | 230  | 0.3668          | 0.8927   |
| 0.0815        | 1.6916 | 240  | 0.3735          | 0.8938   |
| 0.0842        | 1.7621 | 250  | 0.3751          | 0.8943   |
| 0.0868        | 1.8326 | 260  | 0.3758          | 0.8932   |
| 0.0743        | 1.9031 | 270  | 0.3753          | 0.8938   |
| 0.0874        | 1.9736 | 280  | 0.3742          | 0.8948   |


### Framework versions

- Transformers 4.43.4
- Pytorch 2.1.2+cu121
- Datasets 4.4.1
- Tokenizers 0.19.1