---
tags:
- generated_from_trainer
model-index:
- name: my_run
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# my_run

This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 6.4043

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 128
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 1000000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step    | Validation Loss |
|:-------------:|:-----:|:-------:|:---------------:|
| 7.1589        | 1.81  | 50000   | 6.8011          |
| 6.7421        | 3.62  | 100000  | 6.6767          |
| 6.6503        | 5.42  | 150000  | 6.6086          |
| 6.5951        | 7.23  | 200000  | 6.5605          |
| 6.5566        | 9.04  | 250000  | 6.5271          |
| 6.5292        | 10.85 | 300000  | 6.5037          |
| 6.5069        | 12.65 | 350000  | 6.4839          |
| 6.49          | 14.46 | 400000  | 6.4667          |
| 6.4759        | 16.27 | 450000  | 6.4552          |
| 6.4648        | 18.08 | 500000  | 6.4442          |
| 6.4551        | 19.88 | 550000  | 6.4365          |
| 6.4475        | 21.69 | 600000  | 6.4278          |
| 6.4412        | 23.5  | 650000  | 6.4222          |
| 6.4358        | 25.31 | 700000  | 6.4185          |
| 6.4313        | 27.11 | 750000  | 6.4134          |
| 6.4275        | 28.92 | 800000  | 6.4110          |
| 6.425         | 30.73 | 850000  | 6.4079          |
| 6.4229        | 32.54 | 900000  | 6.4063          |
| 6.4218        | 34.35 | 950000  | 6.4047          |
| 6.4204        | 36.15 | 1000000 | 6.4043          |


### Framework versions

- Transformers 4.38.2
- Pytorch 2.3.0a0+ebedce2
- Datasets 2.17.1
- Tokenizers 0.15.2