---
tags:
- generated_from_trainer
datasets:
- generator
model-index:
- name: bert-concat
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# bert-concat

This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
It achieves the following results on the evaluation set:
- Loss: 5.9507

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 1000
- num_epochs: 14
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 7.3397        | 0.25  | 500   | 6.6405          |
| 6.5835        | 0.51  | 1000  | 6.5183          |
| 6.4967        | 0.76  | 1500  | 6.4926          |
| 6.451         | 1.01  | 2000  | 6.4507          |
| 6.4104        | 1.26  | 2500  | 6.4097          |
| 6.3868        | 1.52  | 3000  | 6.4019          |
| 6.3717        | 1.77  | 3500  | 6.3789          |
| 6.3361        | 2.02  | 4000  | 6.3596          |
| 6.3099        | 2.28  | 4500  | 6.3345          |
| 6.2807        | 2.53  | 5000  | 6.3050          |
| 6.2578        | 2.78  | 5500  | 6.2843          |
| 6.2356        | 3.03  | 6000  | 6.2735          |
| 6.2017        | 3.29  | 6500  | 6.2527          |
| 6.1837        | 3.54  | 7000  | 6.2277          |
| 6.1682        | 3.79  | 7500  | 6.2102          |
| 6.1443        | 4.04  | 8000  | 6.1917          |
| 6.1128        | 4.3   | 8500  | 6.1767          |
| 6.1034        | 4.55  | 9000  | 6.1678          |
| 6.0838        | 4.8   | 9500  | 6.1552          |
| 6.0641        | 5.06  | 10000 | 6.1401          |
| 6.0417        | 5.31  | 10500 | 6.1350          |
| 6.0247        | 5.56  | 11000 | 6.1123          |
| 6.0125        | 5.81  | 11500 | 6.1082          |
| 6.0028        | 6.07  | 12000 | 6.1022          |
| 5.9788        | 6.32  | 12500 | 6.0895          |
| 5.9739        | 6.57  | 13000 | 6.0828          |
| 5.9545        | 6.83  | 13500 | 6.0687          |
| 5.9441        | 7.08  | 14000 | 6.0652          |
| 5.923         | 7.33  | 14500 | 6.0567          |
| 5.9115        | 7.58  | 15000 | 6.0492          |
| 5.9106        | 7.84  | 15500 | 6.0466          |
| 5.8943        | 8.09  | 16000 | 6.0315          |
| 5.8726        | 8.34  | 16500 | 6.0339          |
| 5.8665        | 8.59  | 17000 | 6.0243          |
| 5.8548        | 8.85  | 17500 | 6.0193          |
| 5.8431        | 9.1   | 18000 | 6.0111          |
| 5.8218        | 9.35  | 18500 | 6.0053          |
| 5.8193        | 9.61  | 19000 | 6.0026          |
| 5.8174        | 9.86  | 19500 | 5.9927          |
| 5.7954        | 10.11 | 20000 | 5.9873          |
| 5.7779        | 10.36 | 20500 | 5.9823          |
| 5.7749        | 10.62 | 21000 | 5.9799          |
| 5.7739        | 10.87 | 21500 | 5.9784          |
| 5.7582        | 11.12 | 22000 | 5.9757          |
| 5.7415        | 11.38 | 22500 | 5.9686          |
| 5.7467        | 11.63 | 23000 | 5.9650          |
| 5.7448        | 11.88 | 23500 | 5.9648          |
| 5.7372        | 12.13 | 24000 | 5.9585          |
| 5.7207        | 12.39 | 24500 | 5.9596          |
| 5.7264        | 12.64 | 25000 | 5.9546          |
| 5.7212        | 12.89 | 25500 | 5.9516          |
| 5.7142        | 13.14 | 26000 | 5.9553          |
| 5.7103        | 13.4  | 26500 | 5.9551          |
| 5.7093        | 13.65 | 27000 | 5.9527          |
| 5.7183        | 13.9  | 27500 | 5.9507          |


### Framework versions

- Transformers 4.26.1
- Pytorch 1.11.0+cu113
- Datasets 2.13.0
- Tokenizers 0.13.3