NasimB
/

bert-dp-second

+---
+tags:
+- generated_from_trainer
+datasets:
+- generator
+model-index:
+- name: bert-dp-second
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bert-dp-second
+This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.5640
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 1000
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 7.3416        | 0.23  | 500   | 6.6532          |
+| 6.5752        | 0.47  | 1000  | 6.5275          |
+| 6.4866        | 0.7   | 1500  | 6.4720          |
+| 6.4273        | 0.93  | 2000  | 6.4540          |
+| 6.4036        | 1.17  | 2500  | 6.4236          |
+| 6.3779        | 1.4   | 3000  | 6.4018          |
+| 6.3528        | 1.63  | 3500  | 6.3768          |
+| 6.3258        | 1.87  | 4000  | 6.3679          |
+| 6.3009        | 2.1   | 4500  | 6.3305          |
+| 6.2646        | 2.33  | 5000  | 6.3142          |
+| 6.2583        | 2.57  | 5500  | 6.3004          |
+| 6.2223        | 2.8   | 6000  | 6.2605          |
+| 6.1941        | 3.03  | 6500  | 6.2353          |
+| 6.1382        | 3.27  | 7000  | 6.2095          |
+| 6.1301        | 3.5   | 7500  | 6.1774          |
+| 6.09          | 3.73  | 8000  | 6.1480          |
+| 6.0624        | 3.97  | 8500  | 6.1061          |
+| 6.0056        | 4.2   | 9000  | 6.0655          |
+| 5.9444        | 4.43  | 9500  | 5.9461          |
+| 5.7101        | 4.67  | 10000 | 5.2594          |
+| 5.005         | 4.9   | 10500 | 4.7348          |
+| 4.6127        | 5.13  | 11000 | 4.4626          |
+| 4.3907        | 5.37  | 11500 | 4.2862          |
+| 4.241         | 5.6   | 12000 | 4.1701          |
+| 4.1286        | 5.83  | 12500 | 4.0673          |
+| 4.0151        | 6.07  | 13000 | 3.9967          |
+| 3.934         | 6.3   | 13500 | 3.9292          |
+| 3.8789        | 6.53  | 14000 | 3.8707          |
+| 3.8231        | 6.77  | 14500 | 3.8222          |
+| 3.7696        | 7.0   | 15000 | 3.7800          |
+| 3.7078        | 7.23  | 15500 | 3.7424          |
+| 3.6671        | 7.47  | 16000 | 3.7093          |
+| 3.6446        | 7.7   | 16500 | 3.6780          |
+| 3.6069        | 7.93  | 17000 | 3.6476          |
+| 3.5782        | 8.17  | 17500 | 3.6283          |
+| 3.5384        | 8.4   | 18000 | 3.6098          |
+| 3.5245        | 8.63  | 18500 | 3.5942          |
+| 3.5209        | 8.87  | 19000 | 3.5841          |
+| 3.4948        | 9.1   | 19500 | 3.5728          |
+| 3.4877        | 9.33  | 20000 | 3.5692          |
+| 3.4818        | 9.57  | 20500 | 3.5641          |
+| 3.4844        | 9.8   | 21000 | 3.5640          |
+### Framework versions
+- Transformers 4.26.1
+- Pytorch 1.11.0+cu113
+- Datasets 2.13.0
+- Tokenizers 0.13.3