--- library_name: transformers license: mit base_model: openai-community/gpt2-medium tags: - generated_from_trainer model-index: - name: tinystories_scrambled_words results: [] --- [Visualize in Weights & Biases](https://wandb.ai/ptsvil/tom-training/runs/fxhgekth) # tinystories_scrambled_words This model is a fine-tuned version of [openai-community/gpt2-medium](https://huggingface.co/openai-community/gpt2-medium) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 3.7107 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 32 - eval_batch_size: 32 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 256 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 1 - num_epochs: 3 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:-----:|:---------------:| | 4.3727 | 0.1051 | 400 | 4.3237 | | 4.1849 | 0.2102 | 800 | 4.1539 | | 4.1369 | 0.3153 | 1200 | 4.0841 | | 4.0574 | 0.4204 | 1600 | 4.0165 | | 4.0123 | 0.5255 | 2000 | 3.9716 | | 3.9874 | 0.6306 | 2400 | 3.9359 | | 3.9459 | 0.7357 | 2800 | 3.9111 | | 3.9004 | 0.8408 | 3200 | 3.8865 | | 3.9493 | 0.9459 | 3600 | 3.8587 | | 3.8599 | 1.0510 | 4000 | 3.8427 | | 3.8753 | 1.1561 | 4400 | 3.8281 | | 3.8352 | 1.2612 | 4800 | 3.8156 | | 3.8194 | 1.3663 | 5200 | 3.8007 | | 3.8334 | 1.4714 | 5600 | 3.7884 | | 3.8279 | 1.5765 | 6000 | 3.7814 | | 3.8233 | 1.6816 | 6400 | 3.7728 | | 3.7066 | 1.7867 | 6800 | 3.7629 | | 3.7625 | 1.8918 | 7200 | 3.7544 | | 3.7696 | 1.9969 | 7600 | 3.7465 | | 3.7041 | 2.1020 | 8000 | 3.7424 | | 3.7223 | 2.2071 | 8400 | 3.7369 | | 3.7445 | 2.3122 | 8800 | 3.7362 | | 3.7023 | 2.4173 | 9200 | 3.7291 | | 3.6926 | 2.5224 | 9600 | 3.7284 | | 3.7196 | 2.6275 | 10000 | 3.7212 | | 3.7469 | 2.7326 | 10400 | 3.7183 | | 3.7082 | 2.8377 | 10800 | 3.7159 | | 3.7284 | 2.9428 | 11200 | 3.7147 | ### Framework versions - Transformers 4.44.1 - Pytorch 2.2.2 - Datasets 2.18.0 - Tokenizers 0.19.1