End of training

Files changed (4) hide show

README.md CHANGED Viewed

@@ -10,7 +10,6 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/smiled0g/preflop/runs/aue4q75g)
 # preflop
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
@@ -33,21 +32,17 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 128
 - eval_batch_size: 8
 - seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3
 - mixed_precision_training: Native AMP
-### Training results
 ### Framework versions
-- Transformers 4.45.2
 - Pytorch 2.1.1+cu121
 - Datasets 3.0.2
 - Tokenizers 0.20.1

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # preflop
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 100
 - eval_batch_size: 8
 - seed: 42
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 2
 - mixed_precision_training: Native AMP
 ### Framework versions
+- Transformers 4.46.0
 - Pytorch 2.1.1+cu121
 - Datasets 3.0.2
 - Tokenizers 0.20.1

config.json CHANGED Viewed

@@ -38,7 +38,7 @@
   "rope_theta": 10000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
-  "transformers_version": "4.45.2",
   "use_cache": true,
   "vocab_size": 53
 }

   "rope_theta": 10000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
+  "transformers_version": "4.46.0",
   "use_cache": true,
   "vocab_size": 53
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4251b110111ca00d62c5078597269e737702ad49f6e19f7613700852ab4048cb
 size 1074131104

 version https://git-lfs.github.com/spec/v1
+oid sha256:6a2a6beab37ccf7a39cc3c031242e6477acc1a4b9974cbba1dcad0d02cd9d156
 size 1074131104

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4a3cf1c4501fb617a1ea6a6e7871b3840eb36338d12ad842dbbcf53c292df90d
-size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:a3e689b8aa4479620caaf93ab56a44aa0acb7f02b7879d1f2aa150ddbd93a61c
+size 5240