Harshil13
/

botGPT2_Context_v1

@@ -14,13 +14,13 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: nan
-- Train Accuracy: 0.0
-- Train Perplexity: 58011.8711
-- Validation Loss: 0.2925
 - Validation Accuracy: 0.0
-- Validation Perplexity: 57475.0
-- Epoch: 5
 ## Model description
@@ -39,19 +39,22 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 1e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 1e-05, 'decay_steps': 18724, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 5000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
 - training_precision: mixed_float16
 ### Training results
 | Train Loss | Train Accuracy | Train Perplexity | Validation Loss | Validation Accuracy | Validation Perplexity | Epoch |
 |:----------:|:--------------:|:----------------:|:---------------:|:-------------------:|:---------------------:|:-----:|
-| nan        | 0.0044         | 133036.0312      | 0.2925          | 0.0                 | 57475.0               | 0     |
-| nan        | 0.0000         | 57758.4336       | 0.2925          | 0.0                 | 57475.0               | 1     |
-| nan        | 0.0000         | 58262.7109       | 0.2925          | 0.0                 | 57475.0               | 2     |
-| nan        | 0.0000         | 57296.8555       | 0.2925          | 0.0                 | 57475.0               | 3     |
-| nan        | 0.0000         | 62598.2734       | 0.2925          | 0.0                 | 57475.0               | 4     |
-| nan        | 0.0            | 58011.8711       | 0.2925          | 0.0                 | 57475.0               | 5     |
 ### Framework versions

 This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 0.3524
+- Train Accuracy: 0.0000
+- Train Perplexity: 18824.3340
+- Validation Loss: 0.3106
 - Validation Accuracy: 0.0
+- Validation Perplexity: 39785.5430
+- Epoch: 8
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 1e-05, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 1e-05, 'decay_steps': 16381, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
 - training_precision: mixed_float16
 ### Training results
 | Train Loss | Train Accuracy | Train Perplexity | Validation Loss | Validation Accuracy | Validation Perplexity | Epoch |
 |:----------:|:--------------:|:----------------:|:---------------:|:-------------------:|:---------------------:|:-----:|
+| 0.6295     | 0.0032         | 100042.4062      | 0.3106          | 0.0                 | 39785.5273            | 0     |
+| 0.3528     | 0.0000         | 18560.1328       | 0.3106          | 0.0                 | 39785.5391            | 1     |
+| 0.3525     | 0.0000         | 18773.9668       | 0.3106          | 0.0                 | 39785.5156            | 2     |
+| 0.3525     | 0.0            | 18342.8223       | 0.3106          | 0.0                 | 39785.5078            | 3     |
+| 0.3525     | 0.0000         | 19026.9180       | 0.3106          | 0.0                 | 39785.5508            | 4     |
+| 0.3526     | 0.0            | 19108.625        | 0.3106          | 0.0                 | 39785.5195            | 5     |
+| 0.3526     | 0.0000         | 19143.7520       | 0.3106          | 0.0                 | 39785.5312            | 6     |
+| 0.3525     | 0.0000         | 18503.0938       | 0.3106          | 0.0                 | 39785.5195            | 7     |
+| 0.3524     | 0.0000         | 18824.3340       | 0.3106          | 0.0                 | 39785.5430            | 8     |
 ### Framework versions

config.json CHANGED Viewed

@@ -34,5 +34,5 @@
   },
   "transformers_version": "4.26.0",
   "use_cache": true,
-  "vocab_size": 1787
 }

   },
   "transformers_version": "4.26.0",
   "use_cache": true,
+  "vocab_size": 1814
 }

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:23aff9d575f0e43fc84cf59a7eface16326e0e830187fafe1f919bb8768ef763
-size 349035600

 version https://git-lfs.github.com/spec/v1
+oid sha256:bc4b029f38898205c326f6444d735540115f93d540fc5018990920f170cb2416
+size 349118544