End of training

Browse files

Files changed (7) hide show

README.md +33 -13
config.json +1 -1
logs/events.out.tfevents.1714285176.ip-10-25-205-144.266412.1 +3 -0
logs/events.out.tfevents.1714287609.ip-10-25-205-144.266412.2 +3 -0
logs/events.out.tfevents.1714289724.ip-10-25-205-144.266412.3 +3 -0
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,25 +1,19 @@
 ---
-base_model: google/byt5-small
 tags:
 - generated_from_trainer
 model-index:
-- name: byt5_add_1k
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# byt5_add_1k
-This model is a fine-tuned version of [AlexWang99/byt5_add_1k/checkpoint-86](https://huggingface.co/AlexWang99/byt5_add_1k/checkpoint-86) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: 0.4665
-- eval_runtime: 10.7864
-- eval_samples_per_second: 927.092
-- eval_steps_per_second: 1.205
-- epoch: 32.0
-- step: 64
 ## Model description
@@ -38,13 +32,39 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 800
 - eval_batch_size: 800
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 80
 ### Framework versions

 ---
 tags:
 - generated_from_trainer
 model-index:
+- name: byt5_1k
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# byt5_1k
+This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0868
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 400
 - eval_batch_size: 800
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 20
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 3    | 0.1081          |
+| No log        | 2.0   | 6    | 0.0983          |
+| No log        | 3.0   | 9    | 0.1285          |
+| 0.1432        | 4.0   | 12   | 0.0961          |
+| 0.1432        | 5.0   | 15   | 0.1040          |
+| 0.1432        | 6.0   | 18   | 0.1032          |
+| 0.1488        | 7.0   | 21   | 0.0938          |
+| 0.1488        | 8.0   | 24   | 0.0979          |
+| 0.1488        | 9.0   | 27   | 0.0976          |
+| 0.1375        | 10.0  | 30   | 0.0885          |
+| 0.1375        | 11.0  | 33   | 0.0907          |
+| 0.1375        | 12.0  | 36   | 0.0863          |
+| 0.1375        | 13.0  | 39   | 0.0843          |
+| 0.1297        | 14.0  | 42   | 0.0833          |
+| 0.1297        | 15.0  | 45   | 0.0840          |
+| 0.1297        | 16.0  | 48   | 0.0861          |
+| 0.1241        | 17.0  | 51   | 0.0903          |
+| 0.1241        | 18.0  | 54   | 0.0891          |
+| 0.1241        | 19.0  | 57   | 0.0876          |
+| 0.1185        | 20.0  | 60   | 0.0868          |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "google/byt5-small",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

 {
+  "_name_or_path": "AlexWang99/byt5_1k",
   "architectures": [
     "T5ForConditionalGeneration"
   ],

logs/events.out.tfevents.1714285176.ip-10-25-205-144.266412.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b21af001fe5158c276dc24fdcff6d1e0b1dc82469343c4470940e188733512fb
+size 25835

logs/events.out.tfevents.1714287609.ip-10-25-205-144.266412.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1b94955ee84f5953be16af7990569559175738ccf0ab05b30e1681501329ca62
+size 20569

logs/events.out.tfevents.1714289724.ip-10-25-205-144.266412.3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ef606384881afd7ee6220395ef6dd0dab0b554a58bbf2b2d68dcd37ab52ad1c
+size 11148

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b1acc4b999f06bdef9036c06dcd7b3f67519c24515a6ec23dcfcfc5abe6feb43
 size 1198571496

 version https://git-lfs.github.com/spec/v1
+oid sha256:b1d38312142676df106c129e100839b6de37257ace7652c5df8d40b1aa17cbdb
 size 1198571496

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:160d58158a6ec3f8a009c04569665c565cfda06315a5577cb2100ca9803c9bab
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:cab6b39dc088aa20d7417697114e95ffad3ae5054465ee82549b2fc0dc90a37d
 size 4792