Model save

Browse files

Files changed (3) hide show

README.md +4 -128
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -2,36 +2,18 @@
 library_name: transformers
 tags:
 - generated_from_trainer
-datasets:
-- HuggingFaceFW/fineweb
-metrics:
-- accuracy
 model-index:
 - name: T5LA
-  results:
-  - task:
-      name: Causal Language Modeling
-      type: text-generation
-    dataset:
-      name: HuggingFaceFW/fineweb sample-10BT
-      type: HuggingFaceFW/fineweb
-      args: sample-10BT
-    metrics:
-    - name: Accuracy
-      type: accuracy
-      value: 0.03222989830774154
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/uoy/llm_training/runs/elf928gg)
 # T5LA
-This model is a fine-tuned version of [](https://huggingface.co/) on the HuggingFaceFW/fineweb sample-10BT dataset.
-It achieves the following results on the evaluation set:
-- Loss: 5.5470
-- Accuracy: 0.0322
 ## Model description
@@ -60,115 +42,9 @@ The following hyperparameters were used during training:
 - total_eval_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- training_steps: 100000
 - mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step   | Accuracy | Validation Loss |
-|:-------------:|:------:|:------:|:--------:|:---------------:|
-| 9.4056        | 0.01   | 1000   | 0.0435   | 9.1215          |
-| 8.4062        | 0.02   | 2000   | 0.0443   | 8.1939          |
-| 7.7307        | 0.03   | 3000   | 0.0444   | 7.6024          |
-| 7.39          | 0.04   | 4000   | 0.0444   | 7.3338          |
-| 7.2546        | 0.05   | 5000   | 0.0441   | 7.2452          |
-| 7.1985        | 0.06   | 6000   | 0.0369   | 7.1682          |
-| 7.1009        | 0.07   | 7000   | 0.0346   | 7.0718          |
-| 7.004         | 0.08   | 8000   | 0.0332   | 6.9778          |
-| 6.9159        | 0.09   | 9000   | 0.0325   | 6.8964          |
-| 6.8548        | 0.1    | 10000  | 0.0325   | 6.8307          |
-| 6.7833        | 0.11   | 11000  | 0.0326   | 6.7702          |
-| 6.7376        | 0.12   | 12000  | 0.0337   | 6.7163          |
-| 6.6821        | 0.13   | 13000  | 0.0346   | 6.6615          |
-| 6.6373        | 0.14   | 14000  | 0.0349   | 6.6086          |
-| 6.5895        | 0.15   | 15000  | 0.0344   | 6.5569          |
-| 6.5421        | 0.16   | 16000  | 0.0354   | 6.5119          |
-| 6.5051        | 0.17   | 17000  | 0.0355   | 6.4678          |
-| 6.4391        | 0.18   | 18000  | 0.0360   | 6.4324          |
-| 6.4242        | 0.19   | 19000  | 0.0355   | 6.4015          |
-| 6.3889        | 0.2    | 20000  | 0.0373   | 6.3553          |
-| 6.3631        | 0.21   | 21000  | 0.0367   | 6.3285          |
-| 6.3296        | 0.22   | 22000  | 0.0369   | 6.3015          |
-| 6.3081        | 0.23   | 23000  | 0.0364   | 6.2699          |
-| 6.2784        | 0.24   | 24000  | 0.0370   | 6.2454          |
-| 6.2589        | 0.25   | 25000  | 0.0374   | 6.2167          |
-| 6.2371        | 0.26   | 26000  | 0.0370   | 6.1890          |
-| 6.1978        | 0.27   | 27000  | 0.0376   | 6.1660          |
-| 6.1895        | 0.28   | 28000  | 0.0375   | 6.1378          |
-| 6.1636        | 0.29   | 29000  | 0.0366   | 6.1213          |
-| 6.1262        | 0.3    | 30000  | 0.0370   | 6.0967          |
-| 6.1345        | 0.31   | 31000  | 0.0361   | 6.0745          |
-| 6.1096        | 0.32   | 32000  | 0.0360   | 6.0556          |
-| 6.0794        | 0.33   | 33000  | 0.0357   | 6.0413          |
-| 6.0643        | 0.34   | 34000  | 0.0363   | 6.0136          |
-| 6.057         | 0.35   | 35000  | 0.0362   | 5.9965          |
-| 6.0337        | 0.36   | 36000  | 0.0354   | 5.9806          |
-| 6.0217        | 0.37   | 37000  | 0.0363   | 5.9584          |
-| 6.0045        | 0.38   | 38000  | 0.0359   | 5.9526          |
-| 5.9896        | 0.39   | 39000  | 0.0355   | 5.9288          |
-| 5.9711        | 0.4    | 40000  | 0.0352   | 5.9152          |
-| 5.9629        | 0.41   | 41000  | 0.0349   | 5.8962          |
-| 5.9465        | 0.42   | 42000  | 0.0359   | 5.8821          |
-| 5.9463        | 0.43   | 43000  | 0.0345   | 5.8692          |
-| 5.9317        | 0.44   | 44000  | 0.0343   | 5.8699          |
-| 5.9097        | 1.0034 | 45000  | 0.0346   | 5.8483          |
-| 5.9107        | 1.0134 | 46000  | 0.0348   | 5.8352          |
-| 5.8838        | 1.0234 | 47000  | 0.0343   | 5.8188          |
-| 5.887         | 1.0334 | 48000  | 0.0340   | 5.8086          |
-| 5.8563        | 1.0434 | 49000  | 0.0338   | 5.7971          |
-| 5.8576        | 1.0534 | 50000  | 0.0339   | 5.7968          |
-| 5.8567        | 1.0635 | 51000  | 0.0343   | 5.7797          |
-| 5.841         | 1.0735 | 52000  | 0.0337   | 5.7677          |
-| 5.8192        | 1.0835 | 53000  | 0.0332   | 5.7613          |
-| 5.8214        | 1.0935 | 54000  | 0.0338   | 5.7486          |
-| 5.8166        | 1.1035 | 55000  | 0.0338   | 5.7409          |
-| 5.806         | 1.1135 | 56000  | 0.0333   | 5.7342          |
-| 5.7961        | 1.1235 | 57000  | 0.0335   | 5.7236          |
-| 5.7847        | 1.1335 | 58000  | 0.0333   | 5.7164          |
-| 5.787         | 1.1435 | 59000  | 0.0330   | 5.7096          |
-| 5.7711        | 1.1535 | 60000  | 0.0328   | 5.7035          |
-| 5.7699        | 1.1635 | 61000  | 0.0331   | 5.6888          |
-| 5.763         | 1.1734 | 62000  | 0.0334   | 5.6875          |
-| 5.7434        | 1.1835 | 63000  | 0.0330   | 5.6809          |
-| 5.7477        | 1.1934 | 64000  | 0.0329   | 5.6686          |
-| 5.7409        | 1.2034 | 65000  | 0.0330   | 5.6624          |
-| 5.737         | 1.2134 | 66000  | 0.0339   | 5.6758          |
-| 5.729         | 1.2234 | 67000  | 0.0326   | 5.6546          |
-| 5.7232        | 1.2334 | 68000  | 0.0329   | 5.6467          |
-| 5.7127        | 1.2434 | 69000  | 0.0329   | 5.6449          |
-| 5.7187        | 1.2534 | 70000  | 0.0329   | 5.6352          |
-| 5.717         | 1.2634 | 71000  | 0.0326   | 5.6264          |
-| 5.714         | 1.2734 | 72000  | 0.0330   | 5.6219          |
-| 5.7079        | 1.2834 | 73000  | 0.0330   | 5.6169          |
-| 5.7034        | 1.2934 | 74000  | 0.0326   | 5.6131          |
-| 5.6768        | 1.3034 | 75000  | 0.0325   | 5.6125          |
-| 5.6955        | 1.3135 | 76000  | 0.0328   | 5.6075          |
-| 5.6947        | 1.3235 | 77000  | 0.0325   | 5.6017          |
-| 5.7056        | 1.3335 | 78000  | 0.0323   | 5.5956          |
-| 5.6636        | 1.3435 | 79000  | 0.0326   | 5.5921          |
-| 5.6723        | 1.3535 | 80000  | 0.0326   | 5.5881          |
-| 5.659         | 1.3635 | 81000  | 0.0324   | 5.5823          |
-| 5.6729        | 1.3735 | 82000  | 0.0326   | 5.5795          |
-| 5.6595        | 1.3835 | 83000  | 0.0322   | 5.5794          |
-| 5.6565        | 1.3935 | 84000  | 0.0328   | 5.5758          |
-| 5.6649        | 1.4034 | 85000  | 0.0325   | 5.5716          |
-| 5.6561        | 1.4135 | 86000  | 0.0321   | 5.5695          |
-| 5.6405        | 1.4234 | 87000  | 0.0323   | 5.5654          |
-| 5.6482        | 1.4335 | 88000  | 0.0321   | 5.5628          |
-| 5.6425        | 1.4434 | 89000  | 0.0323   | 5.5622          |
-| 5.6379        | 2.0069 | 90000  | 0.0323   | 5.5582          |
-| 5.6357        | 2.0169 | 91000  | 0.0322   | 5.5573          |
-| 5.6381        | 2.0269 | 92000  | 0.0320   | 5.5568          |
-| 5.6427        | 2.0369 | 93000  | 0.0324   | 5.5526          |
-| 5.6364        | 2.0469 | 94000  | 0.0323   | 5.5526          |
-| 5.626         | 2.0569 | 95000  | 0.0321   | 5.5501          |
-| 5.636         | 2.0669 | 96000  | 0.0324   | 5.5492          |
-| 5.632         | 2.0769 | 97000  | 0.0323   | 5.5489          |
-| 5.6133        | 2.0869 | 98000  | 0.0323   | 5.5479          |
-| 5.6291        | 2.0969 | 99000  | 0.0323   | 5.5477          |
-| 5.6271        | 2.1069 | 100000 | 0.0322   | 5.5470          |
 ### Framework versions
 - Transformers 4.49.0.dev0

 library_name: transformers
 tags:
 - generated_from_trainer
 model-index:
 - name: T5LA
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/uoy/llm_training/runs/pzcq293g)
 # T5LA
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 ## Model description
 - total_eval_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- training_steps: 200000
 - mixed_precision_training: Native AMP
 ### Framework versions
 - Transformers 4.49.0.dev0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:880f5daf0555630aaa1f5971bcbc30735b81263f4959a256d6baedeaab077586
 size 439436624

 version https://git-lfs.github.com/spec/v1
+oid sha256:558b40774ebb3c775f86849ec49e96e68298f3fc467a56f70324925d9ce915bd
 size 439436624

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bbeb34171b3a4acd5b65788b8a72d6a9a1089a87551e1ba44bc0d9baa0db3f6f
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:250fa1af03ffaa8bda0e2278749102dd8574803e5d42d069bf5be28611ad9412
 size 5432