End of training

Files changed (6) hide show

README.md CHANGED Viewed

@@ -4,12 +4,25 @@ base_model: hrezaei/flan-t5la-large
 tags:
 - generated_from_trainer
 datasets:
-- generator
 metrics:
 - accuracy
 model-index:
 - name: flan-t5la-large
-  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -17,13 +30,13 @@ should probably proofread and complete it, then remove this comment. -->
 # flan-t5la-large
-This model is a fine-tuned version of [hrezaei/flan-t5la-large](https://huggingface.co/hrezaei/flan-t5la-large) on the generator dataset.
 It achieves the following results on the evaluation set:
 - Perplexity: 5.0594
 - Loss: 1.6212
 - Accuracy: 0.0025
-- Lookahead Perplexity: 22.5301
-- Lookahead Loss: 3.1149
 - Base Perplexity: 1.1402
 - Base Loss: 0.1312

 tags:
 - generated_from_trainer
 datasets:
+- HuggingFaceFW/fineweb
 metrics:
 - accuracy
 model-index:
 - name: flan-t5la-large
+  results:
+  - task:
+      name: Causal Language Modeling
+      type: text-generation
+    dataset:
+      name: HuggingFaceFW/fineweb sample-350BT
+      type: HuggingFaceFW/fineweb
+      config: default
+      split: train
+      args: sample-350BT
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 0.00254853228962818
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # flan-t5la-large
+This model is a fine-tuned version of [hrezaei/flan-t5la-large](https://huggingface.co/hrezaei/flan-t5la-large) on the HuggingFaceFW/fineweb sample-350BT dataset.
 It achieves the following results on the evaluation set:
 - Perplexity: 5.0594
 - Loss: 1.6212
 - Accuracy: 0.0025
+- Lookahead Perplexity: 22.5300
+- Lookahead Loss: 3.1148
 - Base Perplexity: 1.1402
 - Base Loss: 0.1312

all_results.json ADDED Viewed

+{
+    "eval_accuracy": 0.00254853228962818,
+    "eval_base_loss": 0.13120111280355973,
+    "eval_lookahead_loss": 3.1148472632082127,
+    "eval_loss": 1.6212472915649414,
+    "eval_perplexity": 5.059396925765216,
+    "eval_runtime": 271.8591,
+    "eval_samples": 10000,
+    "eval_samples_per_second": 18.392,
+    "eval_steps_per_second": 0.578,
+    "total_flos": 3.285699411601208e+19,
+    "train_loss": 0.14963702380191535,
+    "train_runtime": 35362.9264,
+    "train_samples": 2000000,
+    "train_samples_per_second": 474.43,
+    "train_steps_per_second": 14.826
+}

eval_results.json ADDED Viewed

+{
+    "eval_accuracy": 0.00254853228962818,
+    "eval_base_loss": 0.13120111280355973,
+    "eval_lookahead_loss": 3.1148472632082127,
+    "eval_loss": 1.6212472915649414,
+    "eval_perplexity": 5.059396925765216,
+    "eval_runtime": 271.8591,
+    "eval_samples": 10000,
+    "eval_samples_per_second": 18.392,
+    "eval_steps_per_second": 0.578
+}

runs/Nov11_13-16-32_gpu22.viking2.yor.alces.network/events.out.tfevents.1762903058.gpu22.viking2.yor.alces.network.157277.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:dfb869da25245d1130beadb9eeb80b22dad63b1cb5ed2bb1e16ad9f516427dad
+size 710

train_results.json ADDED Viewed

+{
+    "total_flos": 3.285699411601208e+19,
+    "train_loss": 0.14963702380191535,
+    "train_runtime": 35362.9264,
+    "train_samples": 2000000,
+    "train_samples_per_second": 474.43,
+    "train_steps_per_second": 14.826
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff