End of training

Browse files

Files changed (9) hide show

README.md +9 -0
all_results.json +23 -0
eval_results.json +10 -0
generated_predictions.txt +10 -0
generation_config.json +0 -1
predict_results.json +9 -0
runs/Jun09_01-19-27_serv-9213/events.out.tfevents.1749424963.serv-9213.3250692.1 +3 -0
train_results.json +9 -0
trainer_state.json +43 -0

README.md CHANGED Viewed

@@ -1,9 +1,14 @@
 ---
 library_name: transformers
 license: apache-2.0
 base_model: bigscience/mt0-base
 tags:
 - generated_from_trainer
 model-index:
 - name: mt0-base-pt-vmw
   results: []
@@ -15,6 +20,10 @@ should probably proofread and complete it, then remove this comment. -->
 # mt0-base-pt-vmw
 This model is a fine-tuned version of [bigscience/mt0-base](https://huggingface.co/bigscience/mt0-base) on an unknown dataset.
 ## Model description

 ---
 library_name: transformers
+language:
+- pt
+- vmw
 license: apache-2.0
 base_model: bigscience/mt0-base
 tags:
 - generated_from_trainer
+metrics:
+- bleu
 model-index:
 - name: mt0-base-pt-vmw
   results: []
 # mt0-base-pt-vmw
 This model is a fine-tuned version of [bigscience/mt0-base](https://huggingface.co/bigscience/mt0-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 6.4560
+- Bleu: 0.8717
+- Gen Len: 127.0
 ## Model description

all_results.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+    "epoch": 0.001219958521410272,
+    "eval_bleu": 0.8717,
+    "eval_gen_len": 127.0,
+    "eval_loss": 6.4559645652771,
+    "eval_runtime": 70.5393,
+    "eval_samples": 195,
+    "eval_samples_per_second": 2.764,
+    "eval_steps_per_second": 0.354,
+    "predict_bleu": 0.7558,
+    "predict_gen_len": 127.0,
+    "predict_loss": 6.439186096191406,
+    "predict_runtime": 5.146,
+    "predict_samples": 10,
+    "predict_samples_per_second": 1.943,
+    "predict_steps_per_second": 0.389,
+    "total_flos": 13713766612992.0,
+    "train_loss": 6.803397369384766,
+    "train_runtime": 14.6438,
+    "train_samples": 65573,
+    "train_samples_per_second": 5.463,
+    "train_steps_per_second": 0.683
+}

eval_results.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+    "epoch": 0.001219958521410272,
+    "eval_bleu": 0.8717,
+    "eval_gen_len": 127.0,
+    "eval_loss": 6.4559645652771,
+    "eval_runtime": 70.5393,
+    "eval_samples": 195,
+    "eval_samples_per_second": 2.764,
+    "eval_steps_per_second": 0.354
+}

generated_predictions.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+Vatican City uses Italian in its laws and communications offices a ggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweg
+Emakhuwa: Yahoo! n'a Microsoft sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
+Emakhuwa: Omwana wa 30, uyo alikuwa Buffalo, alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa a
+Emakhuwa ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya
+Emakhuwa a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n
+Ministry of Foreign Affairs revealed Wednesday, 11 that a ndege ya ndege ya Moçambique (LAM) ilikuwa n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amag
+Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? -
+Emakhuwa a Blake a n'agamba nti "a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a
+Emakhuwa nĩ ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa ũrĩa
+Emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emak

generation_config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "_from_model_config": true,
   "decoder_start_token_id": 0,
   "eos_token_id": 1,
   "pad_token_id": 0,

 {
   "decoder_start_token_id": 0,
   "eos_token_id": 1,
   "pad_token_id": 0,

predict_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "predict_bleu": 0.7558,
+    "predict_gen_len": 127.0,
+    "predict_loss": 6.439186096191406,
+    "predict_runtime": 5.146,
+    "predict_samples": 10,
+    "predict_samples_per_second": 1.943,
+    "predict_steps_per_second": 0.389
+}

runs/Jun09_01-19-27_serv-9213/events.out.tfevents.1749424963.serv-9213.3250692.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:20f80fc9550ba8041b91c649eea1766fb89ea1460b13d814e3debafa3d7eb9e6
+size 403

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 0.001219958521410272,
+    "total_flos": 13713766612992.0,
+    "train_loss": 6.803397369384766,
+    "train_runtime": 14.6438,
+    "train_samples": 65573,
+    "train_samples_per_second": 5.463,
+    "train_steps_per_second": 0.683
+}

trainer_state.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "best_global_step": null,
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.001219958521410272,
+  "eval_steps": 500,
+  "global_step": 10,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.001219958521410272,
+      "step": 10,
+      "total_flos": 13713766612992.0,
+      "train_loss": 6.803397369384766,
+      "train_runtime": 14.6438,
+      "train_samples_per_second": 5.463,
+      "train_steps_per_second": 0.683
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 10,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 1,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": true
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 13713766612992.0,
+  "train_batch_size": 8,
+  "trial_name": null,
+  "trial_params": null
+}