End of training
Browse files- README.md +9 -0
- all_results.json +23 -0
- eval_results.json +10 -0
- generated_predictions.txt +10 -0
- generation_config.json +0 -1
- predict_results.json +9 -0
- runs/Jun09_01-19-27_serv-9213/events.out.tfevents.1749424963.serv-9213.3250692.1 +3 -0
- train_results.json +9 -0
- trainer_state.json +43 -0
README.md
CHANGED
|
@@ -1,9 +1,14 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
| 3 |
license: apache-2.0
|
| 4 |
base_model: bigscience/mt0-base
|
| 5 |
tags:
|
| 6 |
- generated_from_trainer
|
|
|
|
|
|
|
| 7 |
model-index:
|
| 8 |
- name: mt0-base-pt-vmw
|
| 9 |
results: []
|
|
@@ -15,6 +20,10 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 15 |
# mt0-base-pt-vmw
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [bigscience/mt0-base](https://huggingface.co/bigscience/mt0-base) on an unknown dataset.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
language:
|
| 4 |
+
- pt
|
| 5 |
+
- vmw
|
| 6 |
license: apache-2.0
|
| 7 |
base_model: bigscience/mt0-base
|
| 8 |
tags:
|
| 9 |
- generated_from_trainer
|
| 10 |
+
metrics:
|
| 11 |
+
- bleu
|
| 12 |
model-index:
|
| 13 |
- name: mt0-base-pt-vmw
|
| 14 |
results: []
|
|
|
|
| 20 |
# mt0-base-pt-vmw
|
| 21 |
|
| 22 |
This model is a fine-tuned version of [bigscience/mt0-base](https://huggingface.co/bigscience/mt0-base) on an unknown dataset.
|
| 23 |
+
It achieves the following results on the evaluation set:
|
| 24 |
+
- Loss: 6.4560
|
| 25 |
+
- Bleu: 0.8717
|
| 26 |
+
- Gen Len: 127.0
|
| 27 |
|
| 28 |
## Model description
|
| 29 |
|
all_results.json
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"epoch": 0.001219958521410272,
|
| 3 |
+
"eval_bleu": 0.8717,
|
| 4 |
+
"eval_gen_len": 127.0,
|
| 5 |
+
"eval_loss": 6.4559645652771,
|
| 6 |
+
"eval_runtime": 70.5393,
|
| 7 |
+
"eval_samples": 195,
|
| 8 |
+
"eval_samples_per_second": 2.764,
|
| 9 |
+
"eval_steps_per_second": 0.354,
|
| 10 |
+
"predict_bleu": 0.7558,
|
| 11 |
+
"predict_gen_len": 127.0,
|
| 12 |
+
"predict_loss": 6.439186096191406,
|
| 13 |
+
"predict_runtime": 5.146,
|
| 14 |
+
"predict_samples": 10,
|
| 15 |
+
"predict_samples_per_second": 1.943,
|
| 16 |
+
"predict_steps_per_second": 0.389,
|
| 17 |
+
"total_flos": 13713766612992.0,
|
| 18 |
+
"train_loss": 6.803397369384766,
|
| 19 |
+
"train_runtime": 14.6438,
|
| 20 |
+
"train_samples": 65573,
|
| 21 |
+
"train_samples_per_second": 5.463,
|
| 22 |
+
"train_steps_per_second": 0.683
|
| 23 |
+
}
|
eval_results.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"epoch": 0.001219958521410272,
|
| 3 |
+
"eval_bleu": 0.8717,
|
| 4 |
+
"eval_gen_len": 127.0,
|
| 5 |
+
"eval_loss": 6.4559645652771,
|
| 6 |
+
"eval_runtime": 70.5393,
|
| 7 |
+
"eval_samples": 195,
|
| 8 |
+
"eval_samples_per_second": 2.764,
|
| 9 |
+
"eval_steps_per_second": 0.354
|
| 10 |
+
}
|
generated_predictions.txt
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Vatican City uses Italian in its laws and communications offices a ggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweggweg
|
| 2 |
+
Emakhuwa: Yahoo! n'a Microsoft sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss
|
| 3 |
+
Emakhuwa: Omwana wa 30, uyo alikuwa Buffalo, alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa alikuwa a
|
| 4 |
+
Emakhuwa ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya ntiya
|
| 5 |
+
Emakhuwa a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n
|
| 6 |
+
Ministry of Foreign Affairs revealed Wednesday, 11 that a ndege ya ndege ya Mo莽ambique (LAM) ilikuwa n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amagwilira n'amag
|
| 7 |
+
Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? - Why do women suffer all these discriminations? -
|
| 8 |
+
Emakhuwa a Blake a n'agamba nti "a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a n'a
|
| 9 |
+
Emakhuwa n末 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a 农r末a
|
| 10 |
+
Emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emakhuwa, emak
|
generation_config.json
CHANGED
|
@@ -1,5 +1,4 @@
|
|
| 1 |
{
|
| 2 |
-
"_from_model_config": true,
|
| 3 |
"decoder_start_token_id": 0,
|
| 4 |
"eos_token_id": 1,
|
| 5 |
"pad_token_id": 0,
|
|
|
|
| 1 |
{
|
|
|
|
| 2 |
"decoder_start_token_id": 0,
|
| 3 |
"eos_token_id": 1,
|
| 4 |
"pad_token_id": 0,
|
predict_results.json
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"predict_bleu": 0.7558,
|
| 3 |
+
"predict_gen_len": 127.0,
|
| 4 |
+
"predict_loss": 6.439186096191406,
|
| 5 |
+
"predict_runtime": 5.146,
|
| 6 |
+
"predict_samples": 10,
|
| 7 |
+
"predict_samples_per_second": 1.943,
|
| 8 |
+
"predict_steps_per_second": 0.389
|
| 9 |
+
}
|
runs/Jun09_01-19-27_serv-9213/events.out.tfevents.1749424963.serv-9213.3250692.1
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:20f80fc9550ba8041b91c649eea1766fb89ea1460b13d814e3debafa3d7eb9e6
|
| 3 |
+
size 403
|
train_results.json
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"epoch": 0.001219958521410272,
|
| 3 |
+
"total_flos": 13713766612992.0,
|
| 4 |
+
"train_loss": 6.803397369384766,
|
| 5 |
+
"train_runtime": 14.6438,
|
| 6 |
+
"train_samples": 65573,
|
| 7 |
+
"train_samples_per_second": 5.463,
|
| 8 |
+
"train_steps_per_second": 0.683
|
| 9 |
+
}
|
trainer_state.json
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"best_global_step": null,
|
| 3 |
+
"best_metric": null,
|
| 4 |
+
"best_model_checkpoint": null,
|
| 5 |
+
"epoch": 0.001219958521410272,
|
| 6 |
+
"eval_steps": 500,
|
| 7 |
+
"global_step": 10,
|
| 8 |
+
"is_hyper_param_search": false,
|
| 9 |
+
"is_local_process_zero": true,
|
| 10 |
+
"is_world_process_zero": true,
|
| 11 |
+
"log_history": [
|
| 12 |
+
{
|
| 13 |
+
"epoch": 0.001219958521410272,
|
| 14 |
+
"step": 10,
|
| 15 |
+
"total_flos": 13713766612992.0,
|
| 16 |
+
"train_loss": 6.803397369384766,
|
| 17 |
+
"train_runtime": 14.6438,
|
| 18 |
+
"train_samples_per_second": 5.463,
|
| 19 |
+
"train_steps_per_second": 0.683
|
| 20 |
+
}
|
| 21 |
+
],
|
| 22 |
+
"logging_steps": 500,
|
| 23 |
+
"max_steps": 10,
|
| 24 |
+
"num_input_tokens_seen": 0,
|
| 25 |
+
"num_train_epochs": 1,
|
| 26 |
+
"save_steps": 1000,
|
| 27 |
+
"stateful_callbacks": {
|
| 28 |
+
"TrainerControl": {
|
| 29 |
+
"args": {
|
| 30 |
+
"should_epoch_stop": false,
|
| 31 |
+
"should_evaluate": false,
|
| 32 |
+
"should_log": false,
|
| 33 |
+
"should_save": true,
|
| 34 |
+
"should_training_stop": true
|
| 35 |
+
},
|
| 36 |
+
"attributes": {}
|
| 37 |
+
}
|
| 38 |
+
},
|
| 39 |
+
"total_flos": 13713766612992.0,
|
| 40 |
+
"train_batch_size": 8,
|
| 41 |
+
"trial_name": null,
|
| 42 |
+
"trial_params": null
|
| 43 |
+
}
|