Instructions to use madoss/nllb-dry-run with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use madoss/nllb-dry-run with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("madoss/nllb-dry-run") model = AutoModelForSeq2SeqLM.from_pretrained("madoss/nllb-dry-run") - Notebooks
- Google Colab
- Kaggle
End of training
Browse files- README.md +13 -12
- all_results.json +7 -7
- config.json +1 -1
- generation_config.json +1 -1
- model.safetensors +1 -1
- test_results.json +7 -7
- tokenizer.json +2 -2
- tokenizer_config.json +0 -1
- training_args.bin +2 -2
README.md
CHANGED
|
@@ -3,6 +3,8 @@ library_name: transformers
|
|
| 3 |
license: cc-by-nc-4.0
|
| 4 |
base_model: facebook/nllb-200-distilled-600M
|
| 5 |
tags:
|
|
|
|
|
|
|
| 6 |
- generated_from_trainer
|
| 7 |
metrics:
|
| 8 |
- bleu
|
|
@@ -14,13 +16,14 @@ model-index:
|
|
| 14 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 15 |
should probably proofread and complete it, then remove this comment. -->
|
| 16 |
|
|
|
|
| 17 |
# nllb-dry-run
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
-
- Loss: 3.
|
| 22 |
-
- Bleu:
|
| 23 |
-
- Chrf:
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
|
@@ -51,18 +54,16 @@ The following hyperparameters were used during training:
|
|
| 51 |
|
| 52 |
| Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf |
|
| 53 |
|:-------------:|:------:|:----:|:---------------:|:------:|:-------:|
|
| 54 |
-
| No log | 0.3125 | 10 | 3.
|
| 55 |
-
| No log | 0.625 | 20 | 3.
|
| 56 |
-
| No log | 0.9375 | 30 | 3.
|
| 57 |
-
| No log | 1.25 | 40 | 3.
|
| 58 |
-
| No log | 1.5625 | 50 | 3.
|
| 59 |
-
| No log | 1.875 | 60 | 3.3710 | 3.9858 | 21.0020 |
|
| 60 |
-
| No log | 2.1875 | 70 | 3.6589 | 2.6974 | 19.3271 |
|
| 61 |
|
| 62 |
|
| 63 |
### Framework versions
|
| 64 |
|
| 65 |
-
- Transformers 5.
|
| 66 |
- Pytorch 2.8.0+cu128
|
| 67 |
-
- Datasets 4.8.
|
| 68 |
- Tokenizers 0.22.2
|
|
|
|
| 3 |
license: cc-by-nc-4.0
|
| 4 |
base_model: facebook/nllb-200-distilled-600M
|
| 5 |
tags:
|
| 6 |
+
- trackio
|
| 7 |
+
- trackio:https://huggingface.co/spaces/madoss/trackio
|
| 8 |
- generated_from_trainer
|
| 9 |
metrics:
|
| 10 |
- bleu
|
|
|
|
| 16 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 17 |
should probably proofread and complete it, then remove this comment. -->
|
| 18 |
|
| 19 |
+
<a href="https://huggingface.co/spaces/madoss/trackio" target="_blank"><img src="https://raw.githubusercontent.com/gradio-app/trackio/refs/heads/main/trackio/assets/badge.png" alt="Visualize in Trackio" title="Visualize in Trackio" style="height: 40px;"/></a>
|
| 20 |
# nllb-dry-run
|
| 21 |
|
| 22 |
This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
|
| 23 |
It achieves the following results on the evaluation set:
|
| 24 |
+
- Loss: 3.5119
|
| 25 |
+
- Bleu: 3.4640
|
| 26 |
+
- Chrf: 18.8077
|
| 27 |
|
| 28 |
## Model description
|
| 29 |
|
|
|
|
| 54 |
|
| 55 |
| Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf |
|
| 56 |
|:-------------:|:------:|:----:|:---------------:|:------:|:-------:|
|
| 57 |
+
| No log | 0.3125 | 10 | 3.4697 | 2.3857 | 15.3837 |
|
| 58 |
+
| No log | 0.625 | 20 | 3.3382 | 3.7821 | 19.2101 |
|
| 59 |
+
| No log | 0.9375 | 30 | 3.3116 | 5.8430 | 18.9142 |
|
| 60 |
+
| No log | 1.25 | 40 | 3.2531 | 3.6516 | 19.1025 |
|
| 61 |
+
| No log | 1.5625 | 50 | 3.5119 | 3.4640 | 18.8077 |
|
|
|
|
|
|
|
| 62 |
|
| 63 |
|
| 64 |
### Framework versions
|
| 65 |
|
| 66 |
+
- Transformers 5.5.4
|
| 67 |
- Pytorch 2.8.0+cu128
|
| 68 |
+
- Datasets 4.8.4
|
| 69 |
- Tokenizers 0.22.2
|
all_results.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
{
|
| 2 |
-
"epoch":
|
| 3 |
-
"test_bleu":
|
| 4 |
-
"test_chrf":
|
| 5 |
-
"test_loss": 4.
|
| 6 |
-
"test_runtime":
|
| 7 |
-
"test_samples_per_second":
|
| 8 |
-
"test_steps_per_second": 0.
|
| 9 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"epoch": 1.5625,
|
| 3 |
+
"test_bleu": 2.3686059277426317,
|
| 4 |
+
"test_chrf": 19.996465593863476,
|
| 5 |
+
"test_loss": 4.228389739990234,
|
| 6 |
+
"test_runtime": 6.6284,
|
| 7 |
+
"test_samples_per_second": 1.509,
|
| 8 |
+
"test_steps_per_second": 0.453
|
| 9 |
}
|
config.json
CHANGED
|
@@ -27,7 +27,7 @@
|
|
| 27 |
"scale_embedding": true,
|
| 28 |
"tie_word_embeddings": true,
|
| 29 |
"tokenizer_class": "NllbTokenizer",
|
| 30 |
-
"transformers_version": "5.
|
| 31 |
"use_cache": false,
|
| 32 |
"vocab_size": 256206
|
| 33 |
}
|
|
|
|
| 27 |
"scale_embedding": true,
|
| 28 |
"tie_word_embeddings": true,
|
| 29 |
"tokenizer_class": "NllbTokenizer",
|
| 30 |
+
"transformers_version": "5.5.4",
|
| 31 |
"use_cache": false,
|
| 32 |
"vocab_size": 256206
|
| 33 |
}
|
generation_config.json
CHANGED
|
@@ -32,7 +32,7 @@
|
|
| 32 |
"temperature": 1.0,
|
| 33 |
"top_k": 50,
|
| 34 |
"top_p": 1.0,
|
| 35 |
-
"transformers_version": "5.
|
| 36 |
"typical_p": 1.0,
|
| 37 |
"use_cache": true
|
| 38 |
}
|
|
|
|
| 32 |
"temperature": 1.0,
|
| 33 |
"top_k": 50,
|
| 34 |
"top_p": 1.0,
|
| 35 |
+
"transformers_version": "5.5.4",
|
| 36 |
"typical_p": 1.0,
|
| 37 |
"use_cache": true
|
| 38 |
}
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2460354912
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6f7f5620207bfc05b93aefecc44aa76e1fafb516a279ce755d0acd91bbe0ca91
|
| 3 |
size 2460354912
|
test_results.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
{
|
| 2 |
-
"epoch":
|
| 3 |
-
"test_bleu":
|
| 4 |
-
"test_chrf":
|
| 5 |
-
"test_loss": 4.
|
| 6 |
-
"test_runtime":
|
| 7 |
-
"test_samples_per_second":
|
| 8 |
-
"test_steps_per_second": 0.
|
| 9 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"epoch": 1.5625,
|
| 3 |
+
"test_bleu": 2.3686059277426317,
|
| 4 |
+
"test_chrf": 19.996465593863476,
|
| 5 |
+
"test_loss": 4.228389739990234,
|
| 6 |
+
"test_runtime": 6.6284,
|
| 7 |
+
"test_samples_per_second": 1.509,
|
| 8 |
+
"test_steps_per_second": 0.453
|
| 9 |
}
|
tokenizer.json
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a033dad7cd9f5a295fff4a20fe14f7b2c6dc7152c670c79d07e502891d460fe7
|
| 3 |
+
size 32240233
|
tokenizer_config.json
CHANGED
|
@@ -209,7 +209,6 @@
|
|
| 209 |
],
|
| 210 |
"is_local": false,
|
| 211 |
"legacy_behaviour": false,
|
| 212 |
-
"local_files_only": false,
|
| 213 |
"mask_token": "<mask>",
|
| 214 |
"model_max_length": 1024,
|
| 215 |
"pad_token": "<pad>",
|
|
|
|
| 209 |
],
|
| 210 |
"is_local": false,
|
| 211 |
"legacy_behaviour": false,
|
|
|
|
| 212 |
"mask_token": "<mask>",
|
| 213 |
"model_max_length": 1024,
|
| 214 |
"pad_token": "<pad>",
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:270f2e120330b110cc57777d90e42017aa5dcd4d8ae3ac8792a08da0702b4137
|
| 3 |
+
size 5393
|