NetherQuartz
/

tatoeba-en-tok

@@ -1,82 +1,77 @@
----
-library_name: transformers
-license: apache-2.0
-base_model: Helsinki-NLP/opus-mt-en-ru
-tags:
-- translation
-- generated_from_trainer
-metrics:
-- bleu
-model-index:
-- name: tatoeba-en-tok
-  results: []
-language:
-- en
-- tok
-datasets:
-- NetherQuartz/tatoeba-tokipona
----
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# tatoeba-en-tok
-This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ru](https://huggingface.co/Helsinki-NLP/opus-mt-en-ru) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3998
-- Bleu: 53.4755
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 64
-- eval_batch_size: 64
-- seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- num_epochs: 15
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch | Step  | Validation Loss | Bleu    |
-|:-------------:|:-----:|:-----:|:---------------:|:-------:|
-| 0.8007        | 1.0   | 1191  | 0.6103          | 42.1290 |
-| 0.601         | 2.0   | 2382  | 0.5092          | 46.7946 |
-| 0.5048        | 3.0   | 3573  | 0.4695          | 48.9150 |
-| 0.4607        | 4.0   | 4764  | 0.4463          | 50.2526 |
-| 0.4291        | 5.0   | 5955  | 0.4321          | 50.9477 |
-| 0.4013        | 6.0   | 7146  | 0.4202          | 51.7015 |
-| 0.3839        | 7.0   | 8337  | 0.4154          | 52.0396 |
-| 0.3674        | 8.0   | 9528  | 0.4090          | 52.7005 |
-| 0.3518        | 9.0   | 10719 | 0.4067          | 52.6233 |
-| 0.3423        | 10.0  | 11910 | 0.4053          | 52.7858 |
-| 0.3332        | 11.0  | 13101 | 0.4016          | 53.1131 |
-| 0.3251        | 12.0  | 14292 | 0.4008          | 53.2837 |
-| 0.3202        | 13.0  | 15483 | 0.3999          | 53.4268 |
-| 0.3131        | 14.0  | 16674 | 0.4001          | 53.4309 |
-| 0.3128        | 15.0  | 17865 | 0.3998          | 53.4755 |
-### Framework versions
-- Transformers 4.52.4
-- Pytorch 2.7.1+cu128
-- Datasets 3.6.0
-- Tokenizers 0.21.1

+---
+library_name: transformers
+license: apache-2.0
+base_model: Helsinki-NLP/opus-mt-en-ru
+tags:
+- translation
+- generated_from_trainer
+metrics:
+- bleu
+model-index:
+- name: tatoeba-en-tok
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# tatoeba-en-tok
+This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ru](https://huggingface.co/Helsinki-NLP/opus-mt-en-ru) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4513
+- Bleu: 49.2199
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 15
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu    |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|
+| 0.8381        | 1.0   | 1167  | 0.6677          | 38.6270 |
+| 0.6401        | 2.0   | 2334  | 0.5611          | 42.8112 |
+| 0.5453        | 3.0   | 3501  | 0.5228          | 44.9041 |
+| 0.5046        | 4.0   | 4668  | 0.4977          | 46.2278 |
+| 0.474         | 5.0   | 5835  | 0.4806          | 47.2086 |
+| 0.4466        | 6.0   | 7002  | 0.4723          | 47.5220 |
+| 0.4274        | 7.0   | 8169  | 0.4662          | 48.3719 |
+| 0.4134        | 8.0   | 9336  | 0.4587          | 48.4629 |
+| 0.3949        | 9.0   | 10503 | 0.4593          | 48.8579 |
+| 0.3864        | 10.0  | 11670 | 0.4537          | 48.6287 |
+| 0.375         | 11.0  | 12837 | 0.4546          | 48.8812 |
+| 0.3692        | 12.0  | 14004 | 0.4522          | 49.1093 |
+| 0.3608        | 13.0  | 15171 | 0.4524          | 49.1794 |
+| 0.3553        | 14.0  | 16338 | 0.4513          | 49.2199 |
+| 0.3533        | 15.0  | 17505 | 0.4518          | 49.2096 |
+### Framework versions
+- Transformers 4.52.4
+- Pytorch 2.7.1+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.1

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9c8164f061532b7011f9bffacf25c89ac37c4eda127a95c929f46f17dbb4ce4d
 size 304869976

 version https://git-lfs.github.com/spec/v1
+oid sha256:4aee73d0cb2664d2a0625eee668c79923e8e24aafc2fd58bdaf985709c98af7a
 size 304869976

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:10d104cda681b54b11291b7a92a1e270d34be12da3522bd3cc321787c8a96cea
 size 5841

 version https://git-lfs.github.com/spec/v1
+oid sha256:7569370907b5c5a83abbe85905c238ac92f23fc1d77bf7433ba66191318d935c
 size 5841