FiveC commited on
Commit
8aef529
·
verified ·
1 Parent(s): 3625fc7

End of training

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +63 -0
  3. tokenizer.json +3 -0
  4. tokenizer_config.json +20 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ base_model: facebook/mbart-large-50-many-to-many-mmt
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - sacrebleu
8
+ model-index:
9
+ - name: za_zh_sc
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # za_zh_sc
17
+
18
+ This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 2.6823
21
+ - Sacrebleu: 6.0325
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
+
27
+ ## Intended uses & limitations
28
+
29
+ More information needed
30
+
31
+ ## Training and evaluation data
32
+
33
+ More information needed
34
+
35
+ ## Training procedure
36
+
37
+ ### Training hyperparameters
38
+
39
+ The following hyperparameters were used during training:
40
+ - learning_rate: 2e-05
41
+ - train_batch_size: 16
42
+ - eval_batch_size: 16
43
+ - seed: 42
44
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
+ - lr_scheduler_type: linear
46
+ - num_epochs: 3
47
+ - mixed_precision_training: Native AMP
48
+
49
+ ### Training results
50
+
51
+ | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
52
+ |:-------------:|:-----:|:----:|:---------------:|:---------:|
53
+ | 2.4359 | 1.0 | 309 | 2.9542 | 3.4303 |
54
+ | 1.5757 | 2.0 | 618 | 2.7130 | 4.8701 |
55
+ | 1.2767 | 3.0 | 927 | 2.6823 | 6.0325 |
56
+
57
+
58
+ ### Framework versions
59
+
60
+ - Transformers 5.0.0
61
+ - Pytorch 2.10.0+cu128
62
+ - Datasets 4.0.0
63
+ - Tokenizers 0.22.2
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1eeb5a28135f15f64314077af5cae74547d898c979266144ecd95f709e76b008
3
+ size 16793473
tokenizer_config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "bos_token": "<s>",
4
+ "cls_token": "<s>",
5
+ "eos_token": "</s>",
6
+ "extra_special_tokens": [
7
+ "<SEP>"
8
+ ],
9
+ "is_local": false,
10
+ "language_codes": "ML50",
11
+ "mask_token": "<mask>",
12
+ "model_max_length": 1000000000000000019884624838656,
13
+ "pad_token": "<pad>",
14
+ "sep_token": "</s>",
15
+ "src_lang": "zh_CN",
16
+ "tgt_lang": null,
17
+ "tokenizer_class": "MBart50Tokenizer",
18
+ "unk_id": 0,
19
+ "unk_token": "<unk>"
20
+ }