jhpassion0621 commited on
Commit
ae75f38
·
verified ·
1 Parent(s): 770abc0

End of training

Browse files
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model: google/mt5-base
5
+ tags:
6
+ - generated_from_trainer
7
+ metrics:
8
+ - bleu
9
+ model-index:
10
+ - name: kp-mt5-base
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # kp-mt5-base
18
+
19
+ This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.9680
22
+ - Bleu: 29.4263
23
+ - Gen Len: 44.3157
24
+
25
+ ## Model description
26
+
27
+ More information needed
28
+
29
+ ## Intended uses & limitations
30
+
31
+ More information needed
32
+
33
+ ## Training and evaluation data
34
+
35
+ More information needed
36
+
37
+ ## Training procedure
38
+
39
+ ### Training hyperparameters
40
+
41
+ The following hyperparameters were used during training:
42
+ - learning_rate: 2.59e-05
43
+ - train_batch_size: 24
44
+ - eval_batch_size: 32
45
+ - seed: 42
46
+ - gradient_accumulation_steps: 4
47
+ - total_train_batch_size: 96
48
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
+ - lr_scheduler_type: linear
50
+ - num_epochs: 3
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Bleu | Gen Len | Validation Loss |
55
+ |:-------------:|:------:|:------:|:-------:|:-------:|:---------------:|
56
+ | 1.8289 | 0.4026 | 20001 | 17.9948 | 45.585 | 1.3952 |
57
+ | 1.569 | 0.8052 | 40002 | 22.2856 | 45.2362 | 1.2204 |
58
+ | 1.463 | 1.2078 | 60003 | 24.6884 | 44.4264 | 1.1352 |
59
+ | 1.3552 | 1.6104 | 80004 | 26.0272 | 44.3574 | 1.0864 |
60
+ | 1.3187 | 2.0130 | 100005 | 27.0619 | 45.005 | 1.0543 |
61
+ | 1.3011 | 2.4156 | 120006 | 1.0286 | 27.63 | 44.1082 |
62
+ | 1.2552 | 2.8182 | 140007 | 0.9847 | 29.0405 | 44.4054 |
63
+
64
+
65
+ ### Framework versions
66
+
67
+ - Transformers 4.56.1
68
+ - Pytorch 2.8.0+cu126
69
+ - Datasets 4.0.0
70
+ - Tokenizers 0.22.0
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": [
4
+ 1
5
+ ],
6
+ "pad_token_id": 0,
7
+ "transformers_version": "4.56.1"
8
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:279140d23bc95173689d32c0bdcdf97ab7a04ea46b9017fa90e38fd0ff88d4c0
3
  size 2329638768
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ae3c92d5b85a8f07090c777dba813742a9b96e021c2cf745d9253bcb341f1c0
3
  size 2329638768
runs/Sep20_09-18-10_cfa1d4a90e71/events.out.tfevents.1758359926.cfa1d4a90e71.7055.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:19357cb79618c7394aa2f8c3d16f0201651a4c8139835c54fd971bf76f35d45b
3
- size 27228
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c59047e61e63d9c6689f74fe16820f670494bd6155cbe1822e8dc634df0df7df
3
+ size 37640
runs/Sep20_09-18-10_cfa1d4a90e71/events.out.tfevents.1758395441.cfa1d4a90e71.7055.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d21af173d1c39420a1bb5bfa4d4be4785faf3274e79730f6102e262009bf536c
3
+ size 465