mschonhardt commited on
Commit
dbbea80
·
verified ·
1 Parent(s): 36a2dc2

Model save

Browse files
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [google/byt5-base](https://huggingface.co/google/byt5-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 7.9562
22
 
23
  ## Model description
24
 
@@ -37,7 +37,7 @@ More information needed
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
- - learning_rate: 0.0001
41
  - train_batch_size: 8
42
  - eval_batch_size: 8
43
  - seed: 42
@@ -46,15 +46,14 @@ The following hyperparameters were used during training:
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
48
  - num_epochs: 3
49
- - mixed_precision_training: Native AMP
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-----:|:----:|:---------------:|
55
- | 10.0892 | 1.0 | 971 | 7.9562 |
56
- | 10.0874 | 2.0 | 1942 | 7.9562 |
57
- | 10.0738 | 3.0 | 2913 | 7.9562 |
58
 
59
 
60
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [google/byt5-base](https://huggingface.co/google/byt5-base) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.1427
22
 
23
  ## Model description
24
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
+ - learning_rate: 5e-05
41
  - train_batch_size: 8
42
  - eval_batch_size: 8
43
  - seed: 42
 
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
48
  - num_epochs: 3
 
49
 
50
  ### Training results
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 0.3895 | 1.0 | 971 | 0.2032 |
55
+ | 0.2885 | 2.0 | 1942 | 0.1498 |
56
+ | 0.272 | 3.0 | 2913 | 0.1427 |
57
 
58
 
59
  ### Framework versions
adapter_config.json CHANGED
@@ -25,8 +25,8 @@
25
  "rank_pattern": {},
26
  "revision": null,
27
  "target_modules": [
28
- "v",
29
- "q"
30
  ],
31
  "task_type": "SEQ_2_SEQ_LM",
32
  "trainable_token_indices": null,
 
25
  "rank_pattern": {},
26
  "revision": null,
27
  "target_modules": [
28
+ "q",
29
+ "v"
30
  ],
31
  "task_type": "SEQ_2_SEQ_LM",
32
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2431ec5ecaca0cc349717e64130553b7930b77f9cb55889e7e62e456b735942
3
  size 4440360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab9e49e7c8a08ad4ed2c0bb608d46650540d1c9b5df3c1fdb1e29cc05d38a883
3
  size 4440360
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e9dbdf6b0dc80f506334ba7c3df2f5db5d046e7157aa0be9f34b502003205cd5
3
  size 5969
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ac9f8bf7a5d5edc6ed6a9840f1c973f0af94e39c504b21e181dab7569258e72
3
  size 5969