End of training

Browse files

Files changed (4) hide show

README.md +15 -19
adapter_config.json +4 -2
adapter_model.safetensors +2 -2
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -16,14 +16,19 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
-- Score: 0.0
-- Counts: [0, 0, 0, 0]
-- Totals: [7173, 6173, 5173, 4173]
-- Precisions: [0.0, 0.0, 0.0, 0.0]
-- Bp: 0.1414
-- Sys Len: 7173
-- Ref Len: 21207
 ## Model description
@@ -43,8 +48,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
@@ -52,15 +57,6 @@ The following hyperparameters were used during training:
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
-### Training results
-| Training Loss | Epoch | Step | Validation Loss | Score | Counts       | Totals                   | Precisions           | Bp     | Sys Len | Ref Len |
-|:-------------:|:-----:|:----:|:---------------:|:-----:|:------------:|:------------------------:|:--------------------:|:------:|:-------:|:-------:|
-| 0.0           | 1.0   | 2250 | nan             | 0.0   | [0, 0, 0, 0] | [7173, 6173, 5173, 4173] | [0.0, 0.0, 0.0, 0.0] | 0.1414 | 7173    | 21207   |
-| 0.0           | 2.0   | 4500 | nan             | 0.0   | [0, 0, 0, 0] | [7173, 6173, 5173, 4173] | [0.0, 0.0, 0.0, 0.0] | 0.1414 | 7173    | 21207   |
-| 0.0           | 3.0   | 6750 | nan             | 0.0   | [0, 0, 0, 0] | [7173, 6173, 5173, 4173] | [0.0, 0.0, 0.0, 0.0] | 0.1414 | 7173    | 21207   |
 ### Framework versions
 - PEFT 0.15.2

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: nan
+- eval_score: 0.0
+- eval_counts: [0, 0, 0, 0]
+- eval_totals: [1003, 3, 0, 0]
+- eval_precisions: [0.0, 0.0, 0.0, 0.0]
+- eval_bp: 0.0000
+- eval_sys_len: 1003
+- eval_ref_len: 20237
+- eval_runtime: 35.2587
+- eval_samples_per_second: 28.362
+- eval_steps_per_second: 3.545
+- epoch: 2.0
+- step: 2250
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
 ### Framework versions
 - PEFT 0.15.2

adapter_config.json CHANGED Viewed

@@ -24,8 +24,10 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v",
-    "q"
   ],
   "task_type": "SEQ_2_SEQ_LM",
   "trainable_token_indices": null,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q",
+    "o",
+    "k",
+    "v"
   ],
   "task_type": "SEQ_2_SEQ_LM",
   "trainable_token_indices": null,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a852c350090f71eae13d9d747c614c6f466d83ed563a0dc62a6dd8b6bff0ca8c
-size 1389456

 version https://git-lfs.github.com/spec/v1
+oid sha256:27714901585c613bbac04b31f0c21d44ca7127e01db52e0d21f23fdce8079175
+size 2779024

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9140ac7f6d23fee1e3e95bd60a767520bd27f4d6aa0a344b307be95da4abdf74
 size 5368

 version https://git-lfs.github.com/spec/v1
+oid sha256:bf6360675e7abf87fa8f9a61ad128b4ea1d68558828af18deb49de9c0b07969b
 size 5368