Push classification fine-tuned model
Browse files- README.md +56 -56
- config.json +22 -24
- model.safetensors +2 -2
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on an unknown dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
-
- Loss: 0.
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
|
@@ -35,9 +35,9 @@ More information needed
|
|
| 35 |
### Training hyperparameters
|
| 36 |
|
| 37 |
The following hyperparameters were used during training:
|
| 38 |
-
- learning_rate: 0.
|
| 39 |
-
- train_batch_size:
|
| 40 |
-
- eval_batch_size:
|
| 41 |
- seed: 42
|
| 42 |
- distributed_type: multi-GPU
|
| 43 |
- optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
|
@@ -48,58 +48,58 @@ The following hyperparameters were used during training:
|
|
| 48 |
|
| 49 |
### Training results
|
| 50 |
|
| 51 |
-
| Training Loss | Epoch | Step
|
| 52 |
-
|
| 53 |
-
|
|
| 54 |
-
|
|
| 55 |
-
|
|
| 56 |
-
|
|
| 57 |
-
|
|
| 58 |
-
|
|
| 59 |
-
|
|
| 60 |
-
| 1.
|
| 61 |
-
| 1.
|
| 62 |
-
| 1.
|
| 63 |
-
| 1.
|
| 64 |
-
| 1.
|
| 65 |
-
| 1.
|
| 66 |
-
| 1.
|
| 67 |
-
| 1.
|
| 68 |
-
|
|
| 69 |
-
|
|
| 70 |
-
|
|
| 71 |
-
|
|
| 72 |
-
|
|
| 73 |
-
|
|
| 74 |
-
|
|
| 75 |
-
| 0.
|
| 76 |
-
| 0.
|
| 77 |
-
| 0.
|
| 78 |
-
| 0.
|
| 79 |
-
| 0.
|
| 80 |
-
| 0.
|
| 81 |
-
| 0.
|
| 82 |
-
| 0.
|
| 83 |
-
| 0.
|
| 84 |
-
| 0.
|
| 85 |
-
| 0.
|
| 86 |
-
| 0.
|
| 87 |
-
| 0.
|
| 88 |
-
| 0.
|
| 89 |
-
| 0.
|
| 90 |
-
| 0.
|
| 91 |
-
| 0.
|
| 92 |
-
| 0.
|
| 93 |
-
| 0.
|
| 94 |
-
| 0.
|
| 95 |
-
| 0.
|
| 96 |
-
| 0.
|
| 97 |
-
| 0.
|
| 98 |
-
| 0.
|
| 99 |
-
| 0.
|
| 100 |
-
| 0.
|
| 101 |
-
| 0.
|
| 102 |
-
| 0.
|
| 103 |
|
| 104 |
|
| 105 |
### Framework versions
|
|
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on an unknown dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
+
- Loss: 0.5728
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
|
|
|
| 35 |
### Training hyperparameters
|
| 36 |
|
| 37 |
The following hyperparameters were used during training:
|
| 38 |
+
- learning_rate: 0.005
|
| 39 |
+
- train_batch_size: 3072
|
| 40 |
+
- eval_batch_size: 1024
|
| 41 |
- seed: 42
|
| 42 |
- distributed_type: multi-GPU
|
| 43 |
- optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
|
|
|
| 48 |
|
| 49 |
### Training results
|
| 50 |
|
| 51 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
| 52 |
+
|:-------------:|:-----:|:----:|:---------------:|
|
| 53 |
+
| 8.921 | 1.0 | 9 | 7.9531 |
|
| 54 |
+
| 6.1949 | 2.0 | 18 | 5.2070 |
|
| 55 |
+
| 4.6541 | 3.0 | 27 | 4.0664 |
|
| 56 |
+
| 3.6725 | 4.0 | 36 | 3.2637 |
|
| 57 |
+
| 2.9716 | 5.0 | 45 | 2.6895 |
|
| 58 |
+
| 2.4881 | 6.0 | 54 | 2.3086 |
|
| 59 |
+
| 2.1437 | 7.0 | 63 | 2.0430 |
|
| 60 |
+
| 1.8888 | 8.0 | 72 | 1.8438 |
|
| 61 |
+
| 1.7087 | 9.0 | 81 | 1.6875 |
|
| 62 |
+
| 1.5463 | 10.0 | 90 | 1.5508 |
|
| 63 |
+
| 1.4124 | 11.0 | 99 | 1.4395 |
|
| 64 |
+
| 1.3009 | 12.0 | 108 | 1.3467 |
|
| 65 |
+
| 1.2058 | 13.0 | 117 | 1.2627 |
|
| 66 |
+
| 1.1296 | 14.0 | 126 | 1.2031 |
|
| 67 |
+
| 1.0643 | 15.0 | 135 | 1.1465 |
|
| 68 |
+
| 0.9962 | 16.0 | 144 | 1.0898 |
|
| 69 |
+
| 0.9387 | 17.0 | 153 | 1.0449 |
|
| 70 |
+
| 0.8919 | 18.0 | 162 | 1.0049 |
|
| 71 |
+
| 0.8522 | 19.0 | 171 | 0.9648 |
|
| 72 |
+
| 0.8161 | 20.0 | 180 | 0.9346 |
|
| 73 |
+
| 0.7829 | 21.0 | 189 | 0.8999 |
|
| 74 |
+
| 0.7489 | 22.0 | 198 | 0.8784 |
|
| 75 |
+
| 0.7249 | 23.0 | 207 | 0.8516 |
|
| 76 |
+
| 0.6945 | 24.0 | 216 | 0.8311 |
|
| 77 |
+
| 0.6763 | 25.0 | 225 | 0.8076 |
|
| 78 |
+
| 0.6529 | 26.0 | 234 | 0.7915 |
|
| 79 |
+
| 0.6309 | 27.0 | 243 | 0.7793 |
|
| 80 |
+
| 0.6121 | 28.0 | 252 | 0.7617 |
|
| 81 |
+
| 0.6009 | 29.0 | 261 | 0.7485 |
|
| 82 |
+
| 0.5841 | 30.0 | 270 | 0.7314 |
|
| 83 |
+
| 0.5598 | 31.0 | 279 | 0.7197 |
|
| 84 |
+
| 0.5529 | 32.0 | 288 | 0.7085 |
|
| 85 |
+
| 0.5378 | 33.0 | 297 | 0.6997 |
|
| 86 |
+
| 0.522 | 34.0 | 306 | 0.6846 |
|
| 87 |
+
| 0.5097 | 35.0 | 315 | 0.6650 |
|
| 88 |
+
| 0.5017 | 36.0 | 324 | 0.6602 |
|
| 89 |
+
| 0.4889 | 37.0 | 333 | 0.6567 |
|
| 90 |
+
| 0.4795 | 38.0 | 342 | 0.6426 |
|
| 91 |
+
| 0.4682 | 39.0 | 351 | 0.6396 |
|
| 92 |
+
| 0.4646 | 40.0 | 360 | 0.6323 |
|
| 93 |
+
| 0.4526 | 41.0 | 369 | 0.6226 |
|
| 94 |
+
| 0.4474 | 42.0 | 378 | 0.6133 |
|
| 95 |
+
| 0.4387 | 43.0 | 387 | 0.6040 |
|
| 96 |
+
| 0.432 | 44.0 | 396 | 0.6064 |
|
| 97 |
+
| 0.4258 | 45.0 | 405 | 0.6011 |
|
| 98 |
+
| 0.4194 | 46.0 | 414 | 0.5938 |
|
| 99 |
+
| 0.4113 | 47.0 | 423 | 0.5854 |
|
| 100 |
+
| 0.4076 | 48.0 | 432 | 0.5850 |
|
| 101 |
+
| 0.402 | 49.0 | 441 | 0.5703 |
|
| 102 |
+
| 0.3934 | 50.0 | 450 | 0.5728 |
|
| 103 |
|
| 104 |
|
| 105 |
### Framework versions
|
config.json
CHANGED
|
@@ -335,18 +335,17 @@
|
|
| 335 |
"323": "tumor_stage_n",
|
| 336 |
"324": "tumor_stage_t",
|
| 337 |
"325": "ultrasound_doppler_grade",
|
| 338 |
-
"326": "
|
| 339 |
-
"327": "
|
| 340 |
-
"328": "
|
| 341 |
-
"329": "
|
| 342 |
-
"330": "
|
| 343 |
-
"331": "
|
| 344 |
-
"332": "
|
| 345 |
-
"333": "
|
| 346 |
-
"334": "
|
| 347 |
-
"335": "
|
| 348 |
-
"336": "
|
| 349 |
-
"337": "worsening_heart_failure_start_date"
|
| 350 |
},
|
| 351 |
"initializer_range": 0.02,
|
| 352 |
"label2id": {
|
|
@@ -676,18 +675,17 @@
|
|
| 676 |
"tumor_stage_n": 323,
|
| 677 |
"tumor_stage_t": 324,
|
| 678 |
"ultrasound_doppler_grade": 325,
|
| 679 |
-
"
|
| 680 |
-
"
|
| 681 |
-
"
|
| 682 |
-
"
|
| 683 |
-
"
|
| 684 |
-
"
|
| 685 |
-
"
|
| 686 |
-
"
|
| 687 |
-
"
|
| 688 |
-
"
|
| 689 |
-
"
|
| 690 |
-
"worsening_heart_failure_start_date": 337
|
| 691 |
},
|
| 692 |
"layer_norm_epsilon": 1e-05,
|
| 693 |
"model_type": "gpt2",
|
|
|
|
| 335 |
"323": "tumor_stage_n",
|
| 336 |
"324": "tumor_stage_t",
|
| 337 |
"325": "ultrasound_doppler_grade",
|
| 338 |
+
"326": "urine_albumin_creatinine_ratio",
|
| 339 |
+
"327": "visit_end_date",
|
| 340 |
+
"328": "visit_id",
|
| 341 |
+
"329": "visit_start_date",
|
| 342 |
+
"330": "visit_type",
|
| 343 |
+
"331": "wbc_count",
|
| 344 |
+
"332": "weight",
|
| 345 |
+
"333": "worsening_heart_failure_episode_order",
|
| 346 |
+
"334": "worsening_heart_failure_event_type",
|
| 347 |
+
"335": "worsening_heart_failure_flag",
|
| 348 |
+
"336": "worsening_heart_failure_start_date"
|
|
|
|
| 349 |
},
|
| 350 |
"initializer_range": 0.02,
|
| 351 |
"label2id": {
|
|
|
|
| 675 |
"tumor_stage_n": 323,
|
| 676 |
"tumor_stage_t": 324,
|
| 677 |
"ultrasound_doppler_grade": 325,
|
| 678 |
+
"urine_albumin_creatinine_ratio": 326,
|
| 679 |
+
"visit_end_date": 327,
|
| 680 |
+
"visit_id": 328,
|
| 681 |
+
"visit_start_date": 329,
|
| 682 |
+
"visit_type": 330,
|
| 683 |
+
"wbc_count": 331,
|
| 684 |
+
"weight": 332,
|
| 685 |
+
"worsening_heart_failure_episode_order": 333,
|
| 686 |
+
"worsening_heart_failure_event_type": 334,
|
| 687 |
+
"worsening_heart_failure_flag": 335,
|
| 688 |
+
"worsening_heart_failure_start_date": 336
|
|
|
|
| 689 |
},
|
| 690 |
"layer_norm_epsilon": 1e-05,
|
| 691 |
"model_type": "gpt2",
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3da2a50385230c5fbc8a06c831e6d9923d39a4e7d1a1622f21c93d585cfb33d1
|
| 3 |
+
size 164351968
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 7416
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:20f6401f4171ba69319ed6ea406c608618cc9b7cff9c69647950762dfe6d6707
|
| 3 |
size 7416
|