Luca-Engel commited on
Commit
5c86529
·
verified ·
1 Parent(s): 548f4d9

do test run on scitas with ref_model

Browse files
Files changed (2) hide show
  1. README.md +21 -20
  2. model.safetensors +1 -1
README.md CHANGED
@@ -1,4 +1,5 @@
1
  ---
 
2
  base_model: mNLP-project/gpt2-finetuned-mcqa
3
  tags:
4
  - trl
@@ -16,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [mNLP-project/gpt2-finetuned-mcqa](https://huggingface.co/mNLP-project/gpt2-finetuned-mcqa) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.6371
20
- - Rewards/chosen: 1.8147
21
- - Rewards/rejected: 1.4746
22
- - Rewards/accuracies: 0.6429
23
- - Rewards/margins: 0.3401
24
- - Logps/rejected: -595.2877
25
- - Logps/chosen: -712.7159
26
- - Logits/rejected: 3.3478
27
- - Logits/chosen: 2.3916
28
 
29
  ## Model description
30
 
@@ -43,7 +44,7 @@ More information needed
43
  ### Training hyperparameters
44
 
45
  The following hyperparameters were used during training:
46
- - learning_rate: 1e-06
47
  - train_batch_size: 8
48
  - eval_batch_size: 8
49
  - seed: 42
@@ -58,16 +59,16 @@ The following hyperparameters were used during training:
58
 
59
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
60
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
61
- | 0.6185 | 0.9993 | 668 | 0.6383 | 1.3102 | 1.0521 | 0.6396 | 0.2580 | -599.5124 | -717.7612 | 3.2114 | 2.3318 |
62
- | 0.6482 | 2.0 | 1337 | 0.6605 | 1.4570 | 1.2176 | 0.6194 | 0.2394 | -597.8582 | -716.2932 | 3.5209 | 2.5720 |
63
- | 0.5926 | 2.9993 | 2005 | 0.6371 | 1.8147 | 1.4746 | 0.6429 | 0.3401 | -595.2877 | -712.7159 | 3.3478 | 2.3916 |
64
- | 0.5284 | 4.0 | 2674 | 0.6425 | 1.8648 | 1.5295 | 0.6276 | 0.3354 | -594.7390 | -712.2144 | 3.4174 | 2.4301 |
65
- | 0.4941 | 4.9993 | 3342 | 0.6490 | 2.1245 | 1.7548 | 0.6313 | 0.3697 | -592.4860 | -709.6179 | 3.7487 | 2.7230 |
66
- | 0.4608 | 6.0 | 4011 | 0.6507 | 2.0729 | 1.7055 | 0.6284 | 0.3675 | -592.9789 | -710.1334 | 3.8444 | 2.7879 |
67
- | 0.4424 | 6.9993 | 4679 | 0.6553 | 2.0245 | 1.6718 | 0.6295 | 0.3527 | -593.3158 | -710.6180 | 3.9476 | 2.8726 |
68
- | 0.4302 | 8.0 | 5348 | 0.6553 | 2.1030 | 1.7333 | 0.6306 | 0.3698 | -592.7012 | -709.8326 | 4.0016 | 2.9177 |
69
- | 0.4161 | 8.9993 | 6016 | 0.6564 | 2.1260 | 1.7538 | 0.6328 | 0.3722 | -592.4957 | -709.6025 | 4.0053 | 2.9198 |
70
- | 0.4051 | 9.9925 | 6680 | 0.6566 | 2.1259 | 1.7535 | 0.6321 | 0.3724 | -592.4987 | -709.6038 | 4.0114 | 2.9244 |
71
 
72
 
73
  ### Framework versions
 
1
  ---
2
+ license: mit
3
  base_model: mNLP-project/gpt2-finetuned-mcqa
4
  tags:
5
  - trl
 
17
 
18
  This model is a fine-tuned version of [mNLP-project/gpt2-finetuned-mcqa](https://huggingface.co/mNLP-project/gpt2-finetuned-mcqa) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.6310
21
+ - Rewards/chosen: 1.4580
22
+ - Rewards/rejected: 1.1845
23
+ - Rewards/accuracies: 0.6414
24
+ - Rewards/margins: 0.2735
25
+ - Logps/rejected: -659.0944
26
+ - Logps/chosen: -787.4795
27
+ - Logits/rejected: -14.9328
28
+ - Logits/chosen: -11.6364
29
 
30
  ## Model description
31
 
 
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
+ - learning_rate: 1e-07
48
  - train_batch_size: 8
49
  - eval_batch_size: 8
50
  - seed: 42
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.6407 | 0.9993 | 668 | 0.6460 | 0.7721 | 0.6216 | 0.6295 | 0.1505 | -664.7236 | -794.3383 | -15.1273 | -11.7899 |
63
+ | 0.6498 | 2.0 | 1337 | 0.6374 | 1.2927 | 1.0475 | 0.6325 | 0.2453 | -660.4651 | -789.1318 | -14.9517 | -11.6401 |
64
+ | 0.6468 | 2.9993 | 2005 | 0.6342 | 1.3734 | 1.1102 | 0.6388 | 0.2632 | -659.8373 | -788.3249 | -14.9535 | -11.6481 |
65
+ | 0.6113 | 4.0 | 2674 | 0.6332 | 1.3317 | 1.0769 | 0.6444 | 0.2548 | -660.1705 | -788.7426 | -14.9930 | -11.6897 |
66
+ | 0.5826 | 4.9993 | 3342 | 0.6310 | 1.4580 | 1.1845 | 0.6414 | 0.2735 | -659.0944 | -787.4795 | -14.9328 | -11.6364 |
67
+ | 0.5613 | 6.0 | 4011 | 0.6317 | 1.4979 | 1.2181 | 0.6407 | 0.2798 | -658.7584 | -787.0804 | -14.9234 | -11.6271 |
68
+ | 0.581 | 6.9993 | 4679 | 0.6316 | 1.5084 | 1.2260 | 0.6437 | 0.2825 | -658.6798 | -786.9750 | -14.9319 | -11.6377 |
69
+ | 0.571 | 8.0 | 5348 | 0.6320 | 1.4992 | 1.2184 | 0.6425 | 0.2808 | -658.7557 | -787.0676 | -14.9334 | -11.6373 |
70
+ | 0.5943 | 8.9993 | 6016 | 0.6317 | 1.5126 | 1.2294 | 0.6437 | 0.2832 | -658.6454 | -786.9331 | -14.9226 | -11.6269 |
71
+ | 0.5635 | 9.9925 | 6680 | 0.6317 | 1.5142 | 1.2308 | 0.6433 | 0.2835 | -658.6317 | -786.9168 | -14.9211 | -11.6256 |
72
 
73
 
74
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:29b5c39873c02a0a785a4ecffe743df33d396127cf87eb35a11073fb87187cb8
3
  size 497774208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:709c372fa7291e9a81d46d9c732baa6bbf559619bd4cc5421efd2764dbfe2284
3
  size 497774208