mrarish320 commited on
Commit
e096aaa
·
verified ·
1 Parent(s): 29d4706

End of training

Browse files
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
- base_model: bert-base-uncased
5
  tags:
6
  - generated_from_trainer
7
  model-index:
@@ -14,9 +14,9 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # results
16
 
17
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.8851
20
 
21
  ## Model description
22
 
@@ -35,25 +35,27 @@ More information needed
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
- - learning_rate: 5e-05
39
- - train_batch_size: 8
40
- - eval_batch_size: 8
41
  - seed: 42
42
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 
 
43
  - lr_scheduler_type: linear
44
- - lr_scheduler_warmup_steps: 500
45
- - num_epochs: 1
46
 
47
  ### Training results
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:----:|:---------------:|
51
- | 0.9405 | 1.0 | 50 | 0.8851 |
 
52
 
53
 
54
  ### Framework versions
55
 
56
- - Transformers 4.46.2
57
  - Pytorch 2.5.1+cu121
58
- - Datasets 3.1.0
59
- - Tokenizers 0.20.3
 
1
  ---
2
  library_name: transformers
3
+ license: mit
4
+ base_model: microsoft/phi-1_5
5
  tags:
6
  - generated_from_trainer
7
  model-index:
 
14
 
15
  # results
16
 
17
+ This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.8877
20
 
21
  ## Model description
22
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 1e-05
39
+ - train_batch_size: 2
40
+ - eval_batch_size: 2
41
  - seed: 42
42
+ - gradient_accumulation_steps: 4
43
+ - total_train_batch_size: 8
44
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
+ - num_epochs: 2
47
+ - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:-----:|:----:|:---------------:|
53
+ | 6.0811 | 1.0 | 7 | 1.0800 |
54
+ | 6.0811 | 1.8 | 12 | 0.8877 |
55
 
56
 
57
  ### Framework versions
58
 
59
+ - Transformers 4.47.1
60
  - Pytorch 2.5.1+cu121
61
+ - Tokenizers 0.21.0
 
generation_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "transformers_version": "4.47.1"
4
+ }
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:77766be96c509ea44c8da1c15402ef6a1f002679599e2e2c2975877fffd539b1
3
  size 4984916152
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a12095a1bf17aa4b24781a844d4473967a914ef1227b8bf0426809f47ad3900f
3
  size 4984916152
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0c91ef643478afe943b3df3720627d7795f285779f445567d2225463728d5c9d
3
  size 688204064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c0b2480555bad2a5939b7f9f33d8905bfbff11a215682bf836f63e0ee479465
3
  size 688204064