ChiJuiChen commited on
Commit
e6cc979
·
verified ·
1 Parent(s): c7406cf

End of training

Browse files
README.md CHANGED
@@ -1,56 +1,62 @@
1
- ---
2
- license: apache-2.0
3
- base_model: distilgpt2
4
- tags:
5
- - generated_from_trainer
6
- model-index:
7
- - name: lab7_model
8
- results: []
9
- ---
10
-
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
- # lab7_model
15
-
16
- This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
17
-
18
- ## Model description
19
-
20
- More information needed
21
-
22
- ## Intended uses & limitations
23
-
24
- More information needed
25
-
26
- ## Training and evaluation data
27
-
28
- More information needed
29
-
30
- ## Training procedure
31
-
32
- ### Training hyperparameters
33
-
34
- The following hyperparameters were used during training:
35
- - learning_rate: 0.0005
36
- - train_batch_size: 4
37
- - eval_batch_size: 1
38
- - seed: 42
39
- - gradient_accumulation_steps: 8
40
- - total_train_batch_size: 32
41
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
- - lr_scheduler_type: cosine
43
- - lr_scheduler_warmup_steps: 1000
44
- - num_epochs: 1
45
- - mixed_precision_training: Native AMP
46
-
47
- ### Training results
48
-
49
-
50
-
51
- ### Framework versions
52
-
53
- - Transformers 4.40.1
54
- - Pytorch 2.0.0+cu118
55
- - Datasets 2.18.0
56
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: distilgpt2
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: lab7_model
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # lab7_model
15
+
16
+ This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on an unknown dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 1.7172
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 0.0005
38
+ - train_batch_size: 16
39
+ - eval_batch_size: 1
40
+ - seed: 42
41
+ - gradient_accumulation_steps: 8
42
+ - total_train_batch_size: 128
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: cosine
45
+ - lr_scheduler_warmup_steps: 1000
46
+ - num_epochs: 1
47
+ - mixed_precision_training: Native AMP
48
+
49
+ ### Training results
50
+
51
+ | Training Loss | Epoch | Step | Validation Loss |
52
+ |:-------------:|:------:|:-----:|:---------------:|
53
+ | 2.7274 | 0.4614 | 5000 | 2.0313 |
54
+ | 1.887 | 0.9228 | 10000 | 1.7172 |
55
+
56
+
57
+ ### Framework versions
58
+
59
+ - Transformers 4.40.1
60
+ - Pytorch 2.2.0+cu121
61
+ - Datasets 2.19.0
62
+ - Tokenizers 0.19.1
generation_config.json CHANGED
@@ -1,6 +1,6 @@
1
- {
2
- "_from_model_config": true,
3
- "bos_token_id": 0,
4
- "eos_token_id": 0,
5
- "transformers_version": "4.40.1"
6
- }
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "eos_token_id": 0,
5
+ "transformers_version": "4.40.1"
6
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2fabd17e47367694aa9785d293560817b8a9715fa3817c0afc74e62d102c8134
3
  size 326868424
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8e633ff5ff843a0c5848c1494f7b4ff3e0332ac1c14b18243642dd46bcc0a17
3
  size 326868424
runs/May08_19-39-54_KugelblitzPC/events.out.tfevents.1715168395.KugelblitzPC.6980.2 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c9abe7706976c6c3acb4684a74cf2ea7c81ec6d30e97f69f38796cc278d59aa6
3
- size 5984
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:04336da8c45d12c14df2e5d45b1210a7ff08a57259097a4c99babc0add5ac9c6
3
+ size 6338