add 4 epochs tuning

Browse files

Files changed (6) hide show

README.md +18 -55
config.json +1 -1
pytorch_model.bin +1 -1
tokenizer_config.json +1 -1
trainer_state.json +0 -0
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -2,55 +2,19 @@
 license: apache-2.0
 tags:
 - generated_from_trainer
-- distilgpt2
-- email generation
-- email
-datasets:
-- aeslc
-- postbot/multi-emails-100k
-widget:
-- text: "Good Morning Professor Beans,
-Hope you are doing well. I just wanted to reach out and ask if differential calculus will be on the exam"
-  example_title: "email to prof"
-- text: "Hey <NAME>,\n\nThank you for signing up for my weekly newsletter. Before we get started, you'll have to confirm your email address."
-  example_title: "newsletter"
-- text: "Hi <NAME>,\n\nI hope this email finds you well. I wanted to reach out and ask about office hours"
-  example_title: "office hours"
-- text: "Greetings <NAME>,\n\nI hope you had a splendid evening at the Company sausage eating festival. I am reaching out because"
-  example_title: "festival"
-- text: "Good Morning Harold,\n\nI was wondering when the next"
-  example_title: "event"
-- text: "URGENT - I need the TPS reports"
-  example_title: "URGENT"
-- text: "Hi Archibald,\n\nI hope this email finds you extremely well."
-  example_title: "emails that find you"
-- text: "Hello there.\n\nI just wanted to reach out and check in to"
-  example_title: "checking in"
-- text: "Hello <NAME>,\n\nI hope this email finds you well. I wanted to reach out and see if you've enjoyed your time with us"
-  example_title: "work well"
-- text: "Hi <NAME>,\n\nI hope this email finds you well. I wanted to reach out and see if we could catch up"
-  example_title: "catch up"
-- text: "I'm <NAME> and I just moved into the area and wanted to reach out and get some details on where I could get groceries and"
-  example_title: "grocery"
-parameters:
-  min_length: 4
-  max_length: 128
-  length_penalty: 0.8
-  no_repeat_ngram_size: 2
-  do_sample: False
-  num_beams: 12
-  early_stopping: True
-  repetition_penalty: 2.5
 ---
-# distilgpt2-emailgen-v2
-This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on the postbot/multi-emails-100k dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.0401
 ## Model description
@@ -69,27 +33,26 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.001
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
 - distributed_type: multi-GPU
-- gradient_accumulation_steps: 16
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_ratio: 0.02
-- num_epochs: 5
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.4393        | 1.0   | 789  | 2.3821          |
-| 2.1549        | 2.0   | 1578 | 2.1982          |
-| 2.1424        | 3.0   | 2367 | 2.1065          |
-| 1.9885        | 4.0   | 3156 | 2.0514          |
-| 1.806         | 5.0   | 3945 | 2.0401          |
 ### Framework versions

 license: apache-2.0
 tags:
 - generated_from_trainer
+model-index:
+- name: distilgpt2-emailgen-V2-emailgen_DS-multi-clean-100k_Ep-4_Bs-16
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# distilgpt2-emailgen-V2-emailgen_DS-multi-clean-100k_Ep-4_Bs-16
+This model is a fine-tuned version of [postbot/distilgpt2-emailgen-V2](https://huggingface.co/postbot/distilgpt2-emailgen-V2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.9126
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0006
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - distributed_type: multi-GPU
+- gradient_accumulation_steps: 8
 - total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.01
+- num_epochs: 4
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.9045        | 1.0   | 789  | 2.0006          |
+| 1.8115        | 2.0   | 1578 | 1.9557          |
+| 1.8501        | 3.0   | 2367 | 1.9110          |
+| 1.7376        | 4.0   | 3156 | 1.9126          |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "distilgpt2",
   "_num_labels": 1,
   "activation_function": "gelu_new",
   "architectures": [

 {
+  "_name_or_path": "postbot/distilgpt2-emailgen-V2",
   "_num_labels": 1,
   "activation_function": "gelu_new",
   "architectures": [

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:264051a4107113d7cccbff8e4c2b9afcaa6208ac97db37edb57fa161cdd1d5dc
 size 333969117

 version https://git-lfs.github.com/spec/v1
+oid sha256:5f564a5fc55c8e20acf4d8195073ee3c5a0ce012ab28beb9c5bdf357fa1b26b7
 size 333969117

tokenizer_config.json CHANGED Viewed

@@ -19,7 +19,7 @@
   },
   "errors": "replace",
   "model_max_length": 1024,
-  "name_or_path": "distilgpt2",
   "pad_token": null,
   "special_tokens_map_file": null,
   "tokenizer_class": "GPT2Tokenizer",

   },
   "errors": "replace",
   "model_max_length": 1024,
+  "name_or_path": "postbot/distilgpt2-emailgen-V2",
   "pad_token": null,
   "special_tokens_map_file": null,
   "tokenizer_class": "GPT2Tokenizer",

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:dde570ff07cb025ff9030c6c32fd2d0b180fd30e7ee1096d7ce8a33b3d149c7f
-size 3567

 version https://git-lfs.github.com/spec/v1
+oid sha256:d55d1b578701be043c5e54f68e065520d66dee6b2f36f998afd658361f854c85
+size 3631