ethzanalytics
/

distilgpt2-tiny-conversational

Text Generation

text-generation-inference

Model card Files Files and versions

pszemraj commited on Feb 1, 2022

Commit

ac37ba7

·

1 Parent(s): 575c037

Update README.md

Files changed (1) hide show

README.md +7 -8

README.md CHANGED Viewed

@@ -32,16 +32,14 @@ inference:
     min_length: 2
     max_length: 64
     length_penalty: 0.7
-    no_repeat_ngram_size: 3
     do_sample: True
-    top_p: 0.90
-    top_k: 15
     repetition_penalty: 2.1
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # distilgpt2-tiny-conversational
@@ -55,15 +53,16 @@ It achieves the following results on the evaluation set:
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
-- deepspeed
 ### Training hyperparameters
 The following hyperparameters were used during training:

     min_length: 2
     max_length: 64
     length_penalty: 0.7
+    no_repeat_ngram_size: 2
     do_sample: True
+    top_p: 0.95
+    top_k: 30
     repetition_penalty: 2.1
 ---
 # distilgpt2-tiny-conversational
 ## Intended uses & limitations
+- [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
 ## Training and evaluation data
+- [wizard of Wikipedia](https://parl.ai/projects/wizard_of_wikipedia/) parsed, from parlAI
 ## Training procedure
+- deepspeed + huggingface trainer, an example notebook is in [ai-msgbot](https://github.com/pszemraj/ai-msgbot)
 ### Training hyperparameters
 The following hyperparameters were used during training: