olumideola commited on
Commit
cb2c02b
·
verified ·
1 Parent(s): 1a62942

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -88,6 +88,35 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
88
 
89
  ---
90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
  ## License
92
 
93
  [Llama 3.1 Community License](https://llama.meta.com/llama3/license/)
 
88
 
89
  ---
90
 
91
+ ## Recommended Generation Settings
92
+
93
+ These settings were verified through testing. Without `repetition_penalty`
94
+ and `min_p` the model will ramble and not stop cleanly.
95
+
96
+ ```python
97
+ outputs = model.generate(
98
+ **inputs,
99
+ max_new_tokens=1024,
100
+ do_sample=True,
101
+ temperature=0.7,
102
+ top_p=0.95,
103
+ min_p=0.05,
104
+ repetition_penalty=1.5,
105
+ eos_token_id=[128040, 128009, 128001],
106
+ pad_token_id=128001,
107
+ )
108
+ ```
109
+
110
+ ### Stop Tokens
111
+ This model's ChatML parents (`<|im_end|>`) survived the DARE+TIES merge
112
+ alongside Llama 3.1 native tokens. Use all three:
113
+
114
+ | Token | ID | Source |
115
+ |---|---|---|
116
+ | `<\|im_end\|>` | 128040 | Hermes/Nemotron parents |
117
+ | `<\|eot_id\|>` | 128009 | Llama 3.1 native |
118
+ | `<\|end_of_text\|>` | 128001 | Llama 3.1 native |
119
+
120
  ## License
121
 
122
  [Llama 3.1 Community License](https://llama.meta.com/llama3/license/)