ghost-x
/

ghost-7b-alpha

@@ -660,7 +660,7 @@ messages = [
 output = generate_sample(
   messages=messages,
-  max_new_tokens=256, temperature=0.2, top_k=50, top_p=1,
 )
 ```
@@ -739,7 +739,7 @@ messages = [
 output = generate_sample(
   messages=messages,
-  max_new_tokens=256, temperature=0.2, top_k=50, top_p=1,
 )
 ```
@@ -804,7 +804,7 @@ In addition, there are some things you need to know before using as follows:
 #### Generation configuration
-The **temperature** affects the truth of the answer. Setting a **temperature** value greater than 0.2 will result in a more creative answer but may affect the accuracy of the answer, please consider this based on your task.
 Hint: you can write a prompt to receive input and ask the model to choose the appropriate temperature based on the question, useful in the case of virtual assistant development.
@@ -856,7 +856,7 @@ For direct use with `transformers`, you can easily get started with the followin
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
-  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
   ```
@@ -894,7 +894,7 @@ For direct use with `transformers`, you can easily get started with the followin
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
-  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
@@ -936,7 +936,7 @@ For direct use with `unsloth`, you can easily get started with the following ste
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
-  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
   ```

 output = generate_sample(
   messages=messages,
+  max_new_tokens=256, temperature=0.4, top_k=50, top_p=1,
 )
 ```
 output = generate_sample(
   messages=messages,
+  max_new_tokens=256, temperature=0.4, top_k=50, top_p=1,
 )
 ```
 #### Generation configuration
+The **temperature** affects the truth of the answer. Setting a **temperature** value greater than 0.2 - 0.4 will result in a more creative answer but may affect the accuracy of the answer, please consider this based on your task.
 Hint: you can write a prompt to receive input and ask the model to choose the appropriate temperature based on the question, useful in the case of virtual assistant development.
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
+  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.4)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
   ```
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
+  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.4)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
   inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
   for k,v in inputs.items():
       inputs[k] = v.cuda()
+  outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.4)
   results = tokenizer.batch_decode(outputs)[0]
   print(results)
   ```