aashish1904 commited on
Commit
7948778
·
verified ·
1 Parent(s): b0d1165

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
5
+ language:
6
+ - en
7
+ license: apache-2.0
8
+ tags:
9
+ - text-generation-inference
10
+ - transformers
11
+ - unsloth
12
+ - llama
13
+ - trl
14
+
15
+ ---
16
+
17
+ ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
18
+
19
+ # QuantFactory/Reflection-Llama-3.1-8B-GGUF
20
+ This is quantized version of [terrycraddock/Reflection-Llama-3.1-8B](https://huggingface.co/terrycraddock/Reflection-Llama-3.1-8B) created using llama.cpp
21
+
22
+ # Original Model Card
23
+
24
+
25
+ # Uploaded model
26
+
27
+ - **Developed by:** terrycraddock
28
+ - **License:** apache-2.0
29
+ - **Finetuned from model :** unsloth/Meta-Llama-3.1-8B-bnb-4bit
30
+
31
+ *** Currently Re-Training Model on multiple epochs of the data set to get a better loss rate. I will remove this notice when I upload the new version ***
32
+
33
+ Trained from unsloth/Meta-Llama-3.1-8B-bnb-4bit, you can sample from Reflection Llama-3.1 8B using the same code, pipelines, etc. as any other Llama model. It even uses the stock Llama 3.1 chat template format (though, we've trained in a few new special tokens to aid in reasoning and reflection).
34
+
35
+ During sampling, the model will start by outputting reasoning inside <thinking> and </thinking> tags, and then once it is satisfied with its reasoning, it will output the final answer inside <output> and </output> tags. Each of these tags are special tokens, trained into the model.
36
+
37
+ This enables the model to separate its internal thoughts and reasoning from its final answer, improving the experience for the user.
38
+
39
+ Inside the <thinking> section, the model may output one or more <reflection> tags, which signals the model has caught an error in its reasoning and will attempt to correct it before providing a final answer.
40
+
41
+ This model was finetuned on one epoch of https://huggingface.co/datasets/mahiatlinux/Reflection-Dataset-v2 .
42
+
43
+ This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
44
+
45
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
46
+