QuantFactory
/

Reflection-Llama-3.1-8B-GGUF

+---
+base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
+language:
+- en
+license: apache-2.0
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- llama
+- trl
+---
+![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
+# QuantFactory/Reflection-Llama-3.1-8B-GGUF
+This is quantized version of [terrycraddock/Reflection-Llama-3.1-8B](https://huggingface.co/terrycraddock/Reflection-Llama-3.1-8B) created using llama.cpp
+# Original Model Card
+# Uploaded  model
+- **Developed by:** terrycraddock
+- **License:** apache-2.0
+- **Finetuned from model :** unsloth/Meta-Llama-3.1-8B-bnb-4bit
+*** Currently Re-Training Model on multiple epochs of the data set to get a better loss rate. I will remove this notice when I upload the new version ***
+Trained from unsloth/Meta-Llama-3.1-8B-bnb-4bit, you can sample from Reflection Llama-3.1 8B using the same code, pipelines, etc. as any other Llama model. It even uses the stock Llama 3.1 chat template format (though, we've trained in a few new special tokens to aid in reasoning and reflection).
+During sampling, the model will start by outputting reasoning inside <thinking> and </thinking> tags, and then once it is satisfied with its reasoning, it will output the final answer inside <output> and </output> tags. Each of these tags are special tokens, trained into the model.
+This enables the model to separate its internal thoughts and reasoning from its final answer, improving the experience for the user.
+Inside the <thinking> section, the model may output one or more <reflection> tags, which signals the model has caught an error in its reasoning and will attempt to correct it before providing a final answer.
+This model was finetuned on one epoch of https://huggingface.co/datasets/mahiatlinux/Reflection-Dataset-v2 .
+This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
+[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)