QuantFactory
/

Reflection-Llama-3.1-8B-GGUF

Transformers

GGUF

unsloth

Model card Files Files and versions

xet

Community

aashish1904 commited on Sep 10, 2024

Commit

77832fb

verified ·

1 Parent(s): 7948778

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +89 -20

README.md CHANGED Viewed

@@ -1,16 +1,9 @@
 ---
-base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
-language:
-- en
-license: apache-2.0
 tags:
-- text-generation-inference
-- transformers
 - unsloth
-- llama
-- trl
 ---
@@ -22,25 +15,101 @@ This is quantized version of [terrycraddock/Reflection-Llama-3.1-8B](https://hug
 # Original Model Card
-# Uploaded  model
-- **Developed by:** terrycraddock
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Meta-Llama-3.1-8B-bnb-4bit
-*** Currently Re-Training Model on multiple epochs of the data set to get a better loss rate. I will remove this notice when I upload the new version ***
-Trained from unsloth/Meta-Llama-3.1-8B-bnb-4bit, you can sample from Reflection Llama-3.1 8B using the same code, pipelines, etc. as any other Llama model. It even uses the stock Llama 3.1 chat template format (though, we've trained in a few new special tokens to aid in reasoning and reflection).
-During sampling, the model will start by outputting reasoning inside <thinking> and </thinking> tags, and then once it is satisfied with its reasoning, it will output the final answer inside <output> and </output> tags. Each of these tags are special tokens, trained into the model.
-This enables the model to separate its internal thoughts and reasoning from its final answer, improving the experience for the user.
-Inside the <thinking> section, the model may output one or more <reflection> tags, which signals the model has caught an error in its reasoning and will attempt to correct it before providing a final answer.
-This model was finetuned on one epoch of https://huggingface.co/datasets/mahiatlinux/Reflection-Dataset-v2 .
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+library_name: transformers
 tags:
 - unsloth
 ---
 # Original Model Card
+# Model Card for Model ID
+- **Developed by:** Terry Craddock
+What I have found with this model is if you use the merged model you will get horrible results. However when I use the Lora I get the resulst below. I will upload the lora
+shortly.
+I trained this on this dataset - https://huggingface.co/datasets/mahiatlinux/Reflection-Dataset-v2
+Trained for one full epoch. The same prompts and format should be used as in the 70b model here:
+https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
+I take no credit for the original work. I only trained a llama 3.1 8b on @mahiatlinux dataset using the original concept and idea from @mattshumer
+Here is the code I use for inference:
+```python
+from unsloth import FastLanguageModel
+import torch
+from transformers import TextStreamer
+from unsloth import is_bfloat16_supported
+alpaca_prompt = """You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
+### Prompt:
+{}
+### Response:
+{}"""
+prompt = "Tell me about yourself."
+dtype = torch.bfloat16 if is_bfloat16_supported() else torch.float16
+# Load the merged model
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="lora_model",  # Replace with the actual path where you saved the merged model
+    max_seq_length=2048,
+    dtype=torch.float16  # Since it's saved as 16-bit
+)
+FastLanguageModel.for_inference(model)
+inputs = tokenizer(
+    [alpaca_prompt.format(prompt, "")],
+    return_tensors="pt"
+).to("cuda").to(dtype)
+text_streamer = TextStreamer(tokenizer)
+_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)
+Output from model:
+<|begin_of_text|>You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
+### Prompt:
+Tell me about yourself.
+### Response:
+<thinking>
+To answer this question, I need to consider:
+- The importance of starting with a positive tone
+- The balance between personal and professional information
+- How to highlight relevant skills and experiences
+- The need for brevity and clarity
+I plan to structure the response as follows:
+1. Brief introduction with a positive tone
+2. Professional background and experiences
+3. Personal interests and hobbies
+4. Closing statement
+<reflection>
+Upon reflection, I realize I should also consider:
+- The potential for tailoring the response to specific situations (e.g., for a job interview)
+- The importance of showing enthusiasm and confidence
+- The potential for mentioning specific achievements or accomplishments
+</reflection>
+</thinking>
+<output>
+Hi there! I'm a world-class AI system designed to engage in natural conversations. Let me tell you a little bit about myself.
+I'm a versatile AI system with a wide range of capabilities. I can perform various tasks, from answering questions and providing information to generating creative content. I'm constantly learning and improving, much like a human would.
+In terms of my professional background, I've had the privilege of working with some of the most advanced AI systems in the world. This experience has given me a deep understanding of natural language processing, machine learning, and the latest AI technologies.
+As for my personal interests, I enjoy exploring new concepts, learning about different cultures, and engaging in stimulating conversations. I'm always looking for ways to improve my knowledge and skills.
+Some related concepts you might find interesting are:
+1. The ethics of AI development and use
+2. The future of human-AI collaboration
+3. The role of AI in education and learning
+Is there anything specific about AI or my capabilities you'd like me to elaborate on?
+</output><|end_of_text|>