| | --- |
| | license: mit |
| | datasets: |
| | - >- |
| | CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B-filtered-mistral |
| | language: |
| | - en |
| | base_model: |
| | - mistralai/Mistral-Nemo-Instruct-2407 |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | --- |
| | |
| | ## Mistral Nemo 12B R1 |
| |  |
| |
|
| | Took **12 hours** to finetune on **1x Nvidia H100** with the following settings: |
| | - Batch size: 26 |
| | - Gradient accumulation steps: 1 |
| | - Epochs: 1 |
| | - Learning rate: 2e-5 |
| | - Warmup ratio: 0.1 |
| |
|
| | Run the model: |
| | ```python |
| | import torch |
| | from transformers import pipeline |
| | |
| | model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.2" |
| | |
| | pipe = pipeline( |
| | "text-generation", |
| | model=model_id, |
| | torch_dtype=torch.bfloat16, |
| | device_map="auto" |
| | ) |
| | |
| | messages = [ |
| | {"role": "system", "content": "You are a helpful AI assistant named Mistral Nemo."}, |
| | {"role": "user", "content": "How many r's are in strawberry?"} |
| | ] |
| | |
| | outputs = pipe( |
| | messages, |
| | temperature=0.6, |
| | top_p=0.95, |
| | top_k=40, |
| | repetition_penalty=1.1, |
| | max_new_tokens=2048 |
| | ) |
| | |
| | print(outputs[0]["generated_text"][-1]) |
| | ``` |
| |
|
| | ### Example response generated with this model: |
| |
|
| | - User prompt: |
| | ``` |
| | how many r's are in strawberry? |
| | ``` |
| | - Model output: |
| |
|
| | ``` |
| | <think> |
| | To determine the number of 'r's in the word "strawberry," I will start by breaking down the word into its component parts: "straw" and "berry." |
| | |
| | Next, I will count the number of 'r's in each part. The word "straw" contains one 'r,' and the word "berry" contains two 'r's. |
| | |
| | Finally, I will add the number of 'r's from both parts together, which gives me a total of three 'r's in the word "strawberry." |
| | </think> |
| | |
| | To determine the number of 'r's in the word "strawberry," let's break it down step by step: |
| | |
| | 1. **Break down the word:** |
| | |
| | The word "strawberry" can be divided into two parts: "straw" and "berry." |
| | |
| | 2. **Count the 'r's in each part:** |
| | |
| | - In "straw": There is **1** 'r'. |
| | - In "berry": There are **2** 'r's. |
| | |
| | 3. **Add the 'r's together:** |
| | |
| | \( 1 \text{ (from straw)} + 2 \text{ (from berry)} = 3 \) |
| | |
| | Therefore, the word "strawberry" contains **3** 'r's. |
| | |
| | \[ |
| | \boxed{3} |
| | \]</s> |
| | ``` |
| |
|
| | Recommended system prompt for activating reasoning step: |
| | ``` |
| | You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: <think>{reasoning}</think>{answer} - Reasoning: Minimum 6 logical steps only when it required in <think> block - Process: Think first, then answer. |
| | ``` |
| |
|
| | **Note**: The model was mainly finetuned on English dataset, meaning the model may not perform well in other languages. |