| | --- |
| | license: mit |
| | datasets: |
| | - CreitinGameplays/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70Bmistral |
| | language: |
| | - en |
| | base_model: |
| | - mistralai/Mistral-Nemo-Instruct-2407 |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | --- |
| | |
| | ## Mistral Nemo 12B R1 |
| |  |
| |
|
| | Took **96 hours** to finetune on **2x Nvidia RTX A6000** with the following settings: |
| | - Batch size: 3 |
| | - Gradient accumulation steps: 1 |
| | - Epochs: 1 |
| | - Learning rate: 1e-4 |
| | - Warmup ratio: 0.1 |
| |
|
| | Run the model: |
| | ```python |
| | import torch |
| | from transformers import pipeline |
| | |
| | model_id = "CreitinGameplays/Mistral-Nemo-12B-R1-v0.1" |
| | |
| | pipe = pipeline( |
| | "text-generation", |
| | model=model_id, |
| | torch_dtype=torch.bfloat16, |
| | device_map="auto" |
| | ) |
| | |
| | messages = [ |
| | {"role": "system", "content": "You are a helpful AI assistant."}, |
| | {"role": "user", "content": "How many r's are in strawberry?"} |
| | ] |
| | |
| | outputs = pipe( |
| | messages, |
| | temperature=0.4, |
| | repetition_penalty=1.1, |
| | max_new_tokens=2048 |
| | ) |
| | |
| | print(outputs[0]["generated_text"][-1]) |
| | ``` |