open-r1/OpenR1-Math-220k
Viewer • Updated • 450k • 46.8k • 754
How to use rimon-dutta/Llama-3.2-3B-Math-LoRA with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rimon-dutta/Llama-3.2-3B-Math-LoRA to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rimon-dutta/Llama-3.2-3B-Math-LoRA to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for rimon-dutta/Llama-3.2-3B-Math-LoRA to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="rimon-dutta/Llama-3.2-3B-Math-LoRA",
max_seq_length=2048,
)This model is a fine-tuned version of Llama-3.2-3B-Instruct, specialized in solving mathematical problems using a reflective, iterative reasoning process. It was trained using Unsloth to mimic human-like stream-of-consciousness thinking, similar to the DeepSeek-R1 approach.
OpenR1-Math-220k dataset to handle algebraic and arithmetic logic.The model was trained for 60 steps on a single NVIDIA T4 GPU (Google Colab).
| Metric | Value |
|---|---|
| Training Loss (Final) | ~0.85 (Check your Colab logs) |
| Learning Rate | 2e-4 |
| Optimizer | AdamW 8-bit |
| Batch Size | 1 (with 8 gradient accumulation steps) |
| Precision | 4-bit Quantization |
Since this is a LoRA adapter, you can load it using the following code. Make sure you have unsloth installed.
To run this model, you must have a GPU environment (like Google Colab or a local GPU). Install the necessary dependencies using the following commands:
!pip install -q huggingface_hub
!pip install -q unsloth
!pip install -q --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git
from unsloth import FastLanguageModel
import torch
# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "rimon-dutta/Llama-3.2-3B-Math-LoRA",
max_seq_length = 2048,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
# Reasoning System Prompt
r1_prompt = """You are a reflective assistant engaging in thorough, iterative reasoning.
<problem>
{}
</problem>
"""
# Test Question
problem_text = "Find all real values of x that satisfy the equation: 2^(x+3) + 2^x = 72."
messages = [{"role": "user", "content": r1_prompt.format(problem_text)}]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=1024, use_cache=False)
print(tokenizer.batch_decode(outputs)[0])
Base model
meta-llama/Llama-3.2-3B-Instruct