Llama-3.2-3B-Math-LoRA (Reasoning Model)

This model is a fine-tuned version of Llama-3.2-3B-Instruct, specialized in solving mathematical problems using a reflective, iterative reasoning process. It was trained using Unsloth to mimic human-like stream-of-consciousness thinking, similar to the DeepSeek-R1 approach.

Key Features

Reflective Thinking: The model explores the problem, expresses self-doubt, and refines its logic before providing a final answer.
Efficient Fine-tuning: Trained using LoRA (Low-Rank Adaptation) in 4-bit quantization.
Math Specialist: Optimized with the OpenR1-Math-220k dataset to handle algebraic and arithmetic logic.

Training Metrics

The model was trained for 60 steps on a single NVIDIA T4 GPU (Google Colab).

Metric	Value
Training Loss (Final)	~0.85 (Check your Colab logs)
Learning Rate	2e-4
Optimizer	AdamW 8-bit
Batch Size	1 (with 8 gradient accumulation steps)
Precision	4-bit Quantization

How to Use (Inference Code)

Since this is a LoRA adapter, you can load it using the following code. Make sure you have unsloth installed.

Installation & Dependencies (CMD)

To run this model, you must have a GPU environment (like Google Colab or a local GPU). Install the necessary dependencies using the following commands:

!pip install -q huggingface_hub
!pip install -q unsloth
!pip install -q --force-reinstall --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

from unsloth import FastLanguageModel
import torch

# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "rimon-dutta/Llama-3.2-3B-Math-LoRA",
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# Reasoning System Prompt
r1_prompt = """You are a reflective assistant engaging in thorough, iterative reasoning.
<problem>
{}
</problem>
"""

# Test Question
problem_text = "Find all real values of x that satisfy the equation: 2^(x+3) + 2^x = 72."
messages = [{"role": "user", "content": r1_prompt.format(problem_text)}]


inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")


outputs = model.generate(input_ids=inputs, max_new_tokens=1024, use_cache=False)

print(tokenizer.batch_decode(outputs)[0])

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for rimon-dutta/Llama-3.2-3B-Math-LoRA

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

unsloth/Llama-3.2-3B-Instruct

Adapter

(410)

this model

rimon-dutta
/

Llama-3.2-3B-Math-LoRA