qwen2-7b-orca-math-lora

A LoRA fine-tune of Qwen2-7B-Instruct trained with supervised fine-tuning on a curated blend of mathematical reasoning and general instruction-following data. Training was performed using Unsloth for memory-efficient adaptation on a single GPU.

Model Details

Property	Value
Base model	Qwen/Qwen2-7B-Instruct
Model family	Qwen2
Parameter count	7B
Fine-tuning method	LoRA (PEFT)
Quantization (training)	4-bit NormalFloat (bitsandbytes)
Chat template	ChatML
Context length	2048 tokens
Language	English
License	Apache 2.0

Training Details

LoRA Configuration

Hyperparameter	Value
Rank (r)	8
Alpha	8
Dropout	0
Bias	none
RSLoRA	True
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Gradient checkpointing	Unsloth (memory-optimized)

Training Hyperparameters

Hyperparameter	Value
Trainer	SFTTrainer (TRL)
Max steps	300
Per-device batch size	1
Gradient accumulation steps	8
Effective batch size	8
Learning rate	2e-4
LR scheduler	Linear
Optimizer	AdamW (8-bit)
Weight decay	0.01
Precision	bf16 (fp16 fallback if bf16 unavailable)
Packing	False
Training objective	Responses only (instruction tokens masked)
Seed	3407

Training Data

The model was trained on a concatenated and shuffled mixture of three datasets (seed 3407):

Dataset	Split	Samples
openai/gsm8k	train (full)	~7,473
microsoft/orca-math-word-problems-200k	train	4,000
HuggingFaceH4/ultrachat_200k	train_sft	2,000

All examples were formatted using the ChatML conversation template before training. The loss was computed on assistant responses only; user turns and system prompts were excluded from the gradient.

Intended Use

This model is suited for tasks involving:

Grade-school and competition-level math word problems
Step-by-step arithmetic and algebraic reasoning
General instruction following and question answering in English

It is not intended for safety-critical applications, factual knowledge retrieval, or domains outside its training distribution.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "x0root/qwen2-7b-orca-math-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

messages = [
    {"role": "user", "content": "A train travels 300 km in 4 hours. What is its average speed?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

For faster inference with the original 4-bit quantized weights, load via Unsloth:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="x0root/qwen2-7b-orca-math-lora",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Training Framework

Component	Version / Note
Unsloth	Latest at training time
TRL	<= 0.24.0
Transformers	<= 5.5.0
Datasets	< 4.4.0
Accelerate	Latest at training time
PEFT	Latest at training time
bitsandbytes	Latest at training time
Hardware	Single CUDA GPU

Downloads last month: 5

Model tree for x0root/qwen2-7b-orca-math-lora

Base model

Qwen/Qwen2-7B

Finetuned

Qwen/Qwen2-7B-Instruct

Adapter

(382)

this model

x0root
/

qwen2-7b-orca-math-lora