Text Generation
PEFT
Safetensors
English
qwen2
lora
unsloth
math
reasoning
sft
conversational

qwen2-7b-orca-math-lora

A LoRA fine-tune of Qwen2-7B-Instruct trained with supervised fine-tuning on a curated blend of mathematical reasoning and general instruction-following data. Training was performed using Unsloth for memory-efficient adaptation on a single GPU.


Model Details

Property Value
Base model Qwen/Qwen2-7B-Instruct
Model family Qwen2
Parameter count 7B
Fine-tuning method LoRA (PEFT)
Quantization (training) 4-bit NormalFloat (bitsandbytes)
Chat template ChatML
Context length 2048 tokens
Language English
License Apache 2.0

Training Details

LoRA Configuration

Hyperparameter Value
Rank (r) 8
Alpha 8
Dropout 0
Bias none
RSLoRA True
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Gradient checkpointing Unsloth (memory-optimized)

Training Hyperparameters

Hyperparameter Value
Trainer SFTTrainer (TRL)
Max steps 300
Per-device batch size 1
Gradient accumulation steps 8
Effective batch size 8
Learning rate 2e-4
LR scheduler Linear
Optimizer AdamW (8-bit)
Weight decay 0.01
Precision bf16 (fp16 fallback if bf16 unavailable)
Packing False
Training objective Responses only (instruction tokens masked)
Seed 3407

Training Data

The model was trained on a concatenated and shuffled mixture of three datasets (seed 3407):

Dataset Split Samples
openai/gsm8k train (full) ~7,473
microsoft/orca-math-word-problems-200k train 4,000
HuggingFaceH4/ultrachat_200k train_sft 2,000

All examples were formatted using the ChatML conversation template before training. The loss was computed on assistant responses only; user turns and system prompts were excluded from the gradient.


Intended Use

This model is suited for tasks involving:

  • Grade-school and competition-level math word problems
  • Step-by-step arithmetic and algebraic reasoning
  • General instruction following and question answering in English

It is not intended for safety-critical applications, factual knowledge retrieval, or domains outside its training distribution.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "x0root/qwen2-7b-orca-math-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

messages = [
    {"role": "user", "content": "A train travels 300 km in 4 hours. What is its average speed?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

For faster inference with the original 4-bit quantized weights, load via Unsloth:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="x0root/qwen2-7b-orca-math-lora",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Training Framework

Component Version / Note
Unsloth Latest at training time
TRL <= 0.24.0
Transformers <= 5.5.0
Datasets < 4.4.0
Accelerate Latest at training time
PEFT Latest at training time
bitsandbytes Latest at training time
Hardware Single CUDA GPU
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for x0root/qwen2-7b-orca-math-lora

Base model

Qwen/Qwen2-7B
Adapter
(418)
this model

Datasets used to train x0root/qwen2-7b-orca-math-lora