LIMO-Qwen3-8B-Math

This model is fine-tuned on the LIMO dataset for mathematical reasoning tasks.

Model Details

Base Model: Qwen3-8B (4-bit quantized)
Training Method: LoRA fine-tuning with Unsloth
Dataset: GAIR/LIMO (817 high-quality samples)
Training Framework: Unsloth + SFTTrainer
Sequence Length: 4096 tokens

Training Configuration

Batch Size: 8
Gradient Accumulation: 1
Learning Rate: 2e-5
Epochs: 3
LoRA Rank: 16
LoRA Alpha: 32
LoRA Dropout: 0.1

Performance

This model follows the LIMO (Less is More) approach, achieving strong mathematical reasoning performance with minimal but high-quality training data.

Usage

from unsloth import FastLanguageModel
import torch

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    "Cbgcbg/limo-qwen3-8b-math",
    max_seq_length=4096,
    dtype=torch.bfloat16,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Format input
messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": "What is the sum of the first 10 positive integers?"}
]

formatted_prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)

# Generate
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)

System Prompt

The model was trained with the following system prompt:

Please reason step by step, and put your final answer within \boxed{}.

Citation

If you use this model, please cite the original LIMO paper:

@misc{ye2025limoreasoning,
    title={LIMO: Less is More for Reasoning},
    author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
    year={2025},
    eprint={2502.03387},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2502.03387},
}

Training Details

This model was trained using the LIMO methodology, which demonstrates that high-quality mathematical reasoning can be achieved with minimal but carefully curated training data.