Instructions to use x0root/qwen2-7b-orca-math-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use x0root/qwen2-7b-orca-math-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen2-7B-Instruct-bnb-4bit") model = PeftModel.from_pretrained(base_model, "x0root/qwen2-7b-orca-math-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use x0root/qwen2-7b-orca-math-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for x0root/qwen2-7b-orca-math-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for x0root/qwen2-7b-orca-math-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for x0root/qwen2-7b-orca-math-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="x0root/qwen2-7b-orca-math-lora", max_seq_length=2048, )
qwen2-7b-orca-math-lora
A LoRA fine-tune of Qwen2-7B-Instruct trained with supervised fine-tuning on a curated blend of mathematical reasoning and general instruction-following data. Training was performed using Unsloth for memory-efficient adaptation on a single GPU.
Model Details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen2-7B-Instruct |
| Model family | Qwen2 |
| Parameter count | 7B |
| Fine-tuning method | LoRA (PEFT) |
| Quantization (training) | 4-bit NormalFloat (bitsandbytes) |
| Chat template | ChatML |
| Context length | 2048 tokens |
| Language | English |
| License | Apache 2.0 |
Training Details
LoRA Configuration
| Hyperparameter | Value |
|---|---|
| Rank (r) | 8 |
| Alpha | 8 |
| Dropout | 0 |
| Bias | none |
| RSLoRA | True |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Gradient checkpointing | Unsloth (memory-optimized) |
Training Hyperparameters
| Hyperparameter | Value |
|---|---|
| Trainer | SFTTrainer (TRL) |
| Max steps | 300 |
| Per-device batch size | 1 |
| Gradient accumulation steps | 8 |
| Effective batch size | 8 |
| Learning rate | 2e-4 |
| LR scheduler | Linear |
| Optimizer | AdamW (8-bit) |
| Weight decay | 0.01 |
| Precision | bf16 (fp16 fallback if bf16 unavailable) |
| Packing | False |
| Training objective | Responses only (instruction tokens masked) |
| Seed | 3407 |
Training Data
The model was trained on a concatenated and shuffled mixture of three datasets (seed 3407):
| Dataset | Split | Samples |
|---|---|---|
| openai/gsm8k | train (full) | ~7,473 |
| microsoft/orca-math-word-problems-200k | train | 4,000 |
| HuggingFaceH4/ultrachat_200k | train_sft | 2,000 |
All examples were formatted using the ChatML conversation template before training. The loss was computed on assistant responses only; user turns and system prompts were excluded from the gradient.
Intended Use
This model is suited for tasks involving:
- Grade-school and competition-level math word problems
- Step-by-step arithmetic and algebraic reasoning
- General instruction following and question answering in English
It is not intended for safety-critical applications, factual knowledge retrieval, or domains outside its training distribution.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "x0root/qwen2-7b-orca-math-lora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
messages = [
{"role": "user", "content": "A train travels 300 km in 4 hours. What is its average speed?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
For faster inference with the original 4-bit quantized weights, load via Unsloth:
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="x0root/qwen2-7b-orca-math-lora",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
Training Framework
| Component | Version / Note |
|---|---|
| Unsloth | Latest at training time |
| TRL | <= 0.24.0 |
| Transformers | <= 5.5.0 |
| Datasets | < 4.4.0 |
| Accelerate | Latest at training time |
| PEFT | Latest at training time |
| bitsandbytes | Latest at training time |
| Hardware | Single CUDA GPU |
- Downloads last month
- 1