Shadow-V2

Fine-tuned Qwen3-0.6B for mathematical reasoning.

Model Details

Property Value
Base Model Qwen3-0.6B
Parameters 636M total, 40M trainable (6.34%)
Precision BF16
Training Method LoRA via Unsloth
Context Length 2048

Training

Config Value
Dataset 25,000 examples
Epochs 1
Batch Size 16 (2 × 8 accum)
Steps 1,200
Hardware Tesla T4 16GB
Time 1.35 hours
Final Loss 0.43

Benchmarks

Benchmark Shadow-V2 Qwen3-0.6B (base)
GSM8K (5-shot) TBD 42.3
MATH (4-shot) TBD 18.2
HumanEval (0-shot) TBD 28.0

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Redhanuman/Shadow-V2")
tokenizer = AutoTokenizer.from_pretrained("Redhanuman/Shadow-V2")

prompt = "Solve: If 3x + 7 = 22, find x.\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
32
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for Redhanuman/Shadow-V2

Quantizations
1 model

Dataset used to train Redhanuman/Shadow-V2

Evaluation results