Math Model — Olympiad-Focused SFT Checkpoint

This model is a fine-tuned version of Qwen/Qwen3-1.7B for mathematical reasoning, developed for the CS-552 standard project math track.

The model was trained to solve competition-style mathematics problems and produce final answers in boxed LaTeX format.

Base Model

  • Base model: Qwen/Qwen3-1.7B
  • Fine-tuning method: LoRA supervised fine-tuning
  • Target task: mathematical reasoning and short-answer competition problems

Training Data

The final submitted checkpoint was trained on approximately 25,165 examples from hard mathematical reasoning datasets:

Dataset / Source Examples
Hendrycks MATH 4,759
OpenR1-Math-220k 7,999
NuminaMath-CoT 12,407
Total 25,165

The NuminaMath subset was filtered to focus on harder mathematical sources:

  • Olympiads
  • AoPS Forum
  • AMC/AIME
  • MATH

The OpenR1 subset was filtered to competition-relevant categories:

  • Algebra
  • Geometry
  • Number Theory
  • Combinatorics
  • Inequalities

Training Details

Setting Value
Base model Qwen/Qwen3-1.7B
Fine-tuning method LoRA
Epochs 1
LoRA rank 32
LoRA alpha 64
LoRA dropout 0.05
Learning rate 1e-4
Batch size 1
Gradient accumulation steps 8
Precision bfloat16
Hardware 1 × NVIDIA A100 40GB
Training steps 3,146
Runtime 6,761 seconds
Tokens processed 14.7M
Final training loss 0.6114
Mean token accuracy 0.8302

Generation Configuration

The submitted generation configuration uses sampling:

{
  "do_sample": true,
  "temperature": 0.25,
  "top_p": 0.85,
  "top_k": 30
}
Downloads last month
187
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cs-552-2026-group1/math_model

Finetuned
Qwen/Qwen3-1.7B
Adapter
(536)
this model

Datasets used to train cs-552-2026-group1/math_model