sft finetuned on gsm8k dataset

used the format

<think>{thinking tokens}</think> <answer>{final answer}</answer>

got these stats on gsm8k test set after 2 epochs

correct format: 1260/1319
correct reward: 515/1319
Downloads last month
8
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(156)
this model
Finetunes
1 model

Dataset used to train michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT