sft finetuned on gsm8k dataset

used the format

<think>{thinking tokens}</think> <answer>{final answer}</answer>

got these stats on gsm8k test set after 2 epochs

correct format: 1260/1319
correct reward: 515/1319
Downloads last month
56
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT.

Model tree for michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT

Finetuned
(202)
this model
Finetunes
1 model

Dataset used to train michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT