performed Expert Iteration (EI) on top of michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-SFT

used the following hyperparameters:

n_ei_steps = 60
d_batch_size = 250
G = 30
micro_batch_size = 16
gradient_accum_steps = 2
lr = 1e-5

got these stats on gsm8k test set

correct format: 1312/1319
correct reward: 878/1319
Downloads last month
4
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-EI

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1)
this model

Dataset used to train michaelbzhu/Qwen2.5-Math-1.5B-GSM8K-EI