Experiments setup for reproduction

#2
by charosen - opened

Hi team! great blog, clear and easily understood. For getting started with on-policy distill, I would like to reproduce the experiments in your blog.

And I encounter some trouble.

For the dataset parts, there are 15.2k prompts for Qwen/Qwen2.5-7B-Instruct in blog, but the HuggingFaceTB/Countdown-Task-GOLD datasets shows it contains 30.4k rows, double from 15.2k. This makes me a little bit confusing.

image

Hugging Face H4 org
edited 13 days ago

Thank you for reading the blogpost! I took a look and indeed the numbers weren't consistent between the blogpost and the dataset. There was an error in the paragraph describing the dataset, so I modified the text in this PR to make it more clear.

cmpatino changed discussion status to closed

Sign up or log in to comment