Experiments setup for reproduction
#2
by
charosen
- opened
Hi team! great blog, clear and easily understood. For getting started with on-policy distill, I would like to reproduce the experiments in your blog.
And I encounter some trouble.
For the dataset parts, there are 15.2k prompts for Qwen/Qwen2.5-7B-Instruct in blog, but the HuggingFaceTB/Countdown-Task-GOLD datasets shows it contains 30.4k rows, double from 15.2k. This makes me a little bit confusing.
Thank you for reading the blogpost! I took a look and indeed the numbers weren't consistent between the blogpost and the dataset. There was an error in the paragraph describing the dataset, so I modified the text in this PR to make it more clear.
cmpatino
changed discussion status to
closed
