Rephrase dataset description
#5
by
cmpatino HF Staff - opened
app/src/content/article.mdx
CHANGED
|
@@ -224,7 +224,7 @@ Below is an example of the system and user prompts we pass to the model for the
|
|
| 224 |
|
| 225 |
### Dataset
|
| 226 |
|
| 227 |
-
We sourced all the prompts from the [Jiayi-Pan/Countdown-Tasks-3to4](https://huggingface.co/datasets/Jiayi-Pan/Countdown-Tasks-3to4) dataset. Our dataset contains 80k training prompts and 10k testing prompts selected randomly. We then generated responses from the `Qwen/Qwen2.5-7B-Instruct` and `Qwen/Qwen3-4B-Instruct-2507` teacher models, including only the prompts that had the correct answers from the teachers.
|
| 228 |
|
| 229 |
<iframe
|
| 230 |
src="https://huggingface.co/datasets/HuggingFaceTB/Countdown-Task-GOLD/embed/viewer/verified_Qwen3-4B-Instruct-2507/train"
|
|
|
|
| 224 |
|
| 225 |
### Dataset
|
| 226 |
|
| 227 |
+
We sourced all the prompts from the [Jiayi-Pan/Countdown-Tasks-3to4](https://huggingface.co/datasets/Jiayi-Pan/Countdown-Tasks-3to4) dataset. Our full dataset contains 80k training prompts and 10k testing prompts selected randomly. We then generated responses from the `Qwen/Qwen2.5-7B-Instruct` and `Qwen/Qwen3-4B-Instruct-2507` teacher models, including only the prompts that had the correct answers from the teachers. Our published dataset contains 30.4k prompts for `Qwen/Qwen2.5-7B-Instruct` and 27.7k for `Qwen/Qwen3-4B-Instruct-2507` generations. We use the prompts in the training dataset with 30.4k prompts for all the on-policy experiments because we use the student’s generations instead of the teacher’s completions.
|
| 228 |
|
| 229 |
<iframe
|
| 230 |
src="https://huggingface.co/datasets/HuggingFaceTB/Countdown-Task-GOLD/embed/viewer/verified_Qwen3-4B-Instruct-2507/train"
|