Spaces:
Running on CPU Upgrade
Running on CPU Upgrade
Commit ·
5df08f8
1
Parent(s): 7c8644c
add overview of the findings in the conclusions
Browse files
app/src/content/chapters/conclusions.mdx
CHANGED
|
@@ -1,6 +1,31 @@
|
|
| 1 |
## Conclusions
|
| 2 |
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
### Next Steps
|
| 6 |
|
|
|
|
| 1 |
## Conclusions
|
| 2 |
|
| 3 |
+
Here are the key takeaways from our experiments:
|
| 4 |
+
|
| 5 |
+
- **Q: How do existing datasets compare?**<br/>
|
| 6 |
+
A: DCLM, Nemotron-HQ-Synth, and REWIRE lead. Most synthetic baselines fall behind.
|
| 7 |
+
- **Q: Which individual prompts from the synthetic baselines match DCLM?**<br/>
|
| 8 |
+
A: Only Diverse QA Pairs and REWIRE's Guided Rewrite.
|
| 9 |
+
- **Q: Can new prompts beat DCLM?**<br/>
|
| 10 |
+
A: Yes. Math, Table, FAQ, and Tutorial all outperform DCLM.
|
| 11 |
+
- **Q: Does model size matter?**<br/>
|
| 12 |
+
A: Not much. 1B is sufficient for simple prompts, 4B for complex ones.
|
| 13 |
+
- **Q: Do we need better models for low-quality data?**<br/>
|
| 14 |
+
A: No consistent advantage from larger models on low-quality sources.
|
| 15 |
+
- **Q: Does the model family matter?**<br/>
|
| 16 |
+
A: Yes. SmolLM2 dominates across all prompts.
|
| 17 |
+
- **Q: Does the model generation matter?**<br/>
|
| 18 |
+
A: Slightly. Newer Qwen versions trend better.
|
| 19 |
+
- **Q: Is synthetic data enough?**<br/>
|
| 20 |
+
A: No. Always mix synthetic with original data.
|
| 21 |
+
- **Q: Does the mix-in dataset matter?**<br/>
|
| 22 |
+
A: Yes, a major performance driver, sometimes more important than the synthetic data.
|
| 23 |
+
- **Q: Does the source dataset matter?**<br/>
|
| 24 |
+
A: Not with a strong mix-in. Even low-quality sources produce competitive results.
|
| 25 |
+
- **Q: Does increased diversity help?**<br/>
|
| 26 |
+
A: No, performance averages rather than compounds.
|
| 27 |
+
- **Q: Do typos in the prompt hurt?**<br/>
|
| 28 |
+
A: No. Typos have no negative effect on downstream performance.
|
| 29 |
|
| 30 |
### Next Steps
|
| 31 |
|