Oren Data Distillation Experiment
Collection
Two identical d10 models (100M params) trained to validate the hypothesis
that quality-filtered data enables more efficient training.
โข
2 items
โข
Updated
โข
1