PrincetonPLI commited on
Commit
08c3236
·
verified ·
1 Parent(s): eff8ebc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -37,10 +37,10 @@ Each folder contains an OpenLM PyTorch checkpoint (`epoch_11.pt`, final epoch) p
37
 
38
  | Folder | Method | Training mixture | DCLM CORE v2 avg. |
39
  |--------|--------|------------------|-------------------|
40
- | `random_selection` | Random baseline | Uniform sampling from Corpus-200B pool | 40.5% |
41
- | `dclm_fasttext_only` | Quality (DCLM-fasttext) | Documents above DCLM-fasttext quality threshold | 43.1% |
42
  | `betweenness_alpha0.5` | **WebGraphMix** | 50/50 mix of top/bottom betweenness-centrality hosts | 41.4% |
43
- | `betweenness_alpha0.5_mult_div_dclm_fasttext` | **WebGraphMix+** | Betweenness 50/50 mix × DCLM-fasttext quality filter | 43.4% |
44
 
45
  > Scores are `aggregated_results` from the `mmlu_and_lowvar` eval suite (23 low-variance ICL tasks). See the [WebGraphMix repo](https://github.com/princeton-pli/WebGraphMix) to reproduce evaluation.
46
 
 
37
 
38
  | Folder | Method | Training mixture | DCLM CORE v2 avg. |
39
  |--------|--------|------------------|-------------------|
40
+ | `random_selection` | Random baseline | Uniform sampling from Corpus-200B pool | 39.8% |
41
+ | `dclm_fasttext_only` | Quality (DCLM-fasttext) | Documents above DCLM-fasttext quality threshold | 42.3% |
42
  | `betweenness_alpha0.5` | **WebGraphMix** | 50/50 mix of top/bottom betweenness-centrality hosts | 41.4% |
43
+ | `betweenness_alpha0.5_mult_div_dclm_fasttext` | **WebGraphMix+** | Betweenness 50/50 mix × DCLM-fasttext quality filter | 43.8% |
44
 
45
  > Scores are `aggregated_results` from the `mmlu_and_lowvar` eval suite (23 low-variance ICL tasks). See the [WebGraphMix repo](https://github.com/princeton-pli/WebGraphMix) to reproduce evaluation.
46