HippolyteP commited on
Commit
3ffcb09
·
verified ·
1 Parent(s): 9f5c9c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -115,7 +115,7 @@ Helium 6B checkpoints were trained on data from Common Crawl, which was preproce
115
 
116
  #### Testing Data
117
 
118
- While our models are primarily designed to facilitate research on LLM temporality and base model dynamics—which may result in lower general performance compared to state-of-the-art models—we nonetheless evaluated them using the OLMES benchmark. This evaluation covers MMLU, ARC (Easy & Challenge), OpenBookQA, CommonSenseQA, PIQA, SIQA, HellaSwag, WinoGrande, and BoolQA."
119
 
120
 
121
 
 
115
 
116
  #### Testing Data
117
 
118
+ While our models are primarily designed to facilitate research on LLM temporality and base model dynamics—which may result in lower general performance compared to state-of-the-art models—we nonetheless evaluated them using the OLMES benchmark. This evaluation covers MMLU, ARC (Easy & Challenge), OpenBookQA, CommonSenseQA, PIQA, SIQA, HellaSwag, WinoGrande, and BoolQA.
119
 
120
 
121