jpacifico commited on
Commit
ba2dd3a
·
verified ·
1 Parent(s): 613013c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -17,6 +17,8 @@ This is a merge of pre-trained language models created using [mergekit](https://
17
  # First benchmarks
18
 
19
  **Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit baseline.
 
 
20
 
21
  **ARC-Challenge:** 51.62 (First-ever ≥50 score for a model in the 2B category, i.e., >1.5B and <2.5B params)
22
 
 
17
  # First benchmarks
18
 
19
  **Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit baseline.
20
+ All scores are reported in comparison with the original Microsoft BitNet b1.58 BF16 model.
21
+ Evaluations were performed using LM Eval Harness, all results are fully reproducible.
22
 
23
  **ARC-Challenge:** 51.62 (First-ever ≥50 score for a model in the 2B category, i.e., >1.5B and <2.5B params)
24