jpacifico
/

Aramis-2B-BitNet-bf16

Text Generation

Model card Files Files and versions

jpacifico commited on Aug 17, 2025

Commit

6ecdfe4

·

verified ·

1 Parent(s): 0dd9547

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -47,7 +47,7 @@ Iterative DPO + Model merging :
 **Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit quantized baseline (58,38).
 All scores are reported in comparison with the original [microsoft/bitnet-b1.58-2B-4T-bf16](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16) model.
-Evaluations were performed using LM Eval Harness, all results are fully reproducible.
 | Benchmark (metric)                 | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
 |------------------------------------|-----------------------------------|--------------------------------|

 **Interpretation:** Significant gains on language understanding & pragmatic reasoning (ARC-C/E, Wino, BoolQ, HellaSwag, TriviaQA) with stability on other skills. Math/code are not the optimization target; GSM8K stays essentially stable relative to the BitNet 1.58-bit quantized baseline (58,38).
 All scores are reported in comparison with the original [microsoft/bitnet-b1.58-2B-4T-bf16](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16) model.
+Evaluations were performed using [LM Eval Harness](https://github.com/EleutherAI/lm-evaluation-harness), all results are fully reproducible.
 | Benchmark (metric)                 | microsoft/bitnet-b1.58-2B-4T-bf16 | bitnet-dpo-merged-modelstock7 |
 |------------------------------------|-----------------------------------|--------------------------------|