sixf0ur
/

tiny-lm-8M

Model card Files Files and versions

sixf0ur commited on Feb 9

Commit

53350fd

·

verified ·

1 Parent(s): eca2f16

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -30,8 +30,8 @@ This model was evaluated using the `lm-evaluation-harness` against OpenAI's GPT-
 | **HellaSwag** (acc_norm) | **27.00%** | 31.14% | **86.7%** |
 > **Key Takeaway:** With only **12% of the parameters**, this model achieves over **80% of the reasoning performance** of GPT-2, proving that modern architectures combined with curated data can drastically reduce model size.
->
-> ## Model Architecture
 The model is based on the **Llama-2 architecture** with several modern optimizations:

 | **HellaSwag** (acc_norm) | **27.00%** | 31.14% | **86.7%** |
 > **Key Takeaway:** With only **12% of the parameters**, this model achieves over **80% of the reasoning performance** of GPT-2, proving that modern architectures combined with curated data can drastically reduce model size.
+## Model Architecture
 The model is based on the **Llama-2 architecture** with several modern optimizations: