Update README.md
Browse files
README.md
CHANGED
|
@@ -40,6 +40,14 @@ datasets:
|
|
| 40 |
|
| 41 |
---
|
| 42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
## 📊 Benchmark Results
|
| 44 |
|
| 45 |
All scores measured with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) under **identical conditions** (same prompts, same few-shot settings, same hardware).
|
|
|
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
+
## 🔬 Key Contributions
|
| 44 |
+
|
| 45 |
+
- Demonstrates that domain-balanced continued pretraining on curated multi-domain data (education, math, code, science) yields consistent improvements across commonsense reasoning benchmarks in 1B-scale models
|
| 46 |
+
- Suggests that multi-step mathematical reasoning remains a fundamental bottleneck for 1B-scale models, even when combining math-focused pretraining (OpenWebMath) with instruction tuning (MetaMathQA)
|
| 47 |
+
- Provides a fully reproducible, compute-efficient training recipe (CPT → LoRA SFT) built and executed **by a single undergraduate student in under one week**, demonstrating that meaningful LLM research is achievable without institutional resources or large teams
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
## 📊 Benchmark Results
|
| 52 |
|
| 53 |
All scores measured with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) under **identical conditions** (same prompts, same few-shot settings, same hardware).
|