Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,8 @@ language:
|
|
| 8 |
- en
|
| 9 |
---
|
| 10 |
|
| 11 |
-
Imu1-Midtrain
|
|
|
|
| 12 |
First small language model trained on consumer GPUs with competitive performance.
|
| 13 |
|
| 14 |
Trained on 2B tokens of publicly available post-training datasets using advanced NorMuon optimizer with Cautious Weight Decay and Polar Express Newton-Shulz coefficients and WSD scheduler.
|
|
|
|
| 8 |
- en
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Imu1-Midtrain
|
| 12 |
+
|
| 13 |
First small language model trained on consumer GPUs with competitive performance.
|
| 14 |
|
| 15 |
Trained on 2B tokens of publicly available post-training datasets using advanced NorMuon optimizer with Cautious Weight Decay and Polar Express Newton-Shulz coefficients and WSD scheduler.
|