Update README.md
Browse files
README.md
CHANGED
|
@@ -53,7 +53,7 @@ Layers 11-16: Full Attention Blocks
|
|
| 53 |
| **Hidden Dimension** | 512 | 512 |
|
| 54 |
| **Vocabulary Size** | 4,466 | 35,560 |
|
| 55 |
| **Training Dataset** | TinyChat only | TinyStories + TinyChat + HQ Sentences |
|
| 56 |
-
| **Total Tokens** | ~1M conversations |
|
| 57 |
| **Final Loss** | ~2.0 | ~2.0 |
|
| 58 |
| **Final Perplexity** | 7.29-9.70 | 7.29-10.0 |
|
| 59 |
| **Training Time** | ~17 hours | ~2-4 hours |
|
|
|
|
| 53 |
| **Hidden Dimension** | 512 | 512 |
|
| 54 |
| **Vocabulary Size** | 4,466 | 35,560 |
|
| 55 |
| **Training Dataset** | TinyChat only | TinyStories + TinyChat + HQ Sentences |
|
| 56 |
+
| **Total Tokens** | ~1M conversations | 3M+ tokens |
|
| 57 |
| **Final Loss** | ~2.0 | ~2.0 |
|
| 58 |
| **Final Perplexity** | 7.29-9.70 | 7.29-10.0 |
|
| 59 |
| **Training Time** | ~17 hours | ~2-4 hours |
|