i3-lab
/

i3-80m

Text Generation

i3-architecture

Model card Files Files and versions

FlameF0X commited on Nov 12, 2025

Commit

b72e7dc

·

verified ·

1 Parent(s): 725812c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -53,7 +53,7 @@ Layers 11-16: Full Attention Blocks
 | **Hidden Dimension** | 512 | 512 |
 | **Vocabulary Size** | 4,466 | 35,560 |
 | **Training Dataset** | TinyChat only | TinyStories + TinyChat + HQ Sentences |
-| **Total Tokens** | ~1M conversations | 3,000,000+ tokens |
 | **Final Loss** | ~2.0 | ~2.0 |
 | **Final Perplexity** | 7.29-9.70 | 7.29-10.0 |
 | **Training Time** | ~17 hours | ~2-4 hours |

 | **Hidden Dimension** | 512 | 512 |
 | **Vocabulary Size** | 4,466 | 35,560 |
 | **Training Dataset** | TinyChat only | TinyStories + TinyChat + HQ Sentences |
+| **Total Tokens** | ~1M conversations | 3M+ tokens |
 | **Final Loss** | ~2.0 | ~2.0 |
 | **Final Perplexity** | 7.29-9.70 | 7.29-10.0 |
 | **Training Time** | ~17 hours | ~2-4 hours |