HuggingFaceFW/fineweb-edu
Viewer • Updated • 3.5B • 634k • 1.09k
Once upon a time, there was a model that could only say
couldcouldoldbloodbloodbodybody. This is its ancestor.
Glint-0.1 is where the Glint line started. 1M parameters. Big dreams. Almost no ability to realize them. We look back on this one fondly, like a blurry photo of a puppy that chewed your shoes.
| File | What it is |
|---|---|
tokenizer.json |
Hybrid word/char tokenizer (~2,111 tokens) |
pretrain.pt |
Base pretrained checkpoint |
model.pt |
Instruction-tuned checkpoint (SFT) |
| Thing | Value |
|---|---|
| Architecture | Transformer Decoder |
| Parameters | ~1 Million |
| Context | 2,048 tokens |
| d_model | 160 |
| Layers | 6 |
| Heads | 4 |
| FFN | 256 |
| Vocab | ~2,111 tokens (Hybrid Char + Word) |
| Norm | RMSNorm + QK-Norm |
| Position | RoPE |
| Activation | SwiGLU |
It went down. Slowly. Painfully.
Built by CompactAI. We started somewhere.