Glint-0.1

Once upon a time, there was a model that could only say couldcouldoldbloodbloodbodybody. This is its ancestor.

Glint-0.1 is where the Glint line started. 1M parameters. Big dreams. Almost no ability to realize them. We look back on this one fondly, like a blurry photo of a puppy that chewed your shoes.

What you get

File	What it is
`tokenizer.json`	Hybrid word/char tokenizer (~2,111 tokens)
`pretrain.pt`	Base pretrained checkpoint
`model.pt`	Instruction-tuned checkpoint (SFT)

Specs

Thing	Value
Architecture	Transformer Decoder
Parameters	~1 Million
Context	2,048 tokens
d_model	160
Layers	6
Heads	4
FFN	256
Vocab	~2,111 tokens (Hybrid Char + Word)
Norm	RMSNorm + QK-Norm
Position	RoPE
Activation	SwiGLU

What made this one special

Hybrid tokenizer -- word-level where it helps, character-level where it gets confused
QK-Norm -- RMSNorm on queries and keys so training doesnt blow up
Loss boosting -- yelled at the model extra hard when it ignored multi-character words
Response-start weighting -- made it actually pay attention to the first tokens of its answers
Pretrain replay -- kept mixing in pretrain data during SFT so it wouldnt forget how to speak English

Training curve

It went down. Slowly. Painfully.

Limitations

Repeats itself. A lot.
Knows almost nothing about the world.
Not useful for anything real. Research only.
Will embarrass itself if asked a direct question.

Built by CompactAI. We started somewhere.

Downloads last month: 16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train CompactAI-O/Glint-0.1

Space using CompactAI-O/Glint-0.1 1

Collection including CompactAI-O/Glint-0.1

Glint Series

Collection

Find all of the Glint models in one place! (Hint: its here ) • 6 items • Updated 14 days ago • 2