-
Let me be honest with you. For months, I was that person. You know the one.
- Leaving my computer on overnight, running distillation scripts, trying to squeeze Claude
- opus 4.6 into something that could run on a potato.
-
-
And you know what? It was boring.
-
-
Who actually enjoys watching loss curves descend at 3 AM? Who gets excited about shaving off
- 2% of parameters while the model forgets how to count?
-
-
- "I was cloning someone else's work and compressing it. The process lacked real creation.
- Just digital photocopying with extra steps."
-
-
-
So I stopped. And I started asking a different question:
-
-
-
?
-
What if a model could be small by design, avoiding compression entirely?
-
-
TinyMemoryLM is a character-level transformer that learns to remember things. Not because it smart, but because we gave it external memory. And a codebook. And MTP. It still forgets where it put its keys though.