diff --git "a/downloads/index.html" "b/downloads/index.html" --- "a/downloads/index.html" +++ "b/downloads/index.html" @@ -1,32 +1,29 @@ + - TinyMemoryLM | AILAY + Download | CompactAI Studio - + +
-
-
-
- - Training on RTX 5090 -
-

A ~1M Parameter Model
with 2K Context

-

TinyMemoryLM is a hybrid word-character transformer trained on RTX 5090. Features recurrent memory, precision codebook output head, and DeepSeek-V3 style MTP. It learns to remember things — not because it's smart, but because we gave it external memory. And a codebook. And multi-token prediction. It still forgets where it put its keys though.

- -
-
- - -
-
-

TRAINING THREE MODEL TIERS

-
- - Haiku ~1M params — Live - - - Sonnet ~300M params — In Training - - - Opus ~600M params — In Training - -
-
-
- -
-
-

Download CompactAI Studio

-

Run our AI models locally on your machine. Chat with models, browse available models, and download them for offline use.

-
- - - Download App - - - Read the Blog - -
-

Built with Electron.

-
-
- -
-
-
-
-
~1M
-
Parameters
-
-
-
2K
-
Context Length
-
-
-
6
-
Layers
-
-
-
4
-
Attention Heads
-
-
-
160
-
Model Dimension
-
-
-
229
-
FFN Dimension
-
-
-
-
- -
-
-
-

Architecture Features

-

Not your grandmother's transformer. Actually, probably not even your mother's.

-
-
-
-
M
-

Recurrent Memory (Chunk-GRU)

-

A recurrent memory module with chunk-level GRU processing is integrated into the architecture. Processes sequential chunks to maintain memory across the context window, giving the model external memory capabilities beyond what attention can handle.

-
-
-
C
-

Precision Codebook Output Head

-

Tied weight embeddings with a learnable per-token output bias. Instead of a separate codebook projection, the model ties input embeddings to output weights and learns a 2111-parameter bias vector to compensate for word-token suppression. Simple, parameter-efficient, and surprisingly effective.

-
-
-
T
-

Makeshift MTP

-

DeepSeek-V3 style Multi-Token Prediction with horizons (2, 3, 4). MTP adapters learn to predict multiple future tokens simultaneously, improving sample quality through branch selection during generation. Pretrain weight: 0.3, SFT weight: 0.3.

-
-
-
R
-

RTX 5090 Optimized

-

Tuned for RTX 5090 with flash attention, bf16 mixed precision, and batch size 64. Uses PyTorch Inductor with coordinate_descent_tuning enabled. Gradient checkpointing and torch.compile are available but disabled for Haiku tier — stability over speed.

-
-
-
H
-

Hybrid Word-Character Tokenizer

-

Word-level tokenizer with ~2111 tokens. Scans datasets for top 2000 frequent words to achieve 3-4x compression vs pure character-level. Supports special format tokens for instruction tuning: <|user|>, <|assistant|>, <|system|>, <|begin_of_thought|>, <|end_of_thought|>.

-
-
-
-
- -
-
-
-

The Architecture

-

Simple on paper, complicated in practice. Just like your relationship with your co-workers.

-
-
-
-
- Input - Character Embedding -
-
-
-
-
- Transformer Block ×6 - RMSNorm, QK-Norm, SwiGLU FFN -
-
-
-
-
- MTP Adapters ×3 - Horizons 2, 3, 4 -
-
-
-
-
- Tied Output Head - Learnable Bias (2111 params) -
-
-
-
-
- Output - ~2.1K Hybrid Vocab -
-
-
-
- d_model - 160 -
-
- heads - 4 -
-
- ffn_dim - 229 -
-
- mtp_horizons - [2, 3, 4] -
-
- vocab_size - ~2111 +
+

Download CompactAI Studio

+

Run AI models locally on your machine

+ +
+
+
+
+ + + +
-
- seq_len - 2048 +
+

Windows v1.0.0

+

Windows 10/11 (x64) - Installer (.exe)

+ + + + + + Download +
-
-
-
-
-
-

Model Series

-

Three tiers following Chinchilla scaling. Yes, we borrowed the naming scheme. No, we're not sorry.

-
-
-
-
- Haiku - ~1M params -
-

Lightweight and experimental. Updated frequently. The scrappy underdog.

-
- dim160 - layers6 - heads4 - ffn_dim229 - context2,048 - lr8e-4 -
-
-
-
- Sonnet - ~300M params -
-

Balanced and stable. Updated less often. The responsible middle child.

-
- dim768 - layers36 - heads12 - ffn_dim2,538 - context2,048 - lr2e-4 +
+
+
+ + + +
-
-
-
- Opus - ~600M params -
-

Maximum quality. Heavy and most stable. The overachiever who never sleeps.

-
- dim1,024 - layers39 - heads16 - ffn_dim3,557 - context2,048 - lr1.6e-4 +
+

Linux v1.0.0

+

Ubuntu/Debian/Fedora - AppImage

+ + + + + + Download +
-
-
-
-
-
-

Sample Output

-

What happens when you train on English text and hope for the best.

-
-
-
-
- - - - tinyMemoryLM --sample -
-
-
> Write a haiku about neural networks
-
< weights dance in dark
gradient descends like rain
loss slowly fades
-
> What's the meaning of life?
-
< 42, obviously. Though I suspect the question was rhetorical. Unless you count the time I spent learning that "the" is the most common token. That's been 38% of my existence. It's a living.
-
+
+

Don't feel like downloading an app? No worries!

+

Run the browser version, pull checkpoints from huggingface.co/CompactAI, and chat locally without installing the desktop app. Dependencies install automatically the first time you launch it.

+
+
-
-
-
+
python3 interactive.py
+ - -
-
-
-

AIFinder

-

A tool that snitches on AI models. Every AI has a writing accent — AIFinder detects it.

-
-
-
-
🔍
-

Which AI Wrote This?

-

Paste any AI-generated text and AIFinder will guess which lab made it. Google, Anthropic, OpenAI, DeepSeek, xAI, and more. It learns from corrections. The more you use it, the smarter it gets.

- -
- Anthropic - DeepSeek - Google - OpenAI - xAI - Mistral - MiniMax - +4 more -
- - - -

- Free API available · 60 requests/min · No API key required -

- - -
-

YES WE KNOW IT SUCKS

-

- The tool guesses wrong sometimes. It confuses Anthropic with OpenAI. - It confidently identifies Google as DeepSeek. It's basically a parrot with an opinion. -

-

- Pro tip: Ask it math and reasoning questions. That's what we trained it on — - huge amounts of TeichAI datasets (check them out at huggingface.co/TeichAI). - It is noticeably better at detecting which math-happy lab produced the output. -

-
-

- That said, I have an AI working on fixing it. I couldn't be bothered to do it manually. -

-

- 7+ hours -

-

- The AI is trying its best. Poor thing. -

-
-
-
+
+ Note: The web launcher downloads model files from Hugging Face and caches them + locally. The desktop app still ships for offline use if you prefer a packaged installer.
-
-
- - + + +