Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
wopΒ 
posted an update 2 days ago
Post
104
πŸš€ __Monostep v1__ is up β†’[Monostep-v1 Demo}( wop/Cosmos-T2-Chat)

A tiny (~16.6M) experimental model that predicts 4 tokens per forward pass instead of one. A Transformer trunk pools the prompt into a single vector, then 4 sequential "slot" heads emit a block of tokens left-to-right β€” a lightweight take on multi-token prediction.

Trained on GSM8K (GPT-2 tokenizer, 10 epochs). It's small and rough β€” answers are often wrong β€” but it's a fun little testbed for block decoding. Weights, config, training curves, and a self-contained inference snippet are all in the repo.

Also wired into the Cosmos T2-Accelerate chat demo, where it streams those 4-token blocks live. πŸ§ͺ

#multitokenprediction #gsm8k #smallmodels
In this post