view article Article What happens when you scale a hybrid SSM-Mamba-transformer: tmax-2B → 4B → 9B → 27B juiceb0xc0de • 6 days ago
view article Article Signal to Noise in Language Models: The Single Voice Upgrade ML Needs juiceb0xc0de • Mar 15