nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16 Text Generation • 63B • Updated 5 days ago • 10.7k • 122
lordx64/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled Text Generation • 36B • Updated Apr 23 • 29k • 192
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 154k • • 2.89k
bartowski/moonshotai_Kimi-Linear-48B-A3B-Instruct-GGUF Text Generation • 49B • Updated Feb 9 • 3.9k • 19
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning Paper • 2601.09708 • Published Jan 14 • 56
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19, 2025 • 69
Craw4LLM: Efficient Web Crawling for LLM Pretraining Paper • 2502.13347 • Published Feb 19, 2025 • 30
Soundwave: Less is More for Speech-Text Alignment in LLMs Paper • 2502.12900 • Published Feb 18, 2025 • 85