ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 122
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 3 days ago • 20
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 1 day ago • 25
Running on A100 120 Music Flamingo 🎵 120 Upload music or YouTube videos and ask detailed questions about them
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 7 days ago • 80
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 7 days ago • 38
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56