Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 14 days ago • 141
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published 13 days ago • 82
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published 14 days ago • 38
view article Article Introducing OptiMind, a research model designed for optimization 12 days ago • 31
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 145
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 301
mradermacher/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning-i1-GGUF 8B • Updated 25 days ago • 18.9k • 7
DavidAU/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning Text Generation • 8B • Updated 21 days ago • 1.21k • 22