RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 5 days ago • 72
hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 36B • Updated 27 days ago • 11.8k • 78
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated Apr 6 • 238k • • 2.84k
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 18 items • Updated 7 days ago • 293