Running on CPU Upgrade 25 FAIR Chemistry Leaderboard 🥇 25 Submit and view model evaluation results on chemical benchmarks
Omnilingual MT: Machine Translation for 1,600 Languages Paper • 2603.16309 • Published 3 days ago • 10
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 7 days ago • 62
Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge Paper • 2603.11665 • Published 8 days ago • 4
Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training Paper • 2603.12246 • Published 7 days ago • 4