Self-planning Code Generation with Large Language Models Paper • 2303.06689 • Published Mar 12, 2023 • 1
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
view article Article Introducing OptiMind, a research model designed for optimization microsoft • Jan 15 • 35
view article Article Qwen-Image-i2L: Training Strategies for Image-to-LoRA Generation kelseye • Dec 16, 2025 • 59
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms mattt • Nov 20, 2025 • 42
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 natolambert, LouisCastricato, lvwerra, Dahoas • Dec 9, 2022 • 417
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment NormalUhr • Feb 11, 2025 • 126
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 spisakjo, darktex, zkwentz, mortimerp9, Sanyam, Hamid-Nazeri, Pankit01, emre0, lewtun, reach-vb • Oct 23, 2025 • 164
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 614
view article Article Red-Teaming Large Language Models +1 nazneen, natolambert, lewtun • Feb 24, 2023 • 39
view article Article Small Language Models (SLM): A Comprehensive Overview jjokah • Feb 22, 2025 • 164
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 64
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 96
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper • 2507.23348 • Published Jul 31, 2025 • 12
SWE-Exp: Experience-Driven Software Issue Resolution Paper • 2507.23361 • Published Jul 31, 2025 • 14
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published Jul 14, 2025 • 90
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.15k