Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 324k • 1.58k
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 440
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR Oct 20, 2024 • 53