Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning Paper • 2505.01441 • Published Apr 28, 2025 • 39
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published Apr 15, 2025 • 12
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen! Paper • 2505.15656 • Published May 21, 2025 • 15
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence Paper • 2606.19348 • Published Apr 26 • 9
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention Paper • 2606.20945 • Published 16 days ago • 76