On the SDEs and Scaling Rules for Adaptive Gradient Algorithms Paper • 2205.10287 • Published May 20, 2022
Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Paper • 2501.02669 • Published Jan 5, 2025 • 1
AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models Paper • 2505.00147 • Published Apr 30, 2025 • 4
Task-Specific Skill Localization in Fine-tuned Language Models Paper • 2302.06600 • Published Feb 13, 2023