SRFT SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15 Yuqian-Fu/SRFT-Qwen2.5-Math-7B Text Generation • 8B • Updated Jul 24 • 31 • 3 Yuqian-Fu/SRFT-Qwen2.5-7B-Instruct 8B • Updated Jul 24 • 6 Yuqian-Fu/SRFT-Qwen2.5-Math-1.5B 2B • Updated Jul 24 • 4
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15
SRFT SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15 Yuqian-Fu/SRFT-Qwen2.5-Math-7B Text Generation • 8B • Updated Jul 24 • 31 • 3 Yuqian-Fu/SRFT-Qwen2.5-7B-Instruct 8B • Updated Jul 24 • 6 Yuqian-Fu/SRFT-Qwen2.5-Math-1.5B 2B • Updated Jul 24 • 4
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15