Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published Feb 1 • 44
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning Paper • 2602.01058 • Published Feb 1 • 44
is-sft-271828/synlogic_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_8 4B • Updated Jan 27 • 4
is-sft-271828/synlogic_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_8 4B • Updated Jan 27 • 4
is-sft-271828/synlogic_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_4 4B • Updated Jan 27 • 1
is-sft-271828/synlogic_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_4 4B • Updated Jan 27 • 1
is-sft-271828/math_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_8 4B • Updated Jan 27 • 1
is-sft-271828/math_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_8 4B • Updated Jan 27 • 1
is-sft-271828/math_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_4 4B • Updated Jan 27 • 1
is-sft-271828/math_grpo_qwen3-4b-base-qwen3-8b-3e-5-seq-seqlen_8192_chunksize_4 4B • Updated Jan 27 • 1