On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published Feb 24 • 102
ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4 Reinforcement Learning • 15B • Updated Feb 13, 2025 • 1.6k • 830
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 125