Special-R1 - a OpenLearnLM Collection

OpenLearnLM 's Collections

PedagogyRL-Experiments

Special-R1

updated May 5

OpenLearnLM/special-r1-deepseek-qwen3-8b-sped-adaptive-think-noreward

Text Generation • 8B • Updated Apr 7 • 3
OpenLearnLM/special-r1-deepseek-qwen3-8b-sped-adaptive-think-reward

Text Generation • 8B • Updated Apr 17 • 7
OpenLearnLM/special-r1-deepseek-qwen3-8b-merged-dare-v2

Text Generation • 8B • Updated May 4 • 5