view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Jul 10, 2025 • 54
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20, 2025 • 194
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_8b-table-0.002 Text Generation • 8B • Updated Sep 29, 2024 • 1
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_8b-table-0.002 Text Generation • 8B • Updated Sep 28, 2024 • 15
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_bt_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 4
xukp20/Llama-3-8B-Instruct-SPPO-score-Iter3_gp_2b-table-0.001 Text Generation • 8B • Updated Sep 28, 2024 • 5