shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20 • 9
shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20 • 9
shubhamprshr/Qwen2.5-1.5B-Instruct_gsm8k_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 20 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 10
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 11
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 8
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_countdown2345_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 6
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17 • 9
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_gaussian_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 10
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_cosine_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 11
shubhamprshr/Qwen2.5-1.5B-Instruct_math_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1600 Text Generation • 2B • Updated Nov 18 • 8
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17 • 10
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17 • 14
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_gaussian_0.25_0.75_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated Nov 17 • 10