shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-swebench-k5 Text Generation • 1B • Updated 17 days ago • 15
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-swebench-instructions-k5-cwm-plus-qwen Text Generation • 1B • Updated 17 days ago • 8
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-swebench-instructions-k5 Text Generation • 1B • Updated 17 days ago • 16
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-swebench-k5-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated May 16 • 67
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-k5-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated May 5 • 5
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-instructions-k10-opus-distill-32k-lr5e6-multiturn Updated May 2
shubhamrgandhi/qwen3-8b-full-sft-prm-r2egym-instructions-k10-opus-distill-32k-lr5e6-flattened Text Generation • 1B • Updated May 1 • 19
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-multiturn Text Generation • 1B • Updated Apr 29 • 63
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6-flattened Text Generation • 1B • Updated Apr 28 • 64
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean_think Text Generation • 1B • Updated Mar 28 • 3
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_clean Text Generation • 1B • Updated Mar 27 • 4
shubhamrgandhi/qwen3-8b-full-sft-prm-opus-distill-32k-lr5e6_rejection-sample_think Text Generation • 1B • Updated Mar 27 • 4