lllqaq/R2EGym-7B-Agent-Coder-Instruct-num02_posttrain_r2egym_32768_8gpu Text Generation • 333k • Updated 1 day ago • 21
lllqaq/R2EGym-7B-Agent-Coder-Instruct-num02_posttrain_r2egym_32768_8gpu Text Generation • 333k • Updated 1 day ago • 21
lllqaq/R2EGym-7B-Agent-Coder-Instruct-num01_posttrain_r2egym_32768_8gpu Text Generation • 333k • Updated 1 day ago • 16
lllqaq/R2EGym-7B-Agent-Coder-Instruct-num01_posttrain_r2egym_32768_8gpu Text Generation • 333k • Updated 1 day ago • 16
lllqaq/R2EGym-32B-Agent-Coder-Instruct_merged_bucketab_4sources_20260228_101548_32768_8gpu Text Generation • 1.12M • Updated 9 days ago • 14
lllqaq/R2EGym-32B-Agent-Coder-Instruct_merged_bucketab_4sources_20260228_101548_32768_8gpu Text Generation • 1.12M • Updated 9 days ago • 14
lllqaq/R2EGym-7B-Agent-Coder-Instruct-merged_bucketab_4sources_20260228_101548_32768_3gpu_oomfix Text Generation • 333k • Updated 9 days ago • 38
lllqaq/R2EGym-7B-Agent-Coder-Instruct-merged_bucketab_4sources_20260228_101548_32768_3gpu_oomfix Text Generation • 333k • Updated 9 days ago • 38
lllqaq/R2EGym-14B-Agent-Coder-Instruct1-traj_reward1_loose_4sources_shuf42_ckpt2400 841k • Updated 12 days ago • 11
lllqaq/R2EGym-14B-Agent-Coder-Instruct1-traj_reward1_loose_4sources_shuf42_ckpt2400 841k • Updated 12 days ago • 11
lllqaq/R2EGym-14B-Agent-Coder-Instruct1-merged_bucketab_4sources_20260228_101548_32768_4gpu_oomfix Text Generation • 841k • Updated 13 days ago • 14
lllqaq/R2EGym-14B-Agent-Coder-Instruct1-merged_bucketab_4sources_20260228_101548_32768_4gpu_oomfix Text Generation • 841k • Updated 13 days ago • 14
lllqaq/R2EGym-14B-Agent-Coder-Instruct-traj_bucketAB_multi_3sources_bucketAB_sft_shuf42 Text Generation • 841k • Updated 14 days ago • 12
lllqaq/R2EGym-14B-Agent-Coder-Instruct-traj_bucketAB_multi_3sources_bucketAB_sft_shuf42 Text Generation • 841k • Updated 14 days ago • 12