DCAgent2/swebench_verified_random_100_folders_GLM_4_6_stackexchange_overflow_sandboxes_3c2e552a9 Updated 43 minutes ago
DCAgent2/swebench_verified_random_100_folders_r2egymGPT5CodexPassed_nl2bash_bugsseq_Qwena0e0c3f6 Updated about 2 hours ago
DCAgent2/swebench_verified_random_100_folders_r2egymGPT5CodexPassed_nl2bash_bugsseq_Qwenb1c78a15 Updated about 2 hours ago
DCAgent2/swebench_verified_random_100_folders_r2egymGPT5CodexPassed_nl2bash_bugsseq_Qwen6a3f6328 Updated about 2 hours ago
DCAgent2/swebench_verified_random_100_folders_rl_r2egym_nl2bash_stack_bugsseq_fixthink_a7e78a3c7 Viewer • Updated about 6 hours ago • 300
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_10K_glm_4_7_traces_20260311_170344 Viewer • Updated about 7 hours ago • 267
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_webshop_sandbod07c3d59 Viewer • Updated about 8 hours ago • 300
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_kg_sandboxes_me5f27cd1 Viewer • Updated about 8 hours ago • 300
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_316_glm_4_7_traces_20260311_170339 Viewer • Updated about 8 hours ago • 267
DCAgent2/medagentbench_laion_r2egym-nl2bash-stack-bugsseq Viewer • Updated about 10 hours ago • 900 • 9