DCAgent2/dev_set_71_tasks_exp_gfi_staqc_askllm_filtered_10K_glm_4_7_traces_jupiter_202606cd59d3e Viewer • Updated Feb 24 • 198 • 12
DCAgent2/dev_set_71_tasks_exp_gfi_swesmith_random_filtered_10K_glm_4_7_traces_jupiter_20b3bfb505 Viewer • Updated Feb 24 • 203 • 15
DCAgent2/dev_set_71_tasks_exp_syh_r2egym_askllm_constrained_glm_4_7_traces_jupiter_202608fcab736 Viewer • Updated Feb 24 • 202 • 15
DCAgent2/swebench_verified_random_100_folders_exp_swd_r2egym_wo_docker_glm_4_7_traces_20e35cb356 Viewer • Updated Feb 24 • 300 • 15
DCAgent2/dev_set_71_tasks_bs64_rloo_n_noct_stri_micr_auto_conv_pref_model_r2e_120_202602c7fec705 Viewer • Updated Feb 24 • 210 • 16
DCAgent2/dev_set_v2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_adam_bed5a504e8 Viewer • Updated Feb 24 • 293 • 14
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_neulab_agenttuning_webshop_sandboeeb3cade Viewer • Updated Feb 24 • 300 • 17
DCAgent2/dev_set_71_tasks_exp_syh_r2egym_swesmith_mixed_glm_4_7_traces_jupiter_20260223_204026 Viewer • Updated Feb 24 • 208 • 15
DCAgent2/dev_set_71_tasks_GLM_4_6_inferredbugs_32eps_65k_fixeps_20260224_004148 Viewer • Updated Feb 24 • 210 • 15
DCAgent2/dev_set_71_tasks_rl_base_exp_rpt_stack_bash_with_gpt5_90_20260224_044315 Viewer • Updated Feb 24 • 210 • 20
DCAgent2/terminal_bench_2_rl_base_code_contests_900s_reg_lr1e_5_140_20260223_182655 Viewer • Updated Feb 24 • 267 • 16
DCAgent2/dev_set_71_tasks_glm46_swesmith_maxeps_131k_fixthink_20260224_044300 Viewer • Updated Feb 24 • 210 • 13
DCAgent2/dev_set_71_tasks_rl_base_code_contests_900s_160_20260224_044306 Viewer • Updated Feb 24 • 210 • 13
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_ncb4958b8 Viewer • Updated Feb 24 • 267 • 17
DCAgent2/terminal_bench_2_rl_base_exp_rpt_stack_bash_with_gpt5_90_20260223_182659 Viewer • Updated Feb 24 • 267 • 17
DCAgent2/terminal_bench_2_rl_base_exp_rpt_stack_bash_90_20260223_182701 Viewer • Updated Feb 24 • 267 • 12
DCAgent2/terminal_bench_2_rl_base_code_contests_900s_160_20260223_182651 Viewer • Updated Feb 24 • 267 • 11
DCAgent2/terminal_bench_2_qwen3base_GLM_4_7_swesmith_sandboxes_with_tests_oracle_verifie2fcf8400 Viewer • Updated Feb 24 • 267 • 14
DCAgent2/dev_set_71_tasks_rl_base_code_contests_900s_reg_140_20260224_044308 Viewer • Updated Feb 24 • 210 • 12
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n7412a17e Viewer • Updated Feb 24 • 267 • 9
DCAgent2/swebench_verified_random_100_folders_glm46_Toolscale_tasks_traces_20260223_132956 Viewer • Updated Feb 24 • 300 • 11
DCAgent2/swebench_verified_random_100_folders_Kimi_K2T_ling_coder_sft_sandboxes_1_maxeps71f356b7 Viewer • Updated Feb 24 • 300 • 13
DCAgent2/terminal_bench_2_rl_base_code_contests_900s_reg_140_20260223_182653 Viewer • Updated Feb 24 • 267 • 14
DCAgent2/terminal_bench_2_GLM_4_6_taskmaster2_32eps_32k_fixeps_20260223_182629 Viewer • Updated Feb 24 • 266 • 11
DCAgent2/terminal_bench_2_glm46_swesmith_maxeps_131k_fixthink_20260223_182642 Viewer • Updated Feb 24 • 267 • 13
DCAgent2/dev_set_71_tasks_exp_syh_tezos_askllm_hardened_glm_4_7_traces_jupiter_20260223_175743 Viewer • Updated Feb 24 • 203 • 12
DCAgent2/dev_set_71_tasks_perturbed_docker_exp_freelancer_tasks_glm_4_7_traces_20260223_175750 Viewer • Updated Feb 24 • 206 • 10
DCAgent2/dev_set_71_tasks_r2egym_nl2bash_stack_bugsseq_fixthink_20260223_175802 Viewer • Updated Feb 24 • 208 • 11
DCAgent2/dev_set_71_tasks_r2egym_nl2bash_stack_bugsseq_crosscodeeval_python_v2_20260223_183815 Viewer • Updated Feb 24 • 209 • 12
DCAgent2/terminal_bench_2_bs64_rloo_n_noct_stri_micr_auto_conv_pref_model_r2e_120_202602a8301381 Viewer • Updated Feb 24 • 267 • 16