DCAgent2/terminal_bench_2_r2egym_nl2bash_stack_bugsseq_fixthink_20260223_182647 Viewer • Updated Feb 25 • 267 • 13
DCAgent2/dev_set_71_tasks_Kimi_K2T_neulab_agenttuning_webshop_sandboxes_maxeps_32k_20260aea47664 Viewer • Updated Feb 25 • 210 • 10
DCAgent2/dev_set_71_tasks_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_a805bcbf0 Viewer • Updated Feb 25 • 198 • 13
DCAgent2/dev_set_71_tasks_exp_tas_timeout_multiplier_0_25_traces_20260224_164649 Viewer • Updated Feb 25 • 210 • 16
DCAgent2/terminal_bench_2_GLM_4_7_inferredbugs_sandboxes_maxeps_131k_20260223_182736 Viewer • Updated Feb 25 • 264 • 14
DCAgent2/dev_set_71_tasks_exp_swd_r2egym_wo_docker_glm_4_7_traces_20260224_204728 Viewer • Updated Feb 25 • 208 • 14
DCAgent2/dev_set_71_tasks_Kimi_K2T_ling_coder_sft_sandboxes_1_maxeps_32k_20260224_204724 Viewer • Updated Feb 25 • 210 • 10
DCAgent2/dev_set_71_tasks_Kimi_K2T_neulab_agenttuning_kg_sandboxes_maxeps_32k_20260224_204722 Viewer • Updated Feb 25 • 210 • 11
DCAgent2/dev_set_71_tasks_Kimi_K2T_neulab_agenttuning_mind2web_sandboxes_maxeps_32k_20264c003700 Viewer • Updated Feb 25 • 210 • 16
DCAgent2/dev_set_71_tasks_rl_rl_conf_qwen_8b_ll_lr1e_5_bs64_yaml_mode_path_r2eg_nl2b_sta6505eea5 Viewer • Updated Feb 25 • 210 • 12
DCAgent2/terminal_bench_2_Qwen3_8B_exp_tas_summarize_threshold_4096_traces_save_strategydebc1646 Viewer • Updated Feb 25 • 267 • 12
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_10K_glm_4_7_traces_20260224_062811 Viewer • Updated Feb 25 • 265 • 13
DCAgent2/dev_set_71_tasks_glm46_Toolscale_tasks_traces_20260224_204730 Viewer • Updated Feb 25 • 210 • 19
DCAgent2/terminal_bench_2_rl_think_npfg_code_contests_900s_45_20260223_182718 Viewer • Updated Feb 25 • 266 • 15
DCAgent2/terminal_bench_2_GLM_4_7_r2egym_sandboxes_maxeps_131k_20260224_062807 Viewer • Updated Feb 25 • 263 • 15
DCAgent2/dev_set_71_tasks_exp_tas_timeout_multiplier_4_0_traces_20260224_124611 Viewer • Updated Feb 25 • 210 • 15
DCAgent2/dev_set_71_tasks_GLM_4_7_swesmith_sandboxes_with_tests_oracle_verified_120s_max9adc42ba Viewer • Updated Feb 25 • 205 • 16
DCAgent2/terminal_bench_2_GLM_4_7_stackexchange_tezos_sandboxes_maxeps_131k_20260224_062809 Viewer • Updated Feb 25 • 265 • 15
DCAgent2/terminal_bench_2_GLM_4_7_swesmith_sandboxes_with_tests_oracle_verified_120s_maxc49ae07b Viewer • Updated Feb 25 • 266 • 13
DCAgent2/dev_set_71_tasks_exp_tas_timeout_multiplier_1_0_traces_20260224_164646 Viewer • Updated Feb 25 • 210 • 13
DCAgent2/dev_set_71_tasks_GLM_4_7_r2egym_sandboxes_maxeps_131k_20260224_124551 Viewer • Updated Feb 25 • 198 • 13
DCAgent2/terminal_bench_2_exp_psu_stackoverflow_316_glm_4_7_traces_20260224_164226 Viewer • Updated Feb 25 • 260 • 12
DCAgent2/terminal_bench_2_bs64_rloo_n_noct_stri_micr_model_noconv_r2eg_nl2_140_20260223_182721 Viewer • Updated Feb 25 • 267 • 15
DCAgent2/dev_set_71_tasks_GLM_4_7_inferredbugs_sandboxes_maxeps_131k_20260224_124548 Viewer • Updated Feb 25 • 205 • 15
DCAgent2/dev_set_71_tasks_exp_uns_r2egym_2_1x_glm_4_7_traces_locetash_20260224_124545 Viewer • Updated Feb 25 • 210 • 13
DCAgent2/dev_set_71_tasks_swesmith_sandboxes_with_tests_gpt_5_mini_passed_glm_4_7_traces64da835b Viewer • Updated Feb 25 • 210 • 17
DCAgent2/terminal_bench_2_dev_set_part1_10k_glm_4_7_traces_locetash_20260223_182731 Viewer • Updated Feb 25 • 264 • 12
DCAgent2/dev_set_71_tasks_exp_tas_timeout_multiplier_8_0_traces_20260224_164644 Viewer • Updated Feb 25 • 210 • 14
DCAgent2/dev_set_71_tasks_exp_psu_stackoverflow_3K_glm_4_7_traces_20260224_124556 Viewer • Updated Feb 25 • 210 • 14
DCAgent2/dev_set_71_tasks_exp_gfi_staqc_short_response_filtered_10K_glm_4_7_traces_locet6fb4c592 Viewer • Updated Feb 25 • 206 • 9