DCAgent2/medagentbench_laion_GLM-4_6-stackexchange-overflow-sandboxes-32eps-65k-reasoning Viewer • Updated 13 days ago • 900 • 11
DCAgent2/medagentbench_laion_exp_tas_timeout_multiplier_8_0_traces Viewer • Updated 13 days ago • 886 • 10
DCAgent2/medagentbench_laion_exp-syh-tezos-askllm-hardened_glm_4_7_traces_jupiter Viewer • Updated 13 days ago • 880 • 8
DCAgent2/medagentbench_laion_exp-gfi-swesmith-random-filtered-10K_glm_4_7_traces_jupiter Viewer • Updated 13 days ago • 884 • 12
DCAgent2/medagentbench_laion_exp-gfi-staqc-askllm-filtered-10K_glm_4_7_traces_jupiter Viewer • Updated 13 days ago • 880 • 12
DCAgent2/medagentbench_laion_exp-syh-tezos-askllm-constrained_glm_4_7_traces_jupiter Viewer • Updated 13 days ago • 885 • 11
DCAgent2/medagentbench_laion_exp-uns-tezos-10x_glm_4_7_traces_jupiter Viewer • Updated 13 days ago • 887 • 12
DCAgent2/terminal_bench_2_r2egym_nl2bash_stack_bugsseq_fixthink_stack_pytest_large_2026068e508aa Viewer • Updated 13 days ago • 267 • 8
DCAgent2/medagentbench_laion_syh-r2eg-askl-glm_4-7_trac_jupi_-gfi-swes-rand-filt-10K_glm0f247f11 Viewer • Updated 13 days ago • 884 • 12
DCAgent2/medagentbench_laion_r2egym-nl2bash-stack-bugsseq-fixthink-again Viewer • Updated 13 days ago • 886 • 11
DCAgent2/medagentbench_laion_GLM-4.6-stackexchange-overflow-sandboxes-32eps-65k-reasonin8d301331 Viewer • Updated 13 days ago • 875 • 12
DCAgent2/swebench_verified_random_100_folders_rl_tp4s64_8x_proportional_20260307_081057 Viewer • Updated 13 days ago • 300 • 9
DCAgent2/terminal_bench_2_exp_uns_r2egym_33_6x_glm_4_7_traces_jupiter_20260306_154530 Viewer • Updated 13 days ago • 267 • 9
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n8681c473 Viewer • Updated 13 days ago • 267 • 8
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n728da60f Viewer • Updated 13 days ago • 267 • 7
DCAgent2/terminal_bench_2_dev_set_part1_10k_glm_4_7_traces_locetash_20260306_154532 Viewer • Updated 13 days ago • 267 • 10
DCAgent2/terminal_bench_2_exp_uns_r2egym_8_4x_glm_4_7_traces_jupiter_20260306_154539 Viewer • Updated 13 days ago • 267 • 10
DCAgent2/DCAgent_dev_set_v2_laion_dev_set_part1_10k_glm_4_7_traces_jupiter_tm4x_20260306_173753 Viewer • Updated 13 days ago • 171 • 8
DCAgent2/terminal_bench_2_exp_swd_r2egym_wo_docker_glm_4_7_traces_20260306_154536 Viewer • Updated 13 days ago • 267 • 9
DCAgent2/DCAgent2_terminal_bench_2_laion_GLM-4_7-r2egym_sandboxes-maxeps-131k-lc_20260307_035544 Viewer • Updated 13 days ago • 267 • 8
DCAgent2/terminal_bench_2_Kimi_K2T_neulab_agenttuning_webshop_sandboxes_maxeps_32k_20260db0a1230 Viewer • Updated 13 days ago • 267 • 8
DCAgent2/swebench_verified_random_100_folders_OpenThinker_Agent_v1_SFT_20260306_200516 Viewer • Updated 13 days ago • 300 • 7
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n85b835d0 Viewer • Updated 13 days ago • 267 • 10
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n9e326f89 Viewer • Updated 13 days ago • 267 • 9
DCAgent2/DCAgent_dev_set_v2_laion_sft_GLM-4-7-swesmith-sandboxes-with_tests-oracle_veriffb8331c1 Viewer • Updated 13 days ago • 168 • 7
DCAgent2/terminal_bench_2_GLM_4_7_swesmith_sandboxes_with_tests_oracle_verified_120s_max60985f4c Viewer • Updated 13 days ago • 266 • 6
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n7051d1fa Viewer • Updated 14 days ago • 267 • 8
DCAgent2/terminal_bench_2_GLM_4_6_stackexchange_overflow_sandboxes_32eps_65k_reasoning_n893ed936 Viewer • Updated 14 days ago • 267 • 8