AI & ML interests
None defined yet.
test-gen/num1_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_qwen-7b-easy_t0.0_n1
Viewer
• Updated
• 182 • 5
test-gen/num10_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 164 • 6
test-gen/num10_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o3_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num5_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 164 • 6
test-gen/num5_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o3_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num1_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 164 • 4
test-gen/num1_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o3_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num10_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 500 • 6
test-gen/num10_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o3_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num5_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 500 • 5
test-gen/num5_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o3_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num1_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 500 • 6
test-gen/num1_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o3_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num10_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 182 • 6
test-gen/num10_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o3_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/num5_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 182 • 6
test-gen/num5_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o3_t0_n1
Viewer
• Updated
• 182 • 7
test-gen/num1_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_qwen3-4b_t0.6_n1_think
Viewer
• Updated
• 182 • 6
test-gen/num1_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o3_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o3_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/num10_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o4-mini_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num5_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o4-mini_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num1_code_humaneval_qwen2.5-3b_t0.1_n8_tests_humaneval_o4-mini_t0_n1
Viewer
• Updated
• 164 • 6
test-gen/num10_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o4-mini_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num5_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o4-mini_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num1_code_mbpp_qwen2.5-3b_t0.1_n8_tests_mbpp_o4-mini_t0_n1
Viewer
• Updated
• 500 • 6
test-gen/num10_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o4-mini_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/num5_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o4-mini_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/num1_code_livecodebench_qwen2.5-3b_t0.1_n8_tests_livecodebench_o4-mini_t0_n1
Viewer
• Updated
• 182 • 6
test-gen/livecodebench_o3_t0_n1_generated_tests
Viewer
• Updated
• 182 • 6