legolasyiu commited on
Commit
68499e3
·
verified ·
1 Parent(s): d242dbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -62,14 +62,14 @@ Debugged vibecoder dataset
62
 
63
  ### 📊 Model Evaluation Results
64
 
65
- | Tasks | Version | Filter | n-shot | Metric | VibeCoder-20b-0.02-Debugger | gpt-oss-20 | Qwen 3 235B |
66
- |-----------------------------------|----------|------------------|--------|----------------|-----------------------------|-------------|--------------|
67
- | gsm8k_cot | 3 | flexible-extract | 3 | exact_match ↑ | 0.8452 (+0.7667) | 0.78 | 0.82 |
68
- | humaneval | 1 | create_test | 0 | exact_match ↑ | 0.933 (+0.8) | 0.73 | 0.92 |
69
- | mmlu_college_biology | 1 | create_test | 0 | exact_match ↑ | 1.000 (+ – ) | — | — |
70
- | mmlu_high_school_computer_science | 1 | create_test | 0 | exact_match ↑ | 1.000 (+0.9) | — | — |
71
- | computer_security | 1 | none | 2 | acc ↑ | 0.8528 (+0.700) | — | — |
72
- | college_computer_science | 1 | none | 2 | acc ↑ | 0.8528 (+0.700) | — | — |
73
 
74
  ---
75
 
 
62
 
63
  ### 📊 Model Evaluation Results
64
 
65
+ | Tasks | Version | Filter | n-shot | Metric | VibeCoder-20b-0.02-Debugger | gpt-oss-20 | Qwen 3 235B |
66
+ |--------------------------|----------|------------------|--------|----------------|-----------------------------|-------------|--------------|
67
+ | gsm8k_cot | 3 | flexible-extract | 3 | exact_match ↑ | 0.8452 (+0.7667) | 0.78 | 0.82 |
68
+ | humaneval | 1 | create_test | 0 | exact_match ↑ | 0.933 (+0.8) | 0.73 | 0.92 |
69
+ | mmlu_college_biology | 1 | create_test | 0 | exact_match ↑ | 1.000 (+ – ) | — | — |
70
+ | mmlu_HS_computer_science | 1 | create_test | 0 | exact_match ↑ | 1.000 (+0.9) | — | — |
71
+ | computer_security | 1 | none | 2 | acc ↑ | 0.8528 (+0.700) | — | — |
72
+ | college_computer_science | 1 | none | 2 | acc ↑ | 0.8528 (+0.700) | — | — |
73
 
74
  ---
75