TitleOS commited on
Commit
a7c7d8e
·
verified ·
1 Parent(s): a026df6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -21
README.md CHANGED
@@ -77,27 +77,18 @@ Benchmarking is on-going, with a number of evaluations runs. So far, the followi
77
  1. LiveCodeBench (Code Generation Lite - Release v2)
78
  Pass@1 (Quantization Q8_0): 26.22% (Passed 134 out of 511 problems)
79
 
80
- Comparable Model Parameter Size / Tier Approximate Pass@1:
81
-
82
- LLama-3-70b-Instruct 70B ~28.3%
83
-
84
- GPT-4o-mini (2024-07) Small Proprietary ~27.7%
85
-
86
- Claude 3 Sonnet (Original) Large Proprietary ~26.9%
87
-
88
- Mixtral-8x22B-Instruct 141B (MoE) ~26.4%
89
-
90
- **Eve-4B (Q8_0) 4B (Quantized)** **26.22%**
91
-
92
- Mistral-Large Large Proprietary ~26.0%
93
-
94
- GPT-3.5-Turbo-0125 Mid Proprietary ~24.6%
95
-
96
- Claude 3 Haiku Small Proprietary ~24.5%
97
-
98
- Codestral-Latest 22B ~23.8%
99
-
100
- Llama-3-8b-Instruct 8B ~15.3%
101
 
102
  ## Limitations & Warning
103
 
 
77
  1. LiveCodeBench (Code Generation Lite - Release v2)
78
  Pass@1 (Quantization Q8_0): 26.22% (Passed 134 out of 511 problems)
79
 
80
+ | Comparable Model | Parameter Size / Tier | Approximate Pass@1 |
81
+ | :--- | :--- | :--- |
82
+ | LLama-3-70b-Instruct | 70B | ~28.3% |
83
+ | GPT-4o-mini (2024-07) | Small Proprietary | ~27.7% |
84
+ | Claude 3 Sonnet (Original) | Large Proprietary | ~26.9% |
85
+ | Mixtral-8x22B-Instruct | 141B (MoE) | ~26.4% |
86
+ | **Eve-4B (Q8_0)** | 4B (Quantized) | 26.22% |
87
+ | Mistral-Large | Large Proprietary | ~26.0% |
88
+ | GPT-3.5-Turbo-0125 | Mid Proprietary | ~24.6% |
89
+ | Claude 3 Haiku | Small Proprietary | ~24.5% |
90
+ | Codestral-Latest | 22B | ~23.8% |
91
+ | Llama-3-8b-Instruct | 8B | ~15.3% |
 
 
 
 
 
 
 
 
 
92
 
93
  ## Limitations & Warning
94