Update README.md
Browse files
README.md
CHANGED
|
@@ -205,7 +205,7 @@ All models are evaluated in chat mode (e.g. with the respective conversation tem
|
|
| 205 |
|
| 206 |
| Model | Size | HumanEval+ pass@1 |
|
| 207 |
|-----------------------------|--------|-------------------|
|
| 208 |
-
| **OpenChat
|
| 209 |
| ChatGPT (December 12, 2023) | - | 64.6 |
|
| 210 |
| WizardCoder-Python-34B-V1.0 | 34B | 64.6 |
|
| 211 |
| OpenChat 3.5 1210 | 7B | 63.4 |
|
|
@@ -215,7 +215,7 @@ All models are evaluated in chat mode (e.g. with the respective conversation tem
|
|
| 215 |
<h3>OpenChat-3.5 vs. Grok</h3>
|
| 216 |
</div>
|
| 217 |
|
| 218 |
-
🔥 OpenChat-3.5
|
| 219 |
|
| 220 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
| 221 |
|-----------------------|-------------|---------|----------|--------|-----------|----------|----------|
|
|
|
|
| 205 |
|
| 206 |
| Model | Size | HumanEval+ pass@1 |
|
| 207 |
|-----------------------------|--------|-------------------|
|
| 208 |
+
| **OpenChat-3.5-0106** | **7B** | **65.9** |
|
| 209 |
| ChatGPT (December 12, 2023) | - | 64.6 |
|
| 210 |
| WizardCoder-Python-34B-V1.0 | 34B | 64.6 |
|
| 211 |
| OpenChat 3.5 1210 | 7B | 63.4 |
|
|
|
|
| 215 |
<h3>OpenChat-3.5 vs. Grok</h3>
|
| 216 |
</div>
|
| 217 |
|
| 218 |
+
🔥 OpenChat-3.5-0106 (7B) now outperforms Grok-0 (33B) on **all 4 benchmarks** and Grok-1 (???B) on average and **3/4 benchmarks**.
|
| 219 |
|
| 220 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
| 221 |
|-----------------------|-------------|---------|----------|--------|-----------|----------|----------|
|