TitleOS commited on
Commit
0fae950
·
verified ·
1 Parent(s): a0cde84

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -71,6 +71,25 @@ It is ideal for:
71
  * Users seeking a coding assistant that offers conversation and personality alongside technical utility.
72
  * Coding without safety or moral refusals.
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  ## Limitations & Warning
75
 
76
  * **No Guardrails:** As a result of the Heretic process, this model has no safety filters. It will generate output for any request. Users are solely responsible for how they utilize the model's output.
 
71
  * Users seeking a coding assistant that offers conversation and personality alongside technical utility.
72
  * Coding without safety or moral refusals.
73
 
74
+ ## Benchmarking
75
+
76
+ Benchmarking is on-going, with a number of evaluations runs. So far, the following score are available:
77
+ 1. LiveCodeBench (Code Generation Lite - Release v2)
78
+ Pass@1 (Quantization Q8_0): 26.22% (Passed 134 out of 511 problems)
79
+
80
+ | Comparable Model | Parameter Size / Tier | Approximate Pass@1 |
81
+ | :--- | :--- | :--- |
82
+ | LLama-3-70b-Instruct | 70B | ~28.3% |
83
+ | GPT-4o-mini (2024-07) | Small Proprietary | ~27.7% |
84
+ | Claude 3 Sonnet (Original) | Large Proprietary | ~26.9% |
85
+ | Mixtral-8x22B-Instruct | 141B (MoE) | ~26.4% |
86
+ | **Eve-4B (Q8_0)** | 4B (Quantized) | 26.22% |
87
+ | Mistral-Large | Large Proprietary | ~26.0% |
88
+ | GPT-3.5-Turbo-0125 | Mid Proprietary | ~24.6% |
89
+ | Claude 3 Haiku | Small Proprietary | ~24.5% |
90
+ | Codestral-Latest | 22B | ~23.8% |
91
+ | Llama-3-8b-Instruct | 8B | ~15.3% |
92
+
93
  ## Limitations & Warning
94
 
95
  * **No Guardrails:** As a result of the Heretic process, this model has no safety filters. It will generate output for any request. Users are solely responsible for how they utilize the model's output.