deepanshupillm commited on
Commit
65b6be8
·
verified ·
1 Parent(s): 6644c49

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -86,6 +86,8 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
86
 
87
  ![BBH Benchmark](Benchmark_BBH.png)
88
 
 
 
89
  ![Combined Benchmark](combined_benchmark.png)
90
 
91
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
@@ -113,8 +115,6 @@ This SFT approach enables Alpie-Core to deliver reliable, aligned, and context-a
113
 
114
  ### Humanity's Last Exam Leaderboard Performance
115
 
116
- ![Humanity's Last Exam](Humanity's_Last_Exam_(Text_Only).png)
117
-
118
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
119
  |------|-------|-------------|---------------------|
120
  | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |
 
86
 
87
  ![BBH Benchmark](Benchmark_BBH.png)
88
 
89
+ ![Humanity's Last Exam](Humanity's_Last_Exam_(Text_Only)_-_Accuracy_Comparison.png)
90
+
91
  ![Combined Benchmark](combined_benchmark.png)
92
 
93
  | Benchmark | Alpie-Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
 
115
 
116
  ### Humanity's Last Exam Leaderboard Performance
117
 
 
 
118
  | Rank | Model | Accuracy (%) | Performance vs Alpie |
119
  |------|-------|-------------|---------------------|
120
  | 1 | GPT 4.5 Preview | 5.8 | Above Alpie |