Trouter-Library commited on
Commit
7d85879
·
verified ·
1 Parent(s): 2bad6cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -27
README.md CHANGED
@@ -91,33 +91,6 @@ The model incorporates several key architectural improvements:
91
  - Dynamic learning rate scheduling with restarts
92
  - Careful hyperparameter tuning for stability at scale
93
 
94
- ## Performance Benchmarks
95
-
96
- ### Reasoning and Knowledge
97
-
98
- | Benchmark | Score | Description |
99
- |-----------|-------|-------------|
100
- | MMLU | 84.7% | Massive Multitask Language Understanding |
101
- | ARC Challenge | 83.4% | Advanced reasoning and comprehension |
102
- | HellaSwag | 88.9% | Common sense inference |
103
- | WinoGrande | 82.3% | Commonsense reasoning |
104
- | TruthfulQA | 61.2% | Truthfulness in question answering |
105
-
106
- ### Mathematical Reasoning
107
-
108
- | Benchmark | Score | Description |
109
- |-----------|-------|-------------|
110
- | GSM8K | 89.2% | Grade school mathematics |
111
- | MATH | 56.7% | Competition-level mathematics |
112
- | Minerva Math | 53.4% | Advanced mathematical reasoning |
113
-
114
- ### Code Generation
115
-
116
- | Benchmark | Score | Description |
117
- |-----------|-------|-------------|
118
- | HumanEval | 75.6% | Python code generation |
119
- | MBPP | 72.3% | Basic Python programming |
120
- | DS-1000 | 64.5% | Data science code completion |
121
 
122
  ### Context Understanding
123
 
 
91
  - Dynamic learning rate scheduling with restarts
92
  - Careful hyperparameter tuning for stability at scale
93
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
 
95
  ### Context Understanding
96