HippolyteP commited on
Commit
54ff14e
·
verified ·
1 Parent(s): 7f37fd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -124,18 +124,18 @@ The model was evaluated using [OLMES](https://arxiv.org/abs/2406.08446) a LLM ev
124
  | Benchmark | Sequential-Helium 6B | Shuffled-Helium 6B |
125
  |--------------|:------:|:------:|
126
  | | | |
127
- | MMLU | 58.8 | 56.4 |
128
- | ARC E | 87.6 | 86.7 |
129
- | ARC C | 74.5 | 72.1 |
130
- | OBQA | 72.8 | 73.2 |
131
- | CSQA | 73.1 | 74.3 |
132
- | PIQA | 80.3 | 80.2 |
133
- | SIQA | 67.0 | 66.2 |
134
- | HS | 79.1 | 81.3 |
135
- | WG | 73.0 | 73.1 |
136
- | BoolQA | 83.9 | 83.9 |
137
  | | | |
138
- | OLMES | 75.0 | 74.7 |
139
 
140
 
141
 
 
124
  | Benchmark | Sequential-Helium 6B | Shuffled-Helium 6B |
125
  |--------------|:------:|:------:|
126
  | | | |
127
+ | MMLU | 59.2 | 56.9 |
128
+ | ARC E | 87.7 | 86.6 |
129
+ | ARC C | 74.6 | 72.3 |
130
+ | OBQA | 74.0 | 72.8 |
131
+ | CSQA | 73.6 | 74.2 |
132
+ | PIQA | 79.9 | 80.3 |
133
+ | SIQA | 66.9 | 67.6 |
134
+ | HS | 78.9 | 81.2 |
135
+ | WG | 73.2 | 73.3 |
136
+ | BoolQA | 84.0 | 83.7 |
137
  | | | |
138
+ | OLMES | 77.0 | 77.0 |
139
 
140
 
141