Update README.md
Browse files
README.md
CHANGED
|
@@ -213,6 +213,48 @@ hf (pretrained=fblgit/LUNA-SOLARkrautLM-Instruct,dtype=float16), gen_kwargs: (),
|
|
| 213 |
| - social_sciences|N/A |none | 5|acc |0.7501|± |0.0684|
|
| 214 |
| - stem |N/A |none | 5|acc |0.5569|± |0.1360|
|
| 215 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 216 |
|
| 217 |
## Disclaimer
|
| 218 |
We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out.
|
|
|
|
| 213 |
| - social_sciences|N/A |none | 5|acc |0.7501|± |0.0684|
|
| 214 |
| - stem |N/A |none | 5|acc |0.5569|± |0.1360|
|
| 215 |
```
|
| 216 |
+
### MT-Bench
|
| 217 |
+
```
|
| 218 |
+
########## Average ##########
|
| 219 |
+
score
|
| 220 |
+
model
|
| 221 |
+
gpt-4 8.990625
|
| 222 |
+
gpt-3.5-turbo 7.943750
|
| 223 |
+
claude-instant-v1 7.905660
|
| 224 |
+
claude-v1 7.900000
|
| 225 |
+
UNA-SOLAR-10.7B-Instruct-v1.0 7.521875
|
| 226 |
+
LUNA-SOLARkrautLM-Instruct 7.462500
|
| 227 |
+
vicuna-33b-v1.3 7.121875
|
| 228 |
+
wizardlm-30b 7.009375
|
| 229 |
+
Llama-2-70b-chat 6.856250
|
| 230 |
+
Llama-2-13b-chat 6.650000
|
| 231 |
+
guanaco-33b 6.528125
|
| 232 |
+
tulu-30b 6.434375
|
| 233 |
+
guanaco-65b 6.409375
|
| 234 |
+
oasst-sft-7-llama-30b 6.409375
|
| 235 |
+
palm-2-chat-bison-001 6.400000
|
| 236 |
+
mpt-30b-chat 6.393750
|
| 237 |
+
vicuna-13b-v1.3 6.387500
|
| 238 |
+
wizardlm-13b 6.353125
|
| 239 |
+
Llama-2-7b-chat 6.268750
|
| 240 |
+
vicuna-7b-v1.3 5.996875
|
| 241 |
+
baize-v2-13b 5.750000
|
| 242 |
+
nous-hermes-13b 5.553459
|
| 243 |
+
mpt-7b-chat 5.459119
|
| 244 |
+
gpt4all-13b-snoozy 5.452830
|
| 245 |
+
koala-13b 5.350000
|
| 246 |
+
mpt-30b-instruct 5.218750
|
| 247 |
+
falcon-40b-instruct 5.168750
|
| 248 |
+
h2ogpt-oasst-open-llama-13b 4.625000
|
| 249 |
+
alpaca-13b 4.531250
|
| 250 |
+
chatglm-6b 4.500000
|
| 251 |
+
oasst-sft-4-pythia-12b 4.318750
|
| 252 |
+
rwkv-4-raven-14b 3.984375
|
| 253 |
+
dolly-v2-12b 3.275000
|
| 254 |
+
fastchat-t5-3b 3.040625
|
| 255 |
+
stablelm-tuned-alpha-7b 2.753125
|
| 256 |
+
llama-13b 2.606250
|
| 257 |
+
```
|
| 258 |
|
| 259 |
## Disclaimer
|
| 260 |
We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out.
|