Antreas commited on
Commit
1892d8f
·
verified ·
1 Parent(s): 8caadee

Initial upload: ogma-base embedding model

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +927 -0
  2. config.json +37 -0
  3. config.py +161 -0
  4. config.yaml +19 -0
  5. embeddings.py +143 -0
  6. model.pt +3 -0
  7. model.safetensors +3 -0
  8. ogma_model.py +203 -0
  9. pooling.py +99 -0
  10. results/AmazonCounterfactualClassification.json +526 -0
  11. results/AmazonPolarityClassification.json +140 -0
  12. results/AmazonReviewsClassification.json +270 -0
  13. results/ArXivHierarchicalClusteringP2P.json +47 -0
  14. results/ArXivHierarchicalClusteringS2S.json +47 -0
  15. results/ArguAna.json +167 -0
  16. results/AskUbuntuDupQuestions.json +167 -0
  17. results/BIOSSES.json +27 -0
  18. results/Banking77Classification.json +140 -0
  19. results/BiorxivClusteringP2P.json +33 -0
  20. results/BiorxivClusteringS2S.json +33 -0
  21. results/CQADupstackAndroidRetrieval.json +167 -0
  22. results/CQADupstackEnglishRetrieval.json +167 -0
  23. results/CQADupstackGamingRetrieval.json +167 -0
  24. results/CQADupstackGisRetrieval.json +167 -0
  25. results/CQADupstackMathematicaRetrieval.json +167 -0
  26. results/CQADupstackPhysicsRetrieval.json +167 -0
  27. results/CQADupstackProgrammersRetrieval.json +167 -0
  28. results/CQADupstackRetrieval.json +20 -0
  29. results/CQADupstackStatsRetrieval.json +167 -0
  30. results/CQADupstackTexRetrieval.json +167 -0
  31. results/CQADupstackUnixRetrieval.json +167 -0
  32. results/CQADupstackWebmastersRetrieval.json +167 -0
  33. results/CQADupstackWordpressRetrieval.json +167 -0
  34. results/ClimateFEVER.json +167 -0
  35. results/DBPedia.json +167 -0
  36. results/EmotionClassification.json +140 -0
  37. results/FEVER.json +167 -0
  38. results/FiQA2018.json +167 -0
  39. results/HotpotQA.json +167 -0
  40. results/ImdbClassification.json +140 -0
  41. results/MSMARCO.json +167 -0
  42. results/MTOPDomainClassification.json +270 -0
  43. results/MTOPIntentClassification.json +270 -0
  44. results/MassiveIntentClassification.json +270 -0
  45. results/MassiveScenarioClassification.json +270 -0
  46. results/MedrxivClusteringP2P.json +33 -0
  47. results/MedrxivClusteringS2S.json +33 -0
  48. results/MindSmallReranking.json +252 -0
  49. results/NFCorpus.json +167 -0
  50. results/NQ.json +167 -0
README.md ADDED
@@ -0,0 +1,927 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - mteb
7
+ - sentence-transformers
8
+ - embedding
9
+ - text-embedding
10
+ - ogma
11
+ - axiotic
12
+ - matryoshka
13
+ - small-model
14
+ model-index:
15
+ - name: ogma-base
16
+ results:
17
+ - task:
18
+ type: Classification
19
+ dataset:
20
+ type: mteb/AmazonCounterfactualClassification
21
+ name: MTEB AmazonCounterfactualClassification
22
+ config: default
23
+ split: test
24
+ revision: 1f7e6a9d6fa6e64c53d146e428565640410c0df1
25
+ metrics:
26
+ - type: accuracy
27
+ value: 73.13
28
+ - task:
29
+ type: Classification
30
+ dataset:
31
+ type: mteb/AmazonPolarityClassification
32
+ name: MTEB AmazonPolarityClassification
33
+ config: default
34
+ split: test
35
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
36
+ metrics:
37
+ - type: accuracy
38
+ value: 79.85
39
+ - task:
40
+ type: Classification
41
+ dataset:
42
+ type: mteb/AmazonReviewsClassification
43
+ name: MTEB AmazonReviewsClassification
44
+ config: default
45
+ split: test
46
+ revision: 6b5d328eaae8ef408dd7d775040245cf86f92e9d
47
+ metrics:
48
+ - type: accuracy
49
+ value: 39.47
50
+ - task:
51
+ type: Clustering
52
+ dataset:
53
+ type: mteb/ArXivHierarchicalClusteringP2P
54
+ name: MTEB ArXivHierarchicalClusteringP2P
55
+ config: default
56
+ split: test
57
+ revision: 0bbdb47bcbe3a90093699aefeed338a0f28a7ee8
58
+ metrics:
59
+ - type: v_measure
60
+ value: 55.83
61
+ - task:
62
+ type: Clustering
63
+ dataset:
64
+ type: mteb/ArXivHierarchicalClusteringS2S
65
+ name: MTEB ArXivHierarchicalClusteringS2S
66
+ config: default
67
+ split: test
68
+ revision: b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3
69
+ metrics:
70
+ - type: v_measure
71
+ value: 52.73
72
+ - task:
73
+ type: Retrieval
74
+ dataset:
75
+ type: mteb/ArguAna
76
+ name: MTEB ArguAna
77
+ config: default
78
+ split: test
79
+ revision: c22ab2a51041ffd869aaddef7af8d8215647e41a
80
+ metrics:
81
+ - type: ndcg_at_10
82
+ value: 45.0
83
+ - task:
84
+ type: Reranking
85
+ dataset:
86
+ type: mteb/AskUbuntuDupQuestions
87
+ name: MTEB AskUbuntuDupQuestions
88
+ config: default
89
+ split: test
90
+ revision: c5691e3c48741d5f83b5cc8e630653d7a8cfc048
91
+ metrics:
92
+ - type: map
93
+ value: 56.76
94
+ - task:
95
+ type: STS
96
+ dataset:
97
+ type: mteb/BIOSSES
98
+ name: MTEB BIOSSES
99
+ config: default
100
+ split: test
101
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
102
+ metrics:
103
+ - type: cosine_spearman
104
+ value: 84.15
105
+ - task:
106
+ type: Classification
107
+ dataset:
108
+ type: mteb/Banking77Classification
109
+ name: MTEB Banking77Classification
110
+ config: default
111
+ split: test
112
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
113
+ metrics:
114
+ - type: accuracy
115
+ value: 78.56
116
+ - task:
117
+ type: Clustering
118
+ dataset:
119
+ type: mteb/BiorxivClusteringP2P
120
+ name: MTEB BiorxivClusteringP2P
121
+ config: default
122
+ split: test
123
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
124
+ metrics:
125
+ - type: v_measure
126
+ value: 34.11
127
+ - task:
128
+ type: Clustering
129
+ dataset:
130
+ type: mteb/BiorxivClusteringS2S
131
+ name: MTEB BiorxivClusteringS2S
132
+ config: default
133
+ split: test
134
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
135
+ metrics:
136
+ - type: v_measure
137
+ value: 26.34
138
+ - task:
139
+ type: Retrieval
140
+ dataset:
141
+ type: mteb/CQADupstackAndroidRetrieval
142
+ name: MTEB CQADupstackAndroidRetrieval
143
+ config: default
144
+ split: test
145
+ revision: 9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3
146
+ metrics:
147
+ - type: ndcg_at_10
148
+ value: 37.28
149
+ - task:
150
+ type: Retrieval
151
+ dataset:
152
+ type: mteb/CQADupstackEnglishRetrieval
153
+ name: MTEB CQADupstackEnglishRetrieval
154
+ config: default
155
+ split: test
156
+ revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
157
+ metrics:
158
+ - type: ndcg_at_10
159
+ value: 34.91
160
+ - task:
161
+ type: Retrieval
162
+ dataset:
163
+ type: mteb/CQADupstackGamingRetrieval
164
+ name: MTEB CQADupstackGamingRetrieval
165
+ config: default
166
+ split: test
167
+ revision: 4885aa143210c98657558c04aaf3dc47cfb54340
168
+ metrics:
169
+ - type: ndcg_at_10
170
+ value: 44.91
171
+ - task:
172
+ type: Retrieval
173
+ dataset:
174
+ type: mteb/CQADupstackGisRetrieval
175
+ name: MTEB CQADupstackGisRetrieval
176
+ config: default
177
+ split: test
178
+ revision: 5003b3064772da1887988e05400cf3806fe491f2
179
+ metrics:
180
+ - type: ndcg_at_10
181
+ value: 29.65
182
+ - task:
183
+ type: Retrieval
184
+ dataset:
185
+ type: mteb/CQADupstackMathematicaRetrieval
186
+ name: MTEB CQADupstackMathematicaRetrieval
187
+ config: default
188
+ split: test
189
+ revision: 90fceea13679c63fe563ded68f3b6f06e50061de
190
+ metrics:
191
+ - type: ndcg_at_10
192
+ value: 24.91
193
+ - task:
194
+ type: Retrieval
195
+ dataset:
196
+ type: mteb/CQADupstackPhysicsRetrieval
197
+ name: MTEB CQADupstackPhysicsRetrieval
198
+ config: default
199
+ split: test
200
+ revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
201
+ metrics:
202
+ - type: ndcg_at_10
203
+ value: 34.23
204
+ - task:
205
+ type: Retrieval
206
+ dataset:
207
+ type: mteb/CQADupstackProgrammersRetrieval
208
+ name: MTEB CQADupstackProgrammersRetrieval
209
+ config: default
210
+ split: test
211
+ revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
212
+ metrics:
213
+ - type: ndcg_at_10
214
+ value: 33.53
215
+ - task:
216
+ type: Retrieval
217
+ dataset:
218
+ type: mteb/CQADupstackRetrieval
219
+ name: MTEB CQADupstackRetrieval
220
+ config: default
221
+ split: test
222
+ revision: '1'
223
+ metrics:
224
+ - type: ndcg_at_10
225
+ value: 31.05
226
+ - task:
227
+ type: Retrieval
228
+ dataset:
229
+ type: mteb/CQADupstackStatsRetrieval
230
+ name: MTEB CQADupstackStatsRetrieval
231
+ config: default
232
+ split: test
233
+ revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
234
+ metrics:
235
+ - type: ndcg_at_10
236
+ value: 26.66
237
+ - task:
238
+ type: Retrieval
239
+ dataset:
240
+ type: mteb/CQADupstackTexRetrieval
241
+ name: MTEB CQADupstackTexRetrieval
242
+ config: default
243
+ split: test
244
+ revision: 46989137a86843e03a6195de44b09deda022eec7
245
+ metrics:
246
+ - type: ndcg_at_10
247
+ value: 21.77
248
+ - task:
249
+ type: Retrieval
250
+ dataset:
251
+ type: mteb/CQADupstackUnixRetrieval
252
+ name: MTEB CQADupstackUnixRetrieval
253
+ config: default
254
+ split: test
255
+ revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
256
+ metrics:
257
+ - type: ndcg_at_10
258
+ value: 29.57
259
+ - task:
260
+ type: Retrieval
261
+ dataset:
262
+ type: mteb/CQADupstackWebmastersRetrieval
263
+ name: MTEB CQADupstackWebmastersRetrieval
264
+ config: default
265
+ split: test
266
+ revision: 160c094312a0e1facb97e55eeddb698c0abe3571
267
+ metrics:
268
+ - type: ndcg_at_10
269
+ value: 31.33
270
+ - task:
271
+ type: Retrieval
272
+ dataset:
273
+ type: mteb/CQADupstackWordpressRetrieval
274
+ name: MTEB CQADupstackWordpressRetrieval
275
+ config: default
276
+ split: test
277
+ revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
278
+ metrics:
279
+ - type: ndcg_at_10
280
+ value: 23.86
281
+ - task:
282
+ type: Retrieval
283
+ dataset:
284
+ type: mteb/ClimateFEVER
285
+ name: MTEB ClimateFEVER
286
+ config: default
287
+ split: test
288
+ revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
289
+ metrics:
290
+ - type: ndcg_at_10
291
+ value: 28.51
292
+ - task:
293
+ type: Retrieval
294
+ dataset:
295
+ type: mteb/DBPedia
296
+ name: MTEB DBPedia
297
+ config: default
298
+ split: test
299
+ revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
300
+ metrics:
301
+ - type: ndcg_at_10
302
+ value: 36.32
303
+ - task:
304
+ type: Classification
305
+ dataset:
306
+ type: mteb/EmotionClassification
307
+ name: MTEB EmotionClassification
308
+ config: default
309
+ split: test
310
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
311
+ metrics:
312
+ - type: accuracy
313
+ value: 47.69
314
+ - task:
315
+ type: Retrieval
316
+ dataset:
317
+ type: mteb/FEVER
318
+ name: MTEB FEVER
319
+ config: default
320
+ split: test
321
+ revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
322
+ metrics:
323
+ - type: ndcg_at_10
324
+ value: 60.27
325
+ - task:
326
+ type: Retrieval
327
+ dataset:
328
+ type: mteb/FiQA2018
329
+ name: MTEB FiQA2018
330
+ config: default
331
+ split: test
332
+ revision: 27a168819829fe9bcd655c2df245fb19452e8e06
333
+ metrics:
334
+ - type: ndcg_at_10
335
+ value: 32.59
336
+ - task:
337
+ type: Retrieval
338
+ dataset:
339
+ type: mteb/HotpotQA
340
+ name: MTEB HotpotQA
341
+ config: default
342
+ split: test
343
+ revision: ab518f4d6fcca38d87c25209f94beba119d02014
344
+ metrics:
345
+ - type: ndcg_at_10
346
+ value: 52.43
347
+ - task:
348
+ type: Classification
349
+ dataset:
350
+ type: mteb/ImdbClassification
351
+ name: MTEB ImdbClassification
352
+ config: default
353
+ split: test
354
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
355
+ metrics:
356
+ - type: accuracy
357
+ value: 73.59
358
+ - task:
359
+ type: Retrieval
360
+ dataset:
361
+ type: mteb/MSMARCO
362
+ name: MTEB MSMARCO
363
+ config: default
364
+ split: test
365
+ revision: c5a29a104738b98a9e76336939199e264163d4a0
366
+ metrics:
367
+ - type: ndcg_at_10
368
+ value: 0
369
+ - task:
370
+ type: Classification
371
+ dataset:
372
+ type: mteb/MTOPDomainClassification
373
+ name: MTEB MTOPDomainClassification
374
+ config: default
375
+ split: test
376
+ revision: a76d16fae880597b9c73047b50159220a441cb54
377
+ metrics:
378
+ - type: accuracy
379
+ value: 90.37
380
+ - task:
381
+ type: Classification
382
+ dataset:
383
+ type: mteb/MTOPIntentClassification
384
+ name: MTEB MTOPIntentClassification
385
+ config: default
386
+ split: test
387
+ revision: 2992d820f31312593c49a4890430aadadb0f0039
388
+ metrics:
389
+ - type: accuracy
390
+ value: 62.51
391
+ - task:
392
+ type: Classification
393
+ dataset:
394
+ type: mteb/MassiveIntentClassification
395
+ name: MTEB MassiveIntentClassification
396
+ config: default
397
+ split: test
398
+ revision: 4672e20407010da34463acc759c162ca9734bca6
399
+ metrics:
400
+ - type: accuracy
401
+ value: 68.19
402
+ - task:
403
+ type: Classification
404
+ dataset:
405
+ type: mteb/MassiveScenarioClassification
406
+ name: MTEB MassiveScenarioClassification
407
+ config: default
408
+ split: test
409
+ revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8
410
+ metrics:
411
+ - type: accuracy
412
+ value: 73.07
413
+ - task:
414
+ type: Clustering
415
+ dataset:
416
+ type: mteb/MedrxivClusteringP2P
417
+ name: MTEB MedrxivClusteringP2P
418
+ config: default
419
+ split: test
420
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
421
+ metrics:
422
+ - type: v_measure
423
+ value: 32.02
424
+ - task:
425
+ type: Clustering
426
+ dataset:
427
+ type: mteb/MedrxivClusteringS2S
428
+ name: MTEB MedrxivClusteringS2S
429
+ config: default
430
+ split: test
431
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
432
+ metrics:
433
+ - type: v_measure
434
+ value: 29.22
435
+ - task:
436
+ type: Reranking
437
+ dataset:
438
+ type: mteb/MindSmallReranking
439
+ name: MTEB MindSmallReranking
440
+ config: default
441
+ split: test
442
+ revision: 227478e3235572039f4f7661840e059f31ef6eb1
443
+ metrics:
444
+ - type: map
445
+ value: 30.62
446
+ - task:
447
+ type: Retrieval
448
+ dataset:
449
+ type: mteb/NFCorpus
450
+ name: MTEB NFCorpus
451
+ config: default
452
+ split: test
453
+ revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
454
+ metrics:
455
+ - type: ndcg_at_10
456
+ value: 30.35
457
+ - task:
458
+ type: Retrieval
459
+ dataset:
460
+ type: mteb/NQ
461
+ name: MTEB NQ
462
+ config: default
463
+ split: test
464
+ revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
465
+ metrics:
466
+ - type: ndcg_at_10
467
+ value: 50.71
468
+ - task:
469
+ type: Retrieval
470
+ dataset:
471
+ type: mteb/QuoraRetrieval
472
+ name: MTEB QuoraRetrieval
473
+ config: default
474
+ split: test
475
+ revision: e4e08e0b7dbe3c8700f0daef558ff32256715259
476
+ metrics:
477
+ - type: ndcg_at_10
478
+ value: 60.88
479
+ - task:
480
+ type: Clustering
481
+ dataset:
482
+ type: mteb/RedditClustering
483
+ name: MTEB RedditClustering
484
+ config: default
485
+ split: test
486
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
487
+ metrics:
488
+ - type: v_measure
489
+ value: 44.67
490
+ - task:
491
+ type: Clustering
492
+ dataset:
493
+ type: mteb/RedditClusteringP2P
494
+ name: MTEB RedditClusteringP2P
495
+ config: default
496
+ split: test
497
+ revision: 385e3cb46b4cfa89021f56c4380204149d0efe33
498
+ metrics:
499
+ - type: v_measure
500
+ value: 53.67
501
+ - task:
502
+ type: Retrieval
503
+ dataset:
504
+ type: mteb/SCIDOCS
505
+ name: MTEB SCIDOCS
506
+ config: default
507
+ split: test
508
+ revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88
509
+ metrics:
510
+ - type: ndcg_at_10
511
+ value: 16.37
512
+ - task:
513
+ type: STS
514
+ dataset:
515
+ type: mteb/SICK-R
516
+ name: MTEB SICK-R
517
+ config: default
518
+ split: test
519
+ revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
520
+ metrics:
521
+ - type: cosine_spearman
522
+ value: 79.81
523
+ - task:
524
+ type: STS
525
+ dataset:
526
+ type: mteb/STS12
527
+ name: MTEB STS12
528
+ config: default
529
+ split: test
530
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
531
+ metrics:
532
+ - type: cosine_spearman
533
+ value: 76.03
534
+ - task:
535
+ type: STS
536
+ dataset:
537
+ type: mteb/STS13
538
+ name: MTEB STS13
539
+ config: default
540
+ split: test
541
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
542
+ metrics:
543
+ - type: cosine_spearman
544
+ value: 85.05
545
+ - task:
546
+ type: STS
547
+ dataset:
548
+ type: mteb/STS14
549
+ name: MTEB STS14
550
+ config: default
551
+ split: test
552
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
553
+ metrics:
554
+ - type: cosine_spearman
555
+ value: 80.97
556
+ - task:
557
+ type: STS
558
+ dataset:
559
+ type: mteb/STS15
560
+ name: MTEB STS15
561
+ config: default
562
+ split: test
563
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
564
+ metrics:
565
+ - type: cosine_spearman
566
+ value: 86.88
567
+ - task:
568
+ type: STS
569
+ dataset:
570
+ type: mteb/STS16
571
+ name: MTEB STS16
572
+ config: default
573
+ split: test
574
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
575
+ metrics:
576
+ - type: cosine_spearman
577
+ value: 83.3
578
+ - task:
579
+ type: STS
580
+ dataset:
581
+ type: mteb/STSBenchmark
582
+ name: MTEB STSBenchmark
583
+ config: default
584
+ split: test
585
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
586
+ metrics:
587
+ - type: cosine_spearman
588
+ value: 86.49
589
+ - task:
590
+ type: Reranking
591
+ dataset:
592
+ type: mteb/SciDocsRR
593
+ name: MTEB SciDocsRR
594
+ config: default
595
+ split: test
596
+ revision: 39b8377811871075eed9de3b8a7e21aaa6acb3d8
597
+ metrics:
598
+ - type: map
599
+ value: 74.1
600
+ - task:
601
+ type: Retrieval
602
+ dataset:
603
+ type: mteb/SciFact
604
+ name: MTEB SciFact
605
+ config: default
606
+ split: test
607
+ revision: d56462d0e63a25450459c4f213e49ffdb866f7f9
608
+ metrics:
609
+ - type: ndcg_at_10
610
+ value: 59.42
611
+ - task:
612
+ type: PairClassification
613
+ dataset:
614
+ type: mteb/SprintDuplicateQuestions
615
+ name: MTEB SprintDuplicateQuestions
616
+ config: default
617
+ split: test
618
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
619
+ metrics:
620
+ - type: cosine_ap
621
+ value: 94.91
622
+ - task:
623
+ type: Clustering
624
+ dataset:
625
+ type: mteb/StackExchangeClustering
626
+ name: MTEB StackExchangeClustering
627
+ config: default
628
+ split: test
629
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
630
+ metrics:
631
+ - type: v_measure
632
+ value: 52.04
633
+ - task:
634
+ type: Clustering
635
+ dataset:
636
+ type: mteb/StackExchangeClusteringP2P
637
+ name: MTEB StackExchangeClusteringP2P
638
+ config: default
639
+ split: test
640
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
641
+ metrics:
642
+ - type: v_measure
643
+ value: 34.14
644
+ - task:
645
+ type: Reranking
646
+ dataset:
647
+ type: mteb/StackOverflowDupQuestions
648
+ name: MTEB StackOverflowDupQuestions
649
+ config: default
650
+ split: test
651
+ revision: 5debda000fe8e27ebb5c123d38081f92e1847a59
652
+ metrics:
653
+ - type: map
654
+ value: 43.53
655
+ - task:
656
+ type: Summarization
657
+ dataset:
658
+ type: mteb/SummEval
659
+ name: MTEB SummEval
660
+ config: default
661
+ split: test
662
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
663
+ metrics:
664
+ - type: cosine_spearman
665
+ value: 29.73
666
+ - task:
667
+ type: Retrieval
668
+ dataset:
669
+ type: mteb/TRECCOVID
670
+ name: MTEB TRECCOVID
671
+ config: default
672
+ split: test
673
+ revision: bb9466bac8153a0349341eb1b22e06409e78ef4e
674
+ metrics:
675
+ - type: ndcg_at_10
676
+ value: 67.01
677
+ - task:
678
+ type: Retrieval
679
+ dataset:
680
+ type: mteb/Touche2020
681
+ name: MTEB Touche2020
682
+ config: default
683
+ split: test
684
+ revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
685
+ metrics:
686
+ - type: ndcg_at_10
687
+ value: 28.58
688
+ - task:
689
+ type: Classification
690
+ dataset:
691
+ type: mteb/ToxicConversationsClassification
692
+ name: MTEB ToxicConversationsClassification
693
+ config: default
694
+ split: test
695
+ revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de
696
+ metrics:
697
+ - type: accuracy
698
+ value: 66.23
699
+ - task:
700
+ type: Classification
701
+ dataset:
702
+ type: mteb/TweetSentimentExtractionClassification
703
+ name: MTEB TweetSentimentExtractionClassification
704
+ config: default
705
+ split: test
706
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
707
+ metrics:
708
+ - type: accuracy
709
+ value: 62.04
710
+ - task:
711
+ type: Clustering
712
+ dataset:
713
+ type: mteb/TwentyNewsgroupsClustering
714
+ name: MTEB TwentyNewsgroupsClustering
715
+ config: default
716
+ split: test
717
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
718
+ metrics:
719
+ - type: v_measure
720
+ value: 41.63
721
+ - task:
722
+ type: PairClassification
723
+ dataset:
724
+ type: mteb/TwitterSemEval2015
725
+ name: MTEB TwitterSemEval2015
726
+ config: default
727
+ split: test
728
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
729
+ metrics:
730
+ - type: cosine_ap
731
+ value: 70.79
732
+ - task:
733
+ type: PairClassification
734
+ dataset:
735
+ type: mteb/TwitterURLCorpus
736
+ name: MTEB TwitterURLCorpus
737
+ config: default
738
+ split: test
739
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
740
+ metrics:
741
+ - type: cosine_ap
742
+ value: 85.5
743
+ ---
744
+
745
+ # ogma-base
746
+
747
+ **13.3M parameter text embedding model** by [Axiotic AI](https://axiotic.ai), achieving **56.54 average** on MTEB English v1 (54/54 tasks).
748
+
749
+ 12-layer transformer, 256 hidden dim, 128 embedding dim.
750
+
751
+ ## Highlights
752
+
753
+ - **13.3M parameters** — small enough for CPU inference, edge deployment, and resource-constrained environments
754
+ - **56.54 MTEB average** — outperforms Potion-32M (51.22) despite being significantly smaller
755
+ - **Matryoshka embeddings** — use dimensions [32, 64, 128, 256] for flexible storage/compute tradeoffs
756
+ - **Asymmetric encoding** — dedicated `[QRY]`, `[DOC]`, `[SYM]` task tokens for query-document and symmetric tasks
757
+ - **1024 token context** — handles longer passages than typical small models (Potion: 512)
758
+ - **Pure PyTorch** — no external transformer library dependencies
759
+
760
+ ## Architecture
761
+
762
+ | Component | Details |
763
+ |-----------|---------|
764
+ | Parameters | 13.3M |
765
+ | Layers | 12 |
766
+ | Hidden dim (d_model) | 256 |
767
+ | Embedding dim (d_embed) | 128 |
768
+ | Output dim (d_output) | 256 |
769
+ | Attention heads | 4 |
770
+ | Max sequence length | 1024 |
771
+ | Matryoshka dims | [32, 64, 128, 256] |
772
+ | Pooling | Mean (mask-aware) |
773
+ | Position encoding | RoPE |
774
+ | FFN | SwiGLU |
775
+ | Normalization | Pre-LayerNorm |
776
+ | Tokenizer | SentencePiece Unigram (30K vocab) |
777
+ | Training | Knowledge distillation from teacher model |
778
+
779
+ ## MTEB Results
780
+
781
+ ### Category-Level Scores
782
+
783
+ | Category | ogma-base | Potion-32M | Potion-8M | vs Potion-32M |
784
+ |----------|-----------|------------|-----------|---------------|
785
+ | Classification | **67.73** | 66.01 | 64.46 | +1.72 |
786
+ | Clustering | **41.49** | 39.24 | 36.88 | +2.25 |
787
+ | PairClassification | **83.73** | 78.17 | 76.62 | +5.56 |
788
+ | Reranking | **51.25** | 50.92 | 49.73 | +0.33 |
789
+ | Retrieval | **42.36** | 32.21 | 30.43 | +10.15 |
790
+ | STS | **82.83** | 73.86 | 72.93 | +8.97 |
791
+ | Summarization | **29.73** | 29.77 | 29.26 | -0.04 |
792
+ | **Overall** | **56.54** | 51.22 | 49.58 | **+5.32** |
793
+
794
+ > **Potion scores are locally reproduced** using the same evaluation pipeline and hardware for fair head-to-head comparison. These are not self-reported numbers from the Potion model card.
795
+
796
+ ## Usage
797
+
798
+ ### Quick Start
799
+
800
+ ```python
801
+ import torch
802
+ import numpy as np
803
+ from pathlib import Path
804
+
805
+ # Load model
806
+ from ogma_model import OgmaModel
807
+ from config import OgmaConfig
808
+ from tokenizer import OgmaTokenizer
809
+
810
+ # Load from checkpoint directory
811
+ model = OgmaModel.from_checkpoint("path/to/ogma-base", device="cpu")
812
+ model.eval()
813
+
814
+ # Load tokenizer (uses the SentencePiece model embedded in tokenizer.json)
815
+ # The tokenizer needs the .model file — extract from tokenizer.json or use:
816
+ tokenizer = OgmaTokenizer("path/to/tokenizer.model")
817
+
818
+ # Encode text
819
+ texts = ["This is a query", "This is a document"]
820
+ encoded = tokenizer.batch_encode(texts, max_length=1024)
821
+
822
+ token_ids = torch.tensor(encoded["input_ids"])
823
+ attention_mask = torch.tensor(encoded["attention_mask"])
824
+
825
+ # Use task tokens for asymmetric encoding
826
+ from config import TaskToken
827
+
828
+ with torch.no_grad():
829
+ # For symmetric tasks (STS, clustering, classification)
830
+ embeddings = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
831
+
832
+ # For retrieval — encode queries and documents separately
833
+ query_embs = model.encode(token_ids[:1], attention_mask[:1], task=TaskToken.QRY)
834
+ doc_embs = model.encode(token_ids[1:], attention_mask[1:], task=TaskToken.DOC)
835
+
836
+ print(f"Embedding shape: {embeddings.shape}") # (2, 256)
837
+ ```
838
+
839
+ ### Matryoshka Dimensionality Reduction
840
+
841
+ ```python
842
+ # Full embeddings: 256d
843
+ full_embs = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
844
+
845
+ # Reduce to any Matryoshka dimension: [32, 64, 128, 256]
846
+ dim = 64
847
+ reduced_embs = torch.nn.functional.normalize(full_embs[:, :dim], p=2, dim=-1)
848
+ # These reduced embeddings are trained to be effective at lower dims
849
+ ```
850
+
851
+ ### Loading with safetensors
852
+
853
+ ```python
854
+ import torch
855
+ import yaml
856
+ from safetensors.torch import load_file
857
+ from ogma_model import OgmaModel
858
+ from config import OgmaConfig
859
+
860
+ # Load config
861
+ with open("path/to/ogma-base/config.json") as f:
862
+ import json
863
+ config_dict = json.load(f)
864
+
865
+ config = OgmaConfig.from_dict(config_dict)
866
+ model = OgmaModel(config)
867
+
868
+ # Load weights from safetensors
869
+ state_dict = load_file("path/to/ogma-base/model.safetensors")
870
+ model.load_state_dict(state_dict)
871
+ model.eval()
872
+ ```
873
+
874
+ ## Task Tokens
875
+
876
+ Ogma uses task-specific prefix tokens for asymmetric encoding:
877
+
878
+ | Token | ID | Use Case |
879
+ |-------|-----|----------|
880
+ | `[QRY]` | 4 | Query encoding for retrieval |
881
+ | `[DOC]` | 5 | Document/passage encoding for retrieval |
882
+ | `[SYM]` | 6 | Symmetric tasks (STS, classification, clustering) |
883
+
884
+ For retrieval tasks, encode queries with `[QRY]` and documents with `[DOC]`. For all other tasks, use `[SYM]`.
885
+
886
+ ## Training
887
+
888
+ Ogma is trained via **knowledge distillation** from a larger teacher embedding model. The training pipeline:
889
+
890
+ 1. **Tokenizer**: SentencePiece Unigram model trained on the distillation corpus (30K vocab)
891
+ 2. **Token embeddings**: PCA-reduced embeddings from the teacher model, providing a strong initialization
892
+ 3. **Distillation**: MSE loss between student and teacher embeddings, with Matryoshka loss at multiple dimensions
893
+ 4. **Architecture**: Standard transformer encoder with RoPE positional encoding and SwiGLU FFN
894
+
895
+ ## Files
896
+
897
+ | File | Description |
898
+ |------|-------------|
899
+ | `model.safetensors` | Model weights (safetensors format) |
900
+ | `model.pt` | Model weights (PyTorch format) |
901
+ | `config.json` | Model configuration |
902
+ | `config.yaml` | Original training config |
903
+ | `tokenizer.json` | HuggingFace tokenizer |
904
+ | `tokenizer_config.json` | Tokenizer configuration |
905
+ | `token_embeds_128d.npy` | Pre-computed token embeddings (30K × 128, float16) |
906
+ | `ogma_model.py` | OgmaModel class |
907
+ | `config.py` | OgmaConfig dataclass |
908
+ | `embeddings.py` | Token embedding + RoPE |
909
+ | `pooling.py` | Pooling strategies |
910
+ | `variants/transformer.py` | Transformer encoder variant |
911
+ | `tokenizer.py` | OgmaTokenizer wrapper |
912
+ | `results/` | MTEB result JSONs |
913
+
914
+ ## Citation
915
+
916
+ ```bibtex
917
+ @misc{ogma2026,
918
+ title={Ogma: Small High-Performance Text Embeddings},
919
+ author={Axiotic AI},
920
+ year={2026},
921
+ url={https://huggingface.co/axiotic/ogma-base}
922
+ }
923
+ ```
924
+
925
+ ## License
926
+
927
+ MIT
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "OgmaModel"
4
+ ],
5
+ "model_type": "ogma",
6
+ "auto_map": {
7
+ "AutoModel": "ogma_model.OgmaModel"
8
+ },
9
+ "variant": "transformer",
10
+ "d_embed": 128,
11
+ "d_model": 256,
12
+ "d_output": 256,
13
+ "n_layers": 12,
14
+ "n_heads": 4,
15
+ "vocab_size": 30000,
16
+ "max_seq_len": 1024,
17
+ "matryoshka_dims": [
18
+ 32,
19
+ 64,
20
+ 128,
21
+ 256
22
+ ],
23
+ "pooling": "mean",
24
+ "ffn_mult": 2.6666666666666665,
25
+ "conv_kernel_size": 7,
26
+ "spatial_rank": 32,
27
+ "n_random_features": 128,
28
+ "dropout": 0.0,
29
+ "pad_id": 0,
30
+ "unk_id": 1,
31
+ "bos_id": 2,
32
+ "eos_id": 3,
33
+ "qry_id": 4,
34
+ "doc_id": 5,
35
+ "sym_id": 6,
36
+ "n_special_tokens": 7
37
+ }
config.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Model configuration for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from dataclasses import dataclass, field
6
+ from enum import StrEnum
7
+ from typing import Any
8
+
9
+ __all__ = ["OgmaConfig", "VariantType", "PoolingType", "TaskToken"]
10
+
11
+
12
+ class VariantType(StrEnum):
13
+ """Architecture variant identifiers."""
14
+
15
+ TRANSFORMER = "transformer"
16
+ DEEP_NARROW = "deep_narrow"
17
+ CONV = "conv"
18
+ LINEAR_ATTENTION = "linear_attention"
19
+ MLP_MIXER = "mlp_mixer"
20
+ TRANSFORMER_RESA = "transformer_resa"
21
+ GLA = "gla"
22
+
23
+
24
+ class PoolingType(StrEnum):
25
+ """Pooling strategy identifiers."""
26
+
27
+ TASK_TOKEN = "task_token"
28
+ LATENT_ATTENTION = "latent_attention"
29
+ MEAN = "mean"
30
+
31
+
32
+ class TaskToken(StrEnum):
33
+ """Task token identifiers for asymmetric encoding."""
34
+
35
+ QRY = "QRY"
36
+ DOC = "DOC"
37
+ SYM = "SYM"
38
+
39
+
40
+ @dataclass
41
+ class OgmaConfig:
42
+ """Configuration for an Ogma model instance.
43
+
44
+ Args:
45
+ variant: Architecture variant to use.
46
+ d_embed: Token embedding dimension (from teacher PCA).
47
+ d_model: Internal model dimension after projection.
48
+ n_layers: Number of fusion layers/blocks.
49
+ n_heads: Number of attention heads (attention variants only).
50
+ vocab_size: Vocabulary size for embedding table.
51
+ max_seq_len: Maximum sequence length.
52
+ matryoshka_dims: Nested output dimensions for Matryoshka.
53
+ pooling: Pooling strategy.
54
+ d_output: Final output dimension.
55
+ ffn_mult: SwiGLU FFN hidden dimension multiplier.
56
+ conv_kernel_size: Kernel size for conv variant.
57
+ spatial_rank: Rank of spatial mixing in MLP mixer.
58
+ n_random_features: Random features for linear attention.
59
+ dropout: Dropout rate (0 for inference).
60
+ """
61
+
62
+ variant: VariantType = VariantType.TRANSFORMER
63
+ d_embed: int = 128
64
+ d_model: int = 256
65
+ n_layers: int = 1
66
+ n_heads: int = 4
67
+ vocab_size: int = 30_000
68
+ max_seq_len: int = 512
69
+ matryoshka_dims: list[int] = field(
70
+ default_factory=lambda: [32, 64, 128, 256]
71
+ )
72
+ pooling: PoolingType = PoolingType.TASK_TOKEN
73
+ d_output: int = 256
74
+ ffn_mult: float = 8 / 3 # SwiGLU: 8/3 * d_model ≈ 683 for d=256
75
+ conv_kernel_size: int = 7
76
+ spatial_rank: int = 32
77
+ n_random_features: int = 128
78
+ dropout: float = 0.0
79
+
80
+ # ReSA scorer settings
81
+ scorer_type: str = "dot"
82
+ scorer_alpha_init: float = 0.1
83
+ scorer_hidden: int = 0 # 0 defaults to d_head
84
+
85
+ # GLA (Gated Linear Attention) settings
86
+ gla_expand_k: float = 0.5 # key dim expansion (key_dim = d_model * expand_k)
87
+ gla_expand_v: float = 1.0 # value dim expansion (value_dim = d_model * expand_v)
88
+ gla_gate_low_rank_dim: int = 16 # low-rank dim for gating projection
89
+ gla_gate_logit_normalizer: int = 16 # normalizer for gate logits
90
+ gla_use_short_conv: bool = True # whether to use short conv on Q,K,V
91
+ gla_conv_size: int = 4 # short conv kernel size
92
+
93
+ # Special token IDs
94
+ pad_id: int = 0
95
+ unk_id: int = 1
96
+ bos_id: int = 2
97
+ eos_id: int = 3
98
+ qry_id: int = 4
99
+ doc_id: int = 5
100
+ sym_id: int = 6
101
+ n_special_tokens: int = 7
102
+
103
+ @property
104
+ def d_head(self) -> int:
105
+ """Per-head dimension."""
106
+ return self.d_model // self.n_heads
107
+
108
+ @property
109
+ def ffn_hidden(self) -> int:
110
+ """SwiGLU FFN hidden dimension."""
111
+ return int(self.d_model * self.ffn_mult)
112
+
113
+ def task_token_id(self, task: TaskToken) -> int:
114
+ """Return token ID for a task token."""
115
+ mapping = {
116
+ TaskToken.QRY: self.qry_id,
117
+ TaskToken.DOC: self.doc_id,
118
+ TaskToken.SYM: self.sym_id,
119
+ }
120
+ return mapping[task]
121
+
122
+ def to_dict(self) -> dict[str, Any]:
123
+ """Serialize config to dictionary."""
124
+ return {
125
+ "variant": self.variant.value,
126
+ "d_embed": self.d_embed,
127
+ "d_model": self.d_model,
128
+ "n_layers": self.n_layers,
129
+ "n_heads": self.n_heads,
130
+ "vocab_size": self.vocab_size,
131
+ "max_seq_len": self.max_seq_len,
132
+ "matryoshka_dims": self.matryoshka_dims,
133
+ "pooling": self.pooling.value,
134
+ "d_output": self.d_output,
135
+ "ffn_mult": self.ffn_mult,
136
+ "conv_kernel_size": self.conv_kernel_size,
137
+ "spatial_rank": self.spatial_rank,
138
+ "n_random_features": self.n_random_features,
139
+ "dropout": self.dropout,
140
+ "scorer_type": self.scorer_type,
141
+ "scorer_alpha_init": self.scorer_alpha_init,
142
+ "scorer_hidden": self.scorer_hidden,
143
+ "gla_expand_k": self.gla_expand_k,
144
+ "gla_expand_v": self.gla_expand_v,
145
+ "gla_gate_low_rank_dim": self.gla_gate_low_rank_dim,
146
+ "gla_gate_logit_normalizer": self.gla_gate_logit_normalizer,
147
+ "gla_use_short_conv": self.gla_use_short_conv,
148
+ "gla_conv_size": self.gla_conv_size,
149
+ }
150
+
151
+ @classmethod
152
+ def from_dict(cls, data: dict[str, Any]) -> OgmaConfig:
153
+ """Deserialize config from dictionary."""
154
+ data = dict(data)
155
+ if "variant" in data:
156
+ data["variant"] = VariantType(data["variant"])
157
+ if "pooling" in data:
158
+ data["pooling"] = PoolingType(data["pooling"])
159
+ known = {f.name for f in cls.__dataclass_fields__.values()}
160
+ filtered = {k: v for k, v in data.items() if k in known}
161
+ return cls(**filtered)
config.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ conv_kernel_size: 7
2
+ d_embed: 128
3
+ d_model: 256
4
+ d_output: 256
5
+ dropout: 0.0
6
+ ffn_mult: 2.6666666666666665
7
+ matryoshka_dims:
8
+ - 32
9
+ - 64
10
+ - 128
11
+ - 256
12
+ max_seq_len: 1024
13
+ n_heads: 4
14
+ n_layers: 12
15
+ n_random_features: 128
16
+ pooling: mean
17
+ spatial_rank: 32
18
+ variant: transformer
19
+ vocab_size: 30000
embeddings.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Token embeddings, task token embeddings, and RoPE for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+
8
+ from ogma.model.config import OgmaConfig
9
+
10
+ __all__ = ["TokenEmbedding", "RotaryPositionalEncoding"]
11
+
12
+
13
+ class TokenEmbedding(nn.Module):
14
+ """Token embedding with optional linear projection.
15
+
16
+ Loads a vocab_size x d_embed embedding table and projects to d_model.
17
+ Includes 3 learnable task token embeddings ([QRY], [DOC], [SYM]).
18
+ """
19
+
20
+ def __init__(self, config: OgmaConfig) -> None:
21
+ super().__init__()
22
+ self.config = config
23
+ self.embed = nn.Embedding(
24
+ config.vocab_size + config.n_special_tokens,
25
+ config.d_embed,
26
+ padding_idx=config.pad_id,
27
+ )
28
+ if config.d_embed != config.d_model:
29
+ self.proj = nn.Linear(config.d_embed, config.d_model)
30
+ else:
31
+ self.proj = nn.Identity() # type: ignore[assignment]
32
+
33
+ # Task token embeddings are learned separately at d_model
34
+ self.task_tokens = nn.Embedding(3, config.d_model)
35
+
36
+ def forward(
37
+ self,
38
+ token_ids: torch.Tensor,
39
+ task_token_ids: torch.Tensor,
40
+ ) -> torch.Tensor:
41
+ """Embed tokens and prepend task token.
42
+
43
+ Args:
44
+ token_ids: (B, S) token IDs.
45
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
46
+
47
+ Returns:
48
+ (B, S+1, d_model) embeddings with task token prepended.
49
+ """
50
+ # Embed and project regular tokens
51
+ x = self.embed(token_ids) # (B, S, d_embed)
52
+ x = self.proj(x) # (B, S, d_model)
53
+
54
+ # Get task token embeddings (map 4,5,6 -> 0,1,2)
55
+ task_idx = task_token_ids - self.config.qry_id # (B,)
56
+ task_emb = self.task_tokens(task_idx) # (B, d_model)
57
+ task_emb = task_emb.unsqueeze(1) # (B, 1, d_model)
58
+
59
+ # Prepend task token
60
+ return torch.cat([task_emb, x], dim=1) # (B, S+1, d_model)
61
+
62
+ def load_pretrained_embeddings(
63
+ self, embeddings: torch.Tensor
64
+ ) -> None:
65
+ """Load pre-computed token embeddings (e.g., from teacher PCA).
66
+
67
+ Args:
68
+ embeddings: (vocab_size, d_embed) tensor.
69
+ """
70
+ with torch.no_grad():
71
+ n = min(embeddings.shape[0], self.config.vocab_size)
72
+ start = self.config.n_special_tokens
73
+ self.embed.weight[start : n + start] = embeddings[:n]
74
+
75
+
76
+ class RotaryPositionalEncoding(nn.Module):
77
+ """Rotary Position Embedding (RoPE). Zero trainable parameters."""
78
+
79
+ def __init__(self, dim: int, max_seq_len: int = 512) -> None:
80
+ super().__init__()
81
+ inv_freq = 1.0 / (
82
+ 10000.0 ** (torch.arange(0, dim, 2).float() / dim)
83
+ )
84
+ self.register_buffer("inv_freq", inv_freq)
85
+ self._build_cache(max_seq_len)
86
+
87
+ def _build_cache(self, seq_len: int) -> None:
88
+ inv_freq: torch.Tensor = self.inv_freq # type: ignore[assignment]
89
+ t = torch.arange(seq_len, dtype=inv_freq.dtype)
90
+ freqs = torch.outer(t, inv_freq)
91
+ cos_cached = freqs.cos()
92
+ sin_cached = freqs.sin()
93
+ self.register_buffer("cos_cached", cos_cached, persistent=False)
94
+ self.register_buffer("sin_cached", sin_cached, persistent=False)
95
+
96
+ def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
97
+ """Return cos and sin for sequence length of x.
98
+
99
+ Args:
100
+ x: (B, S, ...) tensor to determine sequence length.
101
+
102
+ Returns:
103
+ Tuple of (cos, sin) each of shape (S, d_head//2).
104
+ """
105
+ seq_len = x.shape[1]
106
+ cos: torch.Tensor = self.cos_cached # type: ignore[assignment]
107
+ sin: torch.Tensor = self.sin_cached # type: ignore[assignment]
108
+ if seq_len > cos.shape[0]:
109
+ self._build_cache(seq_len)
110
+ cos = self.cos_cached # type: ignore[assignment]
111
+ sin = self.sin_cached # type: ignore[assignment]
112
+ return cos[:seq_len], sin[:seq_len]
113
+
114
+
115
+ def apply_rope(
116
+ q: torch.Tensor,
117
+ k: torch.Tensor,
118
+ cos: torch.Tensor,
119
+ sin: torch.Tensor,
120
+ ) -> tuple[torch.Tensor, torch.Tensor]:
121
+ """Apply rotary embeddings to query and key tensors.
122
+
123
+ Args:
124
+ q: (B, n_heads, S, d_head) query tensor.
125
+ k: (B, n_heads, S, d_head) key tensor.
126
+ cos: (S, d_head//2) cosine cache.
127
+ sin: (S, d_head//2) sine cache.
128
+
129
+ Returns:
130
+ Rotated (q, k) tensors.
131
+ """
132
+
133
+ def _rotate(x: torch.Tensor) -> torch.Tensor:
134
+ x1 = x[..., : x.shape[-1] // 2]
135
+ x2 = x[..., x.shape[-1] // 2 :]
136
+ cos_exp = cos.unsqueeze(0).unsqueeze(0) # (1, 1, S, d_head//2)
137
+ sin_exp = sin.unsqueeze(0).unsqueeze(0)
138
+ return torch.cat(
139
+ [x1 * cos_exp - x2 * sin_exp, x2 * cos_exp + x1 * sin_exp],
140
+ dim=-1,
141
+ )
142
+
143
+ return _rotate(q), _rotate(k)
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:010499a3a946b0d272b46eb3dd32e818584a968f27f1134c72f5259401559f35
3
+ size 53317199
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b3761ec794444f889d18bf9dc3378f3a9c3474a1d7b64fcf6ba3f2041ec64848
3
+ size 53288344
ogma_model.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OgmaModel — top-level model wrapping any architecture variant."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, TaskToken, VariantType
10
+ from ogma.model.embeddings import TokenEmbedding
11
+ from ogma.model.pooling import create_pooling
12
+ from ogma.model.variants.conv import ConvVariant
13
+ from ogma.model.variants.deep_narrow import DeepNarrowVariant
14
+ from ogma.model.variants.linear_attention import LinearAttentionVariant
15
+ from ogma.model.variants.mlp_mixer import MLPMixerVariant
16
+ from ogma.model.variants.transformer import TransformerVariant
17
+ from ogma.model.variants.transformer_resa import TransformerReSAVariant
18
+ from ogma.model.variants.gla import GLAVariant
19
+
20
+ __all__ = ["OgmaModel"]
21
+
22
+ MAX_PARAMS = 10_000_000
23
+
24
+
25
+ def _build_variant(config: OgmaConfig) -> nn.Module:
26
+ """Instantiate the appropriate architecture variant."""
27
+ if config.variant == VariantType.TRANSFORMER:
28
+ return TransformerVariant(config)
29
+ elif config.variant == VariantType.DEEP_NARROW:
30
+ return DeepNarrowVariant(config)
31
+ elif config.variant == VariantType.CONV:
32
+ return ConvVariant(config)
33
+ elif config.variant == VariantType.LINEAR_ATTENTION:
34
+ return LinearAttentionVariant(config)
35
+ elif config.variant == VariantType.MLP_MIXER:
36
+ return MLPMixerVariant(config)
37
+ elif config.variant == VariantType.TRANSFORMER_RESA:
38
+ return TransformerReSAVariant(config)
39
+ elif config.variant == VariantType.GLA:
40
+ return GLAVariant(config)
41
+ raise ValueError(f"Unknown variant: {config.variant}")
42
+
43
+
44
+ class OgmaModel(nn.Module):
45
+ """Ogma embedding model.
46
+
47
+ Wraps any architecture variant with shared embedding, pooling, and
48
+ normalization. Produces L2-normalized embeddings at d_output dimensions,
49
+ Matryoshka-compatible at configured sub-dimensions.
50
+ """
51
+
52
+ def __init__(self, config: OgmaConfig) -> None:
53
+ super().__init__()
54
+ self.config = config
55
+ self.embedding = TokenEmbedding(config)
56
+ self.variant = _build_variant(config)
57
+ self.pooling = create_pooling(config)
58
+
59
+ # Output projection if variant output != d_output
60
+ needs_proj = (
61
+ config.variant == VariantType.DEEP_NARROW
62
+ and config.d_model != config.d_output
63
+ )
64
+ # DeepNarrowVariant already has output_proj, so no extra needed here
65
+ if not needs_proj and config.d_model != config.d_output:
66
+ self.output_proj: nn.Module = nn.Linear(
67
+ config.d_model, config.d_output
68
+ )
69
+ else:
70
+ self.output_proj = nn.Identity()
71
+
72
+ def forward(
73
+ self,
74
+ token_ids: torch.Tensor,
75
+ attention_mask: torch.Tensor,
76
+ task_token_ids: torch.Tensor,
77
+ ) -> torch.Tensor:
78
+ """Forward pass producing L2-normalized embeddings.
79
+
80
+ Args:
81
+ token_ids: (B, S) token IDs.
82
+ attention_mask: (B, S) attention mask (1=valid, 0=pad).
83
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
84
+
85
+ Returns:
86
+ (B, d_output) L2-normalized embeddings.
87
+ """
88
+ # Embed tokens with task token prepended -> (B, S+1, d_model)
89
+ x = self.embedding(token_ids, task_token_ids)
90
+
91
+ # Extend attention mask for prepended task token
92
+ task_mask = torch.ones(
93
+ attention_mask.shape[0], 1,
94
+ device=attention_mask.device,
95
+ dtype=attention_mask.dtype,
96
+ )
97
+ extended_mask = torch.cat([task_mask, attention_mask], dim=1)
98
+
99
+ # Run through variant
100
+ x = self.variant(x, extended_mask)
101
+
102
+ # Pool
103
+ x = self.pooling(x, extended_mask)
104
+
105
+ # Project if needed
106
+ x = self.output_proj(x)
107
+
108
+ # L2 normalize
109
+ return F.normalize(x, p=2, dim=-1)
110
+
111
+ def encode(
112
+ self,
113
+ token_ids: torch.Tensor,
114
+ attention_mask: torch.Tensor,
115
+ task: TaskToken = TaskToken.SYM,
116
+ ) -> torch.Tensor:
117
+ """Encode tokens with a specified task mode.
118
+
119
+ Args:
120
+ token_ids: (B, S) token IDs.
121
+ attention_mask: (B, S) attention mask.
122
+ task: Task token to use.
123
+
124
+ Returns:
125
+ (B, d_output) L2-normalized embeddings.
126
+ """
127
+ task_ids = torch.full(
128
+ (token_ids.shape[0],),
129
+ self.config.task_token_id(task),
130
+ device=token_ids.device,
131
+ dtype=torch.long,
132
+ )
133
+ return self.forward(token_ids, attention_mask, task_ids)
134
+
135
+ def param_count(self) -> int:
136
+ """Count total trainable parameters."""
137
+ return sum(p.numel() for p in self.parameters() if p.requires_grad)
138
+
139
+ def assert_param_budget(self) -> None:
140
+ """Assert model is under the 10M parameter budget."""
141
+ count = self.param_count()
142
+ assert count < MAX_PARAMS, (
143
+ f"Model has {count:,} params, exceeds {MAX_PARAMS:,} budget"
144
+ )
145
+
146
+ @classmethod
147
+ def from_config(cls, config: OgmaConfig) -> OgmaModel:
148
+ """Factory method to build a model from config."""
149
+ model = cls(config)
150
+ model.assert_param_budget()
151
+ return model
152
+
153
+ @classmethod
154
+ def from_checkpoint(
155
+ cls,
156
+ path: str,
157
+ device: str = "cpu",
158
+ ) -> OgmaModel:
159
+ """Load model from a checkpoint directory.
160
+
161
+ Args:
162
+ path: Path to checkpoint directory containing config.yaml
163
+ and model.pt.
164
+ device: Device to load model to.
165
+
166
+ Returns:
167
+ Loaded OgmaModel.
168
+ """
169
+ from pathlib import Path
170
+
171
+ import yaml
172
+
173
+ ckpt_path = Path(path)
174
+ with open(ckpt_path / "config.yaml") as f:
175
+ config_dict = yaml.safe_load(f)
176
+ config = OgmaConfig.from_dict(config_dict)
177
+
178
+ model = cls(config)
179
+ state_dict = torch.load(
180
+ ckpt_path / "model.pt",
181
+ map_location=device,
182
+ weights_only=True,
183
+ )
184
+ model.load_state_dict(state_dict)
185
+ model.to(device)
186
+ model.eval()
187
+ return model
188
+
189
+ def save_checkpoint(self, path: str) -> None:
190
+ """Save model checkpoint.
191
+
192
+ Args:
193
+ path: Directory to save config.yaml and model.pt.
194
+ """
195
+ from pathlib import Path
196
+
197
+ import yaml
198
+
199
+ ckpt_path = Path(path)
200
+ ckpt_path.mkdir(parents=True, exist_ok=True)
201
+ with open(ckpt_path / "config.yaml", "w") as f:
202
+ yaml.dump(self.config.to_dict(), f, default_flow_style=False)
203
+ torch.save(self.state_dict(), ckpt_path / "model.pt")
pooling.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Pooling strategies for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, PoolingType
10
+
11
+ __all__ = [
12
+ "create_pooling",
13
+ "TaskTokenPooling",
14
+ "LatentAttentionPooling",
15
+ "MeanPooling",
16
+ ]
17
+
18
+
19
+ def create_pooling(config: OgmaConfig) -> nn.Module:
20
+ """Factory for pooling layers."""
21
+ if config.pooling == PoolingType.TASK_TOKEN:
22
+ return TaskTokenPooling()
23
+ elif config.pooling == PoolingType.LATENT_ATTENTION:
24
+ return LatentAttentionPooling(config.d_model)
25
+ elif config.pooling == PoolingType.MEAN:
26
+ return MeanPooling()
27
+ raise ValueError(f"Unknown pooling type: {config.pooling}")
28
+
29
+
30
+ class TaskTokenPooling(nn.Module):
31
+ """Use the output at position 0 (task token) as the sentence embedding."""
32
+
33
+ def forward(
34
+ self,
35
+ x: torch.Tensor,
36
+ attention_mask: torch.Tensor | None = None,
37
+ ) -> torch.Tensor:
38
+ """Extract task token output.
39
+
40
+ Args:
41
+ x: (B, S, D) sequence outputs.
42
+ attention_mask: unused, for interface compatibility.
43
+
44
+ Returns:
45
+ (B, D) pooled output.
46
+ """
47
+ return x[:, 0, :]
48
+
49
+
50
+ class LatentAttentionPooling(nn.Module):
51
+ """Learned query vector attends over all token outputs."""
52
+
53
+ def __init__(self, d_model: int) -> None:
54
+ super().__init__()
55
+ self.query = nn.Parameter(torch.randn(d_model))
56
+
57
+ def forward(
58
+ self,
59
+ x: torch.Tensor,
60
+ attention_mask: torch.Tensor | None = None,
61
+ ) -> torch.Tensor:
62
+ """Attend over sequence with learned query.
63
+
64
+ Args:
65
+ x: (B, S, D) sequence outputs.
66
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
67
+
68
+ Returns:
69
+ (B, D) pooled output.
70
+ """
71
+ # (B, S)
72
+ scores = torch.matmul(x, self.query) / (x.shape[-1] ** 0.5)
73
+ if attention_mask is not None:
74
+ scores = scores.masked_fill(attention_mask == 0, float("-inf"))
75
+ weights = F.softmax(scores, dim=-1) # (B, S)
76
+ return torch.bmm(weights.unsqueeze(1), x).squeeze(1) # (B, D)
77
+
78
+
79
+ class MeanPooling(nn.Module):
80
+ """Average all token outputs (excluding padding)."""
81
+
82
+ def forward(
83
+ self,
84
+ x: torch.Tensor,
85
+ attention_mask: torch.Tensor | None = None,
86
+ ) -> torch.Tensor:
87
+ """Mean pool over valid tokens.
88
+
89
+ Args:
90
+ x: (B, S, D) sequence outputs.
91
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
92
+
93
+ Returns:
94
+ (B, D) pooled output.
95
+ """
96
+ if attention_mask is None:
97
+ return x.mean(dim=1)
98
+ mask = attention_mask.unsqueeze(-1).float() # (B, S, 1)
99
+ return (x * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
results/AmazonCounterfactualClassification.json ADDED
@@ -0,0 +1,526 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1f7e6a9d6fa6e64c53d146e428565640410c0df1",
3
+ "task_name": "AmazonCounterfactualClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.753754,
11
+ "f1": 0.623225,
12
+ "f1_weighted": 0.800371,
13
+ "precision": 0.619778,
14
+ "precision_weighted": 0.902615,
15
+ "recall": 0.78357,
16
+ "recall_weighted": 0.753754,
17
+ "ap": 0.23613,
18
+ "ap_weighted": 0.23613
19
+ },
20
+ {
21
+ "accuracy": 0.704204,
22
+ "f1": 0.571116,
23
+ "f1_weighted": 0.761959,
24
+ "precision": 0.586908,
25
+ "precision_weighted": 0.884058,
26
+ "recall": 0.716256,
27
+ "recall_weighted": 0.704204,
28
+ "ap": 0.184202,
29
+ "ap_weighted": 0.184202
30
+ },
31
+ {
32
+ "accuracy": 0.728228,
33
+ "f1": 0.611557,
34
+ "f1_weighted": 0.781609,
35
+ "precision": 0.620103,
36
+ "precision_weighted": 0.910621,
37
+ "recall": 0.802519,
38
+ "recall_weighted": 0.728228,
39
+ "ap": 0.240132,
40
+ "ap_weighted": 0.240132
41
+ },
42
+ {
43
+ "accuracy": 0.659159,
44
+ "f1": 0.545546,
45
+ "f1_weighted": 0.727054,
46
+ "precision": 0.582124,
47
+ "precision_weighted": 0.887965,
48
+ "recall": 0.717726,
49
+ "recall_weighted": 0.659159,
50
+ "ap": 0.178635,
51
+ "ap_weighted": 0.178635
52
+ },
53
+ {
54
+ "accuracy": 0.71021,
55
+ "f1": 0.587738,
56
+ "f1_weighted": 0.767229,
57
+ "precision": 0.602572,
58
+ "precision_weighted": 0.897745,
59
+ "recall": 0.759363,
60
+ "recall_weighted": 0.71021,
61
+ "ap": 0.209328,
62
+ "ap_weighted": 0.209328
63
+ },
64
+ {
65
+ "accuracy": 0.684685,
66
+ "f1": 0.56812,
67
+ "f1_weighted": 0.747347,
68
+ "precision": 0.594168,
69
+ "precision_weighted": 0.895136,
70
+ "recall": 0.745172,
71
+ "recall_weighted": 0.684685,
72
+ "ap": 0.196474,
73
+ "ap_weighted": 0.196474
74
+ },
75
+ {
76
+ "accuracy": 0.701201,
77
+ "f1": 0.578823,
78
+ "f1_weighted": 0.760175,
79
+ "precision": 0.597242,
80
+ "precision_weighted": 0.894588,
81
+ "recall": 0.747726,
82
+ "recall_weighted": 0.701201,
83
+ "ap": 0.200863,
84
+ "ap_weighted": 0.200863
85
+ },
86
+ {
87
+ "accuracy": 0.698198,
88
+ "f1": 0.568635,
89
+ "f1_weighted": 0.757478,
90
+ "precision": 0.587162,
91
+ "precision_weighted": 0.885502,
92
+ "recall": 0.719545,
93
+ "recall_weighted": 0.698198,
94
+ "ap": 0.184985,
95
+ "ap_weighted": 0.184985
96
+ },
97
+ {
98
+ "accuracy": 0.747748,
99
+ "f1": 0.625988,
100
+ "f1_weighted": 0.796452,
101
+ "precision": 0.625755,
102
+ "precision_weighted": 0.910405,
103
+ "recall": 0.806743,
104
+ "recall_weighted": 0.747748,
105
+ "ap": 0.24925,
106
+ "ap_weighted": 0.24925
107
+ },
108
+ {
109
+ "accuracy": 0.708709,
110
+ "f1": 0.568105,
111
+ "f1_weighted": 0.76495,
112
+ "precision": 0.581533,
113
+ "precision_weighted": 0.878276,
114
+ "recall": 0.698876,
115
+ "recall_weighted": 0.708709,
116
+ "ap": 0.175742,
117
+ "ap_weighted": 0.175742
118
+ }
119
+ ],
120
+ "accuracy": 0.70961,
121
+ "f1": 0.584885,
122
+ "f1_weighted": 0.766462,
123
+ "precision": 0.599734,
124
+ "precision_weighted": 0.894691,
125
+ "recall": 0.74975,
126
+ "recall_weighted": 0.70961,
127
+ "ap": 0.205574,
128
+ "ap_weighted": 0.205574,
129
+ "main_score": 0.70961,
130
+ "hf_subset": "en-ext",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ },
135
+ {
136
+ "scores_per_experiment": [
137
+ {
138
+ "accuracy": 0.665672,
139
+ "f1": 0.611594,
140
+ "f1_weighted": 0.706338,
141
+ "precision": 0.636594,
142
+ "precision_weighted": 0.842013,
143
+ "recall": 0.736493,
144
+ "recall_weighted": 0.665672,
145
+ "ap": 0.299211,
146
+ "ap_weighted": 0.299211
147
+ },
148
+ {
149
+ "accuracy": 0.701493,
150
+ "f1": 0.637226,
151
+ "f1_weighted": 0.737044,
152
+ "precision": 0.645156,
153
+ "precision_weighted": 0.840993,
154
+ "recall": 0.744523,
155
+ "recall_weighted": 0.701493,
156
+ "ap": 0.312881,
157
+ "ap_weighted": 0.312881
158
+ },
159
+ {
160
+ "accuracy": 0.60597,
161
+ "f1": 0.546852,
162
+ "f1_weighted": 0.653851,
163
+ "precision": 0.583924,
164
+ "precision_weighted": 0.79384,
165
+ "recall": 0.645867,
166
+ "recall_weighted": 0.60597,
167
+ "ap": 0.236533,
168
+ "ap_weighted": 0.236533
169
+ },
170
+ {
171
+ "accuracy": 0.677612,
172
+ "f1": 0.605779,
173
+ "f1_weighted": 0.715789,
174
+ "precision": 0.616915,
175
+ "precision_weighted": 0.815312,
176
+ "recall": 0.696004,
177
+ "recall_weighted": 0.677612,
178
+ "ap": 0.27473,
179
+ "ap_weighted": 0.27473
180
+ },
181
+ {
182
+ "accuracy": 0.692537,
183
+ "f1": 0.622851,
184
+ "f1_weighted": 0.728832,
185
+ "precision": 0.63076,
186
+ "precision_weighted": 0.826836,
187
+ "recall": 0.718661,
188
+ "recall_weighted": 0.692537,
189
+ "ap": 0.292763,
190
+ "ap_weighted": 0.292763
191
+ },
192
+ {
193
+ "accuracy": 0.689552,
194
+ "f1": 0.596311,
195
+ "f1_weighted": 0.723142,
196
+ "precision": 0.598351,
197
+ "precision_weighted": 0.792265,
198
+ "recall": 0.655515,
199
+ "recall_weighted": 0.689552,
200
+ "ap": 0.250732,
201
+ "ap_weighted": 0.250732
202
+ },
203
+ {
204
+ "accuracy": 0.695522,
205
+ "f1": 0.609825,
206
+ "f1_weighted": 0.729365,
207
+ "precision": 0.611822,
208
+ "precision_weighted": 0.804774,
209
+ "recall": 0.679572,
210
+ "recall_weighted": 0.695522,
211
+ "ap": 0.267173,
212
+ "ap_weighted": 0.267173
213
+ },
214
+ {
215
+ "accuracy": 0.674627,
216
+ "f1": 0.57847,
217
+ "f1_weighted": 0.710085,
218
+ "precision": 0.583686,
219
+ "precision_weighted": 0.780874,
220
+ "recall": 0.632858,
221
+ "recall_weighted": 0.674627,
222
+ "ap": 0.235104,
223
+ "ap_weighted": 0.235104
224
+ },
225
+ {
226
+ "accuracy": 0.692537,
227
+ "f1": 0.622851,
228
+ "f1_weighted": 0.728832,
229
+ "precision": 0.63076,
230
+ "precision_weighted": 0.826836,
231
+ "recall": 0.718661,
232
+ "recall_weighted": 0.692537,
233
+ "ap": 0.292763,
234
+ "ap_weighted": 0.292763
235
+ },
236
+ {
237
+ "accuracy": 0.647761,
238
+ "f1": 0.57702,
239
+ "f1_weighted": 0.690103,
240
+ "precision": 0.596703,
241
+ "precision_weighted": 0.800007,
242
+ "recall": 0.664322,
243
+ "recall_weighted": 0.647761,
244
+ "ap": 0.250776,
245
+ "ap_weighted": 0.250776
246
+ }
247
+ ],
248
+ "accuracy": 0.674328,
249
+ "f1": 0.600878,
250
+ "f1_weighted": 0.712338,
251
+ "precision": 0.613467,
252
+ "precision_weighted": 0.812375,
253
+ "recall": 0.689247,
254
+ "recall_weighted": 0.674328,
255
+ "ap": 0.271267,
256
+ "ap_weighted": 0.271267,
257
+ "main_score": 0.674328,
258
+ "hf_subset": "en",
259
+ "languages": [
260
+ "eng-Latn"
261
+ ]
262
+ }
263
+ ],
264
+ "test": [
265
+ {
266
+ "scores_per_experiment": [
267
+ {
268
+ "accuracy": 0.787106,
269
+ "f1": 0.6618,
270
+ "f1_weighted": 0.82476,
271
+ "precision": 0.644837,
272
+ "precision_weighted": 0.909338,
273
+ "recall": 0.817597,
274
+ "recall_weighted": 0.787106,
275
+ "ap": 0.280992,
276
+ "ap_weighted": 0.280992
277
+ },
278
+ {
279
+ "accuracy": 0.736882,
280
+ "f1": 0.615747,
281
+ "f1_weighted": 0.786532,
282
+ "precision": 0.618779,
283
+ "precision_weighted": 0.900943,
284
+ "recall": 0.783206,
285
+ "recall_weighted": 0.736882,
286
+ "ap": 0.237303,
287
+ "ap_weighted": 0.237303
288
+ },
289
+ {
290
+ "accuracy": 0.742879,
291
+ "f1": 0.625424,
292
+ "f1_weighted": 0.791464,
293
+ "precision": 0.626567,
294
+ "precision_weighted": 0.906989,
295
+ "recall": 0.802447,
296
+ "recall_weighted": 0.742879,
297
+ "ap": 0.25176,
298
+ "ap_weighted": 0.25176
299
+ },
300
+ {
301
+ "accuracy": 0.658171,
302
+ "f1": 0.537212,
303
+ "f1_weighted": 0.724503,
304
+ "precision": 0.570547,
305
+ "precision_weighted": 0.870972,
306
+ "recall": 0.678878,
307
+ "recall_weighted": 0.658171,
308
+ "ap": 0.16542,
309
+ "ap_weighted": 0.16542
310
+ },
311
+ {
312
+ "accuracy": 0.754873,
313
+ "f1": 0.629012,
314
+ "f1_weighted": 0.800066,
315
+ "precision": 0.624459,
316
+ "precision_weighted": 0.900998,
317
+ "recall": 0.786891,
318
+ "recall_weighted": 0.754873,
319
+ "ap": 0.245608,
320
+ "ap_weighted": 0.245608
321
+ },
322
+ {
323
+ "accuracy": 0.715892,
324
+ "f1": 0.59492,
325
+ "f1_weighted": 0.770155,
326
+ "precision": 0.606268,
327
+ "precision_weighted": 0.894236,
328
+ "recall": 0.758776,
329
+ "recall_weighted": 0.715892,
330
+ "ap": 0.216622,
331
+ "ap_weighted": 0.216622
332
+ },
333
+ {
334
+ "accuracy": 0.715142,
335
+ "f1": 0.59242,
336
+ "f1_weighted": 0.769462,
337
+ "precision": 0.603797,
338
+ "precision_weighted": 0.891985,
339
+ "recall": 0.752,
340
+ "recall_weighted": 0.715142,
341
+ "ap": 0.212437,
342
+ "ap_weighted": 0.212437
343
+ },
344
+ {
345
+ "accuracy": 0.718891,
346
+ "f1": 0.595394,
347
+ "f1_weighted": 0.772344,
348
+ "precision": 0.605197,
349
+ "precision_weighted": 0.892422,
350
+ "recall": 0.754092,
351
+ "recall_weighted": 0.718891,
352
+ "ap": 0.214527,
353
+ "ap_weighted": 0.214527
354
+ },
355
+ {
356
+ "accuracy": 0.750375,
357
+ "f1": 0.624186,
358
+ "f1_weighted": 0.796573,
359
+ "precision": 0.621365,
360
+ "precision_weighted": 0.899394,
361
+ "recall": 0.781202,
362
+ "recall_weighted": 0.750375,
363
+ "ap": 0.240296,
364
+ "ap_weighted": 0.240296
365
+ },
366
+ {
367
+ "accuracy": 0.732384,
368
+ "f1": 0.608214,
369
+ "f1_weighted": 0.782812,
370
+ "precision": 0.612646,
371
+ "precision_weighted": 0.896151,
372
+ "recall": 0.767981,
373
+ "recall_weighted": 0.732384,
374
+ "ap": 0.22639,
375
+ "ap_weighted": 0.22639
376
+ }
377
+ ],
378
+ "accuracy": 0.731259,
379
+ "f1": 0.608433,
380
+ "f1_weighted": 0.781867,
381
+ "precision": 0.613446,
382
+ "precision_weighted": 0.896343,
383
+ "recall": 0.768307,
384
+ "recall_weighted": 0.731259,
385
+ "ap": 0.229135,
386
+ "ap_weighted": 0.229135,
387
+ "main_score": 0.731259,
388
+ "hf_subset": "en-ext",
389
+ "languages": [
390
+ "eng-Latn"
391
+ ]
392
+ },
393
+ {
394
+ "scores_per_experiment": [
395
+ {
396
+ "accuracy": 0.683582,
397
+ "f1": 0.638763,
398
+ "f1_weighted": 0.716247,
399
+ "precision": 0.65592,
400
+ "precision_weighted": 0.832264,
401
+ "recall": 0.745557,
402
+ "recall_weighted": 0.683582,
403
+ "ap": 0.340258,
404
+ "ap_weighted": 0.340258
405
+ },
406
+ {
407
+ "accuracy": 0.704478,
408
+ "f1": 0.648188,
409
+ "f1_weighted": 0.733883,
410
+ "precision": 0.651851,
411
+ "precision_weighted": 0.819598,
412
+ "recall": 0.732541,
413
+ "recall_weighted": 0.704478,
414
+ "ap": 0.336346,
415
+ "ap_weighted": 0.336346
416
+ },
417
+ {
418
+ "accuracy": 0.653731,
419
+ "f1": 0.607475,
420
+ "f1_weighted": 0.68953,
421
+ "precision": 0.630889,
422
+ "precision_weighted": 0.810116,
423
+ "recall": 0.70678,
424
+ "recall_weighted": 0.653731,
425
+ "ap": 0.307499,
426
+ "ap_weighted": 0.307499
427
+ },
428
+ {
429
+ "accuracy": 0.692537,
430
+ "f1": 0.637034,
431
+ "f1_weighted": 0.723467,
432
+ "precision": 0.644143,
433
+ "precision_weighted": 0.814359,
434
+ "recall": 0.722231,
435
+ "recall_weighted": 0.692537,
436
+ "ap": 0.325896,
437
+ "ap_weighted": 0.325896
438
+ },
439
+ {
440
+ "accuracy": 0.71194,
441
+ "f1": 0.657558,
442
+ "f1_weighted": 0.740659,
443
+ "precision": 0.660351,
444
+ "precision_weighted": 0.827401,
445
+ "recall": 0.745847,
446
+ "recall_weighted": 0.71194,
447
+ "ap": 0.348219,
448
+ "ap_weighted": 0.348219
449
+ },
450
+ {
451
+ "accuracy": 0.702985,
452
+ "f1": 0.645907,
453
+ "f1_weighted": 0.732479,
454
+ "precision": 0.649537,
455
+ "precision_weighted": 0.817282,
456
+ "recall": 0.728724,
457
+ "recall_weighted": 0.702985,
458
+ "ap": 0.333184,
459
+ "ap_weighted": 0.333184
460
+ },
461
+ {
462
+ "accuracy": 0.755224,
463
+ "f1": 0.685197,
464
+ "f1_weighted": 0.775611,
465
+ "precision": 0.672736,
466
+ "precision_weighted": 0.822982,
467
+ "recall": 0.743857,
468
+ "recall_weighted": 0.755224,
469
+ "ap": 0.362669,
470
+ "ap_weighted": 0.362669
471
+ },
472
+ {
473
+ "accuracy": 0.735821,
474
+ "f1": 0.675433,
475
+ "f1_weighted": 0.760687,
476
+ "precision": 0.669476,
477
+ "precision_weighted": 0.828703,
478
+ "recall": 0.752022,
479
+ "recall_weighted": 0.735821,
480
+ "ap": 0.360963,
481
+ "ap_weighted": 0.360963
482
+ },
483
+ {
484
+ "accuracy": 0.650746,
485
+ "f1": 0.599336,
486
+ "f1_weighted": 0.686734,
487
+ "precision": 0.619405,
488
+ "precision_weighted": 0.797005,
489
+ "recall": 0.687589,
490
+ "recall_weighted": 0.650746,
491
+ "ap": 0.294448,
492
+ "ap_weighted": 0.294448
493
+ },
494
+ {
495
+ "accuracy": 0.658209,
496
+ "f1": 0.604451,
497
+ "f1_weighted": 0.69325,
498
+ "precision": 0.621122,
499
+ "precision_weighted": 0.797068,
500
+ "recall": 0.689338,
501
+ "recall_weighted": 0.658209,
502
+ "ap": 0.29672,
503
+ "ap_weighted": 0.29672
504
+ }
505
+ ],
506
+ "accuracy": 0.694925,
507
+ "f1": 0.639934,
508
+ "f1_weighted": 0.725255,
509
+ "precision": 0.647543,
510
+ "precision_weighted": 0.816678,
511
+ "recall": 0.725449,
512
+ "recall_weighted": 0.694925,
513
+ "ap": 0.33062,
514
+ "ap_weighted": 0.33062,
515
+ "main_score": 0.694925,
516
+ "hf_subset": "en",
517
+ "languages": [
518
+ "eng-Latn"
519
+ ]
520
+ }
521
+ ]
522
+ },
523
+ "evaluation_time": 46.4682559967041,
524
+ "kg_co2_emissions": null,
525
+ "date": null
526
+ }
results/AmazonPolarityClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e2d317d38cd51312af73b3d32a06d1a08b442046",
3
+ "task_name": "AmazonPolarityClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.79523,
11
+ "f1": 0.795028,
12
+ "f1_weighted": 0.795028,
13
+ "precision": 0.7964,
14
+ "precision_weighted": 0.7964,
15
+ "recall": 0.79523,
16
+ "recall_weighted": 0.79523,
17
+ "ap": 0.740619,
18
+ "ap_weighted": 0.740619
19
+ },
20
+ {
21
+ "accuracy": 0.768142,
22
+ "f1": 0.768142,
23
+ "f1_weighted": 0.768142,
24
+ "precision": 0.768144,
25
+ "precision_weighted": 0.768144,
26
+ "recall": 0.768142,
27
+ "recall_weighted": 0.768142,
28
+ "ap": 0.705815,
29
+ "ap_weighted": 0.705815
30
+ },
31
+ {
32
+ "accuracy": 0.827712,
33
+ "f1": 0.827712,
34
+ "f1_weighted": 0.827712,
35
+ "precision": 0.827716,
36
+ "precision_weighted": 0.827716,
37
+ "recall": 0.827712,
38
+ "recall_weighted": 0.827712,
39
+ "ap": 0.771595,
40
+ "ap_weighted": 0.771595
41
+ },
42
+ {
43
+ "accuracy": 0.813983,
44
+ "f1": 0.812903,
45
+ "f1_weighted": 0.812903,
46
+ "precision": 0.821403,
47
+ "precision_weighted": 0.821403,
48
+ "recall": 0.813983,
49
+ "recall_weighted": 0.813983,
50
+ "ap": 0.77324,
51
+ "ap_weighted": 0.77324
52
+ },
53
+ {
54
+ "accuracy": 0.798872,
55
+ "f1": 0.798778,
56
+ "f1_weighted": 0.798778,
57
+ "precision": 0.799433,
58
+ "precision_weighted": 0.799433,
59
+ "recall": 0.798873,
60
+ "recall_weighted": 0.798872,
61
+ "ap": 0.7428,
62
+ "ap_weighted": 0.7428
63
+ },
64
+ {
65
+ "accuracy": 0.757687,
66
+ "f1": 0.755761,
67
+ "f1_weighted": 0.755761,
68
+ "precision": 0.766082,
69
+ "precision_weighted": 0.766082,
70
+ "recall": 0.757688,
71
+ "recall_weighted": 0.757687,
72
+ "ap": 0.709588,
73
+ "ap_weighted": 0.709588
74
+ },
75
+ {
76
+ "accuracy": 0.796585,
77
+ "f1": 0.796115,
78
+ "f1_weighted": 0.796115,
79
+ "precision": 0.799344,
80
+ "precision_weighted": 0.799344,
81
+ "recall": 0.796585,
82
+ "recall_weighted": 0.796585,
83
+ "ap": 0.72855,
84
+ "ap_weighted": 0.72855
85
+ },
86
+ {
87
+ "accuracy": 0.834205,
88
+ "f1": 0.833444,
89
+ "f1_weighted": 0.833444,
90
+ "precision": 0.840423,
91
+ "precision_weighted": 0.840423,
92
+ "recall": 0.834205,
93
+ "recall_weighted": 0.834205,
94
+ "ap": 0.79625,
95
+ "ap_weighted": 0.79625
96
+ },
97
+ {
98
+ "accuracy": 0.77233,
99
+ "f1": 0.768669,
100
+ "f1_weighted": 0.768669,
101
+ "precision": 0.790736,
102
+ "precision_weighted": 0.790736,
103
+ "recall": 0.77233,
104
+ "recall_weighted": 0.77233,
105
+ "ap": 0.69542,
106
+ "ap_weighted": 0.69542
107
+ },
108
+ {
109
+ "accuracy": 0.820133,
110
+ "f1": 0.819212,
111
+ "f1_weighted": 0.819212,
112
+ "precision": 0.826786,
113
+ "precision_weighted": 0.826786,
114
+ "recall": 0.820133,
115
+ "recall_weighted": 0.820133,
116
+ "ap": 0.749753,
117
+ "ap_weighted": 0.749753
118
+ }
119
+ ],
120
+ "accuracy": 0.798488,
121
+ "f1": 0.797576,
122
+ "f1_weighted": 0.797576,
123
+ "precision": 0.803647,
124
+ "precision_weighted": 0.803647,
125
+ "recall": 0.798488,
126
+ "recall_weighted": 0.798488,
127
+ "ap": 0.741363,
128
+ "ap_weighted": 0.741363,
129
+ "main_score": 0.798488,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 6883.579201698303,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/AmazonReviewsClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6b5d328eaae8ef408dd7d775040245cf86f92e9d",
3
+ "task_name": "AmazonReviewsClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.4034,
11
+ "f1": 0.38865,
12
+ "f1_weighted": 0.38865,
13
+ "precision": 0.389995,
14
+ "precision_weighted": 0.389995,
15
+ "recall": 0.4034,
16
+ "recall_weighted": 0.4034,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.403,
22
+ "f1": 0.391288,
23
+ "f1_weighted": 0.391288,
24
+ "precision": 0.394509,
25
+ "precision_weighted": 0.394509,
26
+ "recall": 0.403,
27
+ "recall_weighted": 0.403,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.4,
33
+ "f1": 0.393261,
34
+ "f1_weighted": 0.393261,
35
+ "precision": 0.390257,
36
+ "precision_weighted": 0.390257,
37
+ "recall": 0.4,
38
+ "recall_weighted": 0.4,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3842,
44
+ "f1": 0.383545,
45
+ "f1_weighted": 0.383545,
46
+ "precision": 0.383638,
47
+ "precision_weighted": 0.383638,
48
+ "recall": 0.3842,
49
+ "recall_weighted": 0.3842,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.4018,
55
+ "f1": 0.388243,
56
+ "f1_weighted": 0.388243,
57
+ "precision": 0.397556,
58
+ "precision_weighted": 0.397556,
59
+ "recall": 0.4018,
60
+ "recall_weighted": 0.4018,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.372,
66
+ "f1": 0.366752,
67
+ "f1_weighted": 0.366752,
68
+ "precision": 0.377653,
69
+ "precision_weighted": 0.377653,
70
+ "recall": 0.372,
71
+ "recall_weighted": 0.372,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.3696,
77
+ "f1": 0.36381,
78
+ "f1_weighted": 0.36381,
79
+ "precision": 0.361893,
80
+ "precision_weighted": 0.361893,
81
+ "recall": 0.3696,
82
+ "recall_weighted": 0.3696,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.4232,
88
+ "f1": 0.419273,
89
+ "f1_weighted": 0.419273,
90
+ "precision": 0.419072,
91
+ "precision_weighted": 0.419072,
92
+ "recall": 0.4232,
93
+ "recall_weighted": 0.4232,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.3938,
99
+ "f1": 0.382338,
100
+ "f1_weighted": 0.382338,
101
+ "precision": 0.394634,
102
+ "precision_weighted": 0.394634,
103
+ "recall": 0.3938,
104
+ "recall_weighted": 0.3938,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3886,
110
+ "f1": 0.368353,
111
+ "f1_weighted": 0.368353,
112
+ "precision": 0.375478,
113
+ "precision_weighted": 0.375478,
114
+ "recall": 0.3886,
115
+ "recall_weighted": 0.3886,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.39396,
121
+ "f1": 0.384551,
122
+ "f1_weighted": 0.384551,
123
+ "precision": 0.388469,
124
+ "precision_weighted": 0.388469,
125
+ "recall": 0.39396,
126
+ "recall_weighted": 0.39396,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.39396,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.4112,
141
+ "f1": 0.39289,
142
+ "f1_weighted": 0.39289,
143
+ "precision": 0.392955,
144
+ "precision_weighted": 0.392955,
145
+ "recall": 0.4112,
146
+ "recall_weighted": 0.4112,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.4134,
152
+ "f1": 0.4003,
153
+ "f1_weighted": 0.4003,
154
+ "precision": 0.403745,
155
+ "precision_weighted": 0.403745,
156
+ "recall": 0.4134,
157
+ "recall_weighted": 0.4134,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.3986,
163
+ "f1": 0.390911,
164
+ "f1_weighted": 0.390911,
165
+ "precision": 0.387781,
166
+ "precision_weighted": 0.387781,
167
+ "recall": 0.3986,
168
+ "recall_weighted": 0.3986,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.391,
174
+ "f1": 0.389972,
175
+ "f1_weighted": 0.389972,
176
+ "precision": 0.389224,
177
+ "precision_weighted": 0.389224,
178
+ "recall": 0.391,
179
+ "recall_weighted": 0.391,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.4046,
185
+ "f1": 0.391136,
186
+ "f1_weighted": 0.391136,
187
+ "precision": 0.39954,
188
+ "precision_weighted": 0.39954,
189
+ "recall": 0.4046,
190
+ "recall_weighted": 0.4046,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.3684,
196
+ "f1": 0.363944,
197
+ "f1_weighted": 0.363944,
198
+ "precision": 0.378502,
199
+ "precision_weighted": 0.378502,
200
+ "recall": 0.3684,
201
+ "recall_weighted": 0.3684,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.368,
207
+ "f1": 0.362377,
208
+ "f1_weighted": 0.362377,
209
+ "precision": 0.360133,
210
+ "precision_weighted": 0.360133,
211
+ "recall": 0.368,
212
+ "recall_weighted": 0.368,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.4226,
218
+ "f1": 0.417629,
219
+ "f1_weighted": 0.417629,
220
+ "precision": 0.4173,
221
+ "precision_weighted": 0.4173,
222
+ "recall": 0.4226,
223
+ "recall_weighted": 0.4226,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.3836,
229
+ "f1": 0.369139,
230
+ "f1_weighted": 0.369139,
231
+ "precision": 0.380239,
232
+ "precision_weighted": 0.380239,
233
+ "recall": 0.3836,
234
+ "recall_weighted": 0.3836,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.386,
240
+ "f1": 0.365965,
241
+ "f1_weighted": 0.365965,
242
+ "precision": 0.370847,
243
+ "precision_weighted": 0.370847,
244
+ "recall": 0.386,
245
+ "recall_weighted": 0.386,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.39474,
251
+ "f1": 0.384426,
252
+ "f1_weighted": 0.384426,
253
+ "precision": 0.388027,
254
+ "precision_weighted": 0.388027,
255
+ "recall": 0.39474,
256
+ "recall_weighted": 0.39474,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.39474,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 218.71461749076843,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/ArXivHierarchicalClusteringP2P.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0bbdb47bcbe3a90093699aefeed338a0f28a7ee8",
3
+ "task_name": "ArXivHierarchicalClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.524556,
11
+ 0.539165,
12
+ 0.554688,
13
+ 0.531442,
14
+ 0.552495,
15
+ 0.517158,
16
+ 0.517534,
17
+ 0.560027,
18
+ 0.530731,
19
+ 0.514275
20
+ ],
21
+ "Level 1": [
22
+ 0.591854,
23
+ 0.557006,
24
+ 0.556008,
25
+ 0.575964,
26
+ 0.592051,
27
+ 0.573672,
28
+ 0.598428,
29
+ 0.600044,
30
+ 0.589496,
31
+ 0.588521
32
+ ]
33
+ },
34
+ "v_measure": 0.558256,
35
+ "v_measure_std": 0.028625,
36
+ "main_score": 0.558256,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.750295877456665,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArXivHierarchicalClusteringS2S.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3",
3
+ "task_name": "ArXivHierarchicalClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.440306,
11
+ 0.472852,
12
+ 0.47438,
13
+ 0.447286,
14
+ 0.504828,
15
+ 0.490292,
16
+ 0.477203,
17
+ 0.516111,
18
+ 0.528601,
19
+ 0.492548
20
+ ],
21
+ "Level 1": [
22
+ 0.572926,
23
+ 0.562026,
24
+ 0.571448,
25
+ 0.579162,
26
+ 0.571166,
27
+ 0.539986,
28
+ 0.572516,
29
+ 0.59407,
30
+ 0.560901,
31
+ 0.576501
32
+ ]
33
+ },
34
+ "v_measure": 0.527255,
35
+ "v_measure_std": 0.047706,
36
+ "main_score": 0.527255,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.693020820617676,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArguAna.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c22ab2a51041ffd869aaddef7af8d8215647e41a",
3
+ "task_name": "ArguAna",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.2091,
9
+ "ndcg_at_3": 0.3348,
10
+ "ndcg_at_5": 0.38713,
11
+ "ndcg_at_10": 0.45003,
12
+ "ndcg_at_20": 0.48337,
13
+ "ndcg_at_100": 0.50263,
14
+ "ndcg_at_1000": 0.50506,
15
+ "map_at_1": 0.2091,
16
+ "map_at_3": 0.30334,
17
+ "map_at_5": 0.33233,
18
+ "map_at_10": 0.35861,
19
+ "map_at_20": 0.36817,
20
+ "map_at_100": 0.37103,
21
+ "map_at_1000": 0.37112,
22
+ "recall_at_1": 0.2091,
23
+ "recall_at_3": 0.42603,
24
+ "recall_at_5": 0.55334,
25
+ "recall_at_10": 0.74609,
26
+ "recall_at_20": 0.87482,
27
+ "recall_at_100": 0.97582,
28
+ "recall_at_1000": 0.99502,
29
+ "accuracy": 0.2091,
30
+ "precision_at_1": 0.2091,
31
+ "precision_at_3": 0.14201,
32
+ "precision_at_5": 0.11067,
33
+ "precision_at_10": 0.07461,
34
+ "precision_at_20": 0.04374,
35
+ "precision_at_100": 0.00976,
36
+ "precision_at_1000": 0.001,
37
+ "mrr_at_1": 0.217639,
38
+ "mrr_at_3": 0.305951,
39
+ "mrr_at_5": 0.334898,
40
+ "mrr_at_10": 0.361645,
41
+ "mrr_at_20": 0.371208,
42
+ "mrr_at_100": 0.374067,
43
+ "mrr_at_1000": 0.374154,
44
+ "nauc_ndcg_at_1_max": -0.029418,
45
+ "nauc_ndcg_at_1_std": -0.101331,
46
+ "nauc_ndcg_at_1_diff1": 0.179698,
47
+ "nauc_ndcg_at_3_max": -0.01433,
48
+ "nauc_ndcg_at_3_std": -0.093149,
49
+ "nauc_ndcg_at_3_diff1": 0.108275,
50
+ "nauc_ndcg_at_5_max": 0.008268,
51
+ "nauc_ndcg_at_5_std": -0.075439,
52
+ "nauc_ndcg_at_5_diff1": 0.105102,
53
+ "nauc_ndcg_at_10_max": 0.021083,
54
+ "nauc_ndcg_at_10_std": -0.072937,
55
+ "nauc_ndcg_at_10_diff1": 0.108835,
56
+ "nauc_ndcg_at_20_max": 0.016786,
57
+ "nauc_ndcg_at_20_std": -0.079977,
58
+ "nauc_ndcg_at_20_diff1": 0.1115,
59
+ "nauc_ndcg_at_100_max": 0.011099,
60
+ "nauc_ndcg_at_100_std": -0.069659,
61
+ "nauc_ndcg_at_100_diff1": 0.118202,
62
+ "nauc_ndcg_at_1000_max": 0.006559,
63
+ "nauc_ndcg_at_1000_std": -0.076419,
64
+ "nauc_ndcg_at_1000_diff1": 0.118201,
65
+ "nauc_map_at_1_max": -0.029418,
66
+ "nauc_map_at_1_std": -0.101331,
67
+ "nauc_map_at_1_diff1": 0.179698,
68
+ "nauc_map_at_3_max": -0.016701,
69
+ "nauc_map_at_3_std": -0.093936,
70
+ "nauc_map_at_3_diff1": 0.122906,
71
+ "nauc_map_at_5_max": -0.004149,
72
+ "nauc_map_at_5_std": -0.083949,
73
+ "nauc_map_at_5_diff1": 0.121035,
74
+ "nauc_map_at_10_max": -0.000141,
75
+ "nauc_map_at_10_std": -0.083348,
76
+ "nauc_map_at_10_diff1": 0.122678,
77
+ "nauc_map_at_20_max": -0.001735,
78
+ "nauc_map_at_20_std": -0.085311,
79
+ "nauc_map_at_20_diff1": 0.123796,
80
+ "nauc_map_at_100_max": -0.002429,
81
+ "nauc_map_at_100_std": -0.084061,
82
+ "nauc_map_at_100_diff1": 0.12483,
83
+ "nauc_map_at_1000_max": -0.00257,
84
+ "nauc_map_at_1000_std": -0.084237,
85
+ "nauc_map_at_1000_diff1": 0.124806,
86
+ "nauc_recall_at_1_max": -0.029418,
87
+ "nauc_recall_at_1_std": -0.101331,
88
+ "nauc_recall_at_1_diff1": 0.179698,
89
+ "nauc_recall_at_3_max": -0.008473,
90
+ "nauc_recall_at_3_std": -0.091466,
91
+ "nauc_recall_at_3_diff1": 0.070363,
92
+ "nauc_recall_at_5_max": 0.045889,
93
+ "nauc_recall_at_5_std": -0.049624,
94
+ "nauc_recall_at_5_diff1": 0.061092,
95
+ "nauc_recall_at_10_max": 0.117279,
96
+ "nauc_recall_at_10_std": -0.027709,
97
+ "nauc_recall_at_10_diff1": 0.05909,
98
+ "nauc_recall_at_20_max": 0.162842,
99
+ "nauc_recall_at_20_std": -0.049204,
100
+ "nauc_recall_at_20_diff1": 0.037265,
101
+ "nauc_recall_at_100_max": 0.43144,
102
+ "nauc_recall_at_100_std": 0.616435,
103
+ "nauc_recall_at_100_diff1": 0.081506,
104
+ "nauc_recall_at_1000_max": 0.322347,
105
+ "nauc_recall_at_1000_std": 0.593208,
106
+ "nauc_recall_at_1000_diff1": 0.02695,
107
+ "nauc_precision_at_1_max": -0.029418,
108
+ "nauc_precision_at_1_std": -0.101331,
109
+ "nauc_precision_at_1_diff1": 0.179698,
110
+ "nauc_precision_at_3_max": -0.008473,
111
+ "nauc_precision_at_3_std": -0.091466,
112
+ "nauc_precision_at_3_diff1": 0.070363,
113
+ "nauc_precision_at_5_max": 0.045889,
114
+ "nauc_precision_at_5_std": -0.049624,
115
+ "nauc_precision_at_5_diff1": 0.061092,
116
+ "nauc_precision_at_10_max": 0.117279,
117
+ "nauc_precision_at_10_std": -0.027709,
118
+ "nauc_precision_at_10_diff1": 0.05909,
119
+ "nauc_precision_at_20_max": 0.162842,
120
+ "nauc_precision_at_20_std": -0.049204,
121
+ "nauc_precision_at_20_diff1": 0.037265,
122
+ "nauc_precision_at_100_max": 0.43144,
123
+ "nauc_precision_at_100_std": 0.616435,
124
+ "nauc_precision_at_100_diff1": 0.081506,
125
+ "nauc_precision_at_1000_max": 0.322347,
126
+ "nauc_precision_at_1000_std": 0.593208,
127
+ "nauc_precision_at_1000_diff1": 0.02695,
128
+ "nauc_mrr_at_1_max": -0.020448,
129
+ "nauc_mrr_at_1_std": -0.091764,
130
+ "nauc_mrr_at_1_diff1": 0.148274,
131
+ "nauc_mrr_at_3_max": -0.021561,
132
+ "nauc_mrr_at_3_std": -0.091499,
133
+ "nauc_mrr_at_3_diff1": 0.100988,
134
+ "nauc_mrr_at_5_max": -0.011845,
135
+ "nauc_mrr_at_5_std": -0.081702,
136
+ "nauc_mrr_at_5_diff1": 0.096502,
137
+ "nauc_mrr_at_10_max": -0.004838,
138
+ "nauc_mrr_at_10_std": -0.080537,
139
+ "nauc_mrr_at_10_diff1": 0.100127,
140
+ "nauc_mrr_at_20_max": -0.006619,
141
+ "nauc_mrr_at_20_std": -0.082414,
142
+ "nauc_mrr_at_20_diff1": 0.100562,
143
+ "nauc_mrr_at_100_max": -0.007365,
144
+ "nauc_mrr_at_100_std": -0.081147,
145
+ "nauc_mrr_at_100_diff1": 0.101363,
146
+ "nauc_mrr_at_1000_max": -0.007506,
147
+ "nauc_mrr_at_1000_std": -0.081321,
148
+ "nauc_mrr_at_1000_diff1": 0.101332,
149
+ "hit_rate_at_1": 0.2091,
150
+ "hit_rate_at_3": 0.42603,
151
+ "hit_rate_at_5": 0.55334,
152
+ "hit_rate_at_10": 0.74609,
153
+ "hit_rate_at_20": 0.87482,
154
+ "hit_rate_at_100": 0.97582,
155
+ "hit_rate_at_1000": 0.99502,
156
+ "main_score": 0.45003,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 66.03028631210327,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/AskUbuntuDupQuestions.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5691e3c48741d5f83b5cc8e630653d7a8cfc048",
3
+ "task_name": "AskUbuntuDupQuestions",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.54571,
9
+ "ndcg_at_3": 0.54781,
10
+ "ndcg_at_5": 0.55769,
11
+ "ndcg_at_10": 0.61982,
12
+ "ndcg_at_20": 0.73303,
13
+ "ndcg_at_100": 0.73303,
14
+ "ndcg_at_1000": 0.73303,
15
+ "map_at_1": 0.13839,
16
+ "map_at_3": 0.27646,
17
+ "map_at_5": 0.3514,
18
+ "map_at_10": 0.45988,
19
+ "map_at_20": 0.56761,
20
+ "map_at_100": 0.56761,
21
+ "map_at_1000": 0.56761,
22
+ "recall_at_1": 0.13839,
23
+ "recall_at_3": 0.33807,
24
+ "recall_at_5": 0.4726,
25
+ "recall_at_10": 0.71055,
26
+ "recall_at_20": 1.0,
27
+ "recall_at_100": 1.0,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.13839,
30
+ "precision_at_1": 0.54571,
31
+ "precision_at_3": 0.494,
32
+ "precision_at_5": 0.43712,
33
+ "precision_at_10": 0.35817,
34
+ "precision_at_20": 0.27355,
35
+ "precision_at_100": 0.05471,
36
+ "precision_at_1000": 0.00547,
37
+ "mrr_at_1": 0.545706,
38
+ "mrr_at_3": 0.661127,
39
+ "mrr_at_5": 0.679132,
40
+ "mrr_at_10": 0.688815,
41
+ "mrr_at_20": 0.691918,
42
+ "mrr_at_100": 0.691918,
43
+ "mrr_at_1000": 0.691918,
44
+ "nauc_ndcg_at_1_max": 0.18749,
45
+ "nauc_ndcg_at_1_std": 0.177788,
46
+ "nauc_ndcg_at_1_diff1": 0.106098,
47
+ "nauc_ndcg_at_3_max": 0.167371,
48
+ "nauc_ndcg_at_3_std": 0.170243,
49
+ "nauc_ndcg_at_3_diff1": 0.092766,
50
+ "nauc_ndcg_at_5_max": 0.118467,
51
+ "nauc_ndcg_at_5_std": 0.155563,
52
+ "nauc_ndcg_at_5_diff1": 0.065131,
53
+ "nauc_ndcg_at_10_max": 0.169054,
54
+ "nauc_ndcg_at_10_std": 0.218224,
55
+ "nauc_ndcg_at_10_diff1": 0.089675,
56
+ "nauc_ndcg_at_20_max": 0.217034,
57
+ "nauc_ndcg_at_20_std": 0.179658,
58
+ "nauc_ndcg_at_20_diff1": 0.118978,
59
+ "nauc_ndcg_at_100_max": 0.217034,
60
+ "nauc_ndcg_at_100_std": 0.179658,
61
+ "nauc_ndcg_at_100_diff1": 0.118978,
62
+ "nauc_ndcg_at_1000_max": 0.217034,
63
+ "nauc_ndcg_at_1000_std": 0.179658,
64
+ "nauc_ndcg_at_1000_diff1": 0.118978,
65
+ "nauc_map_at_1_max": -0.013621,
66
+ "nauc_map_at_1_std": 0.026778,
67
+ "nauc_map_at_1_diff1": 0.191883,
68
+ "nauc_map_at_3_max": 0.005575,
69
+ "nauc_map_at_3_std": 0.143969,
70
+ "nauc_map_at_3_diff1": 0.146691,
71
+ "nauc_map_at_5_max": 0.018933,
72
+ "nauc_map_at_5_std": 0.159279,
73
+ "nauc_map_at_5_diff1": 0.108904,
74
+ "nauc_map_at_10_max": 0.120817,
75
+ "nauc_map_at_10_std": 0.2133,
76
+ "nauc_map_at_10_diff1": 0.111823,
77
+ "nauc_map_at_20_max": 0.180269,
78
+ "nauc_map_at_20_std": 0.180391,
79
+ "nauc_map_at_20_diff1": 0.107869,
80
+ "nauc_map_at_100_max": 0.180269,
81
+ "nauc_map_at_100_std": 0.180391,
82
+ "nauc_map_at_100_diff1": 0.107869,
83
+ "nauc_map_at_1000_max": 0.180269,
84
+ "nauc_map_at_1000_std": 0.180391,
85
+ "nauc_map_at_1000_diff1": 0.107869,
86
+ "nauc_recall_at_1_max": -0.013621,
87
+ "nauc_recall_at_1_std": 0.026778,
88
+ "nauc_recall_at_1_diff1": 0.191883,
89
+ "nauc_recall_at_3_max": -0.06167,
90
+ "nauc_recall_at_3_std": 0.09921,
91
+ "nauc_recall_at_3_diff1": 0.113471,
92
+ "nauc_recall_at_5_max": -0.094531,
93
+ "nauc_recall_at_5_std": 0.115518,
94
+ "nauc_recall_at_5_diff1": 0.018277,
95
+ "nauc_recall_at_10_max": 0.006948,
96
+ "nauc_recall_at_10_std": 0.206632,
97
+ "nauc_recall_at_10_diff1": -0.010089,
98
+ "nauc_recall_at_20_max": NaN,
99
+ "nauc_recall_at_20_std": NaN,
100
+ "nauc_recall_at_20_diff1": NaN,
101
+ "nauc_recall_at_100_max": NaN,
102
+ "nauc_recall_at_100_std": NaN,
103
+ "nauc_recall_at_100_diff1": NaN,
104
+ "nauc_recall_at_1000_max": NaN,
105
+ "nauc_recall_at_1000_std": NaN,
106
+ "nauc_recall_at_1000_diff1": NaN,
107
+ "nauc_precision_at_1_max": 0.18749,
108
+ "nauc_precision_at_1_std": 0.177788,
109
+ "nauc_precision_at_1_diff1": 0.106098,
110
+ "nauc_precision_at_3_max": 0.220245,
111
+ "nauc_precision_at_3_std": 0.191283,
112
+ "nauc_precision_at_3_diff1": 0.043229,
113
+ "nauc_precision_at_5_max": 0.208588,
114
+ "nauc_precision_at_5_std": 0.140208,
115
+ "nauc_precision_at_5_diff1": -0.013675,
116
+ "nauc_precision_at_10_max": 0.268601,
117
+ "nauc_precision_at_10_std": 0.121868,
118
+ "nauc_precision_at_10_diff1": -0.012275,
119
+ "nauc_precision_at_20_max": 0.213808,
120
+ "nauc_precision_at_20_std": 0.024965,
121
+ "nauc_precision_at_20_diff1": -0.016831,
122
+ "nauc_precision_at_100_max": 0.213808,
123
+ "nauc_precision_at_100_std": 0.024965,
124
+ "nauc_precision_at_100_diff1": -0.016831,
125
+ "nauc_precision_at_1000_max": 0.213808,
126
+ "nauc_precision_at_1000_std": 0.024965,
127
+ "nauc_precision_at_1000_diff1": -0.016831,
128
+ "nauc_mrr_at_1_max": 0.18749,
129
+ "nauc_mrr_at_1_std": 0.177788,
130
+ "nauc_mrr_at_1_diff1": 0.106098,
131
+ "nauc_mrr_at_3_max": 0.204127,
132
+ "nauc_mrr_at_3_std": 0.158771,
133
+ "nauc_mrr_at_3_diff1": 0.135983,
134
+ "nauc_mrr_at_5_max": 0.194297,
135
+ "nauc_mrr_at_5_std": 0.171215,
136
+ "nauc_mrr_at_5_diff1": 0.119795,
137
+ "nauc_mrr_at_10_max": 0.202435,
138
+ "nauc_mrr_at_10_std": 0.173079,
139
+ "nauc_mrr_at_10_diff1": 0.119667,
140
+ "nauc_mrr_at_20_max": 0.200396,
141
+ "nauc_mrr_at_20_std": 0.17212,
142
+ "nauc_mrr_at_20_diff1": 0.120985,
143
+ "nauc_mrr_at_100_max": 0.200396,
144
+ "nauc_mrr_at_100_std": 0.17212,
145
+ "nauc_mrr_at_100_diff1": 0.120985,
146
+ "nauc_mrr_at_1000_max": 0.200396,
147
+ "nauc_mrr_at_1000_std": 0.17212,
148
+ "nauc_mrr_at_1000_diff1": 0.120985,
149
+ "hit_rate_at_1": 0.54571,
150
+ "hit_rate_at_3": 0.8144,
151
+ "hit_rate_at_5": 0.89197,
152
+ "hit_rate_at_10": 0.95845,
153
+ "hit_rate_at_20": 1.0,
154
+ "hit_rate_at_100": 1.0,
155
+ "hit_rate_at_1000": 1.0,
156
+ "main_score": 0.56761,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 48.22972011566162,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/BIOSSES.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "d3fb88f8f02e40887cd149695127462bbcf29b4a",
3
+ "task_name": "BIOSSES",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "pearson": 0.844543,
9
+ "spearman": 0.841519,
10
+ "cosine_pearson": 0.844543,
11
+ "cosine_spearman": 0.841519,
12
+ "manhattan_pearson": 0.826578,
13
+ "manhattan_spearman": 0.842796,
14
+ "euclidean_pearson": 0.829201,
15
+ "euclidean_spearman": 0.841519,
16
+ "main_score": 0.841519,
17
+ "hf_subset": "default",
18
+ "languages": [
19
+ "eng-Latn"
20
+ ]
21
+ }
22
+ ]
23
+ },
24
+ "evaluation_time": 5.795704126358032,
25
+ "kg_co2_emissions": null,
26
+ "date": null
27
+ }
results/Banking77Classification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0fd18e25b25c072e09e0d92ab615fda904d66300",
3
+ "task_name": "Banking77Classification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.78539,
11
+ "f1": 0.775916,
12
+ "f1_weighted": 0.775916,
13
+ "precision": 0.801046,
14
+ "precision_weighted": 0.801046,
15
+ "recall": 0.78539,
16
+ "recall_weighted": 0.78539,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.8,
22
+ "f1": 0.792747,
23
+ "f1_weighted": 0.792747,
24
+ "precision": 0.812947,
25
+ "precision_weighted": 0.812947,
26
+ "recall": 0.8,
27
+ "recall_weighted": 0.8,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.788312,
33
+ "f1": 0.779776,
34
+ "f1_weighted": 0.779776,
35
+ "precision": 0.807452,
36
+ "precision_weighted": 0.807452,
37
+ "recall": 0.788312,
38
+ "recall_weighted": 0.788312,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.790909,
44
+ "f1": 0.784672,
45
+ "f1_weighted": 0.784672,
46
+ "precision": 0.804151,
47
+ "precision_weighted": 0.804151,
48
+ "recall": 0.790909,
49
+ "recall_weighted": 0.790909,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.783766,
55
+ "f1": 0.775977,
56
+ "f1_weighted": 0.775977,
57
+ "precision": 0.80836,
58
+ "precision_weighted": 0.80836,
59
+ "recall": 0.783766,
60
+ "recall_weighted": 0.783766,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.77987,
66
+ "f1": 0.775546,
67
+ "f1_weighted": 0.775546,
68
+ "precision": 0.798778,
69
+ "precision_weighted": 0.798778,
70
+ "recall": 0.77987,
71
+ "recall_weighted": 0.77987,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.78539,
77
+ "f1": 0.778858,
78
+ "f1_weighted": 0.778858,
79
+ "precision": 0.802693,
80
+ "precision_weighted": 0.802693,
81
+ "recall": 0.78539,
82
+ "recall_weighted": 0.78539,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.786688,
88
+ "f1": 0.778584,
89
+ "f1_weighted": 0.778584,
90
+ "precision": 0.803679,
91
+ "precision_weighted": 0.803679,
92
+ "recall": 0.786688,
93
+ "recall_weighted": 0.786688,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.771104,
99
+ "f1": 0.760284,
100
+ "f1_weighted": 0.760284,
101
+ "precision": 0.791352,
102
+ "precision_weighted": 0.791352,
103
+ "recall": 0.771104,
104
+ "recall_weighted": 0.771104,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.784091,
110
+ "f1": 0.77652,
111
+ "f1_weighted": 0.77652,
112
+ "precision": 0.800061,
113
+ "precision_weighted": 0.800061,
114
+ "recall": 0.784091,
115
+ "recall_weighted": 0.784091,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.785552,
121
+ "f1": 0.777888,
122
+ "f1_weighted": 0.777888,
123
+ "precision": 0.803052,
124
+ "precision_weighted": 0.803052,
125
+ "recall": 0.785552,
126
+ "recall_weighted": 0.785552,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.785552,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 103.63245558738708,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/BiorxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65b79d1d13f80053f67aca9498d9402c2d9f1f40",
3
+ "task_name": "BiorxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.341127,
9
+ "v_measure_std": 0.008937,
10
+ "v_measures": [
11
+ 0.330978,
12
+ 0.347386,
13
+ 0.354358,
14
+ 0.331702,
15
+ 0.325195,
16
+ 0.343906,
17
+ 0.336406,
18
+ 0.346198,
19
+ 0.347687,
20
+ 0.347459
21
+ ],
22
+ "main_score": 0.341127,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 475.0981388092041,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/BiorxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "258694dd0231531bc1fd9de6ceb52a0853c6d908",
3
+ "task_name": "BiorxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.263446,
9
+ "v_measure_std": 0.008938,
10
+ "v_measures": [
11
+ 0.257214,
12
+ 0.26531,
13
+ 0.271913,
14
+ 0.252976,
15
+ 0.247702,
16
+ 0.268314,
17
+ 0.257379,
18
+ 0.263935,
19
+ 0.273099,
20
+ 0.276619
21
+ ],
22
+ "main_score": 0.263446,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 458.2719898223877,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/CQADupstackAndroidRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3",
3
+ "task_name": "CQADupstackAndroidRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.26896,
9
+ "ndcg_at_3": 0.32766,
10
+ "ndcg_at_5": 0.34776,
11
+ "ndcg_at_10": 0.37282,
12
+ "ndcg_at_20": 0.39748,
13
+ "ndcg_at_100": 0.43489,
14
+ "ndcg_at_1000": 0.46244,
15
+ "map_at_1": 0.22876,
16
+ "map_at_3": 0.29197,
17
+ "map_at_5": 0.30657,
18
+ "map_at_10": 0.31967,
19
+ "map_at_20": 0.32754,
20
+ "map_at_100": 0.33381,
21
+ "map_at_1000": 0.33526,
22
+ "recall_at_1": 0.22876,
23
+ "recall_at_3": 0.35602,
24
+ "recall_at_5": 0.40946,
25
+ "recall_at_10": 0.48477,
26
+ "recall_at_20": 0.57292,
27
+ "recall_at_100": 0.74912,
28
+ "recall_at_1000": 0.92937,
29
+ "accuracy": 0.22876,
30
+ "precision_at_1": 0.26896,
31
+ "precision_at_3": 0.15737,
32
+ "precision_at_5": 0.1133,
33
+ "precision_at_10": 0.0701,
34
+ "precision_at_20": 0.04356,
35
+ "precision_at_100": 0.01246,
36
+ "precision_at_1000": 0.00178,
37
+ "mrr_at_1": 0.268956,
38
+ "mrr_at_3": 0.33834,
39
+ "mrr_at_5": 0.352647,
40
+ "mrr_at_10": 0.362799,
41
+ "mrr_at_20": 0.369088,
42
+ "mrr_at_100": 0.373097,
43
+ "mrr_at_1000": 0.373713,
44
+ "nauc_ndcg_at_1_max": 0.21934,
45
+ "nauc_ndcg_at_1_std": -0.037296,
46
+ "nauc_ndcg_at_1_diff1": 0.437247,
47
+ "nauc_ndcg_at_3_max": 0.223749,
48
+ "nauc_ndcg_at_3_std": -0.009159,
49
+ "nauc_ndcg_at_3_diff1": 0.392135,
50
+ "nauc_ndcg_at_5_max": 0.221742,
51
+ "nauc_ndcg_at_5_std": -0.002898,
52
+ "nauc_ndcg_at_5_diff1": 0.391682,
53
+ "nauc_ndcg_at_10_max": 0.22704,
54
+ "nauc_ndcg_at_10_std": 0.004029,
55
+ "nauc_ndcg_at_10_diff1": 0.39805,
56
+ "nauc_ndcg_at_20_max": 0.22455,
57
+ "nauc_ndcg_at_20_std": 0.007623,
58
+ "nauc_ndcg_at_20_diff1": 0.399887,
59
+ "nauc_ndcg_at_100_max": 0.241274,
60
+ "nauc_ndcg_at_100_std": 0.028305,
61
+ "nauc_ndcg_at_100_diff1": 0.392149,
62
+ "nauc_ndcg_at_1000_max": 0.243305,
63
+ "nauc_ndcg_at_1000_std": 0.026057,
64
+ "nauc_ndcg_at_1000_diff1": 0.399486,
65
+ "nauc_map_at_1_max": 0.194457,
66
+ "nauc_map_at_1_std": -0.029163,
67
+ "nauc_map_at_1_diff1": 0.464586,
68
+ "nauc_map_at_3_max": 0.219104,
69
+ "nauc_map_at_3_std": -0.014757,
70
+ "nauc_map_at_3_diff1": 0.41706,
71
+ "nauc_map_at_5_max": 0.217197,
72
+ "nauc_map_at_5_std": -0.010729,
73
+ "nauc_map_at_5_diff1": 0.414383,
74
+ "nauc_map_at_10_max": 0.221064,
75
+ "nauc_map_at_10_std": -0.00629,
76
+ "nauc_map_at_10_diff1": 0.413215,
77
+ "nauc_map_at_20_max": 0.222274,
78
+ "nauc_map_at_20_std": -0.003858,
79
+ "nauc_map_at_20_diff1": 0.412926,
80
+ "nauc_map_at_100_max": 0.225751,
81
+ "nauc_map_at_100_std": -0.000506,
82
+ "nauc_map_at_100_diff1": 0.411824,
83
+ "nauc_map_at_1000_max": 0.225916,
84
+ "nauc_map_at_1000_std": -0.000343,
85
+ "nauc_map_at_1000_diff1": 0.412476,
86
+ "nauc_recall_at_1_max": 0.194457,
87
+ "nauc_recall_at_1_std": -0.029163,
88
+ "nauc_recall_at_1_diff1": 0.464586,
89
+ "nauc_recall_at_3_max": 0.221923,
90
+ "nauc_recall_at_3_std": 0.006092,
91
+ "nauc_recall_at_3_diff1": 0.359814,
92
+ "nauc_recall_at_5_max": 0.209515,
93
+ "nauc_recall_at_5_std": 0.016082,
94
+ "nauc_recall_at_5_diff1": 0.351145,
95
+ "nauc_recall_at_10_max": 0.220189,
96
+ "nauc_recall_at_10_std": 0.039772,
97
+ "nauc_recall_at_10_diff1": 0.366773,
98
+ "nauc_recall_at_20_max": 0.199075,
99
+ "nauc_recall_at_20_std": 0.046774,
100
+ "nauc_recall_at_20_diff1": 0.35724,
101
+ "nauc_recall_at_100_max": 0.286606,
102
+ "nauc_recall_at_100_std": 0.207008,
103
+ "nauc_recall_at_100_diff1": 0.283799,
104
+ "nauc_recall_at_1000_max": 0.466474,
105
+ "nauc_recall_at_1000_std": 0.514592,
106
+ "nauc_recall_at_1000_diff1": 0.275613,
107
+ "nauc_precision_at_1_max": 0.21934,
108
+ "nauc_precision_at_1_std": -0.037296,
109
+ "nauc_precision_at_1_diff1": 0.437247,
110
+ "nauc_precision_at_3_max": 0.248541,
111
+ "nauc_precision_at_3_std": 0.002739,
112
+ "nauc_precision_at_3_diff1": 0.285546,
113
+ "nauc_precision_at_5_max": 0.237357,
114
+ "nauc_precision_at_5_std": 0.029241,
115
+ "nauc_precision_at_5_diff1": 0.249387,
116
+ "nauc_precision_at_10_max": 0.226386,
117
+ "nauc_precision_at_10_std": 0.045204,
118
+ "nauc_precision_at_10_diff1": 0.201322,
119
+ "nauc_precision_at_20_max": 0.201011,
120
+ "nauc_precision_at_20_std": 0.072134,
121
+ "nauc_precision_at_20_diff1": 0.17055,
122
+ "nauc_precision_at_100_max": 0.163244,
123
+ "nauc_precision_at_100_std": 0.10021,
124
+ "nauc_precision_at_100_diff1": 0.029483,
125
+ "nauc_precision_at_1000_max": 0.031723,
126
+ "nauc_precision_at_1000_std": 0.002467,
127
+ "nauc_precision_at_1000_diff1": -0.073872,
128
+ "nauc_mrr_at_1_max": 0.21934,
129
+ "nauc_mrr_at_1_std": -0.037296,
130
+ "nauc_mrr_at_1_diff1": 0.437247,
131
+ "nauc_mrr_at_3_max": 0.223847,
132
+ "nauc_mrr_at_3_std": -0.019766,
133
+ "nauc_mrr_at_3_diff1": 0.386231,
134
+ "nauc_mrr_at_5_max": 0.223417,
135
+ "nauc_mrr_at_5_std": -0.017655,
136
+ "nauc_mrr_at_5_diff1": 0.386434,
137
+ "nauc_mrr_at_10_max": 0.226307,
138
+ "nauc_mrr_at_10_std": -0.01445,
139
+ "nauc_mrr_at_10_diff1": 0.391388,
140
+ "nauc_mrr_at_20_max": 0.224602,
141
+ "nauc_mrr_at_20_std": -0.015176,
142
+ "nauc_mrr_at_20_diff1": 0.392945,
143
+ "nauc_mrr_at_100_max": 0.226187,
144
+ "nauc_mrr_at_100_std": -0.012802,
145
+ "nauc_mrr_at_100_diff1": 0.391643,
146
+ "nauc_mrr_at_1000_max": 0.226344,
147
+ "nauc_mrr_at_1000_std": -0.012829,
148
+ "nauc_mrr_at_1000_diff1": 0.391642,
149
+ "hit_rate_at_1": 0.26896,
150
+ "hit_rate_at_3": 0.42775,
151
+ "hit_rate_at_5": 0.4907,
152
+ "hit_rate_at_10": 0.56652,
153
+ "hit_rate_at_20": 0.65808,
154
+ "hit_rate_at_100": 0.81688,
155
+ "hit_rate_at_1000": 0.95708,
156
+ "main_score": 0.37282,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 298.76985144615173,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackEnglishRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ad9991cb51e31e31e430383c75ffb2885547b5f0",
3
+ "task_name": "CQADupstackEnglishRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.2707,
9
+ "ndcg_at_3": 0.30808,
10
+ "ndcg_at_5": 0.32637,
11
+ "ndcg_at_10": 0.3491,
12
+ "ndcg_at_20": 0.3676,
13
+ "ndcg_at_100": 0.3962,
14
+ "ndcg_at_1000": 0.42306,
15
+ "map_at_1": 0.22119,
16
+ "map_at_3": 0.27326,
17
+ "map_at_5": 0.28708,
18
+ "map_at_10": 0.2989,
19
+ "map_at_20": 0.30552,
20
+ "map_at_100": 0.31094,
21
+ "map_at_1000": 0.3124,
22
+ "recall_at_1": 0.22119,
23
+ "recall_at_3": 0.3233,
24
+ "recall_at_5": 0.37392,
25
+ "recall_at_10": 0.44085,
26
+ "recall_at_20": 0.50817,
27
+ "recall_at_100": 0.64225,
28
+ "recall_at_1000": 0.81938,
29
+ "accuracy": 0.22119,
30
+ "precision_at_1": 0.2707,
31
+ "precision_at_3": 0.14989,
32
+ "precision_at_5": 0.10764,
33
+ "precision_at_10": 0.0686,
34
+ "precision_at_20": 0.04175,
35
+ "precision_at_100": 0.01196,
36
+ "precision_at_1000": 0.00174,
37
+ "mrr_at_1": 0.270701,
38
+ "mrr_at_3": 0.326327,
39
+ "mrr_at_5": 0.339098,
40
+ "mrr_at_10": 0.348762,
41
+ "mrr_at_20": 0.353325,
42
+ "mrr_at_100": 0.356546,
43
+ "mrr_at_1000": 0.357173,
44
+ "nauc_ndcg_at_1_max": 0.278391,
45
+ "nauc_ndcg_at_1_std": -0.023432,
46
+ "nauc_ndcg_at_1_diff1": 0.497471,
47
+ "nauc_ndcg_at_3_max": 0.263421,
48
+ "nauc_ndcg_at_3_std": -0.035728,
49
+ "nauc_ndcg_at_3_diff1": 0.447855,
50
+ "nauc_ndcg_at_5_max": 0.261476,
51
+ "nauc_ndcg_at_5_std": -0.023415,
52
+ "nauc_ndcg_at_5_diff1": 0.437137,
53
+ "nauc_ndcg_at_10_max": 0.257132,
54
+ "nauc_ndcg_at_10_std": -0.013266,
55
+ "nauc_ndcg_at_10_diff1": 0.42762,
56
+ "nauc_ndcg_at_20_max": 0.263594,
57
+ "nauc_ndcg_at_20_std": 0.003264,
58
+ "nauc_ndcg_at_20_diff1": 0.418788,
59
+ "nauc_ndcg_at_100_max": 0.268999,
60
+ "nauc_ndcg_at_100_std": 0.017193,
61
+ "nauc_ndcg_at_100_diff1": 0.415324,
62
+ "nauc_ndcg_at_1000_max": 0.273253,
63
+ "nauc_ndcg_at_1000_std": 0.027357,
64
+ "nauc_ndcg_at_1000_diff1": 0.416564,
65
+ "nauc_map_at_1_max": 0.222147,
66
+ "nauc_map_at_1_std": -0.074908,
67
+ "nauc_map_at_1_diff1": 0.518671,
68
+ "nauc_map_at_3_max": 0.241117,
69
+ "nauc_map_at_3_std": -0.063584,
70
+ "nauc_map_at_3_diff1": 0.476904,
71
+ "nauc_map_at_5_max": 0.245778,
72
+ "nauc_map_at_5_std": -0.050759,
73
+ "nauc_map_at_5_diff1": 0.467307,
74
+ "nauc_map_at_10_max": 0.248244,
75
+ "nauc_map_at_10_std": -0.042424,
76
+ "nauc_map_at_10_diff1": 0.460769,
77
+ "nauc_map_at_20_max": 0.253165,
78
+ "nauc_map_at_20_std": -0.034215,
79
+ "nauc_map_at_20_diff1": 0.456869,
80
+ "nauc_map_at_100_max": 0.256515,
81
+ "nauc_map_at_100_std": -0.028885,
82
+ "nauc_map_at_100_diff1": 0.45596,
83
+ "nauc_map_at_1000_max": 0.257429,
84
+ "nauc_map_at_1000_std": -0.027405,
85
+ "nauc_map_at_1000_diff1": 0.455742,
86
+ "nauc_recall_at_1_max": 0.222147,
87
+ "nauc_recall_at_1_std": -0.074908,
88
+ "nauc_recall_at_1_diff1": 0.518671,
89
+ "nauc_recall_at_3_max": 0.230059,
90
+ "nauc_recall_at_3_std": -0.061923,
91
+ "nauc_recall_at_3_diff1": 0.422938,
92
+ "nauc_recall_at_5_max": 0.228254,
93
+ "nauc_recall_at_5_std": -0.027754,
94
+ "nauc_recall_at_5_diff1": 0.382135,
95
+ "nauc_recall_at_10_max": 0.214181,
96
+ "nauc_recall_at_10_std": -0.003202,
97
+ "nauc_recall_at_10_diff1": 0.345884,
98
+ "nauc_recall_at_20_max": 0.232225,
99
+ "nauc_recall_at_20_std": 0.05161,
100
+ "nauc_recall_at_20_diff1": 0.307041,
101
+ "nauc_recall_at_100_max": 0.24417,
102
+ "nauc_recall_at_100_std": 0.121121,
103
+ "nauc_recall_at_100_diff1": 0.267987,
104
+ "nauc_recall_at_1000_max": 0.261539,
105
+ "nauc_recall_at_1000_std": 0.266321,
106
+ "nauc_recall_at_1000_diff1": 0.236047,
107
+ "nauc_precision_at_1_max": 0.278391,
108
+ "nauc_precision_at_1_std": -0.023432,
109
+ "nauc_precision_at_1_diff1": 0.497471,
110
+ "nauc_precision_at_3_max": 0.310097,
111
+ "nauc_precision_at_3_std": 0.039325,
112
+ "nauc_precision_at_3_diff1": 0.328002,
113
+ "nauc_precision_at_5_max": 0.317761,
114
+ "nauc_precision_at_5_std": 0.10569,
115
+ "nauc_precision_at_5_diff1": 0.26306,
116
+ "nauc_precision_at_10_max": 0.307016,
117
+ "nauc_precision_at_10_std": 0.18193,
118
+ "nauc_precision_at_10_diff1": 0.164312,
119
+ "nauc_precision_at_20_max": 0.317536,
120
+ "nauc_precision_at_20_std": 0.262772,
121
+ "nauc_precision_at_20_diff1": 0.086915,
122
+ "nauc_precision_at_100_max": 0.30113,
123
+ "nauc_precision_at_100_std": 0.356688,
124
+ "nauc_precision_at_100_diff1": -0.017214,
125
+ "nauc_precision_at_1000_max": 0.220063,
126
+ "nauc_precision_at_1000_std": 0.347846,
127
+ "nauc_precision_at_1000_diff1": -0.095717,
128
+ "nauc_mrr_at_1_max": 0.278391,
129
+ "nauc_mrr_at_1_std": -0.023432,
130
+ "nauc_mrr_at_1_diff1": 0.497471,
131
+ "nauc_mrr_at_3_max": 0.274012,
132
+ "nauc_mrr_at_3_std": -0.018136,
133
+ "nauc_mrr_at_3_diff1": 0.446471,
134
+ "nauc_mrr_at_5_max": 0.275599,
135
+ "nauc_mrr_at_5_std": -0.007801,
136
+ "nauc_mrr_at_5_diff1": 0.438547,
137
+ "nauc_mrr_at_10_max": 0.274549,
138
+ "nauc_mrr_at_10_std": -0.003157,
139
+ "nauc_mrr_at_10_diff1": 0.435689,
140
+ "nauc_mrr_at_20_max": 0.275377,
141
+ "nauc_mrr_at_20_std": 0.000233,
142
+ "nauc_mrr_at_20_diff1": 0.434157,
143
+ "nauc_mrr_at_100_max": 0.275219,
144
+ "nauc_mrr_at_100_std": 0.000153,
145
+ "nauc_mrr_at_100_diff1": 0.434122,
146
+ "nauc_mrr_at_1000_max": 0.275293,
147
+ "nauc_mrr_at_1000_std": 0.000173,
148
+ "nauc_mrr_at_1000_diff1": 0.43439,
149
+ "hit_rate_at_1": 0.2707,
150
+ "hit_rate_at_3": 0.39873,
151
+ "hit_rate_at_5": 0.45478,
152
+ "hit_rate_at_10": 0.52611,
153
+ "hit_rate_at_20": 0.59236,
154
+ "hit_rate_at_100": 0.7172,
155
+ "hit_rate_at_1000": 0.86815,
156
+ "main_score": 0.3491,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 506.6996204853058,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGamingRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4885aa143210c98657558c04aaf3dc47cfb54340",
3
+ "task_name": "CQADupstackGamingRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.32853,
9
+ "ndcg_at_3": 0.39425,
10
+ "ndcg_at_5": 0.41798,
11
+ "ndcg_at_10": 0.44909,
12
+ "ndcg_at_20": 0.47186,
13
+ "ndcg_at_100": 0.49888,
14
+ "ndcg_at_1000": 0.51606,
15
+ "map_at_1": 0.2882,
16
+ "map_at_3": 0.36297,
17
+ "map_at_5": 0.37802,
18
+ "map_at_10": 0.39255,
19
+ "map_at_20": 0.39985,
20
+ "map_at_100": 0.40434,
21
+ "map_at_1000": 0.40513,
22
+ "recall_at_1": 0.2882,
23
+ "recall_at_3": 0.43793,
24
+ "recall_at_5": 0.49662,
25
+ "recall_at_10": 0.58729,
26
+ "recall_at_20": 0.67151,
27
+ "recall_at_100": 0.80384,
28
+ "recall_at_1000": 0.92908,
29
+ "accuracy": 0.2882,
30
+ "precision_at_1": 0.32853,
31
+ "precision_at_3": 0.1768,
32
+ "precision_at_5": 0.12288,
33
+ "precision_at_10": 0.07511,
34
+ "precision_at_20": 0.04404,
35
+ "precision_at_100": 0.01097,
36
+ "precision_at_1000": 0.0013,
37
+ "mrr_at_1": 0.328527,
38
+ "mrr_at_3": 0.397074,
39
+ "mrr_at_5": 0.411494,
40
+ "mrr_at_10": 0.423731,
41
+ "mrr_at_20": 0.429557,
42
+ "mrr_at_100": 0.43252,
43
+ "mrr_at_1000": 0.432966,
44
+ "nauc_ndcg_at_1_max": 0.305617,
45
+ "nauc_ndcg_at_1_std": -0.071554,
46
+ "nauc_ndcg_at_1_diff1": 0.49715,
47
+ "nauc_ndcg_at_3_max": 0.301733,
48
+ "nauc_ndcg_at_3_std": -0.0479,
49
+ "nauc_ndcg_at_3_diff1": 0.437762,
50
+ "nauc_ndcg_at_5_max": 0.302835,
51
+ "nauc_ndcg_at_5_std": -0.038215,
52
+ "nauc_ndcg_at_5_diff1": 0.430194,
53
+ "nauc_ndcg_at_10_max": 0.305094,
54
+ "nauc_ndcg_at_10_std": -0.031369,
55
+ "nauc_ndcg_at_10_diff1": 0.433188,
56
+ "nauc_ndcg_at_20_max": 0.314716,
57
+ "nauc_ndcg_at_20_std": -0.026657,
58
+ "nauc_ndcg_at_20_diff1": 0.431479,
59
+ "nauc_ndcg_at_100_max": 0.322878,
60
+ "nauc_ndcg_at_100_std": -0.017131,
61
+ "nauc_ndcg_at_100_diff1": 0.435832,
62
+ "nauc_ndcg_at_1000_max": 0.324541,
63
+ "nauc_ndcg_at_1000_std": -0.014379,
64
+ "nauc_ndcg_at_1000_diff1": 0.440453,
65
+ "nauc_map_at_1_max": 0.267479,
66
+ "nauc_map_at_1_std": -0.091279,
67
+ "nauc_map_at_1_diff1": 0.491916,
68
+ "nauc_map_at_3_max": 0.288167,
69
+ "nauc_map_at_3_std": -0.067217,
70
+ "nauc_map_at_3_diff1": 0.453735,
71
+ "nauc_map_at_5_max": 0.289668,
72
+ "nauc_map_at_5_std": -0.05955,
73
+ "nauc_map_at_5_diff1": 0.44787,
74
+ "nauc_map_at_10_max": 0.292675,
75
+ "nauc_map_at_10_std": -0.055183,
76
+ "nauc_map_at_10_diff1": 0.449042,
77
+ "nauc_map_at_20_max": 0.297732,
78
+ "nauc_map_at_20_std": -0.051455,
79
+ "nauc_map_at_20_diff1": 0.447447,
80
+ "nauc_map_at_100_max": 0.299631,
81
+ "nauc_map_at_100_std": -0.049237,
82
+ "nauc_map_at_100_diff1": 0.447939,
83
+ "nauc_map_at_1000_max": 0.300132,
84
+ "nauc_map_at_1000_std": -0.04874,
85
+ "nauc_map_at_1000_diff1": 0.448188,
86
+ "nauc_recall_at_1_max": 0.267479,
87
+ "nauc_recall_at_1_std": -0.091279,
88
+ "nauc_recall_at_1_diff1": 0.491916,
89
+ "nauc_recall_at_3_max": 0.284466,
90
+ "nauc_recall_at_3_std": -0.037115,
91
+ "nauc_recall_at_3_diff1": 0.395438,
92
+ "nauc_recall_at_5_max": 0.286261,
93
+ "nauc_recall_at_5_std": -0.011679,
94
+ "nauc_recall_at_5_diff1": 0.372866,
95
+ "nauc_recall_at_10_max": 0.28412,
96
+ "nauc_recall_at_10_std": 0.010226,
97
+ "nauc_recall_at_10_diff1": 0.371927,
98
+ "nauc_recall_at_20_max": 0.32005,
99
+ "nauc_recall_at_20_std": 0.036578,
100
+ "nauc_recall_at_20_diff1": 0.355122,
101
+ "nauc_recall_at_100_max": 0.379673,
102
+ "nauc_recall_at_100_std": 0.128787,
103
+ "nauc_recall_at_100_diff1": 0.356563,
104
+ "nauc_recall_at_1000_max": 0.444678,
105
+ "nauc_recall_at_1000_std": 0.34169,
106
+ "nauc_recall_at_1000_diff1": 0.39469,
107
+ "nauc_precision_at_1_max": 0.305617,
108
+ "nauc_precision_at_1_std": -0.071554,
109
+ "nauc_precision_at_1_diff1": 0.49715,
110
+ "nauc_precision_at_3_max": 0.336032,
111
+ "nauc_precision_at_3_std": 0.014341,
112
+ "nauc_precision_at_3_diff1": 0.346681,
113
+ "nauc_precision_at_5_max": 0.335561,
114
+ "nauc_precision_at_5_std": 0.064425,
115
+ "nauc_precision_at_5_diff1": 0.293209,
116
+ "nauc_precision_at_10_max": 0.335572,
117
+ "nauc_precision_at_10_std": 0.118777,
118
+ "nauc_precision_at_10_diff1": 0.23389,
119
+ "nauc_precision_at_20_max": 0.355328,
120
+ "nauc_precision_at_20_std": 0.180976,
121
+ "nauc_precision_at_20_diff1": 0.155829,
122
+ "nauc_precision_at_100_max": 0.363407,
123
+ "nauc_precision_at_100_std": 0.269872,
124
+ "nauc_precision_at_100_diff1": 0.086922,
125
+ "nauc_precision_at_1000_max": 0.326957,
126
+ "nauc_precision_at_1000_std": 0.312661,
127
+ "nauc_precision_at_1000_diff1": 0.021796,
128
+ "nauc_mrr_at_1_max": 0.305617,
129
+ "nauc_mrr_at_1_std": -0.071554,
130
+ "nauc_mrr_at_1_diff1": 0.49715,
131
+ "nauc_mrr_at_3_max": 0.318173,
132
+ "nauc_mrr_at_3_std": -0.045833,
133
+ "nauc_mrr_at_3_diff1": 0.450074,
134
+ "nauc_mrr_at_5_max": 0.319705,
135
+ "nauc_mrr_at_5_std": -0.04069,
136
+ "nauc_mrr_at_5_diff1": 0.448169,
137
+ "nauc_mrr_at_10_max": 0.319536,
138
+ "nauc_mrr_at_10_std": -0.03804,
139
+ "nauc_mrr_at_10_diff1": 0.449888,
140
+ "nauc_mrr_at_20_max": 0.320657,
141
+ "nauc_mrr_at_20_std": -0.038708,
142
+ "nauc_mrr_at_20_diff1": 0.450192,
143
+ "nauc_mrr_at_100_max": 0.320716,
144
+ "nauc_mrr_at_100_std": -0.038168,
145
+ "nauc_mrr_at_100_diff1": 0.450446,
146
+ "nauc_mrr_at_1000_max": 0.32065,
147
+ "nauc_mrr_at_1000_std": -0.038121,
148
+ "nauc_mrr_at_1000_diff1": 0.450542,
149
+ "hit_rate_at_1": 0.32853,
150
+ "hit_rate_at_3": 0.48276,
151
+ "hit_rate_at_5": 0.54671,
152
+ "hit_rate_at_10": 0.63887,
153
+ "hit_rate_at_20": 0.72163,
154
+ "hit_rate_at_100": 0.84075,
155
+ "hit_rate_at_1000": 0.94608,
156
+ "main_score": 0.44909,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 496.17574644088745,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGisRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "5003b3064772da1887988e05400cf3806fe491f2",
3
+ "task_name": "CQADupstackGisRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.20565,
9
+ "ndcg_at_3": 0.25069,
10
+ "ndcg_at_5": 0.27756,
11
+ "ndcg_at_10": 0.2965,
12
+ "ndcg_at_20": 0.31206,
13
+ "ndcg_at_100": 0.34952,
14
+ "ndcg_at_1000": 0.3764,
15
+ "map_at_1": 0.19004,
16
+ "map_at_3": 0.23229,
17
+ "map_at_5": 0.24757,
18
+ "map_at_10": 0.25568,
19
+ "map_at_20": 0.26038,
20
+ "map_at_100": 0.26562,
21
+ "map_at_1000": 0.26664,
22
+ "recall_at_1": 0.19004,
23
+ "recall_at_3": 0.28447,
24
+ "recall_at_5": 0.35038,
25
+ "recall_at_10": 0.40579,
26
+ "recall_at_20": 0.46277,
27
+ "recall_at_100": 0.65846,
28
+ "recall_at_1000": 0.86218,
29
+ "accuracy": 0.19004,
30
+ "precision_at_1": 0.20565,
31
+ "precision_at_3": 0.10508,
32
+ "precision_at_5": 0.07819,
33
+ "precision_at_10": 0.04621,
34
+ "precision_at_20": 0.02695,
35
+ "precision_at_100": 0.00774,
36
+ "precision_at_1000": 0.00105,
37
+ "mrr_at_1": 0.20565,
38
+ "mrr_at_3": 0.249718,
39
+ "mrr_at_5": 0.265085,
40
+ "mrr_at_10": 0.273603,
41
+ "mrr_at_20": 0.277552,
42
+ "mrr_at_100": 0.282518,
43
+ "mrr_at_1000": 0.283353,
44
+ "nauc_ndcg_at_1_max": 0.205467,
45
+ "nauc_ndcg_at_1_std": -0.060025,
46
+ "nauc_ndcg_at_1_diff1": 0.418451,
47
+ "nauc_ndcg_at_3_max": 0.203368,
48
+ "nauc_ndcg_at_3_std": -0.067418,
49
+ "nauc_ndcg_at_3_diff1": 0.366786,
50
+ "nauc_ndcg_at_5_max": 0.197986,
51
+ "nauc_ndcg_at_5_std": -0.062017,
52
+ "nauc_ndcg_at_5_diff1": 0.344412,
53
+ "nauc_ndcg_at_10_max": 0.204828,
54
+ "nauc_ndcg_at_10_std": -0.04495,
55
+ "nauc_ndcg_at_10_diff1": 0.328104,
56
+ "nauc_ndcg_at_20_max": 0.206858,
57
+ "nauc_ndcg_at_20_std": -0.037766,
58
+ "nauc_ndcg_at_20_diff1": 0.332071,
59
+ "nauc_ndcg_at_100_max": 0.211259,
60
+ "nauc_ndcg_at_100_std": -0.017416,
61
+ "nauc_ndcg_at_100_diff1": 0.330108,
62
+ "nauc_ndcg_at_1000_max": 0.214965,
63
+ "nauc_ndcg_at_1000_std": -0.018841,
64
+ "nauc_ndcg_at_1000_diff1": 0.330601,
65
+ "nauc_map_at_1_max": 0.201732,
66
+ "nauc_map_at_1_std": -0.073813,
67
+ "nauc_map_at_1_diff1": 0.4373,
68
+ "nauc_map_at_3_max": 0.202604,
69
+ "nauc_map_at_3_std": -0.072065,
70
+ "nauc_map_at_3_diff1": 0.385648,
71
+ "nauc_map_at_5_max": 0.200841,
72
+ "nauc_map_at_5_std": -0.067971,
73
+ "nauc_map_at_5_diff1": 0.372142,
74
+ "nauc_map_at_10_max": 0.204125,
75
+ "nauc_map_at_10_std": -0.060041,
76
+ "nauc_map_at_10_diff1": 0.365397,
77
+ "nauc_map_at_20_max": 0.205076,
78
+ "nauc_map_at_20_std": -0.05836,
79
+ "nauc_map_at_20_diff1": 0.367157,
80
+ "nauc_map_at_100_max": 0.205905,
81
+ "nauc_map_at_100_std": -0.05471,
82
+ "nauc_map_at_100_diff1": 0.366412,
83
+ "nauc_map_at_1000_max": 0.206137,
84
+ "nauc_map_at_1000_std": -0.054563,
85
+ "nauc_map_at_1000_diff1": 0.366341,
86
+ "nauc_recall_at_1_max": 0.201732,
87
+ "nauc_recall_at_1_std": -0.073813,
88
+ "nauc_recall_at_1_diff1": 0.4373,
89
+ "nauc_recall_at_3_max": 0.196646,
90
+ "nauc_recall_at_3_std": -0.062757,
91
+ "nauc_recall_at_3_diff1": 0.327816,
92
+ "nauc_recall_at_5_max": 0.183653,
93
+ "nauc_recall_at_5_std": -0.05594,
94
+ "nauc_recall_at_5_diff1": 0.276552,
95
+ "nauc_recall_at_10_max": 0.197735,
96
+ "nauc_recall_at_10_std": -0.01277,
97
+ "nauc_recall_at_10_diff1": 0.231384,
98
+ "nauc_recall_at_20_max": 0.200918,
99
+ "nauc_recall_at_20_std": 0.012395,
100
+ "nauc_recall_at_20_diff1": 0.24116,
101
+ "nauc_recall_at_100_max": 0.21523,
102
+ "nauc_recall_at_100_std": 0.125155,
103
+ "nauc_recall_at_100_diff1": 0.212078,
104
+ "nauc_recall_at_1000_max": 0.282007,
105
+ "nauc_recall_at_1000_std": 0.243346,
106
+ "nauc_recall_at_1000_diff1": 0.126783,
107
+ "nauc_precision_at_1_max": 0.205467,
108
+ "nauc_precision_at_1_std": -0.060025,
109
+ "nauc_precision_at_1_diff1": 0.418451,
110
+ "nauc_precision_at_3_max": 0.210966,
111
+ "nauc_precision_at_3_std": -0.051242,
112
+ "nauc_precision_at_3_diff1": 0.306892,
113
+ "nauc_precision_at_5_max": 0.194482,
114
+ "nauc_precision_at_5_std": -0.02911,
115
+ "nauc_precision_at_5_diff1": 0.249566,
116
+ "nauc_precision_at_10_max": 0.216654,
117
+ "nauc_precision_at_10_std": 0.017501,
118
+ "nauc_precision_at_10_diff1": 0.200545,
119
+ "nauc_precision_at_20_max": 0.221337,
120
+ "nauc_precision_at_20_std": 0.038209,
121
+ "nauc_precision_at_20_diff1": 0.199848,
122
+ "nauc_precision_at_100_max": 0.2239,
123
+ "nauc_precision_at_100_std": 0.154223,
124
+ "nauc_precision_at_100_diff1": 0.129406,
125
+ "nauc_precision_at_1000_max": 0.189711,
126
+ "nauc_precision_at_1000_std": 0.167058,
127
+ "nauc_precision_at_1000_diff1": -0.013446,
128
+ "nauc_mrr_at_1_max": 0.205467,
129
+ "nauc_mrr_at_1_std": -0.060025,
130
+ "nauc_mrr_at_1_diff1": 0.418451,
131
+ "nauc_mrr_at_3_max": 0.204972,
132
+ "nauc_mrr_at_3_std": -0.057449,
133
+ "nauc_mrr_at_3_diff1": 0.36957,
134
+ "nauc_mrr_at_5_max": 0.201969,
135
+ "nauc_mrr_at_5_std": -0.054649,
136
+ "nauc_mrr_at_5_diff1": 0.356704,
137
+ "nauc_mrr_at_10_max": 0.205387,
138
+ "nauc_mrr_at_10_std": -0.048696,
139
+ "nauc_mrr_at_10_diff1": 0.349052,
140
+ "nauc_mrr_at_20_max": 0.205448,
141
+ "nauc_mrr_at_20_std": -0.045792,
142
+ "nauc_mrr_at_20_diff1": 0.349683,
143
+ "nauc_mrr_at_100_max": 0.206059,
144
+ "nauc_mrr_at_100_std": -0.042883,
145
+ "nauc_mrr_at_100_diff1": 0.349667,
146
+ "nauc_mrr_at_1000_max": 0.206081,
147
+ "nauc_mrr_at_1000_std": -0.043094,
148
+ "nauc_mrr_at_1000_diff1": 0.349704,
149
+ "hit_rate_at_1": 0.20565,
150
+ "hit_rate_at_3": 0.30734,
151
+ "hit_rate_at_5": 0.37627,
152
+ "hit_rate_at_10": 0.43842,
153
+ "hit_rate_at_20": 0.49379,
154
+ "hit_rate_at_100": 0.6904,
155
+ "hit_rate_at_1000": 0.88475,
156
+ "main_score": 0.2965,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 410.3554427623749,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackMathematicaRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "90fceea13679c63fe563ded68f3b6f06e50061de",
3
+ "task_name": "CQADupstackMathematicaRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.16915,
9
+ "ndcg_at_3": 0.19522,
10
+ "ndcg_at_5": 0.21939,
11
+ "ndcg_at_10": 0.24905,
12
+ "ndcg_at_20": 0.26501,
13
+ "ndcg_at_100": 0.30033,
14
+ "ndcg_at_1000": 0.33451,
15
+ "map_at_1": 0.13545,
16
+ "map_at_3": 0.17225,
17
+ "map_at_5": 0.18711,
18
+ "map_at_10": 0.19999,
19
+ "map_at_20": 0.20463,
20
+ "map_at_100": 0.20997,
21
+ "map_at_1000": 0.21133,
22
+ "recall_at_1": 0.13545,
23
+ "recall_at_3": 0.2155,
24
+ "recall_at_5": 0.27544,
25
+ "recall_at_10": 0.36282,
26
+ "recall_at_20": 0.41967,
27
+ "recall_at_100": 0.59295,
28
+ "recall_at_1000": 0.83921,
29
+ "accuracy": 0.13545,
30
+ "precision_at_1": 0.16915,
31
+ "precision_at_3": 0.09121,
32
+ "precision_at_5": 0.07164,
33
+ "precision_at_10": 0.04876,
34
+ "precision_at_20": 0.02873,
35
+ "precision_at_100": 0.00852,
36
+ "precision_at_1000": 0.00129,
37
+ "mrr_at_1": 0.169154,
38
+ "mrr_at_3": 0.211443,
39
+ "mrr_at_5": 0.226803,
40
+ "mrr_at_10": 0.24021,
41
+ "mrr_at_20": 0.24467,
42
+ "mrr_at_100": 0.24893,
43
+ "mrr_at_1000": 0.249807,
44
+ "nauc_ndcg_at_1_max": 0.100895,
45
+ "nauc_ndcg_at_1_std": -0.020172,
46
+ "nauc_ndcg_at_1_diff1": 0.354507,
47
+ "nauc_ndcg_at_3_max": 0.111368,
48
+ "nauc_ndcg_at_3_std": -0.004085,
49
+ "nauc_ndcg_at_3_diff1": 0.300611,
50
+ "nauc_ndcg_at_5_max": 0.128843,
51
+ "nauc_ndcg_at_5_std": 0.010245,
52
+ "nauc_ndcg_at_5_diff1": 0.283146,
53
+ "nauc_ndcg_at_10_max": 0.111696,
54
+ "nauc_ndcg_at_10_std": 0.019335,
55
+ "nauc_ndcg_at_10_diff1": 0.260047,
56
+ "nauc_ndcg_at_20_max": 0.108871,
57
+ "nauc_ndcg_at_20_std": 0.023958,
58
+ "nauc_ndcg_at_20_diff1": 0.245936,
59
+ "nauc_ndcg_at_100_max": 0.114971,
60
+ "nauc_ndcg_at_100_std": 0.050727,
61
+ "nauc_ndcg_at_100_diff1": 0.243885,
62
+ "nauc_ndcg_at_1000_max": 0.121995,
63
+ "nauc_ndcg_at_1000_std": 0.048508,
64
+ "nauc_ndcg_at_1000_diff1": 0.253634,
65
+ "nauc_map_at_1_max": 0.123875,
66
+ "nauc_map_at_1_std": 0.017193,
67
+ "nauc_map_at_1_diff1": 0.346272,
68
+ "nauc_map_at_3_max": 0.114166,
69
+ "nauc_map_at_3_std": 0.008453,
70
+ "nauc_map_at_3_diff1": 0.31069,
71
+ "nauc_map_at_5_max": 0.122123,
72
+ "nauc_map_at_5_std": 0.012286,
73
+ "nauc_map_at_5_diff1": 0.300469,
74
+ "nauc_map_at_10_max": 0.115676,
75
+ "nauc_map_at_10_std": 0.01686,
76
+ "nauc_map_at_10_diff1": 0.289273,
77
+ "nauc_map_at_20_max": 0.11454,
78
+ "nauc_map_at_20_std": 0.018289,
79
+ "nauc_map_at_20_diff1": 0.284773,
80
+ "nauc_map_at_100_max": 0.115481,
81
+ "nauc_map_at_100_std": 0.022958,
82
+ "nauc_map_at_100_diff1": 0.284186,
83
+ "nauc_map_at_1000_max": 0.115656,
84
+ "nauc_map_at_1000_std": 0.022811,
85
+ "nauc_map_at_1000_diff1": 0.284445,
86
+ "nauc_recall_at_1_max": 0.123875,
87
+ "nauc_recall_at_1_std": 0.017193,
88
+ "nauc_recall_at_1_diff1": 0.346272,
89
+ "nauc_recall_at_3_max": 0.102178,
90
+ "nauc_recall_at_3_std": -0.001789,
91
+ "nauc_recall_at_3_diff1": 0.268311,
92
+ "nauc_recall_at_5_max": 0.134527,
93
+ "nauc_recall_at_5_std": 0.021453,
94
+ "nauc_recall_at_5_diff1": 0.228817,
95
+ "nauc_recall_at_10_max": 0.092912,
96
+ "nauc_recall_at_10_std": 0.038118,
97
+ "nauc_recall_at_10_diff1": 0.171339,
98
+ "nauc_recall_at_20_max": 0.082824,
99
+ "nauc_recall_at_20_std": 0.050012,
100
+ "nauc_recall_at_20_diff1": 0.128957,
101
+ "nauc_recall_at_100_max": 0.105922,
102
+ "nauc_recall_at_100_std": 0.175272,
103
+ "nauc_recall_at_100_diff1": 0.10447,
104
+ "nauc_recall_at_1000_max": 0.215997,
105
+ "nauc_recall_at_1000_std": 0.290842,
106
+ "nauc_recall_at_1000_diff1": 0.131729,
107
+ "nauc_precision_at_1_max": 0.100895,
108
+ "nauc_precision_at_1_std": -0.020172,
109
+ "nauc_precision_at_1_diff1": 0.354507,
110
+ "nauc_precision_at_3_max": 0.103967,
111
+ "nauc_precision_at_3_std": -0.02477,
112
+ "nauc_precision_at_3_diff1": 0.269381,
113
+ "nauc_precision_at_5_max": 0.141522,
114
+ "nauc_precision_at_5_std": -0.001901,
115
+ "nauc_precision_at_5_diff1": 0.224426,
116
+ "nauc_precision_at_10_max": 0.088026,
117
+ "nauc_precision_at_10_std": 0.008489,
118
+ "nauc_precision_at_10_diff1": 0.145186,
119
+ "nauc_precision_at_20_max": 0.071465,
120
+ "nauc_precision_at_20_std": 0.024018,
121
+ "nauc_precision_at_20_diff1": 0.085269,
122
+ "nauc_precision_at_100_max": 0.065434,
123
+ "nauc_precision_at_100_std": 0.091729,
124
+ "nauc_precision_at_100_diff1": 0.008164,
125
+ "nauc_precision_at_1000_max": 0.027449,
126
+ "nauc_precision_at_1000_std": 0.022966,
127
+ "nauc_precision_at_1000_diff1": -0.059975,
128
+ "nauc_mrr_at_1_max": 0.100895,
129
+ "nauc_mrr_at_1_std": -0.020172,
130
+ "nauc_mrr_at_1_diff1": 0.354507,
131
+ "nauc_mrr_at_3_max": 0.112467,
132
+ "nauc_mrr_at_3_std": -0.012036,
133
+ "nauc_mrr_at_3_diff1": 0.313472,
134
+ "nauc_mrr_at_5_max": 0.121921,
135
+ "nauc_mrr_at_5_std": -0.007063,
136
+ "nauc_mrr_at_5_diff1": 0.305985,
137
+ "nauc_mrr_at_10_max": 0.113586,
138
+ "nauc_mrr_at_10_std": -0.005322,
139
+ "nauc_mrr_at_10_diff1": 0.297007,
140
+ "nauc_mrr_at_20_max": 0.113501,
141
+ "nauc_mrr_at_20_std": -0.004042,
142
+ "nauc_mrr_at_20_diff1": 0.293217,
143
+ "nauc_mrr_at_100_max": 0.113325,
144
+ "nauc_mrr_at_100_std": -0.001549,
145
+ "nauc_mrr_at_100_diff1": 0.293701,
146
+ "nauc_mrr_at_1000_max": 0.113302,
147
+ "nauc_mrr_at_1000_std": -0.001718,
148
+ "nauc_mrr_at_1000_diff1": 0.294076,
149
+ "hit_rate_at_1": 0.16915,
150
+ "hit_rate_at_3": 0.26493,
151
+ "hit_rate_at_5": 0.33333,
152
+ "hit_rate_at_10": 0.43657,
153
+ "hit_rate_at_20": 0.49876,
154
+ "hit_rate_at_100": 0.66915,
155
+ "hit_rate_at_1000": 0.88184,
156
+ "main_score": 0.24905,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 190.05853271484375,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackPhysicsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4",
3
+ "task_name": "CQADupstackPhysicsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.24158,
9
+ "ndcg_at_3": 0.29383,
10
+ "ndcg_at_5": 0.31385,
11
+ "ndcg_at_10": 0.34235,
12
+ "ndcg_at_20": 0.36829,
13
+ "ndcg_at_100": 0.40562,
14
+ "ndcg_at_1000": 0.43046,
15
+ "map_at_1": 0.20178,
16
+ "map_at_3": 0.26126,
17
+ "map_at_5": 0.2746,
18
+ "map_at_10": 0.28807,
19
+ "map_at_20": 0.29622,
20
+ "map_at_100": 0.30252,
21
+ "map_at_1000": 0.30371,
22
+ "recall_at_1": 0.20178,
23
+ "recall_at_3": 0.32738,
24
+ "recall_at_5": 0.37788,
25
+ "recall_at_10": 0.46084,
26
+ "recall_at_20": 0.55386,
27
+ "recall_at_100": 0.72744,
28
+ "recall_at_1000": 0.89712,
29
+ "accuracy": 0.20178,
30
+ "precision_at_1": 0.24158,
31
+ "precision_at_3": 0.14084,
32
+ "precision_at_5": 0.10067,
33
+ "precision_at_10": 0.06487,
34
+ "precision_at_20": 0.03994,
35
+ "precision_at_100": 0.01146,
36
+ "precision_at_1000": 0.00153,
37
+ "mrr_at_1": 0.241578,
38
+ "mrr_at_3": 0.305422,
39
+ "mrr_at_5": 0.318415,
40
+ "mrr_at_10": 0.330999,
41
+ "mrr_at_20": 0.337648,
42
+ "mrr_at_100": 0.341864,
43
+ "mrr_at_1000": 0.342421,
44
+ "nauc_ndcg_at_1_max": 0.22923,
45
+ "nauc_ndcg_at_1_std": -0.069274,
46
+ "nauc_ndcg_at_1_diff1": 0.514119,
47
+ "nauc_ndcg_at_3_max": 0.216371,
48
+ "nauc_ndcg_at_3_std": -0.047753,
49
+ "nauc_ndcg_at_3_diff1": 0.451638,
50
+ "nauc_ndcg_at_5_max": 0.221911,
51
+ "nauc_ndcg_at_5_std": -0.042828,
52
+ "nauc_ndcg_at_5_diff1": 0.440762,
53
+ "nauc_ndcg_at_10_max": 0.220246,
54
+ "nauc_ndcg_at_10_std": -0.033718,
55
+ "nauc_ndcg_at_10_diff1": 0.433089,
56
+ "nauc_ndcg_at_20_max": 0.208805,
57
+ "nauc_ndcg_at_20_std": -0.028319,
58
+ "nauc_ndcg_at_20_diff1": 0.428688,
59
+ "nauc_ndcg_at_100_max": 0.22807,
60
+ "nauc_ndcg_at_100_std": -0.007863,
61
+ "nauc_ndcg_at_100_diff1": 0.430369,
62
+ "nauc_ndcg_at_1000_max": 0.231993,
63
+ "nauc_ndcg_at_1000_std": -0.005655,
64
+ "nauc_ndcg_at_1000_diff1": 0.432466,
65
+ "nauc_map_at_1_max": 0.199767,
66
+ "nauc_map_at_1_std": -0.092946,
67
+ "nauc_map_at_1_diff1": 0.548382,
68
+ "nauc_map_at_3_max": 0.206223,
69
+ "nauc_map_at_3_std": -0.067461,
70
+ "nauc_map_at_3_diff1": 0.480606,
71
+ "nauc_map_at_5_max": 0.21469,
72
+ "nauc_map_at_5_std": -0.060353,
73
+ "nauc_map_at_5_diff1": 0.469854,
74
+ "nauc_map_at_10_max": 0.216693,
75
+ "nauc_map_at_10_std": -0.052822,
76
+ "nauc_map_at_10_diff1": 0.465072,
77
+ "nauc_map_at_20_max": 0.214553,
78
+ "nauc_map_at_20_std": -0.048859,
79
+ "nauc_map_at_20_diff1": 0.463081,
80
+ "nauc_map_at_100_max": 0.218223,
81
+ "nauc_map_at_100_std": -0.04477,
82
+ "nauc_map_at_100_diff1": 0.46296,
83
+ "nauc_map_at_1000_max": 0.218744,
84
+ "nauc_map_at_1000_std": -0.04427,
85
+ "nauc_map_at_1000_diff1": 0.46295,
86
+ "nauc_recall_at_1_max": 0.199767,
87
+ "nauc_recall_at_1_std": -0.092946,
88
+ "nauc_recall_at_1_diff1": 0.548382,
89
+ "nauc_recall_at_3_max": 0.186755,
90
+ "nauc_recall_at_3_std": -0.051939,
91
+ "nauc_recall_at_3_diff1": 0.415502,
92
+ "nauc_recall_at_5_max": 0.199772,
93
+ "nauc_recall_at_5_std": -0.03885,
94
+ "nauc_recall_at_5_diff1": 0.378626,
95
+ "nauc_recall_at_10_max": 0.194928,
96
+ "nauc_recall_at_10_std": -0.009361,
97
+ "nauc_recall_at_10_diff1": 0.342736,
98
+ "nauc_recall_at_20_max": 0.139555,
99
+ "nauc_recall_at_20_std": 0.004315,
100
+ "nauc_recall_at_20_diff1": 0.316148,
101
+ "nauc_recall_at_100_max": 0.214416,
102
+ "nauc_recall_at_100_std": 0.12435,
103
+ "nauc_recall_at_100_diff1": 0.299645,
104
+ "nauc_recall_at_1000_max": 0.268208,
105
+ "nauc_recall_at_1000_std": 0.297148,
106
+ "nauc_recall_at_1000_diff1": 0.260845,
107
+ "nauc_precision_at_1_max": 0.22923,
108
+ "nauc_precision_at_1_std": -0.069274,
109
+ "nauc_precision_at_1_diff1": 0.514119,
110
+ "nauc_precision_at_3_max": 0.257747,
111
+ "nauc_precision_at_3_std": 0.035761,
112
+ "nauc_precision_at_3_diff1": 0.338342,
113
+ "nauc_precision_at_5_max": 0.284549,
114
+ "nauc_precision_at_5_std": 0.075212,
115
+ "nauc_precision_at_5_diff1": 0.27549,
116
+ "nauc_precision_at_10_max": 0.267505,
117
+ "nauc_precision_at_10_std": 0.123163,
118
+ "nauc_precision_at_10_diff1": 0.206667,
119
+ "nauc_precision_at_20_max": 0.224159,
120
+ "nauc_precision_at_20_std": 0.15824,
121
+ "nauc_precision_at_20_diff1": 0.141054,
122
+ "nauc_precision_at_100_max": 0.240232,
123
+ "nauc_precision_at_100_std": 0.232688,
124
+ "nauc_precision_at_100_diff1": 0.013049,
125
+ "nauc_precision_at_1000_max": 0.144113,
126
+ "nauc_precision_at_1000_std": 0.187898,
127
+ "nauc_precision_at_1000_diff1": -0.101707,
128
+ "nauc_mrr_at_1_max": 0.22923,
129
+ "nauc_mrr_at_1_std": -0.069274,
130
+ "nauc_mrr_at_1_diff1": 0.514119,
131
+ "nauc_mrr_at_3_max": 0.224142,
132
+ "nauc_mrr_at_3_std": -0.047477,
133
+ "nauc_mrr_at_3_diff1": 0.454186,
134
+ "nauc_mrr_at_5_max": 0.227117,
135
+ "nauc_mrr_at_5_std": -0.044585,
136
+ "nauc_mrr_at_5_diff1": 0.446953,
137
+ "nauc_mrr_at_10_max": 0.228146,
138
+ "nauc_mrr_at_10_std": -0.03916,
139
+ "nauc_mrr_at_10_diff1": 0.443104,
140
+ "nauc_mrr_at_20_max": 0.224568,
141
+ "nauc_mrr_at_20_std": -0.039608,
142
+ "nauc_mrr_at_20_diff1": 0.442529,
143
+ "nauc_mrr_at_100_max": 0.22645,
144
+ "nauc_mrr_at_100_std": -0.038528,
145
+ "nauc_mrr_at_100_diff1": 0.443425,
146
+ "nauc_mrr_at_1000_max": 0.22656,
147
+ "nauc_mrr_at_1000_std": -0.038372,
148
+ "nauc_mrr_at_1000_diff1": 0.443516,
149
+ "hit_rate_at_1": 0.24158,
150
+ "hit_rate_at_3": 0.38787,
151
+ "hit_rate_at_5": 0.44466,
152
+ "hit_rate_at_10": 0.53802,
153
+ "hit_rate_at_20": 0.63234,
154
+ "hit_rate_at_100": 0.79596,
155
+ "hit_rate_at_1000": 0.92974,
156
+ "main_score": 0.34235,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 414.5701470375061,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackProgrammersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6184bc1440d2dbc7612be22b50686b8826d22b32",
3
+ "task_name": "CQADupstackProgrammersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.25,
9
+ "ndcg_at_3": 0.2956,
10
+ "ndcg_at_5": 0.31064,
11
+ "ndcg_at_10": 0.3353,
12
+ "ndcg_at_20": 0.35855,
13
+ "ndcg_at_100": 0.39643,
14
+ "ndcg_at_1000": 0.42614,
15
+ "map_at_1": 0.20257,
16
+ "map_at_3": 0.26097,
17
+ "map_at_5": 0.27295,
18
+ "map_at_10": 0.28466,
19
+ "map_at_20": 0.29204,
20
+ "map_at_100": 0.29823,
21
+ "map_at_1000": 0.2996,
22
+ "recall_at_1": 0.20257,
23
+ "recall_at_3": 0.32658,
24
+ "recall_at_5": 0.37032,
25
+ "recall_at_10": 0.44086,
26
+ "recall_at_20": 0.52196,
27
+ "recall_at_100": 0.70334,
28
+ "recall_at_1000": 0.90798,
29
+ "accuracy": 0.20257,
30
+ "precision_at_1": 0.25,
31
+ "precision_at_3": 0.14155,
32
+ "precision_at_5": 0.09863,
33
+ "precision_at_10": 0.06153,
34
+ "precision_at_20": 0.03801,
35
+ "precision_at_100": 0.01074,
36
+ "precision_at_1000": 0.00152,
37
+ "mrr_at_1": 0.25,
38
+ "mrr_at_3": 0.310693,
39
+ "mrr_at_5": 0.322051,
40
+ "mrr_at_10": 0.332699,
41
+ "mrr_at_20": 0.338434,
42
+ "mrr_at_100": 0.342982,
43
+ "mrr_at_1000": 0.34372,
44
+ "nauc_ndcg_at_1_max": 0.324805,
45
+ "nauc_ndcg_at_1_std": -0.027334,
46
+ "nauc_ndcg_at_1_diff1": 0.441793,
47
+ "nauc_ndcg_at_3_max": 0.285088,
48
+ "nauc_ndcg_at_3_std": -0.054547,
49
+ "nauc_ndcg_at_3_diff1": 0.39759,
50
+ "nauc_ndcg_at_5_max": 0.292995,
51
+ "nauc_ndcg_at_5_std": -0.037491,
52
+ "nauc_ndcg_at_5_diff1": 0.393983,
53
+ "nauc_ndcg_at_10_max": 0.300657,
54
+ "nauc_ndcg_at_10_std": -0.02649,
55
+ "nauc_ndcg_at_10_diff1": 0.378899,
56
+ "nauc_ndcg_at_20_max": 0.309296,
57
+ "nauc_ndcg_at_20_std": -0.012752,
58
+ "nauc_ndcg_at_20_diff1": 0.376906,
59
+ "nauc_ndcg_at_100_max": 0.320962,
60
+ "nauc_ndcg_at_100_std": 0.014914,
61
+ "nauc_ndcg_at_100_diff1": 0.373243,
62
+ "nauc_ndcg_at_1000_max": 0.320363,
63
+ "nauc_ndcg_at_1000_std": 0.015294,
64
+ "nauc_ndcg_at_1000_diff1": 0.378516,
65
+ "nauc_map_at_1_max": 0.272369,
66
+ "nauc_map_at_1_std": -0.059931,
67
+ "nauc_map_at_1_diff1": 0.47015,
68
+ "nauc_map_at_3_max": 0.276393,
69
+ "nauc_map_at_3_std": -0.062672,
70
+ "nauc_map_at_3_diff1": 0.417012,
71
+ "nauc_map_at_5_max": 0.286385,
72
+ "nauc_map_at_5_std": -0.049921,
73
+ "nauc_map_at_5_diff1": 0.413977,
74
+ "nauc_map_at_10_max": 0.292044,
75
+ "nauc_map_at_10_std": -0.041998,
76
+ "nauc_map_at_10_diff1": 0.406553,
77
+ "nauc_map_at_20_max": 0.29505,
78
+ "nauc_map_at_20_std": -0.037604,
79
+ "nauc_map_at_20_diff1": 0.405125,
80
+ "nauc_map_at_100_max": 0.298342,
81
+ "nauc_map_at_100_std": -0.032401,
82
+ "nauc_map_at_100_diff1": 0.404164,
83
+ "nauc_map_at_1000_max": 0.298531,
84
+ "nauc_map_at_1000_std": -0.031908,
85
+ "nauc_map_at_1000_diff1": 0.404317,
86
+ "nauc_recall_at_1_max": 0.272369,
87
+ "nauc_recall_at_1_std": -0.059931,
88
+ "nauc_recall_at_1_diff1": 0.47015,
89
+ "nauc_recall_at_3_max": 0.250147,
90
+ "nauc_recall_at_3_std": -0.075139,
91
+ "nauc_recall_at_3_diff1": 0.358525,
92
+ "nauc_recall_at_5_max": 0.268922,
93
+ "nauc_recall_at_5_std": -0.031596,
94
+ "nauc_recall_at_5_diff1": 0.337474,
95
+ "nauc_recall_at_10_max": 0.279233,
96
+ "nauc_recall_at_10_std": -0.006019,
97
+ "nauc_recall_at_10_diff1": 0.295428,
98
+ "nauc_recall_at_20_max": 0.306878,
99
+ "nauc_recall_at_20_std": 0.041206,
100
+ "nauc_recall_at_20_diff1": 0.285753,
101
+ "nauc_recall_at_100_max": 0.350074,
102
+ "nauc_recall_at_100_std": 0.202419,
103
+ "nauc_recall_at_100_diff1": 0.244626,
104
+ "nauc_recall_at_1000_max": 0.4222,
105
+ "nauc_recall_at_1000_std": 0.46996,
106
+ "nauc_recall_at_1000_diff1": 0.21531,
107
+ "nauc_precision_at_1_max": 0.324805,
108
+ "nauc_precision_at_1_std": -0.027334,
109
+ "nauc_precision_at_1_diff1": 0.441793,
110
+ "nauc_precision_at_3_max": 0.308426,
111
+ "nauc_precision_at_3_std": -0.023667,
112
+ "nauc_precision_at_3_diff1": 0.313257,
113
+ "nauc_precision_at_5_max": 0.340048,
114
+ "nauc_precision_at_5_std": 0.028194,
115
+ "nauc_precision_at_5_diff1": 0.287226,
116
+ "nauc_precision_at_10_max": 0.335668,
117
+ "nauc_precision_at_10_std": 0.069652,
118
+ "nauc_precision_at_10_diff1": 0.198088,
119
+ "nauc_precision_at_20_max": 0.307085,
120
+ "nauc_precision_at_20_std": 0.10034,
121
+ "nauc_precision_at_20_diff1": 0.148041,
122
+ "nauc_precision_at_100_max": 0.252671,
123
+ "nauc_precision_at_100_std": 0.183485,
124
+ "nauc_precision_at_100_diff1": 0.025992,
125
+ "nauc_precision_at_1000_max": 0.0652,
126
+ "nauc_precision_at_1000_std": 0.125647,
127
+ "nauc_precision_at_1000_diff1": -0.065083,
128
+ "nauc_mrr_at_1_max": 0.324805,
129
+ "nauc_mrr_at_1_std": -0.027334,
130
+ "nauc_mrr_at_1_diff1": 0.441793,
131
+ "nauc_mrr_at_3_max": 0.311536,
132
+ "nauc_mrr_at_3_std": -0.035323,
133
+ "nauc_mrr_at_3_diff1": 0.407277,
134
+ "nauc_mrr_at_5_max": 0.315657,
135
+ "nauc_mrr_at_5_std": -0.023003,
136
+ "nauc_mrr_at_5_diff1": 0.400069,
137
+ "nauc_mrr_at_10_max": 0.31773,
138
+ "nauc_mrr_at_10_std": -0.020922,
139
+ "nauc_mrr_at_10_diff1": 0.39409,
140
+ "nauc_mrr_at_20_max": 0.318732,
141
+ "nauc_mrr_at_20_std": -0.017088,
142
+ "nauc_mrr_at_20_diff1": 0.394256,
143
+ "nauc_mrr_at_100_max": 0.319804,
144
+ "nauc_mrr_at_100_std": -0.013964,
145
+ "nauc_mrr_at_100_diff1": 0.394048,
146
+ "nauc_mrr_at_1000_max": 0.319858,
147
+ "nauc_mrr_at_1000_std": -0.014089,
148
+ "nauc_mrr_at_1000_diff1": 0.394367,
149
+ "hit_rate_at_1": 0.25,
150
+ "hit_rate_at_3": 0.38927,
151
+ "hit_rate_at_5": 0.43836,
152
+ "hit_rate_at_10": 0.51712,
153
+ "hit_rate_at_20": 0.59817,
154
+ "hit_rate_at_100": 0.77055,
155
+ "hit_rate_at_1000": 0.9395,
156
+ "main_score": 0.3353,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 349.29484486579895,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackRetrieval.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1",
3
+ "task_name": "CQADupstackRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_10": 0.310504,
9
+ "main_score": 0.310504,
10
+ "hf_subset": "default",
11
+ "languages": [
12
+ "eng-Latn"
13
+ ]
14
+ }
15
+ ]
16
+ },
17
+ "evaluation_time": 5117.2366778850555,
18
+ "kg_co2_emissions": NaN,
19
+ "date": 1775129174.61369
20
+ }
results/CQADupstackStatsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65ac3a16b8e91f9cee4c9828cc7c335575432a2a",
3
+ "task_name": "CQADupstackStatsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19939,
9
+ "ndcg_at_3": 0.23295,
10
+ "ndcg_at_5": 0.24714,
11
+ "ndcg_at_10": 0.26658,
12
+ "ndcg_at_20": 0.28099,
13
+ "ndcg_at_100": 0.30745,
14
+ "ndcg_at_1000": 0.3375,
15
+ "map_at_1": 0.1779,
16
+ "map_at_3": 0.21501,
17
+ "map_at_5": 0.2238,
18
+ "map_at_10": 0.23258,
19
+ "map_at_20": 0.23672,
20
+ "map_at_100": 0.24033,
21
+ "map_at_1000": 0.24142,
22
+ "recall_at_1": 0.1779,
23
+ "recall_at_3": 0.25936,
24
+ "recall_at_5": 0.29372,
25
+ "recall_at_10": 0.35086,
26
+ "recall_at_20": 0.40441,
27
+ "recall_at_100": 0.54095,
28
+ "recall_at_1000": 0.7666,
29
+ "accuracy": 0.1779,
30
+ "precision_at_1": 0.19939,
31
+ "precision_at_3": 0.09969,
32
+ "precision_at_5": 0.06994,
33
+ "precision_at_10": 0.04233,
34
+ "precision_at_20": 0.02477,
35
+ "precision_at_100": 0.00679,
36
+ "precision_at_1000": 0.00102,
37
+ "mrr_at_1": 0.199387,
38
+ "mrr_at_3": 0.236963,
39
+ "mrr_at_5": 0.245092,
40
+ "mrr_at_10": 0.253289,
41
+ "mrr_at_20": 0.257134,
42
+ "mrr_at_100": 0.260601,
43
+ "mrr_at_1000": 0.261461,
44
+ "nauc_ndcg_at_1_max": 0.317712,
45
+ "nauc_ndcg_at_1_std": -0.000928,
46
+ "nauc_ndcg_at_1_diff1": 0.593413,
47
+ "nauc_ndcg_at_3_max": 0.289033,
48
+ "nauc_ndcg_at_3_std": 0.014542,
49
+ "nauc_ndcg_at_3_diff1": 0.532802,
50
+ "nauc_ndcg_at_5_max": 0.281033,
51
+ "nauc_ndcg_at_5_std": 0.031589,
52
+ "nauc_ndcg_at_5_diff1": 0.501139,
53
+ "nauc_ndcg_at_10_max": 0.2812,
54
+ "nauc_ndcg_at_10_std": 0.052378,
55
+ "nauc_ndcg_at_10_diff1": 0.478516,
56
+ "nauc_ndcg_at_20_max": 0.284778,
57
+ "nauc_ndcg_at_20_std": 0.059559,
58
+ "nauc_ndcg_at_20_diff1": 0.462788,
59
+ "nauc_ndcg_at_100_max": 0.280638,
60
+ "nauc_ndcg_at_100_std": 0.075719,
61
+ "nauc_ndcg_at_100_diff1": 0.45779,
62
+ "nauc_ndcg_at_1000_max": 0.279537,
63
+ "nauc_ndcg_at_1000_std": 0.078197,
64
+ "nauc_ndcg_at_1000_diff1": 0.4587,
65
+ "nauc_map_at_1_max": 0.317489,
66
+ "nauc_map_at_1_std": -0.029635,
67
+ "nauc_map_at_1_diff1": 0.602172,
68
+ "nauc_map_at_3_max": 0.293764,
69
+ "nauc_map_at_3_std": -0.004741,
70
+ "nauc_map_at_3_diff1": 0.550511,
71
+ "nauc_map_at_5_max": 0.288052,
72
+ "nauc_map_at_5_std": 0.00658,
73
+ "nauc_map_at_5_diff1": 0.530348,
74
+ "nauc_map_at_10_max": 0.289485,
75
+ "nauc_map_at_10_std": 0.018463,
76
+ "nauc_map_at_10_diff1": 0.5192,
77
+ "nauc_map_at_20_max": 0.290112,
78
+ "nauc_map_at_20_std": 0.021243,
79
+ "nauc_map_at_20_diff1": 0.513921,
80
+ "nauc_map_at_100_max": 0.289861,
81
+ "nauc_map_at_100_std": 0.023636,
82
+ "nauc_map_at_100_diff1": 0.513217,
83
+ "nauc_map_at_1000_max": 0.289854,
84
+ "nauc_map_at_1000_std": 0.023939,
85
+ "nauc_map_at_1000_diff1": 0.513306,
86
+ "nauc_recall_at_1_max": 0.317489,
87
+ "nauc_recall_at_1_std": -0.029635,
88
+ "nauc_recall_at_1_diff1": 0.602172,
89
+ "nauc_recall_at_3_max": 0.26738,
90
+ "nauc_recall_at_3_std": 0.027758,
91
+ "nauc_recall_at_3_diff1": 0.488059,
92
+ "nauc_recall_at_5_max": 0.250209,
93
+ "nauc_recall_at_5_std": 0.068913,
94
+ "nauc_recall_at_5_diff1": 0.418075,
95
+ "nauc_recall_at_10_max": 0.2458,
96
+ "nauc_recall_at_10_std": 0.115391,
97
+ "nauc_recall_at_10_diff1": 0.357596,
98
+ "nauc_recall_at_20_max": 0.254415,
99
+ "nauc_recall_at_20_std": 0.134942,
100
+ "nauc_recall_at_20_diff1": 0.306501,
101
+ "nauc_recall_at_100_max": 0.228532,
102
+ "nauc_recall_at_100_std": 0.218078,
103
+ "nauc_recall_at_100_diff1": 0.263275,
104
+ "nauc_recall_at_1000_max": 0.179482,
105
+ "nauc_recall_at_1000_std": 0.300629,
106
+ "nauc_recall_at_1000_diff1": 0.161043,
107
+ "nauc_precision_at_1_max": 0.317712,
108
+ "nauc_precision_at_1_std": -0.000928,
109
+ "nauc_precision_at_1_diff1": 0.593413,
110
+ "nauc_precision_at_3_max": 0.277551,
111
+ "nauc_precision_at_3_std": 0.072235,
112
+ "nauc_precision_at_3_diff1": 0.465332,
113
+ "nauc_precision_at_5_max": 0.255268,
114
+ "nauc_precision_at_5_std": 0.119395,
115
+ "nauc_precision_at_5_diff1": 0.37565,
116
+ "nauc_precision_at_10_max": 0.257556,
117
+ "nauc_precision_at_10_std": 0.178955,
118
+ "nauc_precision_at_10_diff1": 0.314717,
119
+ "nauc_precision_at_20_max": 0.267482,
120
+ "nauc_precision_at_20_std": 0.201016,
121
+ "nauc_precision_at_20_diff1": 0.24753,
122
+ "nauc_precision_at_100_max": 0.20883,
123
+ "nauc_precision_at_100_std": 0.254214,
124
+ "nauc_precision_at_100_diff1": 0.188949,
125
+ "nauc_precision_at_1000_max": 0.144061,
126
+ "nauc_precision_at_1000_std": 0.235026,
127
+ "nauc_precision_at_1000_diff1": 0.081085,
128
+ "nauc_mrr_at_1_max": 0.317712,
129
+ "nauc_mrr_at_1_std": -0.000928,
130
+ "nauc_mrr_at_1_diff1": 0.593413,
131
+ "nauc_mrr_at_3_max": 0.297264,
132
+ "nauc_mrr_at_3_std": 0.023877,
133
+ "nauc_mrr_at_3_diff1": 0.547307,
134
+ "nauc_mrr_at_5_max": 0.294441,
135
+ "nauc_mrr_at_5_std": 0.033034,
136
+ "nauc_mrr_at_5_diff1": 0.529222,
137
+ "nauc_mrr_at_10_max": 0.292899,
138
+ "nauc_mrr_at_10_std": 0.042312,
139
+ "nauc_mrr_at_10_diff1": 0.518927,
140
+ "nauc_mrr_at_20_max": 0.294498,
141
+ "nauc_mrr_at_20_std": 0.043485,
142
+ "nauc_mrr_at_20_diff1": 0.515377,
143
+ "nauc_mrr_at_100_max": 0.29365,
144
+ "nauc_mrr_at_100_std": 0.044924,
145
+ "nauc_mrr_at_100_diff1": 0.515111,
146
+ "nauc_mrr_at_1000_max": 0.293399,
147
+ "nauc_mrr_at_1000_std": 0.044993,
148
+ "nauc_mrr_at_1000_diff1": 0.514998,
149
+ "hit_rate_at_1": 0.19939,
150
+ "hit_rate_at_3": 0.28681,
151
+ "hit_rate_at_5": 0.32362,
152
+ "hit_rate_at_10": 0.38344,
153
+ "hit_rate_at_20": 0.44018,
154
+ "hit_rate_at_100": 0.58282,
155
+ "hit_rate_at_1000": 0.79755,
156
+ "main_score": 0.26658,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 454.00575137138367,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackTexRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "46989137a86843e03a6195de44b09deda022eec7",
3
+ "task_name": "CQADupstackTexRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.15141,
9
+ "ndcg_at_3": 0.18441,
10
+ "ndcg_at_5": 0.19844,
11
+ "ndcg_at_10": 0.21772,
12
+ "ndcg_at_20": 0.2356,
13
+ "ndcg_at_100": 0.26789,
14
+ "ndcg_at_1000": 0.30123,
15
+ "map_at_1": 0.12781,
16
+ "map_at_3": 0.16452,
17
+ "map_at_5": 0.17308,
18
+ "map_at_10": 0.18139,
19
+ "map_at_20": 0.18656,
20
+ "map_at_100": 0.19128,
21
+ "map_at_1000": 0.19258,
22
+ "recall_at_1": 0.12781,
23
+ "recall_at_3": 0.20751,
24
+ "recall_at_5": 0.24226,
25
+ "recall_at_10": 0.29955,
26
+ "recall_at_20": 0.36569,
27
+ "recall_at_100": 0.52778,
28
+ "recall_at_1000": 0.7687,
29
+ "accuracy": 0.12781,
30
+ "precision_at_1": 0.15141,
31
+ "precision_at_3": 0.08534,
32
+ "precision_at_5": 0.06167,
33
+ "precision_at_10": 0.03957,
34
+ "precision_at_20": 0.02479,
35
+ "precision_at_100": 0.00756,
36
+ "precision_at_1000": 0.00121,
37
+ "mrr_at_1": 0.151411,
38
+ "mrr_at_3": 0.1915,
39
+ "mrr_at_5": 0.20148,
40
+ "mrr_at_10": 0.210094,
41
+ "mrr_at_20": 0.215271,
42
+ "mrr_at_100": 0.219333,
43
+ "mrr_at_1000": 0.22026,
44
+ "nauc_ndcg_at_1_max": 0.188101,
45
+ "nauc_ndcg_at_1_std": -0.059808,
46
+ "nauc_ndcg_at_1_diff1": 0.44279,
47
+ "nauc_ndcg_at_3_max": 0.169839,
48
+ "nauc_ndcg_at_3_std": -0.042406,
49
+ "nauc_ndcg_at_3_diff1": 0.363346,
50
+ "nauc_ndcg_at_5_max": 0.172625,
51
+ "nauc_ndcg_at_5_std": -0.033595,
52
+ "nauc_ndcg_at_5_diff1": 0.347585,
53
+ "nauc_ndcg_at_10_max": 0.18353,
54
+ "nauc_ndcg_at_10_std": -0.021714,
55
+ "nauc_ndcg_at_10_diff1": 0.343985,
56
+ "nauc_ndcg_at_20_max": 0.187462,
57
+ "nauc_ndcg_at_20_std": -0.007311,
58
+ "nauc_ndcg_at_20_diff1": 0.332957,
59
+ "nauc_ndcg_at_100_max": 0.192158,
60
+ "nauc_ndcg_at_100_std": 0.015166,
61
+ "nauc_ndcg_at_100_diff1": 0.325311,
62
+ "nauc_ndcg_at_1000_max": 0.196488,
63
+ "nauc_ndcg_at_1000_std": 0.021498,
64
+ "nauc_ndcg_at_1000_diff1": 0.328248,
65
+ "nauc_map_at_1_max": 0.177569,
66
+ "nauc_map_at_1_std": -0.044058,
67
+ "nauc_map_at_1_diff1": 0.446587,
68
+ "nauc_map_at_3_max": 0.168933,
69
+ "nauc_map_at_3_std": -0.041864,
70
+ "nauc_map_at_3_diff1": 0.382165,
71
+ "nauc_map_at_5_max": 0.17127,
72
+ "nauc_map_at_5_std": -0.037645,
73
+ "nauc_map_at_5_diff1": 0.371836,
74
+ "nauc_map_at_10_max": 0.176201,
75
+ "nauc_map_at_10_std": -0.031853,
76
+ "nauc_map_at_10_diff1": 0.369451,
77
+ "nauc_map_at_20_max": 0.177908,
78
+ "nauc_map_at_20_std": -0.027149,
79
+ "nauc_map_at_20_diff1": 0.365457,
80
+ "nauc_map_at_100_max": 0.179159,
81
+ "nauc_map_at_100_std": -0.023351,
82
+ "nauc_map_at_100_diff1": 0.364243,
83
+ "nauc_map_at_1000_max": 0.179262,
84
+ "nauc_map_at_1000_std": -0.023082,
85
+ "nauc_map_at_1000_diff1": 0.364127,
86
+ "nauc_recall_at_1_max": 0.177569,
87
+ "nauc_recall_at_1_std": -0.044058,
88
+ "nauc_recall_at_1_diff1": 0.446587,
89
+ "nauc_recall_at_3_max": 0.154393,
90
+ "nauc_recall_at_3_std": -0.036288,
91
+ "nauc_recall_at_3_diff1": 0.320667,
92
+ "nauc_recall_at_5_max": 0.159413,
93
+ "nauc_recall_at_5_std": -0.022595,
94
+ "nauc_recall_at_5_diff1": 0.290007,
95
+ "nauc_recall_at_10_max": 0.183948,
96
+ "nauc_recall_at_10_std": 0.006838,
97
+ "nauc_recall_at_10_diff1": 0.281372,
98
+ "nauc_recall_at_20_max": 0.189193,
99
+ "nauc_recall_at_20_std": 0.048,
100
+ "nauc_recall_at_20_diff1": 0.244202,
101
+ "nauc_recall_at_100_max": 0.200693,
102
+ "nauc_recall_at_100_std": 0.140801,
103
+ "nauc_recall_at_100_diff1": 0.20056,
104
+ "nauc_recall_at_1000_max": 0.251608,
105
+ "nauc_recall_at_1000_std": 0.275474,
106
+ "nauc_recall_at_1000_diff1": 0.185376,
107
+ "nauc_precision_at_1_max": 0.188101,
108
+ "nauc_precision_at_1_std": -0.059808,
109
+ "nauc_precision_at_1_diff1": 0.44279,
110
+ "nauc_precision_at_3_max": 0.17356,
111
+ "nauc_precision_at_3_std": -0.044117,
112
+ "nauc_precision_at_3_diff1": 0.315808,
113
+ "nauc_precision_at_5_max": 0.180729,
114
+ "nauc_precision_at_5_std": -0.019843,
115
+ "nauc_precision_at_5_diff1": 0.271323,
116
+ "nauc_precision_at_10_max": 0.207514,
117
+ "nauc_precision_at_10_std": 0.008349,
118
+ "nauc_precision_at_10_diff1": 0.25316,
119
+ "nauc_precision_at_20_max": 0.220315,
120
+ "nauc_precision_at_20_std": 0.042415,
121
+ "nauc_precision_at_20_diff1": 0.209044,
122
+ "nauc_precision_at_100_max": 0.215987,
123
+ "nauc_precision_at_100_std": 0.111507,
124
+ "nauc_precision_at_100_diff1": 0.122919,
125
+ "nauc_precision_at_1000_max": 0.188892,
126
+ "nauc_precision_at_1000_std": 0.110421,
127
+ "nauc_precision_at_1000_diff1": 0.016761,
128
+ "nauc_mrr_at_1_max": 0.188101,
129
+ "nauc_mrr_at_1_std": -0.059808,
130
+ "nauc_mrr_at_1_diff1": 0.44279,
131
+ "nauc_mrr_at_3_max": 0.178068,
132
+ "nauc_mrr_at_3_std": -0.049746,
133
+ "nauc_mrr_at_3_diff1": 0.374799,
134
+ "nauc_mrr_at_5_max": 0.180579,
135
+ "nauc_mrr_at_5_std": -0.043448,
136
+ "nauc_mrr_at_5_diff1": 0.365276,
137
+ "nauc_mrr_at_10_max": 0.185298,
138
+ "nauc_mrr_at_10_std": -0.037397,
139
+ "nauc_mrr_at_10_diff1": 0.363307,
140
+ "nauc_mrr_at_20_max": 0.186362,
141
+ "nauc_mrr_at_20_std": -0.033146,
142
+ "nauc_mrr_at_20_diff1": 0.360414,
143
+ "nauc_mrr_at_100_max": 0.186694,
144
+ "nauc_mrr_at_100_std": -0.030608,
145
+ "nauc_mrr_at_100_diff1": 0.359288,
146
+ "nauc_mrr_at_1000_max": 0.186659,
147
+ "nauc_mrr_at_1000_std": -0.030698,
148
+ "nauc_mrr_at_1000_diff1": 0.359486,
149
+ "hit_rate_at_1": 0.15141,
150
+ "hit_rate_at_3": 0.24329,
151
+ "hit_rate_at_5": 0.2863,
152
+ "hit_rate_at_10": 0.35237,
153
+ "hit_rate_at_20": 0.4267,
154
+ "hit_rate_at_100": 0.59326,
155
+ "hit_rate_at_1000": 0.81624,
156
+ "main_score": 0.21772,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 769.9449524879456,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackUnixRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6c6430d3a6d36f8d2a829195bc5dc94d7e063e53",
3
+ "task_name": "CQADupstackUnixRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.21828,
9
+ "ndcg_at_3": 0.2542,
10
+ "ndcg_at_5": 0.2744,
11
+ "ndcg_at_10": 0.29569,
12
+ "ndcg_at_20": 0.31654,
13
+ "ndcg_at_100": 0.35134,
14
+ "ndcg_at_1000": 0.38127,
15
+ "map_at_1": 0.1882,
16
+ "map_at_3": 0.23073,
17
+ "map_at_5": 0.24373,
18
+ "map_at_10": 0.25331,
19
+ "map_at_20": 0.25935,
20
+ "map_at_100": 0.2645,
21
+ "map_at_1000": 0.26567,
22
+ "recall_at_1": 0.1882,
23
+ "recall_at_3": 0.28051,
24
+ "recall_at_5": 0.33123,
25
+ "recall_at_10": 0.39411,
26
+ "recall_at_20": 0.47019,
27
+ "recall_at_100": 0.64248,
28
+ "recall_at_1000": 0.85699,
29
+ "accuracy": 0.1882,
30
+ "precision_at_1": 0.21828,
31
+ "precision_at_3": 0.11412,
32
+ "precision_at_5": 0.08209,
33
+ "precision_at_10": 0.04963,
34
+ "precision_at_20": 0.03022,
35
+ "precision_at_100": 0.00873,
36
+ "precision_at_1000": 0.00126,
37
+ "mrr_at_1": 0.218284,
38
+ "mrr_at_3": 0.264303,
39
+ "mrr_at_5": 0.277037,
40
+ "mrr_at_10": 0.285644,
41
+ "mrr_at_20": 0.291422,
42
+ "mrr_at_100": 0.29577,
43
+ "mrr_at_1000": 0.296572,
44
+ "nauc_ndcg_at_1_max": 0.281543,
45
+ "nauc_ndcg_at_1_std": -0.040024,
46
+ "nauc_ndcg_at_1_diff1": 0.482605,
47
+ "nauc_ndcg_at_3_max": 0.280481,
48
+ "nauc_ndcg_at_3_std": -0.030218,
49
+ "nauc_ndcg_at_3_diff1": 0.40564,
50
+ "nauc_ndcg_at_5_max": 0.267505,
51
+ "nauc_ndcg_at_5_std": -0.017274,
52
+ "nauc_ndcg_at_5_diff1": 0.395204,
53
+ "nauc_ndcg_at_10_max": 0.257295,
54
+ "nauc_ndcg_at_10_std": -0.014622,
55
+ "nauc_ndcg_at_10_diff1": 0.377198,
56
+ "nauc_ndcg_at_20_max": 0.263194,
57
+ "nauc_ndcg_at_20_std": 0.000184,
58
+ "nauc_ndcg_at_20_diff1": 0.378256,
59
+ "nauc_ndcg_at_100_max": 0.278469,
60
+ "nauc_ndcg_at_100_std": 0.02012,
61
+ "nauc_ndcg_at_100_diff1": 0.378415,
62
+ "nauc_ndcg_at_1000_max": 0.285602,
63
+ "nauc_ndcg_at_1000_std": 0.025091,
64
+ "nauc_ndcg_at_1000_diff1": 0.38129,
65
+ "nauc_map_at_1_max": 0.292304,
66
+ "nauc_map_at_1_std": -0.039585,
67
+ "nauc_map_at_1_diff1": 0.50601,
68
+ "nauc_map_at_3_max": 0.288302,
69
+ "nauc_map_at_3_std": -0.031367,
70
+ "nauc_map_at_3_diff1": 0.438735,
71
+ "nauc_map_at_5_max": 0.281071,
72
+ "nauc_map_at_5_std": -0.023007,
73
+ "nauc_map_at_5_diff1": 0.430747,
74
+ "nauc_map_at_10_max": 0.276116,
75
+ "nauc_map_at_10_std": -0.022191,
76
+ "nauc_map_at_10_diff1": 0.422077,
77
+ "nauc_map_at_20_max": 0.277337,
78
+ "nauc_map_at_20_std": -0.018386,
79
+ "nauc_map_at_20_diff1": 0.421592,
80
+ "nauc_map_at_100_max": 0.279568,
81
+ "nauc_map_at_100_std": -0.015354,
82
+ "nauc_map_at_100_diff1": 0.421515,
83
+ "nauc_map_at_1000_max": 0.279963,
84
+ "nauc_map_at_1000_std": -0.014972,
85
+ "nauc_map_at_1000_diff1": 0.421573,
86
+ "nauc_recall_at_1_max": 0.292304,
87
+ "nauc_recall_at_1_std": -0.039585,
88
+ "nauc_recall_at_1_diff1": 0.50601,
89
+ "nauc_recall_at_3_max": 0.266313,
90
+ "nauc_recall_at_3_std": -0.022458,
91
+ "nauc_recall_at_3_diff1": 0.347169,
92
+ "nauc_recall_at_5_max": 0.232998,
93
+ "nauc_recall_at_5_std": 0.001873,
94
+ "nauc_recall_at_5_diff1": 0.314671,
95
+ "nauc_recall_at_10_max": 0.201628,
96
+ "nauc_recall_at_10_std": 0.0071,
97
+ "nauc_recall_at_10_diff1": 0.267848,
98
+ "nauc_recall_at_20_max": 0.214548,
99
+ "nauc_recall_at_20_std": 0.055337,
100
+ "nauc_recall_at_20_diff1": 0.271335,
101
+ "nauc_recall_at_100_max": 0.28066,
102
+ "nauc_recall_at_100_std": 0.170716,
103
+ "nauc_recall_at_100_diff1": 0.253935,
104
+ "nauc_recall_at_1000_max": 0.39281,
105
+ "nauc_recall_at_1000_std": 0.398256,
106
+ "nauc_recall_at_1000_diff1": 0.211879,
107
+ "nauc_precision_at_1_max": 0.281543,
108
+ "nauc_precision_at_1_std": -0.040024,
109
+ "nauc_precision_at_1_diff1": 0.482605,
110
+ "nauc_precision_at_3_max": 0.253646,
111
+ "nauc_precision_at_3_std": -0.017114,
112
+ "nauc_precision_at_3_diff1": 0.317777,
113
+ "nauc_precision_at_5_max": 0.217351,
114
+ "nauc_precision_at_5_std": 0.007167,
115
+ "nauc_precision_at_5_diff1": 0.280379,
116
+ "nauc_precision_at_10_max": 0.184707,
117
+ "nauc_precision_at_10_std": 0.010177,
118
+ "nauc_precision_at_10_diff1": 0.210426,
119
+ "nauc_precision_at_20_max": 0.202904,
120
+ "nauc_precision_at_20_std": 0.068057,
121
+ "nauc_precision_at_20_diff1": 0.17969,
122
+ "nauc_precision_at_100_max": 0.194589,
123
+ "nauc_precision_at_100_std": 0.13256,
124
+ "nauc_precision_at_100_diff1": 0.070972,
125
+ "nauc_precision_at_1000_max": 0.103915,
126
+ "nauc_precision_at_1000_std": 0.09756,
127
+ "nauc_precision_at_1000_diff1": -0.075323,
128
+ "nauc_mrr_at_1_max": 0.281543,
129
+ "nauc_mrr_at_1_std": -0.040024,
130
+ "nauc_mrr_at_1_diff1": 0.482605,
131
+ "nauc_mrr_at_3_max": 0.275238,
132
+ "nauc_mrr_at_3_std": -0.032196,
133
+ "nauc_mrr_at_3_diff1": 0.408081,
134
+ "nauc_mrr_at_5_max": 0.267431,
135
+ "nauc_mrr_at_5_std": -0.025143,
136
+ "nauc_mrr_at_5_diff1": 0.400962,
137
+ "nauc_mrr_at_10_max": 0.26402,
138
+ "nauc_mrr_at_10_std": -0.024469,
139
+ "nauc_mrr_at_10_diff1": 0.393913,
140
+ "nauc_mrr_at_20_max": 0.266899,
141
+ "nauc_mrr_at_20_std": -0.019789,
142
+ "nauc_mrr_at_20_diff1": 0.39483,
143
+ "nauc_mrr_at_100_max": 0.269335,
144
+ "nauc_mrr_at_100_std": -0.017582,
145
+ "nauc_mrr_at_100_diff1": 0.39559,
146
+ "nauc_mrr_at_1000_max": 0.269478,
147
+ "nauc_mrr_at_1000_std": -0.017441,
148
+ "nauc_mrr_at_1000_diff1": 0.395897,
149
+ "hit_rate_at_1": 0.21828,
150
+ "hit_rate_at_3": 0.32369,
151
+ "hit_rate_at_5": 0.37873,
152
+ "hit_rate_at_10": 0.4431,
153
+ "hit_rate_at_20": 0.52425,
154
+ "hit_rate_at_100": 0.70522,
155
+ "hit_rate_at_1000": 0.90205,
156
+ "main_score": 0.29569,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 517.7233333587646,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWebmastersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "160c094312a0e1facb97e55eeddb698c0abe3571",
3
+ "task_name": "CQADupstackWebmastersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.21937,
9
+ "ndcg_at_3": 0.27274,
10
+ "ndcg_at_5": 0.28587,
11
+ "ndcg_at_10": 0.31327,
12
+ "ndcg_at_20": 0.3329,
13
+ "ndcg_at_100": 0.37491,
14
+ "ndcg_at_1000": 0.40721,
15
+ "map_at_1": 0.18622,
16
+ "map_at_3": 0.23995,
17
+ "map_at_5": 0.24959,
18
+ "map_at_10": 0.26261,
19
+ "map_at_20": 0.27003,
20
+ "map_at_100": 0.27806,
21
+ "map_at_1000": 0.28035,
22
+ "recall_at_1": 0.18622,
23
+ "recall_at_3": 0.29546,
24
+ "recall_at_5": 0.332,
25
+ "recall_at_10": 0.41501,
26
+ "recall_at_20": 0.48812,
27
+ "recall_at_100": 0.69388,
28
+ "recall_at_1000": 0.90904,
29
+ "accuracy": 0.18622,
30
+ "precision_at_1": 0.21937,
31
+ "precision_at_3": 0.1278,
32
+ "precision_at_5": 0.09091,
33
+ "precision_at_10": 0.06146,
34
+ "precision_at_20": 0.04032,
35
+ "precision_at_100": 0.01381,
36
+ "precision_at_1000": 0.00233,
37
+ "mrr_at_1": 0.219368,
38
+ "mrr_at_3": 0.278327,
39
+ "mrr_at_5": 0.28722,
40
+ "mrr_at_10": 0.298878,
41
+ "mrr_at_20": 0.304274,
42
+ "mrr_at_100": 0.309291,
43
+ "mrr_at_1000": 0.309969,
44
+ "nauc_ndcg_at_1_max": 0.205806,
45
+ "nauc_ndcg_at_1_std": -0.022055,
46
+ "nauc_ndcg_at_1_diff1": 0.41167,
47
+ "nauc_ndcg_at_3_max": 0.214123,
48
+ "nauc_ndcg_at_3_std": 0.065781,
49
+ "nauc_ndcg_at_3_diff1": 0.348382,
50
+ "nauc_ndcg_at_5_max": 0.212249,
51
+ "nauc_ndcg_at_5_std": 0.069996,
52
+ "nauc_ndcg_at_5_diff1": 0.356753,
53
+ "nauc_ndcg_at_10_max": 0.211098,
54
+ "nauc_ndcg_at_10_std": 0.084688,
55
+ "nauc_ndcg_at_10_diff1": 0.370725,
56
+ "nauc_ndcg_at_20_max": 0.219452,
57
+ "nauc_ndcg_at_20_std": 0.092902,
58
+ "nauc_ndcg_at_20_diff1": 0.372046,
59
+ "nauc_ndcg_at_100_max": 0.228064,
60
+ "nauc_ndcg_at_100_std": 0.11123,
61
+ "nauc_ndcg_at_100_diff1": 0.36461,
62
+ "nauc_ndcg_at_1000_max": 0.230216,
63
+ "nauc_ndcg_at_1000_std": 0.106767,
64
+ "nauc_ndcg_at_1000_diff1": 0.355181,
65
+ "nauc_map_at_1_max": 0.214091,
66
+ "nauc_map_at_1_std": -0.050987,
67
+ "nauc_map_at_1_diff1": 0.431585,
68
+ "nauc_map_at_3_max": 0.215742,
69
+ "nauc_map_at_3_std": 0.023616,
70
+ "nauc_map_at_3_diff1": 0.370705,
71
+ "nauc_map_at_5_max": 0.215695,
72
+ "nauc_map_at_5_std": 0.026131,
73
+ "nauc_map_at_5_diff1": 0.376614,
74
+ "nauc_map_at_10_max": 0.214956,
75
+ "nauc_map_at_10_std": 0.035824,
76
+ "nauc_map_at_10_diff1": 0.382804,
77
+ "nauc_map_at_20_max": 0.220058,
78
+ "nauc_map_at_20_std": 0.046152,
79
+ "nauc_map_at_20_diff1": 0.382941,
80
+ "nauc_map_at_100_max": 0.220966,
81
+ "nauc_map_at_100_std": 0.055565,
82
+ "nauc_map_at_100_diff1": 0.380704,
83
+ "nauc_map_at_1000_max": 0.219475,
84
+ "nauc_map_at_1000_std": 0.05742,
85
+ "nauc_map_at_1000_diff1": 0.378371,
86
+ "nauc_recall_at_1_max": 0.214091,
87
+ "nauc_recall_at_1_std": -0.050987,
88
+ "nauc_recall_at_1_diff1": 0.431585,
89
+ "nauc_recall_at_3_max": 0.211035,
90
+ "nauc_recall_at_3_std": 0.089839,
91
+ "nauc_recall_at_3_diff1": 0.31531,
92
+ "nauc_recall_at_5_max": 0.20504,
93
+ "nauc_recall_at_5_std": 0.093634,
94
+ "nauc_recall_at_5_diff1": 0.328744,
95
+ "nauc_recall_at_10_max": 0.200175,
96
+ "nauc_recall_at_10_std": 0.140686,
97
+ "nauc_recall_at_10_diff1": 0.35188,
98
+ "nauc_recall_at_20_max": 0.221197,
99
+ "nauc_recall_at_20_std": 0.176191,
100
+ "nauc_recall_at_20_diff1": 0.359699,
101
+ "nauc_recall_at_100_max": 0.233904,
102
+ "nauc_recall_at_100_std": 0.293603,
103
+ "nauc_recall_at_100_diff1": 0.311297,
104
+ "nauc_recall_at_1000_max": 0.334517,
105
+ "nauc_recall_at_1000_std": 0.401122,
106
+ "nauc_recall_at_1000_diff1": 0.135304,
107
+ "nauc_precision_at_1_max": 0.205806,
108
+ "nauc_precision_at_1_std": -0.022055,
109
+ "nauc_precision_at_1_diff1": 0.41167,
110
+ "nauc_precision_at_3_max": 0.188473,
111
+ "nauc_precision_at_3_std": 0.131795,
112
+ "nauc_precision_at_3_diff1": 0.269462,
113
+ "nauc_precision_at_5_max": 0.172083,
114
+ "nauc_precision_at_5_std": 0.157546,
115
+ "nauc_precision_at_5_diff1": 0.255519,
116
+ "nauc_precision_at_10_max": 0.160468,
117
+ "nauc_precision_at_10_std": 0.234025,
118
+ "nauc_precision_at_10_diff1": 0.24435,
119
+ "nauc_precision_at_20_max": 0.160228,
120
+ "nauc_precision_at_20_std": 0.296571,
121
+ "nauc_precision_at_20_diff1": 0.180054,
122
+ "nauc_precision_at_100_max": 0.019627,
123
+ "nauc_precision_at_100_std": 0.305683,
124
+ "nauc_precision_at_100_diff1": -0.015081,
125
+ "nauc_precision_at_1000_max": -0.114629,
126
+ "nauc_precision_at_1000_std": 0.170659,
127
+ "nauc_precision_at_1000_diff1": -0.173047,
128
+ "nauc_mrr_at_1_max": 0.205806,
129
+ "nauc_mrr_at_1_std": -0.022055,
130
+ "nauc_mrr_at_1_diff1": 0.41167,
131
+ "nauc_mrr_at_3_max": 0.212607,
132
+ "nauc_mrr_at_3_std": 0.053805,
133
+ "nauc_mrr_at_3_diff1": 0.357305,
134
+ "nauc_mrr_at_5_max": 0.21078,
135
+ "nauc_mrr_at_5_std": 0.058757,
136
+ "nauc_mrr_at_5_diff1": 0.357453,
137
+ "nauc_mrr_at_10_max": 0.211923,
138
+ "nauc_mrr_at_10_std": 0.064362,
139
+ "nauc_mrr_at_10_diff1": 0.362376,
140
+ "nauc_mrr_at_20_max": 0.21316,
141
+ "nauc_mrr_at_20_std": 0.066068,
142
+ "nauc_mrr_at_20_diff1": 0.364287,
143
+ "nauc_mrr_at_100_max": 0.214309,
144
+ "nauc_mrr_at_100_std": 0.068451,
145
+ "nauc_mrr_at_100_diff1": 0.364175,
146
+ "nauc_mrr_at_1000_max": 0.214358,
147
+ "nauc_mrr_at_1000_std": 0.068295,
148
+ "nauc_mrr_at_1000_diff1": 0.364238,
149
+ "hit_rate_at_1": 0.21937,
150
+ "hit_rate_at_3": 0.34783,
151
+ "hit_rate_at_5": 0.38735,
152
+ "hit_rate_at_10": 0.47826,
153
+ "hit_rate_at_20": 0.55731,
154
+ "hit_rate_at_100": 0.75494,
155
+ "hit_rate_at_1000": 0.92885,
156
+ "main_score": 0.31327,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 188.55257272720337,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWordpressRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4ffe81d471b1924886b33c7567bfb200e9eec5c4",
3
+ "task_name": "CQADupstackWordpressRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.14048,
9
+ "ndcg_at_3": 0.19363,
10
+ "ndcg_at_5": 0.21368,
11
+ "ndcg_at_10": 0.23858,
12
+ "ndcg_at_20": 0.25819,
13
+ "ndcg_at_100": 0.29815,
14
+ "ndcg_at_1000": 0.32505,
15
+ "map_at_1": 0.13001,
16
+ "map_at_3": 0.17555,
17
+ "map_at_5": 0.1875,
18
+ "map_at_10": 0.19808,
19
+ "map_at_20": 0.20357,
20
+ "map_at_100": 0.20904,
21
+ "map_at_1000": 0.21005,
22
+ "recall_at_1": 0.13001,
23
+ "recall_at_3": 0.23031,
24
+ "recall_at_5": 0.27813,
25
+ "recall_at_10": 0.35228,
26
+ "recall_at_20": 0.42609,
27
+ "recall_at_100": 0.63368,
28
+ "recall_at_1000": 0.83963,
29
+ "accuracy": 0.13001,
30
+ "precision_at_1": 0.14048,
31
+ "precision_at_3": 0.08564,
32
+ "precision_at_5": 0.06322,
33
+ "precision_at_10": 0.04067,
34
+ "precision_at_20": 0.02514,
35
+ "precision_at_100": 0.00782,
36
+ "precision_at_1000": 0.00113,
37
+ "mrr_at_1": 0.140481,
38
+ "mrr_at_3": 0.189464,
39
+ "mrr_at_5": 0.200277,
40
+ "mrr_at_10": 0.210398,
41
+ "mrr_at_20": 0.215774,
42
+ "mrr_at_100": 0.221138,
43
+ "mrr_at_1000": 0.221818,
44
+ "nauc_ndcg_at_1_max": 0.115435,
45
+ "nauc_ndcg_at_1_std": -0.067595,
46
+ "nauc_ndcg_at_1_diff1": 0.442807,
47
+ "nauc_ndcg_at_3_max": 0.104581,
48
+ "nauc_ndcg_at_3_std": -0.032052,
49
+ "nauc_ndcg_at_3_diff1": 0.322978,
50
+ "nauc_ndcg_at_5_max": 0.099955,
51
+ "nauc_ndcg_at_5_std": -0.023424,
52
+ "nauc_ndcg_at_5_diff1": 0.304527,
53
+ "nauc_ndcg_at_10_max": 0.107684,
54
+ "nauc_ndcg_at_10_std": -0.020334,
55
+ "nauc_ndcg_at_10_diff1": 0.292103,
56
+ "nauc_ndcg_at_20_max": 0.104147,
57
+ "nauc_ndcg_at_20_std": -0.008628,
58
+ "nauc_ndcg_at_20_diff1": 0.279934,
59
+ "nauc_ndcg_at_100_max": 0.097113,
60
+ "nauc_ndcg_at_100_std": 0.027827,
61
+ "nauc_ndcg_at_100_diff1": 0.280887,
62
+ "nauc_ndcg_at_1000_max": 0.105455,
63
+ "nauc_ndcg_at_1000_std": 0.034688,
64
+ "nauc_ndcg_at_1000_diff1": 0.286275,
65
+ "nauc_map_at_1_max": 0.115885,
66
+ "nauc_map_at_1_std": -0.068062,
67
+ "nauc_map_at_1_diff1": 0.461963,
68
+ "nauc_map_at_3_max": 0.103883,
69
+ "nauc_map_at_3_std": -0.038976,
70
+ "nauc_map_at_3_diff1": 0.353757,
71
+ "nauc_map_at_5_max": 0.100721,
72
+ "nauc_map_at_5_std": -0.034099,
73
+ "nauc_map_at_5_diff1": 0.339688,
74
+ "nauc_map_at_10_max": 0.10739,
75
+ "nauc_map_at_10_std": -0.033386,
76
+ "nauc_map_at_10_diff1": 0.334644,
77
+ "nauc_map_at_20_max": 0.105904,
78
+ "nauc_map_at_20_std": -0.029572,
79
+ "nauc_map_at_20_diff1": 0.331057,
80
+ "nauc_map_at_100_max": 0.104304,
81
+ "nauc_map_at_100_std": -0.023991,
82
+ "nauc_map_at_100_diff1": 0.329825,
83
+ "nauc_map_at_1000_max": 0.104635,
84
+ "nauc_map_at_1000_std": -0.023596,
85
+ "nauc_map_at_1000_diff1": 0.329705,
86
+ "nauc_recall_at_1_max": 0.115885,
87
+ "nauc_recall_at_1_std": -0.068062,
88
+ "nauc_recall_at_1_diff1": 0.461963,
89
+ "nauc_recall_at_3_max": 0.087868,
90
+ "nauc_recall_at_3_std": -0.01842,
91
+ "nauc_recall_at_3_diff1": 0.257933,
92
+ "nauc_recall_at_5_max": 0.086799,
93
+ "nauc_recall_at_5_std": -0.005774,
94
+ "nauc_recall_at_5_diff1": 0.231168,
95
+ "nauc_recall_at_10_max": 0.097835,
96
+ "nauc_recall_at_10_std": 0.004477,
97
+ "nauc_recall_at_10_diff1": 0.200731,
98
+ "nauc_recall_at_20_max": 0.089292,
99
+ "nauc_recall_at_20_std": 0.035362,
100
+ "nauc_recall_at_20_diff1": 0.16365,
101
+ "nauc_recall_at_100_max": 0.048353,
102
+ "nauc_recall_at_100_std": 0.218318,
103
+ "nauc_recall_at_100_diff1": 0.156905,
104
+ "nauc_recall_at_1000_max": 0.105813,
105
+ "nauc_recall_at_1000_std": 0.45983,
106
+ "nauc_recall_at_1000_diff1": 0.159645,
107
+ "nauc_precision_at_1_max": 0.115435,
108
+ "nauc_precision_at_1_std": -0.067595,
109
+ "nauc_precision_at_1_diff1": 0.442807,
110
+ "nauc_precision_at_3_max": 0.11624,
111
+ "nauc_precision_at_3_std": -0.020353,
112
+ "nauc_precision_at_3_diff1": 0.232359,
113
+ "nauc_precision_at_5_max": 0.105391,
114
+ "nauc_precision_at_5_std": 0.011659,
115
+ "nauc_precision_at_5_diff1": 0.184504,
116
+ "nauc_precision_at_10_max": 0.138607,
117
+ "nauc_precision_at_10_std": 0.015981,
118
+ "nauc_precision_at_10_diff1": 0.152522,
119
+ "nauc_precision_at_20_max": 0.117224,
120
+ "nauc_precision_at_20_std": 0.067797,
121
+ "nauc_precision_at_20_diff1": 0.099646,
122
+ "nauc_precision_at_100_max": 0.068607,
123
+ "nauc_precision_at_100_std": 0.212263,
124
+ "nauc_precision_at_100_diff1": 0.010535,
125
+ "nauc_precision_at_1000_max": 0.027033,
126
+ "nauc_precision_at_1000_std": 0.24365,
127
+ "nauc_precision_at_1000_diff1": -0.134684,
128
+ "nauc_mrr_at_1_max": 0.115435,
129
+ "nauc_mrr_at_1_std": -0.067595,
130
+ "nauc_mrr_at_1_diff1": 0.442807,
131
+ "nauc_mrr_at_3_max": 0.111199,
132
+ "nauc_mrr_at_3_std": -0.036077,
133
+ "nauc_mrr_at_3_diff1": 0.338118,
134
+ "nauc_mrr_at_5_max": 0.111595,
135
+ "nauc_mrr_at_5_std": -0.028045,
136
+ "nauc_mrr_at_5_diff1": 0.328242,
137
+ "nauc_mrr_at_10_max": 0.11266,
138
+ "nauc_mrr_at_10_std": -0.027461,
139
+ "nauc_mrr_at_10_diff1": 0.321551,
140
+ "nauc_mrr_at_20_max": 0.11168,
141
+ "nauc_mrr_at_20_std": -0.023874,
142
+ "nauc_mrr_at_20_diff1": 0.31842,
143
+ "nauc_mrr_at_100_max": 0.11104,
144
+ "nauc_mrr_at_100_std": -0.019073,
145
+ "nauc_mrr_at_100_diff1": 0.319179,
146
+ "nauc_mrr_at_1000_max": 0.111201,
147
+ "nauc_mrr_at_1000_std": -0.019292,
148
+ "nauc_mrr_at_1000_diff1": 0.319309,
149
+ "hit_rate_at_1": 0.14048,
150
+ "hit_rate_at_3": 0.24954,
151
+ "hit_rate_at_5": 0.2976,
152
+ "hit_rate_at_10": 0.37523,
153
+ "hit_rate_at_20": 0.45471,
154
+ "hit_rate_at_100": 0.67283,
155
+ "hit_rate_at_1000": 0.86137,
156
+ "main_score": 0.23858,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 521.0858821868896,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ClimateFEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "47f2ac6acb640fc46020b02a5b59fdda04d39380",
3
+ "task_name": "ClimateFEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.25733,
9
+ "ndcg_at_3": 0.23106,
10
+ "ndcg_at_5": 0.24946,
11
+ "ndcg_at_10": 0.28514,
12
+ "ndcg_at_20": 0.30884,
13
+ "ndcg_at_100": 0.34748,
14
+ "ndcg_at_1000": 0.37958,
15
+ "map_at_1": 0.10913,
16
+ "map_at_3": 0.1651,
17
+ "map_at_5": 0.18323,
18
+ "map_at_10": 0.20082,
19
+ "map_at_20": 0.20926,
20
+ "map_at_100": 0.21658,
21
+ "map_at_1000": 0.21827,
22
+ "recall_at_1": 0.10913,
23
+ "recall_at_3": 0.21324,
24
+ "recall_at_5": 0.26821,
25
+ "recall_at_10": 0.34938,
26
+ "recall_at_20": 0.41724,
27
+ "recall_at_100": 0.56403,
28
+ "recall_at_1000": 0.74556,
29
+ "accuracy": 0.10913,
30
+ "precision_at_1": 0.25733,
31
+ "precision_at_3": 0.17633,
32
+ "precision_at_5": 0.1355,
33
+ "precision_at_10": 0.09055,
34
+ "precision_at_20": 0.05518,
35
+ "precision_at_100": 0.01577,
36
+ "precision_at_1000": 0.00217,
37
+ "mrr_at_1": 0.257329,
38
+ "mrr_at_3": 0.34658,
39
+ "mrr_at_5": 0.365375,
40
+ "mrr_at_10": 0.378489,
41
+ "mrr_at_20": 0.383489,
42
+ "mrr_at_100": 0.386851,
43
+ "mrr_at_1000": 0.387302,
44
+ "nauc_ndcg_at_1_max": 0.360605,
45
+ "nauc_ndcg_at_1_std": 0.288479,
46
+ "nauc_ndcg_at_1_diff1": 0.226486,
47
+ "nauc_ndcg_at_3_max": 0.36133,
48
+ "nauc_ndcg_at_3_std": 0.276617,
49
+ "nauc_ndcg_at_3_diff1": 0.155586,
50
+ "nauc_ndcg_at_5_max": 0.377214,
51
+ "nauc_ndcg_at_5_std": 0.305233,
52
+ "nauc_ndcg_at_5_diff1": 0.146417,
53
+ "nauc_ndcg_at_10_max": 0.384467,
54
+ "nauc_ndcg_at_10_std": 0.330077,
55
+ "nauc_ndcg_at_10_diff1": 0.141628,
56
+ "nauc_ndcg_at_20_max": 0.389584,
57
+ "nauc_ndcg_at_20_std": 0.346458,
58
+ "nauc_ndcg_at_20_diff1": 0.134582,
59
+ "nauc_ndcg_at_100_max": 0.399907,
60
+ "nauc_ndcg_at_100_std": 0.363444,
61
+ "nauc_ndcg_at_100_diff1": 0.137893,
62
+ "nauc_ndcg_at_1000_max": 0.403099,
63
+ "nauc_ndcg_at_1000_std": 0.368894,
64
+ "nauc_ndcg_at_1000_diff1": 0.139714,
65
+ "nauc_map_at_1_max": 0.341173,
66
+ "nauc_map_at_1_std": 0.236647,
67
+ "nauc_map_at_1_diff1": 0.217515,
68
+ "nauc_map_at_3_max": 0.35406,
69
+ "nauc_map_at_3_std": 0.255367,
70
+ "nauc_map_at_3_diff1": 0.160133,
71
+ "nauc_map_at_5_max": 0.364905,
72
+ "nauc_map_at_5_std": 0.278874,
73
+ "nauc_map_at_5_diff1": 0.156267,
74
+ "nauc_map_at_10_max": 0.371907,
75
+ "nauc_map_at_10_std": 0.299761,
76
+ "nauc_map_at_10_diff1": 0.15101,
77
+ "nauc_map_at_20_max": 0.374329,
78
+ "nauc_map_at_20_std": 0.307949,
79
+ "nauc_map_at_20_diff1": 0.148187,
80
+ "nauc_map_at_100_max": 0.377414,
81
+ "nauc_map_at_100_std": 0.313176,
82
+ "nauc_map_at_100_diff1": 0.148796,
83
+ "nauc_map_at_1000_max": 0.377881,
84
+ "nauc_map_at_1000_std": 0.313881,
85
+ "nauc_map_at_1000_diff1": 0.148826,
86
+ "nauc_recall_at_1_max": 0.341173,
87
+ "nauc_recall_at_1_std": 0.236647,
88
+ "nauc_recall_at_1_diff1": 0.217515,
89
+ "nauc_recall_at_3_max": 0.325457,
90
+ "nauc_recall_at_3_std": 0.235244,
91
+ "nauc_recall_at_3_diff1": 0.108169,
92
+ "nauc_recall_at_5_max": 0.337877,
93
+ "nauc_recall_at_5_std": 0.276172,
94
+ "nauc_recall_at_5_diff1": 0.101689,
95
+ "nauc_recall_at_10_max": 0.329629,
96
+ "nauc_recall_at_10_std": 0.301548,
97
+ "nauc_recall_at_10_diff1": 0.091553,
98
+ "nauc_recall_at_20_max": 0.324971,
99
+ "nauc_recall_at_20_std": 0.323171,
100
+ "nauc_recall_at_20_diff1": 0.067873,
101
+ "nauc_recall_at_100_max": 0.329571,
102
+ "nauc_recall_at_100_std": 0.352092,
103
+ "nauc_recall_at_100_diff1": 0.069738,
104
+ "nauc_recall_at_1000_max": 0.333373,
105
+ "nauc_recall_at_1000_std": 0.392714,
106
+ "nauc_recall_at_1000_diff1": 0.067146,
107
+ "nauc_precision_at_1_max": 0.360605,
108
+ "nauc_precision_at_1_std": 0.288479,
109
+ "nauc_precision_at_1_diff1": 0.226486,
110
+ "nauc_precision_at_3_max": 0.366436,
111
+ "nauc_precision_at_3_std": 0.325001,
112
+ "nauc_precision_at_3_diff1": 0.119797,
113
+ "nauc_precision_at_5_max": 0.366645,
114
+ "nauc_precision_at_5_std": 0.364172,
115
+ "nauc_precision_at_5_diff1": 0.097449,
116
+ "nauc_precision_at_10_max": 0.341723,
117
+ "nauc_precision_at_10_std": 0.382059,
118
+ "nauc_precision_at_10_diff1": 0.072077,
119
+ "nauc_precision_at_20_max": 0.321855,
120
+ "nauc_precision_at_20_std": 0.391539,
121
+ "nauc_precision_at_20_diff1": 0.04599,
122
+ "nauc_precision_at_100_max": 0.265359,
123
+ "nauc_precision_at_100_std": 0.357649,
124
+ "nauc_precision_at_100_diff1": 0.02463,
125
+ "nauc_precision_at_1000_max": 0.183648,
126
+ "nauc_precision_at_1000_std": 0.288102,
127
+ "nauc_precision_at_1000_diff1": -0.005633,
128
+ "nauc_mrr_at_1_max": 0.360605,
129
+ "nauc_mrr_at_1_std": 0.288479,
130
+ "nauc_mrr_at_1_diff1": 0.226486,
131
+ "nauc_mrr_at_3_max": 0.377359,
132
+ "nauc_mrr_at_3_std": 0.307539,
133
+ "nauc_mrr_at_3_diff1": 0.172497,
134
+ "nauc_mrr_at_5_max": 0.387292,
135
+ "nauc_mrr_at_5_std": 0.323401,
136
+ "nauc_mrr_at_5_diff1": 0.171436,
137
+ "nauc_mrr_at_10_max": 0.385426,
138
+ "nauc_mrr_at_10_std": 0.323793,
139
+ "nauc_mrr_at_10_diff1": 0.171854,
140
+ "nauc_mrr_at_20_max": 0.385513,
141
+ "nauc_mrr_at_20_std": 0.324528,
142
+ "nauc_mrr_at_20_diff1": 0.171713,
143
+ "nauc_mrr_at_100_max": 0.386086,
144
+ "nauc_mrr_at_100_std": 0.324761,
145
+ "nauc_mrr_at_100_diff1": 0.172499,
146
+ "nauc_mrr_at_1000_max": 0.38589,
147
+ "nauc_mrr_at_1000_std": 0.324467,
148
+ "nauc_mrr_at_1000_diff1": 0.17253,
149
+ "hit_rate_at_1": 0.25733,
150
+ "hit_rate_at_3": 0.45928,
151
+ "hit_rate_at_5": 0.54137,
152
+ "hit_rate_at_10": 0.63779,
153
+ "hit_rate_at_20": 0.70814,
154
+ "hit_rate_at_100": 0.83518,
155
+ "hit_rate_at_1000": 0.93225,
156
+ "main_score": 0.28514,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 52955.644976615906,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/DBPedia.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c0f706b76e590d620bd6618b3ca8efdd34e2d659",
3
+ "task_name": "DBPedia",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.52375,
9
+ "ndcg_at_3": 0.42295,
10
+ "ndcg_at_5": 0.39132,
11
+ "ndcg_at_10": 0.36321,
12
+ "ndcg_at_20": 0.35645,
13
+ "ndcg_at_100": 0.39374,
14
+ "ndcg_at_1000": 0.46294,
15
+ "map_at_1": 0.08318,
16
+ "map_at_3": 0.12425,
17
+ "map_at_5": 0.14232,
18
+ "map_at_10": 0.16492,
19
+ "map_at_20": 0.18852,
20
+ "map_at_100": 0.22322,
21
+ "map_at_1000": 0.23796,
22
+ "recall_at_1": 0.08318,
23
+ "recall_at_3": 0.13521,
24
+ "recall_at_5": 0.16385,
25
+ "recall_at_10": 0.20657,
26
+ "recall_at_20": 0.2686,
27
+ "recall_at_100": 0.42316,
28
+ "recall_at_1000": 0.6542,
29
+ "accuracy": 0.08318,
30
+ "precision_at_1": 0.64,
31
+ "precision_at_3": 0.45,
32
+ "precision_at_5": 0.374,
33
+ "precision_at_10": 0.28025,
34
+ "precision_at_20": 0.212,
35
+ "precision_at_100": 0.08922,
36
+ "precision_at_1000": 0.01898,
37
+ "mrr_at_1": 0.64,
38
+ "mrr_at_3": 0.702917,
39
+ "mrr_at_5": 0.710792,
40
+ "mrr_at_10": 0.715509,
41
+ "mrr_at_20": 0.718732,
42
+ "mrr_at_100": 0.719758,
43
+ "mrr_at_1000": 0.719917,
44
+ "nauc_ndcg_at_1_max": 0.396345,
45
+ "nauc_ndcg_at_1_std": 0.215978,
46
+ "nauc_ndcg_at_1_diff1": 0.418898,
47
+ "nauc_ndcg_at_3_max": 0.402449,
48
+ "nauc_ndcg_at_3_std": 0.291379,
49
+ "nauc_ndcg_at_3_diff1": 0.328256,
50
+ "nauc_ndcg_at_5_max": 0.419725,
51
+ "nauc_ndcg_at_5_std": 0.313651,
52
+ "nauc_ndcg_at_5_diff1": 0.303062,
53
+ "nauc_ndcg_at_10_max": 0.407395,
54
+ "nauc_ndcg_at_10_std": 0.32899,
55
+ "nauc_ndcg_at_10_diff1": 0.297736,
56
+ "nauc_ndcg_at_20_max": 0.397298,
57
+ "nauc_ndcg_at_20_std": 0.330641,
58
+ "nauc_ndcg_at_20_diff1": 0.307387,
59
+ "nauc_ndcg_at_100_max": 0.429779,
60
+ "nauc_ndcg_at_100_std": 0.394533,
61
+ "nauc_ndcg_at_100_diff1": 0.299408,
62
+ "nauc_ndcg_at_1000_max": 0.467477,
63
+ "nauc_ndcg_at_1000_std": 0.458531,
64
+ "nauc_ndcg_at_1000_diff1": 0.270737,
65
+ "nauc_map_at_1_max": 0.074408,
66
+ "nauc_map_at_1_std": -0.081985,
67
+ "nauc_map_at_1_diff1": 0.534275,
68
+ "nauc_map_at_3_max": 0.110116,
69
+ "nauc_map_at_3_std": -0.027696,
70
+ "nauc_map_at_3_diff1": 0.475113,
71
+ "nauc_map_at_5_max": 0.150132,
72
+ "nauc_map_at_5_std": 0.024017,
73
+ "nauc_map_at_5_diff1": 0.425374,
74
+ "nauc_map_at_10_max": 0.191633,
75
+ "nauc_map_at_10_std": 0.103971,
76
+ "nauc_map_at_10_diff1": 0.373191,
77
+ "nauc_map_at_20_max": 0.244923,
78
+ "nauc_map_at_20_std": 0.183035,
79
+ "nauc_map_at_20_diff1": 0.335678,
80
+ "nauc_map_at_100_max": 0.31194,
81
+ "nauc_map_at_100_std": 0.307782,
82
+ "nauc_map_at_100_diff1": 0.2882,
83
+ "nauc_map_at_1000_max": 0.325713,
84
+ "nauc_map_at_1000_std": 0.335612,
85
+ "nauc_map_at_1000_diff1": 0.272904,
86
+ "nauc_recall_at_1_max": 0.074408,
87
+ "nauc_recall_at_1_std": -0.081985,
88
+ "nauc_recall_at_1_diff1": 0.534275,
89
+ "nauc_recall_at_3_max": 0.083943,
90
+ "nauc_recall_at_3_std": -0.024473,
91
+ "nauc_recall_at_3_diff1": 0.433208,
92
+ "nauc_recall_at_5_max": 0.120595,
93
+ "nauc_recall_at_5_std": 0.025828,
94
+ "nauc_recall_at_5_diff1": 0.356409,
95
+ "nauc_recall_at_10_max": 0.180242,
96
+ "nauc_recall_at_10_std": 0.12402,
97
+ "nauc_recall_at_10_diff1": 0.298175,
98
+ "nauc_recall_at_20_max": 0.229174,
99
+ "nauc_recall_at_20_std": 0.201622,
100
+ "nauc_recall_at_20_diff1": 0.246315,
101
+ "nauc_recall_at_100_max": 0.360152,
102
+ "nauc_recall_at_100_std": 0.427468,
103
+ "nauc_recall_at_100_diff1": 0.181991,
104
+ "nauc_recall_at_1000_max": 0.381268,
105
+ "nauc_recall_at_1000_std": 0.519384,
106
+ "nauc_recall_at_1000_diff1": 0.083461,
107
+ "nauc_precision_at_1_max": 0.518202,
108
+ "nauc_precision_at_1_std": 0.270924,
109
+ "nauc_precision_at_1_diff1": 0.465657,
110
+ "nauc_precision_at_3_max": 0.438625,
111
+ "nauc_precision_at_3_std": 0.38575,
112
+ "nauc_precision_at_3_diff1": 0.113674,
113
+ "nauc_precision_at_5_max": 0.446871,
114
+ "nauc_precision_at_5_std": 0.439663,
115
+ "nauc_precision_at_5_diff1": -0.007433,
116
+ "nauc_precision_at_10_max": 0.421686,
117
+ "nauc_precision_at_10_std": 0.495747,
118
+ "nauc_precision_at_10_diff1": -0.091661,
119
+ "nauc_precision_at_20_max": 0.408199,
120
+ "nauc_precision_at_20_std": 0.511673,
121
+ "nauc_precision_at_20_diff1": -0.133374,
122
+ "nauc_precision_at_100_max": 0.307229,
123
+ "nauc_precision_at_100_std": 0.479434,
124
+ "nauc_precision_at_100_diff1": -0.178404,
125
+ "nauc_precision_at_1000_max": 0.075643,
126
+ "nauc_precision_at_1000_std": 0.167266,
127
+ "nauc_precision_at_1000_diff1": -0.252182,
128
+ "nauc_mrr_at_1_max": 0.518202,
129
+ "nauc_mrr_at_1_std": 0.270924,
130
+ "nauc_mrr_at_1_diff1": 0.465657,
131
+ "nauc_mrr_at_3_max": 0.558503,
132
+ "nauc_mrr_at_3_std": 0.343433,
133
+ "nauc_mrr_at_3_diff1": 0.445504,
134
+ "nauc_mrr_at_5_max": 0.568679,
135
+ "nauc_mrr_at_5_std": 0.346367,
136
+ "nauc_mrr_at_5_diff1": 0.444988,
137
+ "nauc_mrr_at_10_max": 0.569613,
138
+ "nauc_mrr_at_10_std": 0.349981,
139
+ "nauc_mrr_at_10_diff1": 0.44308,
140
+ "nauc_mrr_at_20_max": 0.567014,
141
+ "nauc_mrr_at_20_std": 0.346014,
142
+ "nauc_mrr_at_20_diff1": 0.441504,
143
+ "nauc_mrr_at_100_max": 0.567072,
144
+ "nauc_mrr_at_100_std": 0.346064,
145
+ "nauc_mrr_at_100_diff1": 0.441791,
146
+ "nauc_mrr_at_1000_max": 0.56689,
147
+ "nauc_mrr_at_1000_std": 0.345785,
148
+ "nauc_mrr_at_1000_diff1": 0.441806,
149
+ "hit_rate_at_1": 0.64,
150
+ "hit_rate_at_3": 0.78,
151
+ "hit_rate_at_5": 0.815,
152
+ "hit_rate_at_10": 0.85,
153
+ "hit_rate_at_20": 0.895,
154
+ "hit_rate_at_100": 0.94,
155
+ "hit_rate_at_1000": 0.9775,
156
+ "main_score": 0.36321,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 28168.432690382004,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/EmotionClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4f58c6b202a23cf9a4da393831edf4f9183cad37",
3
+ "task_name": "EmotionClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.5285,
11
+ "f1": 0.459817,
12
+ "f1_weighted": 0.548466,
13
+ "precision": 0.458764,
14
+ "precision_weighted": 0.596847,
15
+ "recall": 0.509035,
16
+ "recall_weighted": 0.5285,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.4505,
22
+ "f1": 0.394824,
23
+ "f1_weighted": 0.459863,
24
+ "precision": 0.404337,
25
+ "precision_weighted": 0.535639,
26
+ "recall": 0.45209,
27
+ "recall_weighted": 0.4505,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.4425,
33
+ "f1": 0.39019,
34
+ "f1_weighted": 0.468485,
35
+ "precision": 0.39696,
36
+ "precision_weighted": 0.529667,
37
+ "recall": 0.437328,
38
+ "recall_weighted": 0.4425,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.4925,
44
+ "f1": 0.432607,
45
+ "f1_weighted": 0.516797,
46
+ "precision": 0.441713,
47
+ "precision_weighted": 0.573401,
48
+ "recall": 0.487958,
49
+ "recall_weighted": 0.4925,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.4735,
55
+ "f1": 0.430306,
56
+ "f1_weighted": 0.497706,
57
+ "precision": 0.441574,
58
+ "precision_weighted": 0.586105,
59
+ "recall": 0.489526,
60
+ "recall_weighted": 0.4735,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.506,
66
+ "f1": 0.433923,
67
+ "f1_weighted": 0.524617,
68
+ "precision": 0.430331,
69
+ "precision_weighted": 0.569599,
70
+ "recall": 0.476008,
71
+ "recall_weighted": 0.506,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.48,
77
+ "f1": 0.418001,
78
+ "f1_weighted": 0.499944,
79
+ "precision": 0.423246,
80
+ "precision_weighted": 0.560687,
81
+ "recall": 0.469641,
82
+ "recall_weighted": 0.48,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.434,
88
+ "f1": 0.394441,
89
+ "f1_weighted": 0.451256,
90
+ "precision": 0.397807,
91
+ "precision_weighted": 0.522779,
92
+ "recall": 0.457403,
93
+ "recall_weighted": 0.434,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.494,
99
+ "f1": 0.449458,
100
+ "f1_weighted": 0.51762,
101
+ "precision": 0.453661,
102
+ "precision_weighted": 0.585973,
103
+ "recall": 0.504015,
104
+ "recall_weighted": 0.494,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.4675,
110
+ "f1": 0.410275,
111
+ "f1_weighted": 0.491923,
112
+ "precision": 0.415913,
113
+ "precision_weighted": 0.561735,
114
+ "recall": 0.460759,
115
+ "recall_weighted": 0.4675,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.4769,
121
+ "f1": 0.421384,
122
+ "f1_weighted": 0.497668,
123
+ "precision": 0.426431,
124
+ "precision_weighted": 0.562243,
125
+ "recall": 0.474376,
126
+ "recall_weighted": 0.4769,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.4769,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 32.58931565284729,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/FEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "bea83ef9e8fb933d90a2f1d5515737465d613e12",
3
+ "task_name": "FEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.46925,
9
+ "ndcg_at_3": 0.55071,
10
+ "ndcg_at_5": 0.57948,
11
+ "ndcg_at_10": 0.60273,
12
+ "ndcg_at_20": 0.61745,
13
+ "ndcg_at_100": 0.6322,
14
+ "ndcg_at_1000": 0.63999,
15
+ "map_at_1": 0.43338,
16
+ "map_at_3": 0.51533,
17
+ "map_at_5": 0.53199,
18
+ "map_at_10": 0.54191,
19
+ "map_at_20": 0.54617,
20
+ "map_at_100": 0.54833,
21
+ "map_at_1000": 0.54866,
22
+ "recall_at_1": 0.43338,
23
+ "recall_at_3": 0.61345,
24
+ "recall_at_5": 0.68416,
25
+ "recall_at_10": 0.75497,
26
+ "recall_at_20": 0.81112,
27
+ "recall_at_100": 0.88743,
28
+ "recall_at_1000": 0.94505,
29
+ "accuracy": 0.43338,
30
+ "precision_at_1": 0.46925,
31
+ "precision_at_3": 0.22262,
32
+ "precision_at_5": 0.14902,
33
+ "precision_at_10": 0.08254,
34
+ "precision_at_20": 0.04449,
35
+ "precision_at_100": 0.0098,
36
+ "precision_at_1000": 0.00106,
37
+ "mrr_at_1": 0.469247,
38
+ "mrr_at_3": 0.554305,
39
+ "mrr_at_5": 0.571287,
40
+ "mrr_at_10": 0.581008,
41
+ "mrr_at_20": 0.584916,
42
+ "mrr_at_100": 0.586831,
43
+ "mrr_at_1000": 0.587075,
44
+ "nauc_ndcg_at_1_max": 0.307935,
45
+ "nauc_ndcg_at_1_std": -0.044565,
46
+ "nauc_ndcg_at_1_diff1": 0.498164,
47
+ "nauc_ndcg_at_3_max": 0.32875,
48
+ "nauc_ndcg_at_3_std": -0.00628,
49
+ "nauc_ndcg_at_3_diff1": 0.42294,
50
+ "nauc_ndcg_at_5_max": 0.333097,
51
+ "nauc_ndcg_at_5_std": 0.001191,
52
+ "nauc_ndcg_at_5_diff1": 0.426939,
53
+ "nauc_ndcg_at_10_max": 0.337506,
54
+ "nauc_ndcg_at_10_std": 0.013112,
55
+ "nauc_ndcg_at_10_diff1": 0.427317,
56
+ "nauc_ndcg_at_20_max": 0.343402,
57
+ "nauc_ndcg_at_20_std": 0.021344,
58
+ "nauc_ndcg_at_20_diff1": 0.430682,
59
+ "nauc_ndcg_at_100_max": 0.340458,
60
+ "nauc_ndcg_at_100_std": 0.026275,
61
+ "nauc_ndcg_at_100_diff1": 0.430694,
62
+ "nauc_ndcg_at_1000_max": 0.337595,
63
+ "nauc_ndcg_at_1000_std": 0.020189,
64
+ "nauc_ndcg_at_1000_diff1": 0.4305,
65
+ "nauc_map_at_1_max": 0.281822,
66
+ "nauc_map_at_1_std": -0.038833,
67
+ "nauc_map_at_1_diff1": 0.462555,
68
+ "nauc_map_at_3_max": 0.311792,
69
+ "nauc_map_at_3_std": -0.014057,
70
+ "nauc_map_at_3_diff1": 0.426413,
71
+ "nauc_map_at_5_max": 0.315029,
72
+ "nauc_map_at_5_std": -0.010006,
73
+ "nauc_map_at_5_diff1": 0.429154,
74
+ "nauc_map_at_10_max": 0.316792,
75
+ "nauc_map_at_10_std": -0.005429,
76
+ "nauc_map_at_10_diff1": 0.429398,
77
+ "nauc_map_at_20_max": 0.318321,
78
+ "nauc_map_at_20_std": -0.00355,
79
+ "nauc_map_at_20_diff1": 0.430533,
80
+ "nauc_map_at_100_max": 0.318088,
81
+ "nauc_map_at_100_std": -0.002679,
82
+ "nauc_map_at_100_diff1": 0.430622,
83
+ "nauc_map_at_1000_max": 0.318016,
84
+ "nauc_map_at_1000_std": -0.002829,
85
+ "nauc_map_at_1000_diff1": 0.430619,
86
+ "nauc_recall_at_1_max": 0.281822,
87
+ "nauc_recall_at_1_std": -0.038833,
88
+ "nauc_recall_at_1_diff1": 0.462555,
89
+ "nauc_recall_at_3_max": 0.337929,
90
+ "nauc_recall_at_3_std": 0.024712,
91
+ "nauc_recall_at_3_diff1": 0.36574,
92
+ "nauc_recall_at_5_max": 0.348748,
93
+ "nauc_recall_at_5_std": 0.048767,
94
+ "nauc_recall_at_5_diff1": 0.363043,
95
+ "nauc_recall_at_10_max": 0.366064,
96
+ "nauc_recall_at_10_std": 0.105414,
97
+ "nauc_recall_at_10_diff1": 0.347255,
98
+ "nauc_recall_at_20_max": 0.401989,
99
+ "nauc_recall_at_20_std": 0.177834,
100
+ "nauc_recall_at_20_diff1": 0.340058,
101
+ "nauc_recall_at_100_max": 0.38542,
102
+ "nauc_recall_at_100_std": 0.316377,
103
+ "nauc_recall_at_100_diff1": 0.276212,
104
+ "nauc_recall_at_1000_max": 0.318521,
105
+ "nauc_recall_at_1000_std": 0.366042,
106
+ "nauc_recall_at_1000_diff1": 0.126058,
107
+ "nauc_precision_at_1_max": 0.307935,
108
+ "nauc_precision_at_1_std": -0.044565,
109
+ "nauc_precision_at_1_diff1": 0.498164,
110
+ "nauc_precision_at_3_max": 0.384549,
111
+ "nauc_precision_at_3_std": 0.024043,
112
+ "nauc_precision_at_3_diff1": 0.398976,
113
+ "nauc_precision_at_5_max": 0.406047,
114
+ "nauc_precision_at_5_std": 0.051316,
115
+ "nauc_precision_at_5_diff1": 0.401622,
116
+ "nauc_precision_at_10_max": 0.424385,
117
+ "nauc_precision_at_10_std": 0.116081,
118
+ "nauc_precision_at_10_diff1": 0.367981,
119
+ "nauc_precision_at_20_max": 0.451158,
120
+ "nauc_precision_at_20_std": 0.189419,
121
+ "nauc_precision_at_20_diff1": 0.339756,
122
+ "nauc_precision_at_100_max": 0.37159,
123
+ "nauc_precision_at_100_std": 0.289856,
124
+ "nauc_precision_at_100_diff1": 0.21004,
125
+ "nauc_precision_at_1000_max": 0.218426,
126
+ "nauc_precision_at_1000_std": 0.204942,
127
+ "nauc_precision_at_1000_diff1": 0.024451,
128
+ "nauc_mrr_at_1_max": 0.307935,
129
+ "nauc_mrr_at_1_std": -0.044565,
130
+ "nauc_mrr_at_1_diff1": 0.498164,
131
+ "nauc_mrr_at_3_max": 0.33891,
132
+ "nauc_mrr_at_3_std": -0.021385,
133
+ "nauc_mrr_at_3_diff1": 0.459468,
134
+ "nauc_mrr_at_5_max": 0.34134,
135
+ "nauc_mrr_at_5_std": -0.018379,
136
+ "nauc_mrr_at_5_diff1": 0.462922,
137
+ "nauc_mrr_at_10_max": 0.342712,
138
+ "nauc_mrr_at_10_std": -0.015148,
139
+ "nauc_mrr_at_10_diff1": 0.464056,
140
+ "nauc_mrr_at_20_max": 0.343729,
141
+ "nauc_mrr_at_20_std": -0.013733,
142
+ "nauc_mrr_at_20_diff1": 0.464887,
143
+ "nauc_mrr_at_100_max": 0.343275,
144
+ "nauc_mrr_at_100_std": -0.013219,
145
+ "nauc_mrr_at_100_diff1": 0.46497,
146
+ "nauc_mrr_at_1000_max": 0.34312,
147
+ "nauc_mrr_at_1000_std": -0.013459,
148
+ "nauc_mrr_at_1000_diff1": 0.464961,
149
+ "hit_rate_at_1": 0.46925,
150
+ "hit_rate_at_3": 0.65962,
151
+ "hit_rate_at_5": 0.73342,
152
+ "hit_rate_at_10": 0.80588,
153
+ "hit_rate_at_20": 0.86124,
154
+ "hit_rate_at_100": 0.93474,
155
+ "hit_rate_at_1000": 0.9838,
156
+ "main_score": 0.60273,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 33480.48275399208,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/FiQA2018.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "27a168819829fe9bcd655c2df245fb19452e8e06",
3
+ "task_name": "FiQA2018",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.31944,
9
+ "ndcg_at_3": 0.2865,
10
+ "ndcg_at_5": 0.30158,
11
+ "ndcg_at_10": 0.32593,
12
+ "ndcg_at_20": 0.35214,
13
+ "ndcg_at_100": 0.39475,
14
+ "ndcg_at_1000": 0.43158,
15
+ "map_at_1": 0.15915,
16
+ "map_at_3": 0.21783,
17
+ "map_at_5": 0.23849,
18
+ "map_at_10": 0.25479,
19
+ "map_at_20": 0.26427,
20
+ "map_at_100": 0.27227,
21
+ "map_at_1000": 0.2744,
22
+ "recall_at_1": 0.15915,
23
+ "recall_at_3": 0.257,
24
+ "recall_at_5": 0.31698,
25
+ "recall_at_10": 0.3926,
26
+ "recall_at_20": 0.47247,
27
+ "recall_at_100": 0.64958,
28
+ "recall_at_1000": 0.8714,
29
+ "accuracy": 0.15915,
30
+ "precision_at_1": 0.31944,
31
+ "precision_at_3": 0.18981,
32
+ "precision_at_5": 0.14537,
33
+ "precision_at_10": 0.09198,
34
+ "precision_at_20": 0.05702,
35
+ "precision_at_100": 0.01623,
36
+ "precision_at_1000": 0.00226,
37
+ "mrr_at_1": 0.319444,
38
+ "mrr_at_3": 0.3732,
39
+ "mrr_at_5": 0.390329,
40
+ "mrr_at_10": 0.400348,
41
+ "mrr_at_20": 0.405828,
42
+ "mrr_at_100": 0.409798,
43
+ "mrr_at_1000": 0.410371,
44
+ "nauc_ndcg_at_1_max": 0.345238,
45
+ "nauc_ndcg_at_1_std": 0.007052,
46
+ "nauc_ndcg_at_1_diff1": 0.500574,
47
+ "nauc_ndcg_at_3_max": 0.32813,
48
+ "nauc_ndcg_at_3_std": 0.011587,
49
+ "nauc_ndcg_at_3_diff1": 0.426475,
50
+ "nauc_ndcg_at_5_max": 0.321198,
51
+ "nauc_ndcg_at_5_std": 0.024825,
52
+ "nauc_ndcg_at_5_diff1": 0.42222,
53
+ "nauc_ndcg_at_10_max": 0.327739,
54
+ "nauc_ndcg_at_10_std": 0.044576,
55
+ "nauc_ndcg_at_10_diff1": 0.428329,
56
+ "nauc_ndcg_at_20_max": 0.338695,
57
+ "nauc_ndcg_at_20_std": 0.057883,
58
+ "nauc_ndcg_at_20_diff1": 0.438937,
59
+ "nauc_ndcg_at_100_max": 0.346525,
60
+ "nauc_ndcg_at_100_std": 0.061508,
61
+ "nauc_ndcg_at_100_diff1": 0.430515,
62
+ "nauc_ndcg_at_1000_max": 0.363556,
63
+ "nauc_ndcg_at_1000_std": 0.081105,
64
+ "nauc_ndcg_at_1000_diff1": 0.43335,
65
+ "nauc_map_at_1_max": 0.279773,
66
+ "nauc_map_at_1_std": -0.010044,
67
+ "nauc_map_at_1_diff1": 0.466738,
68
+ "nauc_map_at_3_max": 0.287616,
69
+ "nauc_map_at_3_std": 0.002155,
70
+ "nauc_map_at_3_diff1": 0.411578,
71
+ "nauc_map_at_5_max": 0.301459,
72
+ "nauc_map_at_5_std": 0.01419,
73
+ "nauc_map_at_5_diff1": 0.411602,
74
+ "nauc_map_at_10_max": 0.311257,
75
+ "nauc_map_at_10_std": 0.027047,
76
+ "nauc_map_at_10_diff1": 0.415549,
77
+ "nauc_map_at_20_max": 0.318525,
78
+ "nauc_map_at_20_std": 0.034326,
79
+ "nauc_map_at_20_diff1": 0.419144,
80
+ "nauc_map_at_100_max": 0.321806,
81
+ "nauc_map_at_100_std": 0.035403,
82
+ "nauc_map_at_100_diff1": 0.418287,
83
+ "nauc_map_at_1000_max": 0.323425,
84
+ "nauc_map_at_1000_std": 0.037161,
85
+ "nauc_map_at_1000_diff1": 0.418893,
86
+ "nauc_recall_at_1_max": 0.279773,
87
+ "nauc_recall_at_1_std": -0.010044,
88
+ "nauc_recall_at_1_diff1": 0.466738,
89
+ "nauc_recall_at_3_max": 0.259396,
90
+ "nauc_recall_at_3_std": 0.003882,
91
+ "nauc_recall_at_3_diff1": 0.363897,
92
+ "nauc_recall_at_5_max": 0.251172,
93
+ "nauc_recall_at_5_std": 0.021787,
94
+ "nauc_recall_at_5_diff1": 0.351891,
95
+ "nauc_recall_at_10_max": 0.269328,
96
+ "nauc_recall_at_10_std": 0.068258,
97
+ "nauc_recall_at_10_diff1": 0.348113,
98
+ "nauc_recall_at_20_max": 0.289797,
99
+ "nauc_recall_at_20_std": 0.103081,
100
+ "nauc_recall_at_20_diff1": 0.372974,
101
+ "nauc_recall_at_100_max": 0.27662,
102
+ "nauc_recall_at_100_std": 0.11291,
103
+ "nauc_recall_at_100_diff1": 0.306131,
104
+ "nauc_recall_at_1000_max": 0.440731,
105
+ "nauc_recall_at_1000_std": 0.398616,
106
+ "nauc_recall_at_1000_diff1": 0.293247,
107
+ "nauc_precision_at_1_max": 0.345238,
108
+ "nauc_precision_at_1_std": 0.007052,
109
+ "nauc_precision_at_1_diff1": 0.500574,
110
+ "nauc_precision_at_3_max": 0.332331,
111
+ "nauc_precision_at_3_std": 0.027053,
112
+ "nauc_precision_at_3_diff1": 0.348633,
113
+ "nauc_precision_at_5_max": 0.360081,
114
+ "nauc_precision_at_5_std": 0.083072,
115
+ "nauc_precision_at_5_diff1": 0.31374,
116
+ "nauc_precision_at_10_max": 0.355507,
117
+ "nauc_precision_at_10_std": 0.119351,
118
+ "nauc_precision_at_10_diff1": 0.293529,
119
+ "nauc_precision_at_20_max": 0.349144,
120
+ "nauc_precision_at_20_std": 0.144261,
121
+ "nauc_precision_at_20_diff1": 0.27247,
122
+ "nauc_precision_at_100_max": 0.295806,
123
+ "nauc_precision_at_100_std": 0.135977,
124
+ "nauc_precision_at_100_diff1": 0.155543,
125
+ "nauc_precision_at_1000_max": 0.255735,
126
+ "nauc_precision_at_1000_std": 0.159669,
127
+ "nauc_precision_at_1000_diff1": 0.053056,
128
+ "nauc_mrr_at_1_max": 0.345238,
129
+ "nauc_mrr_at_1_std": 0.007052,
130
+ "nauc_mrr_at_1_diff1": 0.500574,
131
+ "nauc_mrr_at_3_max": 0.345872,
132
+ "nauc_mrr_at_3_std": 0.014442,
133
+ "nauc_mrr_at_3_diff1": 0.484271,
134
+ "nauc_mrr_at_5_max": 0.346781,
135
+ "nauc_mrr_at_5_std": 0.020887,
136
+ "nauc_mrr_at_5_diff1": 0.482372,
137
+ "nauc_mrr_at_10_max": 0.351169,
138
+ "nauc_mrr_at_10_std": 0.02806,
139
+ "nauc_mrr_at_10_diff1": 0.480463,
140
+ "nauc_mrr_at_20_max": 0.351288,
141
+ "nauc_mrr_at_20_std": 0.027916,
142
+ "nauc_mrr_at_20_diff1": 0.482601,
143
+ "nauc_mrr_at_100_max": 0.352855,
144
+ "nauc_mrr_at_100_std": 0.028999,
145
+ "nauc_mrr_at_100_diff1": 0.482233,
146
+ "nauc_mrr_at_1000_max": 0.352828,
147
+ "nauc_mrr_at_1000_std": 0.029155,
148
+ "nauc_mrr_at_1000_diff1": 0.482291,
149
+ "hit_rate_at_1": 0.31944,
150
+ "hit_rate_at_3": 0.44136,
151
+ "hit_rate_at_5": 0.51698,
152
+ "hit_rate_at_10": 0.59105,
153
+ "hit_rate_at_20": 0.66975,
154
+ "hit_rate_at_100": 0.82099,
155
+ "hit_rate_at_1000": 0.9429,
156
+ "main_score": 0.32593,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 607.3594682216644,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/HotpotQA.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ab518f4d6fcca38d87c25209f94beba119d02014",
3
+ "task_name": "HotpotQA",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.63835,
9
+ "ndcg_at_3": 0.48441,
10
+ "ndcg_at_5": 0.50487,
11
+ "ndcg_at_10": 0.52429,
12
+ "ndcg_at_20": 0.53733,
13
+ "ndcg_at_100": 0.55731,
14
+ "ndcg_at_1000": 0.57458,
15
+ "map_at_1": 0.31918,
16
+ "map_at_3": 0.40873,
17
+ "map_at_5": 0.42316,
18
+ "map_at_10": 0.43348,
19
+ "map_at_20": 0.4382,
20
+ "map_at_100": 0.44191,
21
+ "map_at_1000": 0.4427,
22
+ "recall_at_1": 0.31918,
23
+ "recall_at_3": 0.44781,
24
+ "recall_at_5": 0.48845,
25
+ "recall_at_10": 0.53707,
26
+ "recall_at_20": 0.57914,
27
+ "recall_at_100": 0.66766,
28
+ "recall_at_1000": 0.78305,
29
+ "accuracy": 0.31918,
30
+ "precision_at_1": 0.63835,
31
+ "precision_at_3": 0.29854,
32
+ "precision_at_5": 0.19538,
33
+ "precision_at_10": 0.10741,
34
+ "precision_at_20": 0.05791,
35
+ "precision_at_100": 0.01335,
36
+ "precision_at_1000": 0.00157,
37
+ "mrr_at_1": 0.638352,
38
+ "mrr_at_3": 0.689444,
39
+ "mrr_at_5": 0.697486,
40
+ "mrr_at_10": 0.703267,
41
+ "mrr_at_20": 0.705479,
42
+ "mrr_at_100": 0.706888,
43
+ "mrr_at_1000": 0.707126,
44
+ "nauc_ndcg_at_1_max": 0.537509,
45
+ "nauc_ndcg_at_1_std": 0.115111,
46
+ "nauc_ndcg_at_1_diff1": 0.708495,
47
+ "nauc_ndcg_at_3_max": 0.412821,
48
+ "nauc_ndcg_at_3_std": 0.132522,
49
+ "nauc_ndcg_at_3_diff1": 0.443424,
50
+ "nauc_ndcg_at_5_max": 0.395649,
51
+ "nauc_ndcg_at_5_std": 0.141626,
52
+ "nauc_ndcg_at_5_diff1": 0.41914,
53
+ "nauc_ndcg_at_10_max": 0.382152,
54
+ "nauc_ndcg_at_10_std": 0.147435,
55
+ "nauc_ndcg_at_10_diff1": 0.398775,
56
+ "nauc_ndcg_at_20_max": 0.376045,
57
+ "nauc_ndcg_at_20_std": 0.154319,
58
+ "nauc_ndcg_at_20_diff1": 0.390656,
59
+ "nauc_ndcg_at_100_max": 0.370357,
60
+ "nauc_ndcg_at_100_std": 0.16199,
61
+ "nauc_ndcg_at_100_diff1": 0.383854,
62
+ "nauc_ndcg_at_1000_max": 0.37346,
63
+ "nauc_ndcg_at_1000_std": 0.163825,
64
+ "nauc_ndcg_at_1000_diff1": 0.386519,
65
+ "nauc_map_at_1_max": 0.537509,
66
+ "nauc_map_at_1_std": 0.115111,
67
+ "nauc_map_at_1_diff1": 0.708495,
68
+ "nauc_map_at_3_max": 0.371432,
69
+ "nauc_map_at_3_std": 0.12356,
70
+ "nauc_map_at_3_diff1": 0.386634,
71
+ "nauc_map_at_5_max": 0.359829,
72
+ "nauc_map_at_5_std": 0.129606,
73
+ "nauc_map_at_5_diff1": 0.37095,
74
+ "nauc_map_at_10_max": 0.352849,
75
+ "nauc_map_at_10_std": 0.132969,
76
+ "nauc_map_at_10_diff1": 0.360652,
77
+ "nauc_map_at_20_max": 0.351009,
78
+ "nauc_map_at_20_std": 0.135431,
79
+ "nauc_map_at_20_diff1": 0.358033,
80
+ "nauc_map_at_100_max": 0.350033,
81
+ "nauc_map_at_100_std": 0.136961,
82
+ "nauc_map_at_100_diff1": 0.356825,
83
+ "nauc_map_at_1000_max": 0.350206,
84
+ "nauc_map_at_1000_std": 0.137136,
85
+ "nauc_map_at_1000_diff1": 0.356948,
86
+ "nauc_recall_at_1_max": 0.537509,
87
+ "nauc_recall_at_1_std": 0.115111,
88
+ "nauc_recall_at_1_diff1": 0.708495,
89
+ "nauc_recall_at_3_max": 0.349218,
90
+ "nauc_recall_at_3_std": 0.140444,
91
+ "nauc_recall_at_3_diff1": 0.317772,
92
+ "nauc_recall_at_5_max": 0.302927,
93
+ "nauc_recall_at_5_std": 0.152663,
94
+ "nauc_recall_at_5_diff1": 0.260239,
95
+ "nauc_recall_at_10_max": 0.260198,
96
+ "nauc_recall_at_10_std": 0.163131,
97
+ "nauc_recall_at_10_diff1": 0.200352,
98
+ "nauc_recall_at_20_max": 0.222573,
99
+ "nauc_recall_at_20_std": 0.176937,
100
+ "nauc_recall_at_20_diff1": 0.156207,
101
+ "nauc_recall_at_100_max": 0.170449,
102
+ "nauc_recall_at_100_std": 0.205193,
103
+ "nauc_recall_at_100_diff1": 0.093175,
104
+ "nauc_recall_at_1000_max": 0.132043,
105
+ "nauc_recall_at_1000_std": 0.225767,
106
+ "nauc_recall_at_1000_diff1": 0.027016,
107
+ "nauc_precision_at_1_max": 0.537509,
108
+ "nauc_precision_at_1_std": 0.115111,
109
+ "nauc_precision_at_1_diff1": 0.708495,
110
+ "nauc_precision_at_3_max": 0.349218,
111
+ "nauc_precision_at_3_std": 0.140444,
112
+ "nauc_precision_at_3_diff1": 0.317772,
113
+ "nauc_precision_at_5_max": 0.302927,
114
+ "nauc_precision_at_5_std": 0.152663,
115
+ "nauc_precision_at_5_diff1": 0.260239,
116
+ "nauc_precision_at_10_max": 0.260198,
117
+ "nauc_precision_at_10_std": 0.163131,
118
+ "nauc_precision_at_10_diff1": 0.200352,
119
+ "nauc_precision_at_20_max": 0.222573,
120
+ "nauc_precision_at_20_std": 0.176937,
121
+ "nauc_precision_at_20_diff1": 0.156207,
122
+ "nauc_precision_at_100_max": 0.170449,
123
+ "nauc_precision_at_100_std": 0.205193,
124
+ "nauc_precision_at_100_diff1": 0.093175,
125
+ "nauc_precision_at_1000_max": 0.132043,
126
+ "nauc_precision_at_1000_std": 0.225767,
127
+ "nauc_precision_at_1000_diff1": 0.027016,
128
+ "nauc_mrr_at_1_max": 0.537509,
129
+ "nauc_mrr_at_1_std": 0.115111,
130
+ "nauc_mrr_at_1_diff1": 0.708495,
131
+ "nauc_mrr_at_3_max": 0.550128,
132
+ "nauc_mrr_at_3_std": 0.135964,
133
+ "nauc_mrr_at_3_diff1": 0.680447,
134
+ "nauc_mrr_at_5_max": 0.549019,
135
+ "nauc_mrr_at_5_std": 0.141093,
136
+ "nauc_mrr_at_5_diff1": 0.677593,
137
+ "nauc_mrr_at_10_max": 0.548094,
138
+ "nauc_mrr_at_10_std": 0.14227,
139
+ "nauc_mrr_at_10_diff1": 0.676347,
140
+ "nauc_mrr_at_20_max": 0.547985,
141
+ "nauc_mrr_at_20_std": 0.143459,
142
+ "nauc_mrr_at_20_diff1": 0.676727,
143
+ "nauc_mrr_at_100_max": 0.547904,
144
+ "nauc_mrr_at_100_std": 0.143642,
145
+ "nauc_mrr_at_100_diff1": 0.677009,
146
+ "nauc_mrr_at_1000_max": 0.547914,
147
+ "nauc_mrr_at_1000_std": 0.14353,
148
+ "nauc_mrr_at_1000_diff1": 0.677092,
149
+ "hit_rate_at_1": 0.63835,
150
+ "hit_rate_at_3": 0.75327,
151
+ "hit_rate_at_5": 0.78852,
152
+ "hit_rate_at_10": 0.83038,
153
+ "hit_rate_at_20": 0.86212,
154
+ "hit_rate_at_100": 0.91654,
155
+ "hit_rate_at_1000": 0.96826,
156
+ "main_score": 0.52429,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 32011.393527507782,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ImdbClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "3d86128a09e091d6018b6d26cad27f2739fc2db7",
3
+ "task_name": "ImdbClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.77208,
11
+ "f1": 0.769924,
12
+ "f1_weighted": 0.769924,
13
+ "precision": 0.782675,
14
+ "precision_weighted": 0.782675,
15
+ "recall": 0.77208,
16
+ "recall_weighted": 0.77208,
17
+ "ap": 0.72784,
18
+ "ap_weighted": 0.72784
19
+ },
20
+ {
21
+ "accuracy": 0.78644,
22
+ "f1": 0.786238,
23
+ "f1_weighted": 0.786238,
24
+ "precision": 0.787528,
25
+ "precision_weighted": 0.787528,
26
+ "recall": 0.78644,
27
+ "recall_weighted": 0.78644,
28
+ "ap": 0.720513,
29
+ "ap_weighted": 0.720513
30
+ },
31
+ {
32
+ "accuracy": 0.68544,
33
+ "f1": 0.683891,
34
+ "f1_weighted": 0.683891,
35
+ "precision": 0.689147,
36
+ "precision_weighted": 0.689147,
37
+ "recall": 0.68544,
38
+ "recall_weighted": 0.68544,
39
+ "ap": 0.622885,
40
+ "ap_weighted": 0.622885
41
+ },
42
+ {
43
+ "accuracy": 0.78456,
44
+ "f1": 0.78456,
45
+ "f1_weighted": 0.78456,
46
+ "precision": 0.784561,
47
+ "precision_weighted": 0.784561,
48
+ "recall": 0.78456,
49
+ "recall_weighted": 0.78456,
50
+ "ap": 0.723397,
51
+ "ap_weighted": 0.723397
52
+ },
53
+ {
54
+ "accuracy": 0.75688,
55
+ "f1": 0.75677,
56
+ "f1_weighted": 0.75677,
57
+ "precision": 0.757346,
58
+ "precision_weighted": 0.757346,
59
+ "recall": 0.75688,
60
+ "recall_weighted": 0.75688,
61
+ "ap": 0.697361,
62
+ "ap_weighted": 0.697361
63
+ },
64
+ {
65
+ "accuracy": 0.69136,
66
+ "f1": 0.688794,
67
+ "f1_weighted": 0.688794,
68
+ "precision": 0.697886,
69
+ "precision_weighted": 0.697886,
70
+ "recall": 0.69136,
71
+ "recall_weighted": 0.69136,
72
+ "ap": 0.626671,
73
+ "ap_weighted": 0.626671
74
+ },
75
+ {
76
+ "accuracy": 0.69864,
77
+ "f1": 0.698639,
78
+ "f1_weighted": 0.698639,
79
+ "precision": 0.698642,
80
+ "precision_weighted": 0.698642,
81
+ "recall": 0.69864,
82
+ "recall_weighted": 0.69864,
83
+ "ap": 0.638892,
84
+ "ap_weighted": 0.638892
85
+ },
86
+ {
87
+ "accuracy": 0.7386,
88
+ "f1": 0.736985,
89
+ "f1_weighted": 0.736985,
90
+ "precision": 0.744608,
91
+ "precision_weighted": 0.744608,
92
+ "recall": 0.7386,
93
+ "recall_weighted": 0.7386,
94
+ "ap": 0.668517,
95
+ "ap_weighted": 0.668517
96
+ },
97
+ {
98
+ "accuracy": 0.71912,
99
+ "f1": 0.718852,
100
+ "f1_weighted": 0.718852,
101
+ "precision": 0.719959,
102
+ "precision_weighted": 0.719959,
103
+ "recall": 0.71912,
104
+ "recall_weighted": 0.71912,
105
+ "ap": 0.654781,
106
+ "ap_weighted": 0.654781
107
+ },
108
+ {
109
+ "accuracy": 0.72596,
110
+ "f1": 0.725762,
111
+ "f1_weighted": 0.725762,
112
+ "precision": 0.726613,
113
+ "precision_weighted": 0.726613,
114
+ "recall": 0.72596,
115
+ "recall_weighted": 0.72596,
116
+ "ap": 0.661437,
117
+ "ap_weighted": 0.661437
118
+ }
119
+ ],
120
+ "accuracy": 0.735908,
121
+ "f1": 0.735042,
122
+ "f1_weighted": 0.735042,
123
+ "precision": 0.738896,
124
+ "precision_weighted": 0.738896,
125
+ "recall": 0.735908,
126
+ "recall_weighted": 0.735908,
127
+ "ap": 0.674229,
128
+ "ap_weighted": 0.674229,
129
+ "main_score": 0.735908,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 339.2887156009674,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MSMARCO.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5a29a104738b98a9e76336939199e264163d4a0",
3
+ "task_name": "MSMARCO",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "dev": [
7
+ {
8
+ "ndcg_at_1": 0.19126,
9
+ "ndcg_at_3": 0.28969,
10
+ "ndcg_at_5": 0.32332,
11
+ "ndcg_at_10": 0.35861,
12
+ "ndcg_at_20": 0.38386,
13
+ "ndcg_at_100": 0.41849,
14
+ "ndcg_at_1000": 0.43481,
15
+ "map_at_1": 0.18635,
16
+ "map_at_3": 0.26304,
17
+ "map_at_5": 0.28186,
18
+ "map_at_10": 0.29662,
19
+ "map_at_20": 0.30367,
20
+ "map_at_100": 0.30863,
21
+ "map_at_1000": 0.30927,
22
+ "recall_at_1": 0.18635,
23
+ "recall_at_3": 0.36175,
24
+ "recall_at_5": 0.44219,
25
+ "recall_at_10": 0.54981,
26
+ "recall_at_20": 0.64793,
27
+ "recall_at_100": 0.83052,
28
+ "recall_at_1000": 0.95587,
29
+ "accuracy": 0.18635,
30
+ "precision_at_1": 0.19126,
31
+ "precision_at_3": 0.12464,
32
+ "precision_at_5": 0.09183,
33
+ "precision_at_10": 0.05725,
34
+ "precision_at_20": 0.03391,
35
+ "precision_at_100": 0.00877,
36
+ "precision_at_1000": 0.00102,
37
+ "mrr_at_1": 0.191261,
38
+ "mrr_at_3": 0.268147,
39
+ "mrr_at_5": 0.28708,
40
+ "mrr_at_10": 0.301613,
41
+ "mrr_at_20": 0.30851,
42
+ "mrr_at_100": 0.313219,
43
+ "mrr_at_1000": 0.313797,
44
+ "nauc_ndcg_at_1_max": 0.073533,
45
+ "nauc_ndcg_at_1_std": -0.183548,
46
+ "nauc_ndcg_at_1_diff1": 0.409257,
47
+ "nauc_ndcg_at_3_max": 0.088434,
48
+ "nauc_ndcg_at_3_std": -0.207699,
49
+ "nauc_ndcg_at_3_diff1": 0.355482,
50
+ "nauc_ndcg_at_5_max": 0.095154,
51
+ "nauc_ndcg_at_5_std": -0.206944,
52
+ "nauc_ndcg_at_5_diff1": 0.353274,
53
+ "nauc_ndcg_at_10_max": 0.101285,
54
+ "nauc_ndcg_at_10_std": -0.196867,
55
+ "nauc_ndcg_at_10_diff1": 0.346436,
56
+ "nauc_ndcg_at_20_max": 0.108454,
57
+ "nauc_ndcg_at_20_std": -0.179766,
58
+ "nauc_ndcg_at_20_diff1": 0.343694,
59
+ "nauc_ndcg_at_100_max": 0.117666,
60
+ "nauc_ndcg_at_100_std": -0.154333,
61
+ "nauc_ndcg_at_100_diff1": 0.343248,
62
+ "nauc_ndcg_at_1000_max": 0.112419,
63
+ "nauc_ndcg_at_1000_std": -0.164204,
64
+ "nauc_ndcg_at_1000_diff1": 0.349564,
65
+ "nauc_map_at_1_max": 0.073235,
66
+ "nauc_map_at_1_std": -0.185024,
67
+ "nauc_map_at_1_diff1": 0.411661,
68
+ "nauc_map_at_3_max": 0.084078,
69
+ "nauc_map_at_3_std": -0.204314,
70
+ "nauc_map_at_3_diff1": 0.368146,
71
+ "nauc_map_at_5_max": 0.087838,
72
+ "nauc_map_at_5_std": -0.204284,
73
+ "nauc_map_at_5_diff1": 0.366716,
74
+ "nauc_map_at_10_max": 0.090386,
75
+ "nauc_map_at_10_std": -0.200325,
76
+ "nauc_map_at_10_diff1": 0.363849,
77
+ "nauc_map_at_20_max": 0.092181,
78
+ "nauc_map_at_20_std": -0.195751,
79
+ "nauc_map_at_20_diff1": 0.363213,
80
+ "nauc_map_at_100_max": 0.093411,
81
+ "nauc_map_at_100_std": -0.192163,
82
+ "nauc_map_at_100_diff1": 0.363337,
83
+ "nauc_map_at_1000_max": 0.093245,
84
+ "nauc_map_at_1000_std": -0.192417,
85
+ "nauc_map_at_1000_diff1": 0.363579,
86
+ "nauc_recall_at_1_max": 0.073235,
87
+ "nauc_recall_at_1_std": -0.185024,
88
+ "nauc_recall_at_1_diff1": 0.411661,
89
+ "nauc_recall_at_3_max": 0.098408,
90
+ "nauc_recall_at_3_std": -0.215396,
91
+ "nauc_recall_at_3_diff1": 0.322494,
92
+ "nauc_recall_at_5_max": 0.112873,
93
+ "nauc_recall_at_5_std": -0.213101,
94
+ "nauc_recall_at_5_diff1": 0.317112,
95
+ "nauc_recall_at_10_max": 0.131444,
96
+ "nauc_recall_at_10_std": -0.18301,
97
+ "nauc_recall_at_10_diff1": 0.29515,
98
+ "nauc_recall_at_20_max": 0.165191,
99
+ "nauc_recall_at_20_std": -0.11006,
100
+ "nauc_recall_at_20_diff1": 0.276758,
101
+ "nauc_recall_at_100_max": 0.291662,
102
+ "nauc_recall_at_100_std": 0.170799,
103
+ "nauc_recall_at_100_diff1": 0.222325,
104
+ "nauc_recall_at_1000_max": 0.487362,
105
+ "nauc_recall_at_1000_std": 0.516501,
106
+ "nauc_recall_at_1000_diff1": 0.216115,
107
+ "nauc_precision_at_1_max": 0.073533,
108
+ "nauc_precision_at_1_std": -0.183548,
109
+ "nauc_precision_at_1_diff1": 0.409257,
110
+ "nauc_precision_at_3_max": 0.100097,
111
+ "nauc_precision_at_3_std": -0.216368,
112
+ "nauc_precision_at_3_diff1": 0.319404,
113
+ "nauc_precision_at_5_max": 0.114707,
114
+ "nauc_precision_at_5_std": -0.212358,
115
+ "nauc_precision_at_5_diff1": 0.312857,
116
+ "nauc_precision_at_10_max": 0.133205,
117
+ "nauc_precision_at_10_std": -0.174567,
118
+ "nauc_precision_at_10_diff1": 0.280249,
119
+ "nauc_precision_at_20_max": 0.166456,
120
+ "nauc_precision_at_20_std": -0.095043,
121
+ "nauc_precision_at_20_diff1": 0.249302,
122
+ "nauc_precision_at_100_max": 0.253166,
123
+ "nauc_precision_at_100_std": 0.156058,
124
+ "nauc_precision_at_100_diff1": 0.139283,
125
+ "nauc_precision_at_1000_max": 0.215924,
126
+ "nauc_precision_at_1000_std": 0.216391,
127
+ "nauc_precision_at_1000_diff1": 0.011633,
128
+ "nauc_mrr_at_1_max": 0.073533,
129
+ "nauc_mrr_at_1_std": -0.183548,
130
+ "nauc_mrr_at_1_diff1": 0.409257,
131
+ "nauc_mrr_at_3_max": 0.084663,
132
+ "nauc_mrr_at_3_std": -0.202154,
133
+ "nauc_mrr_at_3_diff1": 0.365847,
134
+ "nauc_mrr_at_5_max": 0.088569,
135
+ "nauc_mrr_at_5_std": -0.201258,
136
+ "nauc_mrr_at_5_diff1": 0.364466,
137
+ "nauc_mrr_at_10_max": 0.091005,
138
+ "nauc_mrr_at_10_std": -0.197293,
139
+ "nauc_mrr_at_10_diff1": 0.36159,
140
+ "nauc_mrr_at_20_max": 0.092937,
141
+ "nauc_mrr_at_20_std": -0.192804,
142
+ "nauc_mrr_at_20_diff1": 0.361095,
143
+ "nauc_mrr_at_100_max": 0.093768,
144
+ "nauc_mrr_at_100_std": -0.189633,
145
+ "nauc_mrr_at_100_diff1": 0.361338,
146
+ "nauc_mrr_at_1000_max": 0.093565,
147
+ "nauc_mrr_at_1000_std": -0.189949,
148
+ "nauc_mrr_at_1000_diff1": 0.361571,
149
+ "hit_rate_at_1": 0.19126,
150
+ "hit_rate_at_3": 0.36934,
151
+ "hit_rate_at_5": 0.45143,
152
+ "hit_rate_at_10": 0.55974,
153
+ "hit_rate_at_20": 0.65874,
154
+ "hit_rate_at_100": 0.83897,
155
+ "hit_rate_at_1000": 0.95888,
156
+ "main_score": 0.35861,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 64852.122379779816,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/MTOPDomainClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "a76d16fae880597b9c73047b50159220a441cb54",
3
+ "task_name": "MTOPDomainClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.902908,
11
+ "f1": 0.900791,
12
+ "f1_weighted": 0.902103,
13
+ "precision": 0.900503,
14
+ "precision_weighted": 0.903387,
15
+ "recall": 0.903448,
16
+ "recall_weighted": 0.902908,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.909172,
22
+ "f1": 0.9079,
23
+ "f1_weighted": 0.908296,
24
+ "precision": 0.904925,
25
+ "precision_weighted": 0.908647,
26
+ "recall": 0.912257,
27
+ "recall_weighted": 0.909172,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.896644,
33
+ "f1": 0.894483,
34
+ "f1_weighted": 0.895718,
35
+ "precision": 0.892419,
36
+ "precision_weighted": 0.896436,
37
+ "recall": 0.898041,
38
+ "recall_weighted": 0.896644,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.923043,
44
+ "f1": 0.921789,
45
+ "f1_weighted": 0.923832,
46
+ "precision": 0.920316,
47
+ "precision_weighted": 0.928671,
48
+ "recall": 0.927134,
49
+ "recall_weighted": 0.923043,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.893512,
55
+ "f1": 0.892056,
56
+ "f1_weighted": 0.894244,
57
+ "precision": 0.891172,
58
+ "precision_weighted": 0.899376,
59
+ "recall": 0.897475,
60
+ "recall_weighted": 0.893512,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.90962,
66
+ "f1": 0.909962,
67
+ "f1_weighted": 0.908563,
68
+ "precision": 0.908526,
69
+ "precision_weighted": 0.90911,
70
+ "recall": 0.913004,
71
+ "recall_weighted": 0.90962,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.898881,
77
+ "f1": 0.896682,
78
+ "f1_weighted": 0.898179,
79
+ "precision": 0.893531,
80
+ "precision_weighted": 0.900732,
81
+ "recall": 0.902812,
82
+ "recall_weighted": 0.898881,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.906488,
88
+ "f1": 0.902956,
89
+ "f1_weighted": 0.906964,
90
+ "precision": 0.898799,
91
+ "precision_weighted": 0.91045,
92
+ "recall": 0.910028,
93
+ "recall_weighted": 0.906488,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.902908,
99
+ "f1": 0.901409,
100
+ "f1_weighted": 0.902582,
101
+ "precision": 0.897412,
102
+ "precision_weighted": 0.905612,
103
+ "recall": 0.908384,
104
+ "recall_weighted": 0.902908,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.914094,
110
+ "f1": 0.910454,
111
+ "f1_weighted": 0.913827,
112
+ "precision": 0.910336,
113
+ "precision_weighted": 0.914038,
114
+ "recall": 0.911126,
115
+ "recall_weighted": 0.914094,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.905727,
121
+ "f1": 0.903848,
122
+ "f1_weighted": 0.905431,
123
+ "precision": 0.901794,
124
+ "precision_weighted": 0.907646,
125
+ "recall": 0.908371,
126
+ "recall_weighted": 0.905727,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.905727,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.896261,
141
+ "f1": 0.891879,
142
+ "f1_weighted": 0.895796,
143
+ "precision": 0.89013,
144
+ "precision_weighted": 0.899221,
145
+ "recall": 0.897905,
146
+ "recall_weighted": 0.896261,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.909713,
152
+ "f1": 0.905596,
153
+ "f1_weighted": 0.909103,
154
+ "precision": 0.902125,
155
+ "precision_weighted": 0.909515,
156
+ "recall": 0.910141,
157
+ "recall_weighted": 0.909713,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.902873,
163
+ "f1": 0.899876,
164
+ "f1_weighted": 0.902263,
165
+ "precision": 0.895825,
166
+ "precision_weighted": 0.903344,
167
+ "recall": 0.905678,
168
+ "recall_weighted": 0.902873,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.910853,
174
+ "f1": 0.907062,
175
+ "f1_weighted": 0.912115,
176
+ "precision": 0.904627,
177
+ "precision_weighted": 0.917423,
178
+ "recall": 0.91348,
179
+ "recall_weighted": 0.910853,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.895121,
185
+ "f1": 0.890889,
186
+ "f1_weighted": 0.896078,
187
+ "precision": 0.887702,
188
+ "precision_weighted": 0.901834,
189
+ "recall": 0.899267,
190
+ "recall_weighted": 0.895121,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.907205,
196
+ "f1": 0.905379,
197
+ "f1_weighted": 0.906611,
198
+ "precision": 0.901529,
199
+ "precision_weighted": 0.908071,
200
+ "recall": 0.911083,
201
+ "recall_weighted": 0.907205,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.897629,
207
+ "f1": 0.892732,
208
+ "f1_weighted": 0.896618,
209
+ "precision": 0.889894,
210
+ "precision_weighted": 0.898725,
211
+ "recall": 0.898253,
212
+ "recall_weighted": 0.897629,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.906749,
218
+ "f1": 0.900573,
219
+ "f1_weighted": 0.907855,
220
+ "precision": 0.896525,
221
+ "precision_weighted": 0.912968,
222
+ "recall": 0.910568,
223
+ "recall_weighted": 0.906749,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.899681,
229
+ "f1": 0.896724,
230
+ "f1_weighted": 0.899914,
231
+ "precision": 0.891026,
232
+ "precision_weighted": 0.90383,
233
+ "recall": 0.905696,
234
+ "recall_weighted": 0.899681,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.910853,
240
+ "f1": 0.903988,
241
+ "f1_weighted": 0.910806,
242
+ "precision": 0.901604,
243
+ "precision_weighted": 0.911768,
244
+ "recall": 0.907601,
245
+ "recall_weighted": 0.910853,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.903694,
251
+ "f1": 0.89947,
252
+ "f1_weighted": 0.903716,
253
+ "precision": 0.896099,
254
+ "precision_weighted": 0.90667,
255
+ "recall": 0.905967,
256
+ "recall_weighted": 0.903694,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.903694,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 94.56007814407349,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MTOPIntentClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "2992d820f31312593c49a4890430aadadb0f0039",
3
+ "task_name": "MTOPIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.611633,
11
+ "f1": 0.409673,
12
+ "f1_weighted": 0.633875,
13
+ "precision": 0.398318,
14
+ "precision_weighted": 0.829544,
15
+ "recall": 0.602015,
16
+ "recall_weighted": 0.611633,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.641163,
22
+ "f1": 0.432864,
23
+ "f1_weighted": 0.665308,
24
+ "precision": 0.425859,
25
+ "precision_weighted": 0.823211,
26
+ "recall": 0.622044,
27
+ "recall_weighted": 0.641163,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.64519,
33
+ "f1": 0.434012,
34
+ "f1_weighted": 0.663065,
35
+ "precision": 0.411199,
36
+ "precision_weighted": 0.819446,
37
+ "recall": 0.617317,
38
+ "recall_weighted": 0.64519,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.630872,
44
+ "f1": 0.430861,
45
+ "f1_weighted": 0.652042,
46
+ "precision": 0.412139,
47
+ "precision_weighted": 0.823488,
48
+ "recall": 0.606452,
49
+ "recall_weighted": 0.630872,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.630425,
55
+ "f1": 0.449798,
56
+ "f1_weighted": 0.64842,
57
+ "precision": 0.440425,
58
+ "precision_weighted": 0.820239,
59
+ "recall": 0.626703,
60
+ "recall_weighted": 0.630425,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.624609,
66
+ "f1": 0.4352,
67
+ "f1_weighted": 0.644347,
68
+ "precision": 0.430907,
69
+ "precision_weighted": 0.833457,
70
+ "recall": 0.595863,
71
+ "recall_weighted": 0.624609,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.619239,
77
+ "f1": 0.433608,
78
+ "f1_weighted": 0.647686,
79
+ "precision": 0.421929,
80
+ "precision_weighted": 0.829609,
81
+ "recall": 0.612615,
82
+ "recall_weighted": 0.619239,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.655928,
88
+ "f1": 0.444966,
89
+ "f1_weighted": 0.67959,
90
+ "precision": 0.428875,
91
+ "precision_weighted": 0.841773,
92
+ "recall": 0.62983,
93
+ "recall_weighted": 0.655928,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.657718,
99
+ "f1": 0.426842,
100
+ "f1_weighted": 0.671419,
101
+ "precision": 0.416557,
102
+ "precision_weighted": 0.829213,
103
+ "recall": 0.605491,
104
+ "recall_weighted": 0.657718,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.660403,
110
+ "f1": 0.450082,
111
+ "f1_weighted": 0.684754,
112
+ "precision": 0.438556,
113
+ "precision_weighted": 0.831814,
114
+ "recall": 0.612658,
115
+ "recall_weighted": 0.660403,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.637718,
121
+ "f1": 0.434791,
122
+ "f1_weighted": 0.659051,
123
+ "precision": 0.422477,
124
+ "precision_weighted": 0.828179,
125
+ "recall": 0.613099,
126
+ "recall_weighted": 0.637718,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.637718,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.601459,
141
+ "f1": 0.42908,
142
+ "f1_weighted": 0.62617,
143
+ "precision": 0.412977,
144
+ "precision_weighted": 0.822319,
145
+ "recall": 0.643194,
146
+ "recall_weighted": 0.601459,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.628591,
152
+ "f1": 0.420223,
153
+ "f1_weighted": 0.650183,
154
+ "precision": 0.405049,
155
+ "precision_weighted": 0.824838,
156
+ "recall": 0.642639,
157
+ "recall_weighted": 0.628591,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.645919,
163
+ "f1": 0.453776,
164
+ "f1_weighted": 0.667763,
165
+ "precision": 0.424753,
166
+ "precision_weighted": 0.825284,
167
+ "recall": 0.657699,
168
+ "recall_weighted": 0.645919,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.622663,
174
+ "f1": 0.442072,
175
+ "f1_weighted": 0.643417,
176
+ "precision": 0.42229,
177
+ "precision_weighted": 0.821174,
178
+ "recall": 0.656686,
179
+ "recall_weighted": 0.622663,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.613771,
185
+ "f1": 0.444827,
186
+ "f1_weighted": 0.630261,
187
+ "precision": 0.423994,
188
+ "precision_weighted": 0.796384,
189
+ "recall": 0.659946,
190
+ "recall_weighted": 0.613771,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.615595,
196
+ "f1": 0.43974,
197
+ "f1_weighted": 0.635122,
198
+ "precision": 0.421639,
199
+ "precision_weighted": 0.820269,
200
+ "recall": 0.654356,
201
+ "recall_weighted": 0.615595,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.606703,
207
+ "f1": 0.441633,
208
+ "f1_weighted": 0.630369,
209
+ "precision": 0.427797,
210
+ "precision_weighted": 0.817602,
211
+ "recall": 0.641123,
212
+ "recall_weighted": 0.606703,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.638851,
218
+ "f1": 0.439061,
219
+ "f1_weighted": 0.657051,
220
+ "precision": 0.415251,
221
+ "precision_weighted": 0.827372,
222
+ "recall": 0.657952,
223
+ "recall_weighted": 0.638851,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.641587,
229
+ "f1": 0.444679,
230
+ "f1_weighted": 0.66043,
231
+ "precision": 0.421675,
232
+ "precision_weighted": 0.829789,
233
+ "recall": 0.661222,
234
+ "recall_weighted": 0.641587,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.635659,
240
+ "f1": 0.438398,
241
+ "f1_weighted": 0.655304,
242
+ "precision": 0.420978,
243
+ "precision_weighted": 0.81612,
244
+ "recall": 0.647443,
245
+ "recall_weighted": 0.635659,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.62508,
251
+ "f1": 0.439349,
252
+ "f1_weighted": 0.645607,
253
+ "precision": 0.41964,
254
+ "precision_weighted": 0.820115,
255
+ "recall": 0.652226,
256
+ "recall_weighted": 0.62508,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.62508,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 226.89173936843872,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MassiveIntentClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4672e20407010da34463acc759c162ca9734bca6",
3
+ "task_name": "MassiveIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.688637,
11
+ "f1": 0.65826,
12
+ "f1_weighted": 0.675358,
13
+ "precision": 0.644146,
14
+ "precision_weighted": 0.728667,
15
+ "recall": 0.737583,
16
+ "recall_weighted": 0.688637,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.727988,
22
+ "f1": 0.69197,
23
+ "f1_weighted": 0.719752,
24
+ "precision": 0.668949,
25
+ "precision_weighted": 0.757076,
26
+ "recall": 0.766966,
27
+ "recall_weighted": 0.727988,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.683719,
33
+ "f1": 0.635064,
34
+ "f1_weighted": 0.668446,
35
+ "precision": 0.634182,
36
+ "precision_weighted": 0.754225,
37
+ "recall": 0.725714,
38
+ "recall_weighted": 0.683719,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.697,
44
+ "f1": 0.660249,
45
+ "f1_weighted": 0.680987,
46
+ "precision": 0.646752,
47
+ "precision_weighted": 0.730108,
48
+ "recall": 0.74281,
49
+ "recall_weighted": 0.697,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.689129,
55
+ "f1": 0.653315,
56
+ "f1_weighted": 0.670445,
57
+ "precision": 0.633679,
58
+ "precision_weighted": 0.70499,
59
+ "recall": 0.747506,
60
+ "recall_weighted": 0.689129,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.664535,
66
+ "f1": 0.648287,
67
+ "f1_weighted": 0.648055,
68
+ "precision": 0.639366,
69
+ "precision_weighted": 0.72562,
70
+ "recall": 0.738753,
71
+ "recall_weighted": 0.664535,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.67634,
77
+ "f1": 0.642027,
78
+ "f1_weighted": 0.665006,
79
+ "precision": 0.632578,
80
+ "precision_weighted": 0.718234,
81
+ "recall": 0.725251,
82
+ "recall_weighted": 0.67634,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.661584,
88
+ "f1": 0.636182,
89
+ "f1_weighted": 0.635877,
90
+ "precision": 0.628559,
91
+ "precision_weighted": 0.709769,
92
+ "recall": 0.736664,
93
+ "recall_weighted": 0.661584,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.664043,
99
+ "f1": 0.649996,
100
+ "f1_weighted": 0.637349,
101
+ "precision": 0.65272,
102
+ "precision_weighted": 0.734402,
103
+ "recall": 0.738577,
104
+ "recall_weighted": 0.664043,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.703394,
110
+ "f1": 0.670103,
111
+ "f1_weighted": 0.693934,
112
+ "precision": 0.656116,
113
+ "precision_weighted": 0.738481,
114
+ "recall": 0.744836,
115
+ "recall_weighted": 0.703394,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.685637,
121
+ "f1": 0.654545,
122
+ "f1_weighted": 0.669521,
123
+ "precision": 0.643705,
124
+ "precision_weighted": 0.730157,
125
+ "recall": 0.740466,
126
+ "recall_weighted": 0.685637,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.685637,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.682919,
141
+ "f1": 0.667346,
142
+ "f1_weighted": 0.669055,
143
+ "precision": 0.658192,
144
+ "precision_weighted": 0.727625,
145
+ "recall": 0.760362,
146
+ "recall_weighted": 0.682919,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.720242,
152
+ "f1": 0.700709,
153
+ "f1_weighted": 0.710797,
154
+ "precision": 0.674699,
155
+ "precision_weighted": 0.751818,
156
+ "recall": 0.78048,
157
+ "recall_weighted": 0.720242,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.669805,
163
+ "f1": 0.647837,
164
+ "f1_weighted": 0.656717,
165
+ "precision": 0.630074,
166
+ "precision_weighted": 0.720754,
167
+ "recall": 0.761146,
168
+ "recall_weighted": 0.669805,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.704438,
174
+ "f1": 0.674874,
175
+ "f1_weighted": 0.691806,
176
+ "precision": 0.661647,
177
+ "precision_weighted": 0.753333,
178
+ "recall": 0.76496,
179
+ "recall_weighted": 0.704438,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.685945,
185
+ "f1": 0.667464,
186
+ "f1_weighted": 0.667001,
187
+ "precision": 0.659033,
188
+ "precision_weighted": 0.731593,
189
+ "recall": 0.765488,
190
+ "recall_weighted": 0.685945,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.66308,
196
+ "f1": 0.659005,
197
+ "f1_weighted": 0.648718,
198
+ "precision": 0.645573,
199
+ "precision_weighted": 0.724147,
200
+ "recall": 0.763081,
201
+ "recall_weighted": 0.66308,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.666443,
207
+ "f1": 0.654203,
208
+ "f1_weighted": 0.653545,
209
+ "precision": 0.637705,
210
+ "precision_weighted": 0.712659,
211
+ "recall": 0.760928,
212
+ "recall_weighted": 0.666443,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.671486,
218
+ "f1": 0.6545,
219
+ "f1_weighted": 0.652806,
220
+ "precision": 0.646503,
221
+ "precision_weighted": 0.733291,
222
+ "recall": 0.760764,
223
+ "recall_weighted": 0.671486,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.659718,
229
+ "f1": 0.660322,
230
+ "f1_weighted": 0.630015,
231
+ "precision": 0.657448,
232
+ "precision_weighted": 0.721007,
233
+ "recall": 0.765019,
234
+ "recall_weighted": 0.659718,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.694687,
240
+ "f1": 0.687019,
241
+ "f1_weighted": 0.68426,
242
+ "precision": 0.670248,
243
+ "precision_weighted": 0.733429,
244
+ "recall": 0.77893,
245
+ "recall_weighted": 0.694687,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.681876,
251
+ "f1": 0.667328,
252
+ "f1_weighted": 0.666472,
253
+ "precision": 0.654112,
254
+ "precision_weighted": 0.730966,
255
+ "recall": 0.766116,
256
+ "recall_weighted": 0.681876,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.681876,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 169.6871576309204,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MassiveScenarioClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "fad2c6e8459f9e1c45d9315f4953d921437d70f8",
3
+ "task_name": "MassiveScenarioClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.750615,
11
+ "f1": 0.736523,
12
+ "f1_weighted": 0.749691,
13
+ "precision": 0.716925,
14
+ "precision_weighted": 0.774467,
15
+ "recall": 0.788157,
16
+ "recall_weighted": 0.750615,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.751599,
22
+ "f1": 0.735477,
23
+ "f1_weighted": 0.751625,
24
+ "precision": 0.716377,
25
+ "precision_weighted": 0.791761,
26
+ "recall": 0.801039,
27
+ "recall_weighted": 0.751599,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.727988,
33
+ "f1": 0.710607,
34
+ "f1_weighted": 0.728347,
35
+ "precision": 0.698403,
36
+ "precision_weighted": 0.773187,
37
+ "recall": 0.776368,
38
+ "recall_weighted": 0.727988,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.729464,
44
+ "f1": 0.717596,
45
+ "f1_weighted": 0.727545,
46
+ "precision": 0.70483,
47
+ "precision_weighted": 0.764252,
48
+ "recall": 0.776949,
49
+ "recall_weighted": 0.729464,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.737334,
55
+ "f1": 0.714665,
56
+ "f1_weighted": 0.728783,
57
+ "precision": 0.703063,
58
+ "precision_weighted": 0.768826,
59
+ "recall": 0.781151,
60
+ "recall_weighted": 0.737334,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.701426,
66
+ "f1": 0.687642,
67
+ "f1_weighted": 0.695884,
68
+ "precision": 0.677127,
69
+ "precision_weighted": 0.752077,
70
+ "recall": 0.756963,
71
+ "recall_weighted": 0.701426,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.730939,
77
+ "f1": 0.716494,
78
+ "f1_weighted": 0.734151,
79
+ "precision": 0.703237,
80
+ "precision_weighted": 0.772655,
81
+ "recall": 0.770101,
82
+ "recall_weighted": 0.730939,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.721102,
88
+ "f1": 0.708714,
89
+ "f1_weighted": 0.722325,
90
+ "precision": 0.693417,
91
+ "precision_weighted": 0.761375,
92
+ "recall": 0.766744,
93
+ "recall_weighted": 0.721102,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.739793,
99
+ "f1": 0.73448,
100
+ "f1_weighted": 0.738802,
101
+ "precision": 0.723706,
102
+ "precision_weighted": 0.775928,
103
+ "recall": 0.789352,
104
+ "recall_weighted": 0.739793,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.732415,
110
+ "f1": 0.721252,
111
+ "f1_weighted": 0.733595,
112
+ "precision": 0.709303,
113
+ "precision_weighted": 0.770662,
114
+ "recall": 0.779172,
115
+ "recall_weighted": 0.732415,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.732268,
121
+ "f1": 0.718345,
122
+ "f1_weighted": 0.731075,
123
+ "precision": 0.704639,
124
+ "precision_weighted": 0.770519,
125
+ "recall": 0.7786,
126
+ "recall_weighted": 0.732268,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.732268,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.745124,
141
+ "f1": 0.734323,
142
+ "f1_weighted": 0.74268,
143
+ "precision": 0.713857,
144
+ "precision_weighted": 0.77044,
145
+ "recall": 0.79113,
146
+ "recall_weighted": 0.745124,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.748823,
152
+ "f1": 0.731421,
153
+ "f1_weighted": 0.74421,
154
+ "precision": 0.709751,
155
+ "precision_weighted": 0.777606,
156
+ "recall": 0.795404,
157
+ "recall_weighted": 0.748823,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.742771,
163
+ "f1": 0.724449,
164
+ "f1_weighted": 0.740021,
165
+ "precision": 0.705501,
166
+ "precision_weighted": 0.770977,
167
+ "recall": 0.781541,
168
+ "recall_weighted": 0.742771,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.737054,
174
+ "f1": 0.724753,
175
+ "f1_weighted": 0.73541,
176
+ "precision": 0.71329,
177
+ "precision_weighted": 0.772376,
178
+ "recall": 0.780801,
179
+ "recall_weighted": 0.737054,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.728985,
185
+ "f1": 0.715537,
186
+ "f1_weighted": 0.722318,
187
+ "precision": 0.702076,
188
+ "precision_weighted": 0.7627,
189
+ "recall": 0.779263,
190
+ "recall_weighted": 0.728985,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.710155,
196
+ "f1": 0.688617,
197
+ "f1_weighted": 0.70397,
198
+ "precision": 0.679415,
199
+ "precision_weighted": 0.755518,
200
+ "recall": 0.753396,
201
+ "recall_weighted": 0.710155,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.72495,
207
+ "f1": 0.710935,
208
+ "f1_weighted": 0.725437,
209
+ "precision": 0.69525,
210
+ "precision_weighted": 0.77026,
211
+ "recall": 0.775474,
212
+ "recall_weighted": 0.72495,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.714862,
218
+ "f1": 0.706409,
219
+ "f1_weighted": 0.716233,
220
+ "precision": 0.695636,
221
+ "precision_weighted": 0.75582,
222
+ "recall": 0.76044,
223
+ "recall_weighted": 0.714862,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.73302,
229
+ "f1": 0.728273,
230
+ "f1_weighted": 0.72994,
231
+ "precision": 0.714064,
232
+ "precision_weighted": 0.766445,
233
+ "recall": 0.785538,
234
+ "recall_weighted": 0.73302,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.721587,
240
+ "f1": 0.713442,
241
+ "f1_weighted": 0.72304,
242
+ "precision": 0.703141,
243
+ "precision_weighted": 0.765228,
244
+ "recall": 0.771423,
245
+ "recall_weighted": 0.721587,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.730733,
251
+ "f1": 0.717816,
252
+ "f1_weighted": 0.728326,
253
+ "precision": 0.703198,
254
+ "precision_weighted": 0.766737,
255
+ "recall": 0.777441,
256
+ "recall_weighted": 0.730733,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.730733,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 77.37549757957458,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MedrxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e7a26af6f3ae46b30dde8737f02c07b1505bcc73",
3
+ "task_name": "MedrxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.320189,
9
+ "v_measure_std": 0.016559,
10
+ "v_measures": [
11
+ 0.296823,
12
+ 0.304752,
13
+ 0.305142,
14
+ 0.305147,
15
+ 0.312206,
16
+ 0.329949,
17
+ 0.331446,
18
+ 0.346814,
19
+ 0.341258,
20
+ 0.328349
21
+ ],
22
+ "main_score": 0.320189,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 239.82811045646667,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MedrxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "35191c8c0dca72d8ff3efcd72aa802307d469663",
3
+ "task_name": "MedrxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.292197,
9
+ "v_measure_std": 0.016323,
10
+ "v_measures": [
11
+ 0.282038,
12
+ 0.273404,
13
+ 0.272545,
14
+ 0.277393,
15
+ 0.278717,
16
+ 0.313876,
17
+ 0.299438,
18
+ 0.306332,
19
+ 0.317065,
20
+ 0.301159
21
+ ],
22
+ "main_score": 0.292197,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 230.06558442115784,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MindSmallReranking.json ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "227478e3235572039f4f7661840e059f31ef6eb1",
3
+ "task_name": "MindSmallReranking",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.1327,
9
+ "ndcg_at_3": 0.20927,
10
+ "ndcg_at_5": 0.25639,
11
+ "ndcg_at_10": 0.31829,
12
+ "ndcg_at_20": 0.3708,
13
+ "ndcg_at_100": 0.43283,
14
+ "ndcg_at_1000": 0.43619,
15
+ "map_at_1": 0.0999,
16
+ "map_at_3": 0.17119,
17
+ "map_at_5": 0.19926,
18
+ "map_at_10": 0.22717,
19
+ "map_at_20": 0.24442,
20
+ "map_at_100": 0.25782,
21
+ "map_at_1000": 0.25824,
22
+ "recall_at_1": 0.0999,
23
+ "recall_at_3": 0.25983,
24
+ "recall_at_5": 0.37223,
25
+ "recall_at_10": 0.54684,
26
+ "recall_at_20": 0.72545,
27
+ "recall_at_100": 0.98374,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.0999,
30
+ "precision_at_1": 0.1327,
31
+ "precision_at_3": 0.1165,
32
+ "precision_at_5": 0.10256,
33
+ "precision_at_10": 0.07944,
34
+ "precision_at_20": 0.05634,
35
+ "precision_at_100": 0.01767,
36
+ "precision_at_1000": 0.00183,
37
+ "mrr_at_1": 0.132698,
38
+ "mrr_at_3": 0.217654,
39
+ "mrr_at_5": 0.247864,
40
+ "mrr_at_10": 0.273388,
41
+ "mrr_at_20": 0.285332,
42
+ "mrr_at_100": 0.290515,
43
+ "mrr_at_1000": 0.29056,
44
+ "nauc_ndcg_at_1_max": -0.081708,
45
+ "nauc_ndcg_at_1_std": 0.02064,
46
+ "nauc_ndcg_at_1_diff1": 0.102634,
47
+ "nauc_ndcg_at_3_max": -0.17665,
48
+ "nauc_ndcg_at_3_std": 0.001483,
49
+ "nauc_ndcg_at_3_diff1": 0.123902,
50
+ "nauc_ndcg_at_5_max": -0.208195,
51
+ "nauc_ndcg_at_5_std": 0.005442,
52
+ "nauc_ndcg_at_5_diff1": 0.126805,
53
+ "nauc_ndcg_at_10_max": -0.239379,
54
+ "nauc_ndcg_at_10_std": 0.008363,
55
+ "nauc_ndcg_at_10_diff1": 0.125178,
56
+ "nauc_ndcg_at_20_max": -0.249666,
57
+ "nauc_ndcg_at_20_std": 0.013012,
58
+ "nauc_ndcg_at_20_diff1": 0.122677,
59
+ "nauc_ndcg_at_100_max": -0.192326,
60
+ "nauc_ndcg_at_100_std": 0.013436,
61
+ "nauc_ndcg_at_100_diff1": 0.116273,
62
+ "nauc_ndcg_at_1000_max": -0.181592,
63
+ "nauc_ndcg_at_1000_std": 0.011769,
64
+ "nauc_ndcg_at_1000_diff1": 0.115098,
65
+ "nauc_map_at_1_max": -0.146464,
66
+ "nauc_map_at_1_std": -0.003813,
67
+ "nauc_map_at_1_diff1": 0.124138,
68
+ "nauc_map_at_3_max": -0.189751,
69
+ "nauc_map_at_3_std": -0.005117,
70
+ "nauc_map_at_3_diff1": 0.130425,
71
+ "nauc_map_at_5_max": -0.205115,
72
+ "nauc_map_at_5_std": -0.000647,
73
+ "nauc_map_at_5_diff1": 0.130707,
74
+ "nauc_map_at_10_max": -0.217777,
75
+ "nauc_map_at_10_std": 0.002196,
76
+ "nauc_map_at_10_diff1": 0.128912,
77
+ "nauc_map_at_20_max": -0.219413,
78
+ "nauc_map_at_20_std": 0.004418,
79
+ "nauc_map_at_20_diff1": 0.127419,
80
+ "nauc_map_at_100_max": -0.20762,
81
+ "nauc_map_at_100_std": 0.005188,
82
+ "nauc_map_at_100_diff1": 0.125842,
83
+ "nauc_map_at_1000_max": -0.206501,
84
+ "nauc_map_at_1000_std": 0.005022,
85
+ "nauc_map_at_1000_diff1": 0.125725,
86
+ "nauc_recall_at_1_max": -0.146464,
87
+ "nauc_recall_at_1_std": -0.003813,
88
+ "nauc_recall_at_1_diff1": 0.124138,
89
+ "nauc_recall_at_3_max": -0.215306,
90
+ "nauc_recall_at_3_std": -0.006464,
91
+ "nauc_recall_at_3_diff1": 0.128655,
92
+ "nauc_recall_at_5_max": -0.258214,
93
+ "nauc_recall_at_5_std": 0.001474,
94
+ "nauc_recall_at_5_diff1": 0.128479,
95
+ "nauc_recall_at_10_max": -0.333419,
96
+ "nauc_recall_at_10_std": 0.005467,
97
+ "nauc_recall_at_10_diff1": 0.123698,
98
+ "nauc_recall_at_20_max": -0.426621,
99
+ "nauc_recall_at_20_std": 0.018412,
100
+ "nauc_recall_at_20_diff1": 0.120755,
101
+ "nauc_recall_at_100_max": -0.747826,
102
+ "nauc_recall_at_100_std": 0.10962,
103
+ "nauc_recall_at_100_diff1": 0.146903,
104
+ "nauc_recall_at_1000_max": -0.510551,
105
+ "nauc_recall_at_1000_std": -0.038002,
106
+ "nauc_recall_at_1000_diff1": 0.228245,
107
+ "nauc_precision_at_1_max": -0.081708,
108
+ "nauc_precision_at_1_std": 0.02064,
109
+ "nauc_precision_at_1_diff1": 0.102634,
110
+ "nauc_precision_at_3_max": -0.134086,
111
+ "nauc_precision_at_3_std": 0.026447,
112
+ "nauc_precision_at_3_diff1": 0.102671,
113
+ "nauc_precision_at_5_max": -0.15012,
114
+ "nauc_precision_at_5_std": 0.03958,
115
+ "nauc_precision_at_5_diff1": 0.09172,
116
+ "nauc_precision_at_10_max": -0.130677,
117
+ "nauc_precision_at_10_std": 0.0493,
118
+ "nauc_precision_at_10_diff1": 0.054265,
119
+ "nauc_precision_at_20_max": -0.040719,
120
+ "nauc_precision_at_20_std": 0.055028,
121
+ "nauc_precision_at_20_diff1": 0.004641,
122
+ "nauc_precision_at_100_max": 0.22898,
123
+ "nauc_precision_at_100_std": 0.027213,
124
+ "nauc_precision_at_100_diff1": -0.065963,
125
+ "nauc_precision_at_1000_max": 0.256892,
126
+ "nauc_precision_at_1000_std": 0.019808,
127
+ "nauc_precision_at_1000_diff1": -0.070053,
128
+ "nauc_mrr_at_1_max": -0.081708,
129
+ "nauc_mrr_at_1_std": 0.02064,
130
+ "nauc_mrr_at_1_diff1": 0.102634,
131
+ "nauc_mrr_at_3_max": -0.123748,
132
+ "nauc_mrr_at_3_std": 0.015187,
133
+ "nauc_mrr_at_3_diff1": 0.107211,
134
+ "nauc_mrr_at_5_max": -0.1372,
135
+ "nauc_mrr_at_5_std": 0.017112,
136
+ "nauc_mrr_at_5_diff1": 0.107824,
137
+ "nauc_mrr_at_10_max": -0.146242,
138
+ "nauc_mrr_at_10_std": 0.017397,
139
+ "nauc_mrr_at_10_diff1": 0.107443,
140
+ "nauc_mrr_at_20_max": -0.146814,
141
+ "nauc_mrr_at_20_std": 0.0176,
142
+ "nauc_mrr_at_20_diff1": 0.107332,
143
+ "nauc_mrr_at_100_max": -0.143361,
144
+ "nauc_mrr_at_100_std": 0.017502,
145
+ "nauc_mrr_at_100_diff1": 0.107325,
146
+ "nauc_mrr_at_1000_max": -0.143258,
147
+ "nauc_mrr_at_1000_std": 0.017483,
148
+ "nauc_mrr_at_1000_diff1": 0.10732,
149
+ "hit_rate_at_1": 0.1327,
150
+ "hit_rate_at_3": 0.3305,
151
+ "hit_rate_at_5": 0.46334,
152
+ "hit_rate_at_10": 0.65409,
153
+ "hit_rate_at_20": 0.82372,
154
+ "hit_rate_at_100": 0.9945,
155
+ "hit_rate_at_1000": 1.0,
156
+ "max_over_subqueries_ndcg_at_1": 0.1628,
157
+ "max_over_subqueries_ndcg_at_3": 0.26314,
158
+ "max_over_subqueries_ndcg_at_5": 0.3157,
159
+ "max_over_subqueries_ndcg_at_10": 0.37719,
160
+ "max_over_subqueries_ndcg_at_20": 0.42295,
161
+ "max_over_subqueries_ndcg_at_100": 0.46808,
162
+ "max_over_subqueries_ndcg_at_1000": 0.46993,
163
+ "max_over_subqueries_map_at_1": 0.13509,
164
+ "max_over_subqueries_map_at_3": 0.22342,
165
+ "max_over_subqueries_map_at_5": 0.25415,
166
+ "max_over_subqueries_map_at_10": 0.28175,
167
+ "max_over_subqueries_map_at_20": 0.29665,
168
+ "max_over_subqueries_map_at_100": 0.30602,
169
+ "max_over_subqueries_map_at_1000": 0.30621,
170
+ "max_over_subqueries_recall_at_1": 0.13509,
171
+ "max_over_subqueries_recall_at_3": 0.33317,
172
+ "max_over_subqueries_recall_at_5": 0.45724,
173
+ "max_over_subqueries_recall_at_10": 0.63271,
174
+ "max_over_subqueries_recall_at_20": 0.79259,
175
+ "max_over_subqueries_recall_at_100": 0.99017,
176
+ "max_over_subqueries_recall_at_1000": 0.99999,
177
+ "max_over_subqueries_accuracy": 0.13509,
178
+ "max_over_subqueries_precision_at_1": 0.1628,
179
+ "max_over_subqueries_precision_at_3": 0.13639,
180
+ "max_over_subqueries_precision_at_5": 0.11474,
181
+ "max_over_subqueries_precision_at_10": 0.08312,
182
+ "max_over_subqueries_precision_at_20": 0.0548,
183
+ "max_over_subqueries_precision_at_100": 0.01494,
184
+ "max_over_subqueries_precision_at_1000": 0.00152,
185
+ "max_over_subqueries_mrr_at_1_max": -0.095877,
186
+ "max_over_subqueries_mrr_at_1_std": 0.023798,
187
+ "max_over_subqueries_mrr_at_1_diff1": 0.11729,
188
+ "max_over_subqueries_mrr_at_3_max": -0.166541,
189
+ "max_over_subqueries_mrr_at_3_std": 0.010753,
190
+ "max_over_subqueries_mrr_at_3_diff1": 0.100723,
191
+ "max_over_subqueries_mrr_at_5_max": -0.172377,
192
+ "max_over_subqueries_mrr_at_5_std": 0.015844,
193
+ "max_over_subqueries_mrr_at_5_diff1": 0.086746,
194
+ "max_over_subqueries_mrr_at_10_max": -0.106299,
195
+ "max_over_subqueries_mrr_at_10_std": 0.030596,
196
+ "max_over_subqueries_mrr_at_10_diff1": 0.04075,
197
+ "max_over_subqueries_mrr_at_20_max": 0.049465,
198
+ "max_over_subqueries_mrr_at_20_std": 0.071834,
199
+ "max_over_subqueries_mrr_at_20_diff1": -0.00227,
200
+ "max_over_subqueries_mrr_at_100_max": 0.367398,
201
+ "max_over_subqueries_mrr_at_100_std": 0.120132,
202
+ "max_over_subqueries_mrr_at_100_diff1": -0.035885,
203
+ "max_over_subqueries_mrr_at_1000_max": 0.388349,
204
+ "max_over_subqueries_mrr_at_1000_std": 0.113675,
205
+ "max_over_subqueries_mrr_at_1000_diff1": -0.035731,
206
+ "max_over_subqueries_mrr_at_1": 0.162804,
207
+ "max_over_subqueries_mrr_at_3": 0.261047,
208
+ "max_over_subqueries_mrr_at_5": 0.292386,
209
+ "max_over_subqueries_mrr_at_10": 0.31674,
210
+ "max_over_subqueries_mrr_at_20": 0.327097,
211
+ "max_over_subqueries_mrr_at_100": 0.331358,
212
+ "max_over_subqueries_mrr_at_1000": 0.331393,
213
+ "max_over_subqueries_nauc_mrr_at_1_max": -0.095877,
214
+ "max_over_subqueries_nauc_mrr_at_1_std": 0.023798,
215
+ "max_over_subqueries_nauc_mrr_at_1_diff1": 0.11729,
216
+ "max_over_subqueries_nauc_mrr_at_3_max": -0.147125,
217
+ "max_over_subqueries_nauc_mrr_at_3_std": 0.008596,
218
+ "max_over_subqueries_nauc_mrr_at_3_diff1": 0.113741,
219
+ "max_over_subqueries_nauc_mrr_at_5_max": -0.15863,
220
+ "max_over_subqueries_nauc_mrr_at_5_std": 0.007266,
221
+ "max_over_subqueries_nauc_mrr_at_5_diff1": 0.113937,
222
+ "max_over_subqueries_nauc_mrr_at_10_max": -0.163444,
223
+ "max_over_subqueries_nauc_mrr_at_10_std": 0.006491,
224
+ "max_over_subqueries_nauc_mrr_at_10_diff1": 0.112549,
225
+ "max_over_subqueries_nauc_mrr_at_20_max": -0.161817,
226
+ "max_over_subqueries_nauc_mrr_at_20_std": 0.007606,
227
+ "max_over_subqueries_nauc_mrr_at_20_diff1": 0.112904,
228
+ "max_over_subqueries_nauc_mrr_at_100_max": -0.159176,
229
+ "max_over_subqueries_nauc_mrr_at_100_std": 0.008021,
230
+ "max_over_subqueries_nauc_mrr_at_100_diff1": 0.113261,
231
+ "max_over_subqueries_nauc_mrr_at_1000_max": -0.15912,
232
+ "max_over_subqueries_nauc_mrr_at_1000_std": 0.008007,
233
+ "max_over_subqueries_nauc_mrr_at_1000_diff1": 0.113271,
234
+ "max_over_subqueries_hit_rate_at_1": 0.1628,
235
+ "max_over_subqueries_hit_rate_at_3": 0.39051,
236
+ "max_over_subqueries_hit_rate_at_5": 0.52791,
237
+ "max_over_subqueries_hit_rate_at_10": 0.70977,
238
+ "max_over_subqueries_hit_rate_at_20": 0.85624,
239
+ "max_over_subqueries_hit_rate_at_100": 0.99562,
240
+ "max_over_subqueries_hit_rate_at_1000": 0.99999,
241
+ "main_score": 0.30621,
242
+ "hf_subset": "default",
243
+ "languages": [
244
+ "eng-Latn"
245
+ ]
246
+ }
247
+ ]
248
+ },
249
+ "evaluation_time": 20952.249630451202,
250
+ "kg_co2_emissions": null,
251
+ "date": null
252
+ }
results/NFCorpus.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ec0fa4fe99da2ff19ca1214b7966684033a58814",
3
+ "task_name": "NFCorpus",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.37616,
9
+ "ndcg_at_3": 0.33585,
10
+ "ndcg_at_5": 0.32471,
11
+ "ndcg_at_10": 0.30352,
12
+ "ndcg_at_20": 0.28323,
13
+ "ndcg_at_100": 0.28527,
14
+ "ndcg_at_1000": 0.37648,
15
+ "map_at_1": 0.03986,
16
+ "map_at_3": 0.06674,
17
+ "map_at_5": 0.08213,
18
+ "map_at_10": 0.10094,
19
+ "map_at_20": 0.11347,
20
+ "map_at_100": 0.13336,
21
+ "map_at_1000": 0.14815,
22
+ "recall_at_1": 0.03986,
23
+ "recall_at_3": 0.07646,
24
+ "recall_at_5": 0.10717,
25
+ "recall_at_10": 0.14684,
26
+ "recall_at_20": 0.18805,
27
+ "recall_at_100": 0.30525,
28
+ "recall_at_1000": 0.62948,
29
+ "accuracy": 0.03986,
30
+ "precision_at_1": 0.39319,
31
+ "precision_at_3": 0.32301,
32
+ "precision_at_5": 0.28978,
33
+ "precision_at_10": 0.23777,
34
+ "precision_at_20": 0.17585,
35
+ "precision_at_100": 0.07963,
36
+ "precision_at_1000": 0.02108,
37
+ "mrr_at_1": 0.393189,
38
+ "mrr_at_3": 0.457688,
39
+ "mrr_at_5": 0.472394,
40
+ "mrr_at_10": 0.483244,
41
+ "mrr_at_20": 0.487134,
42
+ "mrr_at_100": 0.489953,
43
+ "mrr_at_1000": 0.490448,
44
+ "nauc_ndcg_at_1_max": 0.529601,
45
+ "nauc_ndcg_at_1_std": 0.23544,
46
+ "nauc_ndcg_at_1_diff1": 0.348504,
47
+ "nauc_ndcg_at_3_max": 0.52427,
48
+ "nauc_ndcg_at_3_std": 0.261429,
49
+ "nauc_ndcg_at_3_diff1": 0.218641,
50
+ "nauc_ndcg_at_5_max": 0.514829,
51
+ "nauc_ndcg_at_5_std": 0.269241,
52
+ "nauc_ndcg_at_5_diff1": 0.178952,
53
+ "nauc_ndcg_at_10_max": 0.520771,
54
+ "nauc_ndcg_at_10_std": 0.29523,
55
+ "nauc_ndcg_at_10_diff1": 0.161803,
56
+ "nauc_ndcg_at_20_max": 0.519382,
57
+ "nauc_ndcg_at_20_std": 0.318275,
58
+ "nauc_ndcg_at_20_diff1": 0.154467,
59
+ "nauc_ndcg_at_100_max": 0.514363,
60
+ "nauc_ndcg_at_100_std": 0.348981,
61
+ "nauc_ndcg_at_100_diff1": 0.169456,
62
+ "nauc_ndcg_at_1000_max": 0.556651,
63
+ "nauc_ndcg_at_1000_std": 0.402799,
64
+ "nauc_ndcg_at_1000_diff1": 0.180451,
65
+ "nauc_map_at_1_max": 0.214318,
66
+ "nauc_map_at_1_std": -0.074962,
67
+ "nauc_map_at_1_diff1": 0.456975,
68
+ "nauc_map_at_3_max": 0.275051,
69
+ "nauc_map_at_3_std": 0.000398,
70
+ "nauc_map_at_3_diff1": 0.336704,
71
+ "nauc_map_at_5_max": 0.279643,
72
+ "nauc_map_at_5_std": 0.014286,
73
+ "nauc_map_at_5_diff1": 0.264257,
74
+ "nauc_map_at_10_max": 0.32991,
75
+ "nauc_map_at_10_std": 0.056649,
76
+ "nauc_map_at_10_diff1": 0.233588,
77
+ "nauc_map_at_20_max": 0.362824,
78
+ "nauc_map_at_20_std": 0.101464,
79
+ "nauc_map_at_20_diff1": 0.21873,
80
+ "nauc_map_at_100_max": 0.407759,
81
+ "nauc_map_at_100_std": 0.183005,
82
+ "nauc_map_at_100_diff1": 0.198675,
83
+ "nauc_map_at_1000_max": 0.427592,
84
+ "nauc_map_at_1000_std": 0.229093,
85
+ "nauc_map_at_1000_diff1": 0.186305,
86
+ "nauc_recall_at_1_max": 0.214318,
87
+ "nauc_recall_at_1_std": -0.074962,
88
+ "nauc_recall_at_1_diff1": 0.456975,
89
+ "nauc_recall_at_3_max": 0.264126,
90
+ "nauc_recall_at_3_std": 0.005225,
91
+ "nauc_recall_at_3_diff1": 0.289297,
92
+ "nauc_recall_at_5_max": 0.206584,
93
+ "nauc_recall_at_5_std": -0.008071,
94
+ "nauc_recall_at_5_diff1": 0.170944,
95
+ "nauc_recall_at_10_max": 0.254833,
96
+ "nauc_recall_at_10_std": 0.032072,
97
+ "nauc_recall_at_10_diff1": 0.166475,
98
+ "nauc_recall_at_20_max": 0.255375,
99
+ "nauc_recall_at_20_std": 0.076962,
100
+ "nauc_recall_at_20_diff1": 0.14674,
101
+ "nauc_recall_at_100_max": 0.264781,
102
+ "nauc_recall_at_100_std": 0.201009,
103
+ "nauc_recall_at_100_diff1": 0.097425,
104
+ "nauc_recall_at_1000_max": 0.249207,
105
+ "nauc_recall_at_1000_std": 0.301,
106
+ "nauc_recall_at_1000_diff1": 0.067165,
107
+ "nauc_precision_at_1_max": 0.541908,
108
+ "nauc_precision_at_1_std": 0.237884,
109
+ "nauc_precision_at_1_diff1": 0.341778,
110
+ "nauc_precision_at_3_max": 0.515139,
111
+ "nauc_precision_at_3_std": 0.301521,
112
+ "nauc_precision_at_3_diff1": 0.116484,
113
+ "nauc_precision_at_5_max": 0.50138,
114
+ "nauc_precision_at_5_std": 0.324079,
115
+ "nauc_precision_at_5_diff1": 0.049677,
116
+ "nauc_precision_at_10_max": 0.498338,
117
+ "nauc_precision_at_10_std": 0.364754,
118
+ "nauc_precision_at_10_diff1": 0.020622,
119
+ "nauc_precision_at_20_max": 0.486238,
120
+ "nauc_precision_at_20_std": 0.438763,
121
+ "nauc_precision_at_20_diff1": -0.024567,
122
+ "nauc_precision_at_100_max": 0.373518,
123
+ "nauc_precision_at_100_std": 0.505594,
124
+ "nauc_precision_at_100_diff1": -0.078045,
125
+ "nauc_precision_at_1000_max": 0.256391,
126
+ "nauc_precision_at_1000_std": 0.425971,
127
+ "nauc_precision_at_1000_diff1": -0.118528,
128
+ "nauc_mrr_at_1_max": 0.541908,
129
+ "nauc_mrr_at_1_std": 0.237884,
130
+ "nauc_mrr_at_1_diff1": 0.341778,
131
+ "nauc_mrr_at_3_max": 0.57666,
132
+ "nauc_mrr_at_3_std": 0.284277,
133
+ "nauc_mrr_at_3_diff1": 0.306146,
134
+ "nauc_mrr_at_5_max": 0.580285,
135
+ "nauc_mrr_at_5_std": 0.284516,
136
+ "nauc_mrr_at_5_diff1": 0.304687,
137
+ "nauc_mrr_at_10_max": 0.584743,
138
+ "nauc_mrr_at_10_std": 0.297015,
139
+ "nauc_mrr_at_10_diff1": 0.300633,
140
+ "nauc_mrr_at_20_max": 0.586217,
141
+ "nauc_mrr_at_20_std": 0.299881,
142
+ "nauc_mrr_at_20_diff1": 0.301003,
143
+ "nauc_mrr_at_100_max": 0.58601,
144
+ "nauc_mrr_at_100_std": 0.300557,
145
+ "nauc_mrr_at_100_diff1": 0.300382,
146
+ "nauc_mrr_at_1000_max": 0.585767,
147
+ "nauc_mrr_at_1000_std": 0.300093,
148
+ "nauc_mrr_at_1000_diff1": 0.300522,
149
+ "hit_rate_at_1": 0.39319,
150
+ "hit_rate_at_3": 0.5418,
151
+ "hit_rate_at_5": 0.60372,
152
+ "hit_rate_at_10": 0.68421,
153
+ "hit_rate_at_20": 0.73684,
154
+ "hit_rate_at_100": 0.83901,
155
+ "hit_rate_at_1000": 0.95356,
156
+ "main_score": 0.30352,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 33.98207950592041,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/NQ.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b774495ed302d8c44a3a7ea25c90dbce03968f31",
3
+ "task_name": "NQ",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.33951,
9
+ "ndcg_at_3": 0.43457,
10
+ "ndcg_at_5": 0.47221,
11
+ "ndcg_at_10": 0.50709,
12
+ "ndcg_at_20": 0.52863,
13
+ "ndcg_at_100": 0.55254,
14
+ "ndcg_at_1000": 0.56244,
15
+ "map_at_1": 0.30453,
16
+ "map_at_3": 0.39928,
17
+ "map_at_5": 0.42211,
18
+ "map_at_10": 0.43813,
19
+ "map_at_20": 0.4448,
20
+ "map_at_100": 0.44863,
21
+ "map_at_1000": 0.44905,
22
+ "recall_at_1": 0.30453,
23
+ "recall_at_3": 0.50403,
24
+ "recall_at_5": 0.5905,
25
+ "recall_at_10": 0.69153,
26
+ "recall_at_20": 0.77091,
27
+ "recall_at_100": 0.89098,
28
+ "recall_at_1000": 0.9641,
29
+ "accuracy": 0.30453,
30
+ "precision_at_1": 0.33951,
31
+ "precision_at_3": 0.19409,
32
+ "precision_at_5": 0.13841,
33
+ "precision_at_10": 0.08204,
34
+ "precision_at_20": 0.04612,
35
+ "precision_at_100": 0.01077,
36
+ "precision_at_1000": 0.00117,
37
+ "mrr_at_1": 0.339803,
38
+ "mrr_at_3": 0.432165,
39
+ "mrr_at_5": 0.450835,
40
+ "mrr_at_10": 0.463697,
41
+ "mrr_at_20": 0.469005,
42
+ "mrr_at_100": 0.471857,
43
+ "mrr_at_1000": 0.472159,
44
+ "nauc_ndcg_at_1_max": 0.270025,
45
+ "nauc_ndcg_at_1_std": -0.026849,
46
+ "nauc_ndcg_at_1_diff1": 0.349592,
47
+ "nauc_ndcg_at_3_max": 0.280558,
48
+ "nauc_ndcg_at_3_std": -0.022572,
49
+ "nauc_ndcg_at_3_diff1": 0.310399,
50
+ "nauc_ndcg_at_5_max": 0.29647,
51
+ "nauc_ndcg_at_5_std": -0.009819,
52
+ "nauc_ndcg_at_5_diff1": 0.310482,
53
+ "nauc_ndcg_at_10_max": 0.311006,
54
+ "nauc_ndcg_at_10_std": 0.004916,
55
+ "nauc_ndcg_at_10_diff1": 0.315014,
56
+ "nauc_ndcg_at_20_max": 0.316541,
57
+ "nauc_ndcg_at_20_std": 0.019375,
58
+ "nauc_ndcg_at_20_diff1": 0.310826,
59
+ "nauc_ndcg_at_100_max": 0.317248,
60
+ "nauc_ndcg_at_100_std": 0.025329,
61
+ "nauc_ndcg_at_100_diff1": 0.309193,
62
+ "nauc_ndcg_at_1000_max": 0.313143,
63
+ "nauc_ndcg_at_1000_std": 0.019623,
64
+ "nauc_ndcg_at_1000_diff1": 0.311716,
65
+ "nauc_map_at_1_max": 0.251462,
66
+ "nauc_map_at_1_std": -0.049018,
67
+ "nauc_map_at_1_diff1": 0.354785,
68
+ "nauc_map_at_3_max": 0.273375,
69
+ "nauc_map_at_3_std": -0.032462,
70
+ "nauc_map_at_3_diff1": 0.319255,
71
+ "nauc_map_at_5_max": 0.283607,
72
+ "nauc_map_at_5_std": -0.024037,
73
+ "nauc_map_at_5_diff1": 0.318974,
74
+ "nauc_map_at_10_max": 0.290281,
75
+ "nauc_map_at_10_std": -0.01741,
76
+ "nauc_map_at_10_diff1": 0.321299,
77
+ "nauc_map_at_20_max": 0.291684,
78
+ "nauc_map_at_20_std": -0.013166,
79
+ "nauc_map_at_20_diff1": 0.320553,
80
+ "nauc_map_at_100_max": 0.291961,
81
+ "nauc_map_at_100_std": -0.012082,
82
+ "nauc_map_at_100_diff1": 0.320184,
83
+ "nauc_map_at_1000_max": 0.291836,
84
+ "nauc_map_at_1000_std": -0.012211,
85
+ "nauc_map_at_1000_diff1": 0.320286,
86
+ "nauc_recall_at_1_max": 0.251462,
87
+ "nauc_recall_at_1_std": -0.049018,
88
+ "nauc_recall_at_1_diff1": 0.354785,
89
+ "nauc_recall_at_3_max": 0.281935,
90
+ "nauc_recall_at_3_std": -0.015814,
91
+ "nauc_recall_at_3_diff1": 0.278599,
92
+ "nauc_recall_at_5_max": 0.316856,
93
+ "nauc_recall_at_5_std": 0.011386,
94
+ "nauc_recall_at_5_diff1": 0.274985,
95
+ "nauc_recall_at_10_max": 0.369207,
96
+ "nauc_recall_at_10_std": 0.063566,
97
+ "nauc_recall_at_10_diff1": 0.282917,
98
+ "nauc_recall_at_20_max": 0.414031,
99
+ "nauc_recall_at_20_std": 0.155072,
100
+ "nauc_recall_at_20_diff1": 0.250797,
101
+ "nauc_recall_at_100_max": 0.516506,
102
+ "nauc_recall_at_100_std": 0.343967,
103
+ "nauc_recall_at_100_diff1": 0.198863,
104
+ "nauc_recall_at_1000_max": 0.689949,
105
+ "nauc_recall_at_1000_std": 0.621901,
106
+ "nauc_recall_at_1000_diff1": 0.139196,
107
+ "nauc_precision_at_1_max": 0.270025,
108
+ "nauc_precision_at_1_std": -0.026849,
109
+ "nauc_precision_at_1_diff1": 0.349592,
110
+ "nauc_precision_at_3_max": 0.299539,
111
+ "nauc_precision_at_3_std": 0.033226,
112
+ "nauc_precision_at_3_diff1": 0.242101,
113
+ "nauc_precision_at_5_max": 0.309114,
114
+ "nauc_precision_at_5_std": 0.073494,
115
+ "nauc_precision_at_5_diff1": 0.205136,
116
+ "nauc_precision_at_10_max": 0.306248,
117
+ "nauc_precision_at_10_std": 0.126028,
118
+ "nauc_precision_at_10_diff1": 0.169771,
119
+ "nauc_precision_at_20_max": 0.288905,
120
+ "nauc_precision_at_20_std": 0.186661,
121
+ "nauc_precision_at_20_diff1": 0.110903,
122
+ "nauc_precision_at_100_max": 0.221255,
123
+ "nauc_precision_at_100_std": 0.235258,
124
+ "nauc_precision_at_100_diff1": 0.006968,
125
+ "nauc_precision_at_1000_max": 0.121641,
126
+ "nauc_precision_at_1000_std": 0.195476,
127
+ "nauc_precision_at_1000_diff1": -0.048779,
128
+ "nauc_mrr_at_1_max": 0.269532,
129
+ "nauc_mrr_at_1_std": -0.02756,
130
+ "nauc_mrr_at_1_diff1": 0.34872,
131
+ "nauc_mrr_at_3_max": 0.285746,
132
+ "nauc_mrr_at_3_std": -0.009905,
133
+ "nauc_mrr_at_3_diff1": 0.318684,
134
+ "nauc_mrr_at_5_max": 0.291567,
135
+ "nauc_mrr_at_5_std": -0.004671,
136
+ "nauc_mrr_at_5_diff1": 0.319689,
137
+ "nauc_mrr_at_10_max": 0.295961,
138
+ "nauc_mrr_at_10_std": -0.000471,
139
+ "nauc_mrr_at_10_diff1": 0.321211,
140
+ "nauc_mrr_at_20_max": 0.296766,
141
+ "nauc_mrr_at_20_std": 0.001583,
142
+ "nauc_mrr_at_20_diff1": 0.320293,
143
+ "nauc_mrr_at_100_max": 0.296661,
144
+ "nauc_mrr_at_100_std": 0.001831,
145
+ "nauc_mrr_at_100_diff1": 0.320293,
146
+ "nauc_mrr_at_1000_max": 0.296562,
147
+ "nauc_mrr_at_1000_std": 0.001724,
148
+ "nauc_mrr_at_1000_diff1": 0.320366,
149
+ "hit_rate_at_1": 0.33951,
150
+ "hit_rate_at_3": 0.54838,
151
+ "hit_rate_at_5": 0.63007,
152
+ "hit_rate_at_10": 0.72538,
153
+ "hit_rate_at_20": 0.79954,
154
+ "hit_rate_at_100": 0.90759,
155
+ "hit_rate_at_1000": 0.97016,
156
+ "main_score": 0.50709,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 30726.536249637604,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }