Antreas commited on
Commit
e405320
·
verified ·
1 Parent(s): 0e8f50f

Initial upload: ogma-small embedding model

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +927 -0
  2. config.json +37 -0
  3. config.py +161 -0
  4. config.yaml +19 -0
  5. embeddings.py +143 -0
  6. model.pt +3 -0
  7. model.safetensors +3 -0
  8. ogma_model.py +203 -0
  9. pooling.py +99 -0
  10. results/AmazonCounterfactualClassification.json +268 -0
  11. results/AmazonPolarityClassification.json +140 -0
  12. results/AmazonReviewsClassification.json +140 -0
  13. results/ArXivHierarchicalClusteringP2P.json +47 -0
  14. results/ArXivHierarchicalClusteringS2S.json +47 -0
  15. results/ArguAna.json +167 -0
  16. results/AskUbuntuDupQuestions.json +167 -0
  17. results/BIOSSES.json +27 -0
  18. results/Banking77Classification.json +140 -0
  19. results/BiorxivClusteringP2P.json +33 -0
  20. results/BiorxivClusteringS2S.json +33 -0
  21. results/CQADupstackAndroidRetrieval.json +167 -0
  22. results/CQADupstackEnglishRetrieval.json +167 -0
  23. results/CQADupstackGamingRetrieval.json +167 -0
  24. results/CQADupstackGisRetrieval.json +167 -0
  25. results/CQADupstackMathematicaRetrieval.json +167 -0
  26. results/CQADupstackPhysicsRetrieval.json +167 -0
  27. results/CQADupstackProgrammersRetrieval.json +167 -0
  28. results/CQADupstackRetrieval.json +20 -0
  29. results/CQADupstackStatsRetrieval.json +167 -0
  30. results/CQADupstackTexRetrieval.json +167 -0
  31. results/CQADupstackUnixRetrieval.json +167 -0
  32. results/CQADupstackWebmastersRetrieval.json +167 -0
  33. results/CQADupstackWordpressRetrieval.json +167 -0
  34. results/ClimateFEVER.json +167 -0
  35. results/DBPedia.json +167 -0
  36. results/EmotionClassification.json +140 -0
  37. results/FEVER.json +167 -0
  38. results/FiQA2018.json +167 -0
  39. results/HotpotQA.json +167 -0
  40. results/ImdbClassification.json +140 -0
  41. results/MSMARCO.json +167 -0
  42. results/MTOPDomainClassification.json +140 -0
  43. results/MTOPIntentClassification.json +140 -0
  44. results/MassiveIntentClassification.json +140 -0
  45. results/MassiveScenarioClassification.json +140 -0
  46. results/MedrxivClusteringP2P.json +33 -0
  47. results/MedrxivClusteringS2S.json +33 -0
  48. results/MindSmallReranking.json +252 -0
  49. results/NFCorpus.json +167 -0
  50. results/NQ.json +167 -0
README.md ADDED
@@ -0,0 +1,927 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - mteb
7
+ - sentence-transformers
8
+ - embedding
9
+ - text-embedding
10
+ - ogma
11
+ - axiotic
12
+ - matryoshka
13
+ - small-model
14
+ model-index:
15
+ - name: ogma-small
16
+ results:
17
+ - task:
18
+ type: Classification
19
+ dataset:
20
+ type: mteb/AmazonCounterfactualClassification
21
+ name: MTEB AmazonCounterfactualClassification
22
+ config: default
23
+ split: test
24
+ revision: 1f7e6a9d6fa6e64c53d146e428565640410c0df1
25
+ metrics:
26
+ - type: accuracy
27
+ value: 71.85
28
+ - task:
29
+ type: Classification
30
+ dataset:
31
+ type: mteb/AmazonPolarityClassification
32
+ name: MTEB AmazonPolarityClassification
33
+ config: default
34
+ split: test
35
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
36
+ metrics:
37
+ - type: accuracy
38
+ value: 76.72
39
+ - task:
40
+ type: Classification
41
+ dataset:
42
+ type: mteb/AmazonReviewsClassification
43
+ name: MTEB AmazonReviewsClassification
44
+ config: default
45
+ split: test
46
+ revision: 6b5d328eaae8ef408dd7d775040245cf86f92e9d
47
+ metrics:
48
+ - type: accuracy
49
+ value: 39.0
50
+ - task:
51
+ type: Clustering
52
+ dataset:
53
+ type: mteb/ArXivHierarchicalClusteringP2P
54
+ name: MTEB ArXivHierarchicalClusteringP2P
55
+ config: default
56
+ split: test
57
+ revision: 0bbdb47bcbe3a90093699aefeed338a0f28a7ee8
58
+ metrics:
59
+ - type: v_measure
60
+ value: 55.54
61
+ - task:
62
+ type: Clustering
63
+ dataset:
64
+ type: mteb/ArXivHierarchicalClusteringS2S
65
+ name: MTEB ArXivHierarchicalClusteringS2S
66
+ config: default
67
+ split: test
68
+ revision: b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3
69
+ metrics:
70
+ - type: v_measure
71
+ value: 52.12
72
+ - task:
73
+ type: Retrieval
74
+ dataset:
75
+ type: mteb/ArguAna
76
+ name: MTEB ArguAna
77
+ config: default
78
+ split: test
79
+ revision: c22ab2a51041ffd869aaddef7af8d8215647e41a
80
+ metrics:
81
+ - type: ndcg_at_10
82
+ value: 42.32
83
+ - task:
84
+ type: Reranking
85
+ dataset:
86
+ type: mteb/AskUbuntuDupQuestions
87
+ name: MTEB AskUbuntuDupQuestions
88
+ config: default
89
+ split: test
90
+ revision: c5691e3c48741d5f83b5cc8e630653d7a8cfc048
91
+ metrics:
92
+ - type: map
93
+ value: 55.08
94
+ - task:
95
+ type: STS
96
+ dataset:
97
+ type: mteb/BIOSSES
98
+ name: MTEB BIOSSES
99
+ config: default
100
+ split: test
101
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
102
+ metrics:
103
+ - type: cosine_spearman
104
+ value: 83.81
105
+ - task:
106
+ type: Classification
107
+ dataset:
108
+ type: mteb/Banking77Classification
109
+ name: MTEB Banking77Classification
110
+ config: default
111
+ split: test
112
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
113
+ metrics:
114
+ - type: accuracy
115
+ value: 77.38
116
+ - task:
117
+ type: Clustering
118
+ dataset:
119
+ type: mteb/BiorxivClusteringP2P
120
+ name: MTEB BiorxivClusteringP2P
121
+ config: default
122
+ split: test
123
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
124
+ metrics:
125
+ - type: v_measure
126
+ value: 33.28
127
+ - task:
128
+ type: Clustering
129
+ dataset:
130
+ type: mteb/BiorxivClusteringS2S
131
+ name: MTEB BiorxivClusteringS2S
132
+ config: default
133
+ split: test
134
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
135
+ metrics:
136
+ - type: v_measure
137
+ value: 25.41
138
+ - task:
139
+ type: Retrieval
140
+ dataset:
141
+ type: mteb/CQADupstackAndroidRetrieval
142
+ name: MTEB CQADupstackAndroidRetrieval
143
+ config: default
144
+ split: test
145
+ revision: 9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3
146
+ metrics:
147
+ - type: ndcg_at_10
148
+ value: 36.8
149
+ - task:
150
+ type: Retrieval
151
+ dataset:
152
+ type: mteb/CQADupstackEnglishRetrieval
153
+ name: MTEB CQADupstackEnglishRetrieval
154
+ config: default
155
+ split: test
156
+ revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
157
+ metrics:
158
+ - type: ndcg_at_10
159
+ value: 33.01
160
+ - task:
161
+ type: Retrieval
162
+ dataset:
163
+ type: mteb/CQADupstackGamingRetrieval
164
+ name: MTEB CQADupstackGamingRetrieval
165
+ config: default
166
+ split: test
167
+ revision: 4885aa143210c98657558c04aaf3dc47cfb54340
168
+ metrics:
169
+ - type: ndcg_at_10
170
+ value: 45.01
171
+ - task:
172
+ type: Retrieval
173
+ dataset:
174
+ type: mteb/CQADupstackGisRetrieval
175
+ name: MTEB CQADupstackGisRetrieval
176
+ config: default
177
+ split: test
178
+ revision: 5003b3064772da1887988e05400cf3806fe491f2
179
+ metrics:
180
+ - type: ndcg_at_10
181
+ value: 28.19
182
+ - task:
183
+ type: Retrieval
184
+ dataset:
185
+ type: mteb/CQADupstackMathematicaRetrieval
186
+ name: MTEB CQADupstackMathematicaRetrieval
187
+ config: default
188
+ split: test
189
+ revision: 90fceea13679c63fe563ded68f3b6f06e50061de
190
+ metrics:
191
+ - type: ndcg_at_10
192
+ value: 21.65
193
+ - task:
194
+ type: Retrieval
195
+ dataset:
196
+ type: mteb/CQADupstackPhysicsRetrieval
197
+ name: MTEB CQADupstackPhysicsRetrieval
198
+ config: default
199
+ split: test
200
+ revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
201
+ metrics:
202
+ - type: ndcg_at_10
203
+ value: 33.83
204
+ - task:
205
+ type: Retrieval
206
+ dataset:
207
+ type: mteb/CQADupstackProgrammersRetrieval
208
+ name: MTEB CQADupstackProgrammersRetrieval
209
+ config: default
210
+ split: test
211
+ revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
212
+ metrics:
213
+ - type: ndcg_at_10
214
+ value: 33.07
215
+ - task:
216
+ type: Retrieval
217
+ dataset:
218
+ type: mteb/CQADupstackRetrieval
219
+ name: MTEB CQADupstackRetrieval
220
+ config: default
221
+ split: test
222
+ revision: '1'
223
+ metrics:
224
+ - type: ndcg_at_10
225
+ value: 30.04
226
+ - task:
227
+ type: Retrieval
228
+ dataset:
229
+ type: mteb/CQADupstackStatsRetrieval
230
+ name: MTEB CQADupstackStatsRetrieval
231
+ config: default
232
+ split: test
233
+ revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
234
+ metrics:
235
+ - type: ndcg_at_10
236
+ value: 26.3
237
+ - task:
238
+ type: Retrieval
239
+ dataset:
240
+ type: mteb/CQADupstackTexRetrieval
241
+ name: MTEB CQADupstackTexRetrieval
242
+ config: default
243
+ split: test
244
+ revision: 46989137a86843e03a6195de44b09deda022eec7
245
+ metrics:
246
+ - type: ndcg_at_10
247
+ value: 20.57
248
+ - task:
249
+ type: Retrieval
250
+ dataset:
251
+ type: mteb/CQADupstackUnixRetrieval
252
+ name: MTEB CQADupstackUnixRetrieval
253
+ config: default
254
+ split: test
255
+ revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
256
+ metrics:
257
+ - type: ndcg_at_10
258
+ value: 28.73
259
+ - task:
260
+ type: Retrieval
261
+ dataset:
262
+ type: mteb/CQADupstackWebmastersRetrieval
263
+ name: MTEB CQADupstackWebmastersRetrieval
264
+ config: default
265
+ split: test
266
+ revision: 160c094312a0e1facb97e55eeddb698c0abe3571
267
+ metrics:
268
+ - type: ndcg_at_10
269
+ value: 30.25
270
+ - task:
271
+ type: Retrieval
272
+ dataset:
273
+ type: mteb/CQADupstackWordpressRetrieval
274
+ name: MTEB CQADupstackWordpressRetrieval
275
+ config: default
276
+ split: test
277
+ revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
278
+ metrics:
279
+ - type: ndcg_at_10
280
+ value: 23.09
281
+ - task:
282
+ type: Retrieval
283
+ dataset:
284
+ type: mteb/ClimateFEVER
285
+ name: MTEB ClimateFEVER
286
+ config: default
287
+ split: test
288
+ revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
289
+ metrics:
290
+ - type: ndcg_at_10
291
+ value: 28.61
292
+ - task:
293
+ type: Retrieval
294
+ dataset:
295
+ type: mteb/DBPedia
296
+ name: MTEB DBPedia
297
+ config: default
298
+ split: test
299
+ revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
300
+ metrics:
301
+ - type: ndcg_at_10
302
+ value: 35.94
303
+ - task:
304
+ type: Classification
305
+ dataset:
306
+ type: mteb/EmotionClassification
307
+ name: MTEB EmotionClassification
308
+ config: default
309
+ split: test
310
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
311
+ metrics:
312
+ - type: accuracy
313
+ value: 45.22
314
+ - task:
315
+ type: Retrieval
316
+ dataset:
317
+ type: mteb/FEVER
318
+ name: MTEB FEVER
319
+ config: default
320
+ split: test
321
+ revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
322
+ metrics:
323
+ - type: ndcg_at_10
324
+ value: 68.8
325
+ - task:
326
+ type: Retrieval
327
+ dataset:
328
+ type: mteb/FiQA2018
329
+ name: MTEB FiQA2018
330
+ config: default
331
+ split: test
332
+ revision: 27a168819829fe9bcd655c2df245fb19452e8e06
333
+ metrics:
334
+ - type: ndcg_at_10
335
+ value: 30.05
336
+ - task:
337
+ type: Retrieval
338
+ dataset:
339
+ type: mteb/HotpotQA
340
+ name: MTEB HotpotQA
341
+ config: default
342
+ split: test
343
+ revision: ab518f4d6fcca38d87c25209f94beba119d02014
344
+ metrics:
345
+ - type: ndcg_at_10
346
+ value: 51.57
347
+ - task:
348
+ type: Classification
349
+ dataset:
350
+ type: mteb/ImdbClassification
351
+ name: MTEB ImdbClassification
352
+ config: default
353
+ split: test
354
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
355
+ metrics:
356
+ - type: accuracy
357
+ value: 72.49
358
+ - task:
359
+ type: Retrieval
360
+ dataset:
361
+ type: mteb/MSMARCO
362
+ name: MTEB MSMARCO
363
+ config: default
364
+ split: test
365
+ revision: c5a29a104738b98a9e76336939199e264163d4a0
366
+ metrics:
367
+ - type: ndcg_at_10
368
+ value: 0
369
+ - task:
370
+ type: Classification
371
+ dataset:
372
+ type: mteb/MTOPDomainClassification
373
+ name: MTEB MTOPDomainClassification
374
+ config: default
375
+ split: test
376
+ revision: a76d16fae880597b9c73047b50159220a441cb54
377
+ metrics:
378
+ - type: accuracy
379
+ value: 90.65
380
+ - task:
381
+ type: Classification
382
+ dataset:
383
+ type: mteb/MTOPIntentClassification
384
+ name: MTEB MTOPIntentClassification
385
+ config: default
386
+ split: test
387
+ revision: 2992d820f31312593c49a4890430aadadb0f0039
388
+ metrics:
389
+ - type: accuracy
390
+ value: 60.81
391
+ - task:
392
+ type: Classification
393
+ dataset:
394
+ type: mteb/MassiveIntentClassification
395
+ name: MTEB MassiveIntentClassification
396
+ config: default
397
+ split: test
398
+ revision: 4672e20407010da34463acc759c162ca9734bca6
399
+ metrics:
400
+ - type: accuracy
401
+ value: 66.36
402
+ - task:
403
+ type: Classification
404
+ dataset:
405
+ type: mteb/MassiveScenarioClassification
406
+ name: MTEB MassiveScenarioClassification
407
+ config: default
408
+ split: test
409
+ revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8
410
+ metrics:
411
+ - type: accuracy
412
+ value: 72.78
413
+ - task:
414
+ type: Clustering
415
+ dataset:
416
+ type: mteb/MedrxivClusteringP2P
417
+ name: MTEB MedrxivClusteringP2P
418
+ config: default
419
+ split: test
420
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
421
+ metrics:
422
+ - type: v_measure
423
+ value: 31.96
424
+ - task:
425
+ type: Clustering
426
+ dataset:
427
+ type: mteb/MedrxivClusteringS2S
428
+ name: MTEB MedrxivClusteringS2S
429
+ config: default
430
+ split: test
431
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
432
+ metrics:
433
+ - type: v_measure
434
+ value: 28.59
435
+ - task:
436
+ type: Reranking
437
+ dataset:
438
+ type: mteb/MindSmallReranking
439
+ name: MTEB MindSmallReranking
440
+ config: default
441
+ split: test
442
+ revision: 227478e3235572039f4f7661840e059f31ef6eb1
443
+ metrics:
444
+ - type: map
445
+ value: 30.55
446
+ - task:
447
+ type: Retrieval
448
+ dataset:
449
+ type: mteb/NFCorpus
450
+ name: MTEB NFCorpus
451
+ config: default
452
+ split: test
453
+ revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
454
+ metrics:
455
+ - type: ndcg_at_10
456
+ value: 30.12
457
+ - task:
458
+ type: Retrieval
459
+ dataset:
460
+ type: mteb/NQ
461
+ name: MTEB NQ
462
+ config: default
463
+ split: test
464
+ revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
465
+ metrics:
466
+ - type: ndcg_at_10
467
+ value: 46.72
468
+ - task:
469
+ type: Retrieval
470
+ dataset:
471
+ type: mteb/QuoraRetrieval
472
+ name: MTEB QuoraRetrieval
473
+ config: default
474
+ split: test
475
+ revision: e4e08e0b7dbe3c8700f0daef558ff32256715259
476
+ metrics:
477
+ - type: ndcg_at_10
478
+ value: 60.55
479
+ - task:
480
+ type: Clustering
481
+ dataset:
482
+ type: mteb/RedditClustering
483
+ name: MTEB RedditClustering
484
+ config: default
485
+ split: test
486
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
487
+ metrics:
488
+ - type: v_measure
489
+ value: 43.94
490
+ - task:
491
+ type: Clustering
492
+ dataset:
493
+ type: mteb/RedditClusteringP2P
494
+ name: MTEB RedditClusteringP2P
495
+ config: default
496
+ split: test
497
+ revision: 385e3cb46b4cfa89021f56c4380204149d0efe33
498
+ metrics:
499
+ - type: v_measure
500
+ value: 52.6
501
+ - task:
502
+ type: Retrieval
503
+ dataset:
504
+ type: mteb/SCIDOCS
505
+ name: MTEB SCIDOCS
506
+ config: default
507
+ split: test
508
+ revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88
509
+ metrics:
510
+ - type: ndcg_at_10
511
+ value: 15.87
512
+ - task:
513
+ type: STS
514
+ dataset:
515
+ type: mteb/SICK-R
516
+ name: MTEB SICK-R
517
+ config: default
518
+ split: test
519
+ revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
520
+ metrics:
521
+ - type: cosine_spearman
522
+ value: 78.75
523
+ - task:
524
+ type: STS
525
+ dataset:
526
+ type: mteb/STS12
527
+ name: MTEB STS12
528
+ config: default
529
+ split: test
530
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
531
+ metrics:
532
+ - type: cosine_spearman
533
+ value: 75.62
534
+ - task:
535
+ type: STS
536
+ dataset:
537
+ type: mteb/STS13
538
+ name: MTEB STS13
539
+ config: default
540
+ split: test
541
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
542
+ metrics:
543
+ - type: cosine_spearman
544
+ value: 84.04
545
+ - task:
546
+ type: STS
547
+ dataset:
548
+ type: mteb/STS14
549
+ name: MTEB STS14
550
+ config: default
551
+ split: test
552
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
553
+ metrics:
554
+ - type: cosine_spearman
555
+ value: 79.94
556
+ - task:
557
+ type: STS
558
+ dataset:
559
+ type: mteb/STS15
560
+ name: MTEB STS15
561
+ config: default
562
+ split: test
563
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
564
+ metrics:
565
+ - type: cosine_spearman
566
+ value: 85.77
567
+ - task:
568
+ type: STS
569
+ dataset:
570
+ type: mteb/STS16
571
+ name: MTEB STS16
572
+ config: default
573
+ split: test
574
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
575
+ metrics:
576
+ - type: cosine_spearman
577
+ value: 82.52
578
+ - task:
579
+ type: STS
580
+ dataset:
581
+ type: mteb/STSBenchmark
582
+ name: MTEB STSBenchmark
583
+ config: default
584
+ split: test
585
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
586
+ metrics:
587
+ - type: cosine_spearman
588
+ value: 85.54
589
+ - task:
590
+ type: Reranking
591
+ dataset:
592
+ type: mteb/SciDocsRR
593
+ name: MTEB SciDocsRR
594
+ config: default
595
+ split: test
596
+ revision: 39b8377811871075eed9de3b8a7e21aaa6acb3d8
597
+ metrics:
598
+ - type: map
599
+ value: 73.55
600
+ - task:
601
+ type: Retrieval
602
+ dataset:
603
+ type: mteb/SciFact
604
+ name: MTEB SciFact
605
+ config: default
606
+ split: test
607
+ revision: d56462d0e63a25450459c4f213e49ffdb866f7f9
608
+ metrics:
609
+ - type: ndcg_at_10
610
+ value: 60.04
611
+ - task:
612
+ type: PairClassification
613
+ dataset:
614
+ type: mteb/SprintDuplicateQuestions
615
+ name: MTEB SprintDuplicateQuestions
616
+ config: default
617
+ split: test
618
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
619
+ metrics:
620
+ - type: cosine_ap
621
+ value: 95.3
622
+ - task:
623
+ type: Clustering
624
+ dataset:
625
+ type: mteb/StackExchangeClustering
626
+ name: MTEB StackExchangeClustering
627
+ config: default
628
+ split: test
629
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
630
+ metrics:
631
+ - type: v_measure
632
+ value: 50.22
633
+ - task:
634
+ type: Clustering
635
+ dataset:
636
+ type: mteb/StackExchangeClusteringP2P
637
+ name: MTEB StackExchangeClusteringP2P
638
+ config: default
639
+ split: test
640
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
641
+ metrics:
642
+ - type: v_measure
643
+ value: 34.08
644
+ - task:
645
+ type: Reranking
646
+ dataset:
647
+ type: mteb/StackOverflowDupQuestions
648
+ name: MTEB StackOverflowDupQuestions
649
+ config: default
650
+ split: test
651
+ revision: 5debda000fe8e27ebb5c123d38081f92e1847a59
652
+ metrics:
653
+ - type: map
654
+ value: 42.85
655
+ - task:
656
+ type: Summarization
657
+ dataset:
658
+ type: mteb/SummEval
659
+ name: MTEB SummEval
660
+ config: default
661
+ split: test
662
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
663
+ metrics:
664
+ - type: cosine_spearman
665
+ value: 29.59
666
+ - task:
667
+ type: Retrieval
668
+ dataset:
669
+ type: mteb/TRECCOVID
670
+ name: MTEB TRECCOVID
671
+ config: default
672
+ split: test
673
+ revision: bb9466bac8153a0349341eb1b22e06409e78ef4e
674
+ metrics:
675
+ - type: ndcg_at_10
676
+ value: 69.06
677
+ - task:
678
+ type: Retrieval
679
+ dataset:
680
+ type: mteb/Touche2020
681
+ name: MTEB Touche2020
682
+ config: default
683
+ split: test
684
+ revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
685
+ metrics:
686
+ - type: ndcg_at_10
687
+ value: 26.76
688
+ - task:
689
+ type: Classification
690
+ dataset:
691
+ type: mteb/ToxicConversationsClassification
692
+ name: MTEB ToxicConversationsClassification
693
+ config: default
694
+ split: test
695
+ revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de
696
+ metrics:
697
+ - type: accuracy
698
+ value: 65.58
699
+ - task:
700
+ type: Classification
701
+ dataset:
702
+ type: mteb/TweetSentimentExtractionClassification
703
+ name: MTEB TweetSentimentExtractionClassification
704
+ config: default
705
+ split: test
706
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
707
+ metrics:
708
+ - type: accuracy
709
+ value: 61.19
710
+ - task:
711
+ type: Clustering
712
+ dataset:
713
+ type: mteb/TwentyNewsgroupsClustering
714
+ name: MTEB TwentyNewsgroupsClustering
715
+ config: default
716
+ split: test
717
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
718
+ metrics:
719
+ - type: v_measure
720
+ value: 39.86
721
+ - task:
722
+ type: PairClassification
723
+ dataset:
724
+ type: mteb/TwitterSemEval2015
725
+ name: MTEB TwitterSemEval2015
726
+ config: default
727
+ split: test
728
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
729
+ metrics:
730
+ - type: cosine_ap
731
+ value: 68.49
732
+ - task:
733
+ type: PairClassification
734
+ dataset:
735
+ type: mteb/TwitterURLCorpus
736
+ name: MTEB TwitterURLCorpus
737
+ config: default
738
+ split: test
739
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
740
+ metrics:
741
+ - type: cosine_ap
742
+ value: 84.94
743
+ ---
744
+
745
+ # ogma-small
746
+
747
+ **8.6M parameter text embedding model** by [Axiotic AI](https://axiotic.ai), achieving **55.79 average** on MTEB English v1 (54/54 tasks).
748
+
749
+ 6-layer deep transformer, 256 hidden dim, mean pooling — best sub-10M model.
750
+
751
+ ## Highlights
752
+
753
+ - **8.6M parameters** — small enough for CPU inference, edge deployment, and resource-constrained environments
754
+ - **55.79 MTEB average** — outperforms Potion-32M (51.22) despite being significantly smaller
755
+ - **Matryoshka embeddings** — use dimensions [32, 64, 128, 256] for flexible storage/compute tradeoffs
756
+ - **Asymmetric encoding** — dedicated `[QRY]`, `[DOC]`, `[SYM]` task tokens for query-document and symmetric tasks
757
+ - **1024 token context** — handles longer passages than typical small models (Potion: 512)
758
+ - **Pure PyTorch** — no external transformer library dependencies
759
+
760
+ ## Architecture
761
+
762
+ | Component | Details |
763
+ |-----------|---------|
764
+ | Parameters | 8.6M |
765
+ | Layers | 6 |
766
+ | Hidden dim (d_model) | 256 |
767
+ | Embedding dim (d_embed) | 128 |
768
+ | Output dim (d_output) | 256 |
769
+ | Attention heads | 4 |
770
+ | Max sequence length | 1024 |
771
+ | Matryoshka dims | [32, 64, 128, 256] |
772
+ | Pooling | Mean (mask-aware) |
773
+ | Position encoding | RoPE |
774
+ | FFN | SwiGLU |
775
+ | Normalization | Pre-LayerNorm |
776
+ | Tokenizer | SentencePiece Unigram (30K vocab) |
777
+ | Training | Knowledge distillation from teacher model |
778
+
779
+ ## MTEB Results
780
+
781
+ ### Category-Level Scores
782
+
783
+ | Category | ogma-small | Potion-32M | Potion-8M | vs Potion-32M |
784
+ |----------|------------|------------|-----------|---------------|
785
+ | Classification | **66.48** | 66.01 | 64.46 | +0.47 |
786
+ | Clustering | **40.69** | 39.24 | 36.88 | +1.45 |
787
+ | PairClassification | **82.91** | 78.17 | 76.62 | +4.74 |
788
+ | Reranking | **50.51** | 50.92 | 49.73 | -0.41 |
789
+ | Retrieval | **42.05** | 32.21 | 30.43 | +9.84 |
790
+ | STS | **82.0** | 73.86 | 72.93 | +8.14 |
791
+ | Summarization | **29.59** | 29.77 | 29.26 | -0.18 |
792
+ | **Overall** | **55.79** | 51.22 | 49.58 | **+4.57** |
793
+
794
+ > **Potion scores are locally reproduced** using the same evaluation pipeline and hardware for fair head-to-head comparison. These are not self-reported numbers from the Potion model card.
795
+
796
+ ## Usage
797
+
798
+ ### Quick Start
799
+
800
+ ```python
801
+ import torch
802
+ import numpy as np
803
+ from pathlib import Path
804
+
805
+ # Load model
806
+ from ogma_model import OgmaModel
807
+ from config import OgmaConfig
808
+ from tokenizer import OgmaTokenizer
809
+
810
+ # Load from checkpoint directory
811
+ model = OgmaModel.from_checkpoint("path/to/ogma-small", device="cpu")
812
+ model.eval()
813
+
814
+ # Load tokenizer (uses the SentencePiece model embedded in tokenizer.json)
815
+ # The tokenizer needs the .model file — extract from tokenizer.json or use:
816
+ tokenizer = OgmaTokenizer("path/to/tokenizer.model")
817
+
818
+ # Encode text
819
+ texts = ["This is a query", "This is a document"]
820
+ encoded = tokenizer.batch_encode(texts, max_length=1024)
821
+
822
+ token_ids = torch.tensor(encoded["input_ids"])
823
+ attention_mask = torch.tensor(encoded["attention_mask"])
824
+
825
+ # Use task tokens for asymmetric encoding
826
+ from config import TaskToken
827
+
828
+ with torch.no_grad():
829
+ # For symmetric tasks (STS, clustering, classification)
830
+ embeddings = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
831
+
832
+ # For retrieval — encode queries and documents separately
833
+ query_embs = model.encode(token_ids[:1], attention_mask[:1], task=TaskToken.QRY)
834
+ doc_embs = model.encode(token_ids[1:], attention_mask[1:], task=TaskToken.DOC)
835
+
836
+ print(f"Embedding shape: {embeddings.shape}") # (2, 256)
837
+ ```
838
+
839
+ ### Matryoshka Dimensionality Reduction
840
+
841
+ ```python
842
+ # Full embeddings: 256d
843
+ full_embs = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
844
+
845
+ # Reduce to any Matryoshka dimension: [32, 64, 128, 256]
846
+ dim = 64
847
+ reduced_embs = torch.nn.functional.normalize(full_embs[:, :dim], p=2, dim=-1)
848
+ # These reduced embeddings are trained to be effective at lower dims
849
+ ```
850
+
851
+ ### Loading with safetensors
852
+
853
+ ```python
854
+ import torch
855
+ import yaml
856
+ from safetensors.torch import load_file
857
+ from ogma_model import OgmaModel
858
+ from config import OgmaConfig
859
+
860
+ # Load config
861
+ with open("path/to/ogma-small/config.json") as f:
862
+ import json
863
+ config_dict = json.load(f)
864
+
865
+ config = OgmaConfig.from_dict(config_dict)
866
+ model = OgmaModel(config)
867
+
868
+ # Load weights from safetensors
869
+ state_dict = load_file("path/to/ogma-small/model.safetensors")
870
+ model.load_state_dict(state_dict)
871
+ model.eval()
872
+ ```
873
+
874
+ ## Task Tokens
875
+
876
+ Ogma uses task-specific prefix tokens for asymmetric encoding:
877
+
878
+ | Token | ID | Use Case |
879
+ |-------|-----|----------|
880
+ | `[QRY]` | 4 | Query encoding for retrieval |
881
+ | `[DOC]` | 5 | Document/passage encoding for retrieval |
882
+ | `[SYM]` | 6 | Symmetric tasks (STS, classification, clustering) |
883
+
884
+ For retrieval tasks, encode queries with `[QRY]` and documents with `[DOC]`. For all other tasks, use `[SYM]`.
885
+
886
+ ## Training
887
+
888
+ Ogma is trained via **knowledge distillation** from a larger teacher embedding model. The training pipeline:
889
+
890
+ 1. **Tokenizer**: SentencePiece Unigram model trained on the distillation corpus (30K vocab)
891
+ 2. **Token embeddings**: PCA-reduced embeddings from the teacher model, providing a strong initialization
892
+ 3. **Distillation**: MSE loss between student and teacher embeddings, with Matryoshka loss at multiple dimensions
893
+ 4. **Architecture**: Standard transformer encoder with RoPE positional encoding and SwiGLU FFN
894
+
895
+ ## Files
896
+
897
+ | File | Description |
898
+ |------|-------------|
899
+ | `model.safetensors` | Model weights (safetensors format) |
900
+ | `model.pt` | Model weights (PyTorch format) |
901
+ | `config.json` | Model configuration |
902
+ | `config.yaml` | Original training config |
903
+ | `tokenizer.json` | HuggingFace tokenizer |
904
+ | `tokenizer_config.json` | Tokenizer configuration |
905
+ | `token_embeds_128d.npy` | Pre-computed token embeddings (30K × 128, float16) |
906
+ | `ogma_model.py` | OgmaModel class |
907
+ | `config.py` | OgmaConfig dataclass |
908
+ | `embeddings.py` | Token embedding + RoPE |
909
+ | `pooling.py` | Pooling strategies |
910
+ | `variants/transformer.py` | Transformer encoder variant |
911
+ | `tokenizer.py` | OgmaTokenizer wrapper |
912
+ | `results/` | MTEB result JSONs |
913
+
914
+ ## Citation
915
+
916
+ ```bibtex
917
+ @misc{ogma2026,
918
+ title={Ogma: Small High-Performance Text Embeddings},
919
+ author={Axiotic AI},
920
+ year={2026},
921
+ url={https://huggingface.co/axiotic/ogma-small}
922
+ }
923
+ ```
924
+
925
+ ## License
926
+
927
+ MIT
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "OgmaModel"
4
+ ],
5
+ "model_type": "ogma",
6
+ "auto_map": {
7
+ "AutoModel": "ogma_model.OgmaModel"
8
+ },
9
+ "variant": "transformer",
10
+ "d_embed": 128,
11
+ "d_model": 256,
12
+ "d_output": 256,
13
+ "n_layers": 6,
14
+ "n_heads": 4,
15
+ "vocab_size": 30000,
16
+ "max_seq_len": 1024,
17
+ "matryoshka_dims": [
18
+ 32,
19
+ 64,
20
+ 128,
21
+ 256
22
+ ],
23
+ "pooling": "mean",
24
+ "ffn_mult": 2.6666666666666665,
25
+ "conv_kernel_size": 7,
26
+ "spatial_rank": 32,
27
+ "n_random_features": 128,
28
+ "dropout": 0.0,
29
+ "pad_id": 0,
30
+ "unk_id": 1,
31
+ "bos_id": 2,
32
+ "eos_id": 3,
33
+ "qry_id": 4,
34
+ "doc_id": 5,
35
+ "sym_id": 6,
36
+ "n_special_tokens": 7
37
+ }
config.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Model configuration for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from dataclasses import dataclass, field
6
+ from enum import StrEnum
7
+ from typing import Any
8
+
9
+ __all__ = ["OgmaConfig", "VariantType", "PoolingType", "TaskToken"]
10
+
11
+
12
+ class VariantType(StrEnum):
13
+ """Architecture variant identifiers."""
14
+
15
+ TRANSFORMER = "transformer"
16
+ DEEP_NARROW = "deep_narrow"
17
+ CONV = "conv"
18
+ LINEAR_ATTENTION = "linear_attention"
19
+ MLP_MIXER = "mlp_mixer"
20
+ TRANSFORMER_RESA = "transformer_resa"
21
+ GLA = "gla"
22
+
23
+
24
+ class PoolingType(StrEnum):
25
+ """Pooling strategy identifiers."""
26
+
27
+ TASK_TOKEN = "task_token"
28
+ LATENT_ATTENTION = "latent_attention"
29
+ MEAN = "mean"
30
+
31
+
32
+ class TaskToken(StrEnum):
33
+ """Task token identifiers for asymmetric encoding."""
34
+
35
+ QRY = "QRY"
36
+ DOC = "DOC"
37
+ SYM = "SYM"
38
+
39
+
40
+ @dataclass
41
+ class OgmaConfig:
42
+ """Configuration for an Ogma model instance.
43
+
44
+ Args:
45
+ variant: Architecture variant to use.
46
+ d_embed: Token embedding dimension (from teacher PCA).
47
+ d_model: Internal model dimension after projection.
48
+ n_layers: Number of fusion layers/blocks.
49
+ n_heads: Number of attention heads (attention variants only).
50
+ vocab_size: Vocabulary size for embedding table.
51
+ max_seq_len: Maximum sequence length.
52
+ matryoshka_dims: Nested output dimensions for Matryoshka.
53
+ pooling: Pooling strategy.
54
+ d_output: Final output dimension.
55
+ ffn_mult: SwiGLU FFN hidden dimension multiplier.
56
+ conv_kernel_size: Kernel size for conv variant.
57
+ spatial_rank: Rank of spatial mixing in MLP mixer.
58
+ n_random_features: Random features for linear attention.
59
+ dropout: Dropout rate (0 for inference).
60
+ """
61
+
62
+ variant: VariantType = VariantType.TRANSFORMER
63
+ d_embed: int = 128
64
+ d_model: int = 256
65
+ n_layers: int = 1
66
+ n_heads: int = 4
67
+ vocab_size: int = 30_000
68
+ max_seq_len: int = 512
69
+ matryoshka_dims: list[int] = field(
70
+ default_factory=lambda: [32, 64, 128, 256]
71
+ )
72
+ pooling: PoolingType = PoolingType.TASK_TOKEN
73
+ d_output: int = 256
74
+ ffn_mult: float = 8 / 3 # SwiGLU: 8/3 * d_model ≈ 683 for d=256
75
+ conv_kernel_size: int = 7
76
+ spatial_rank: int = 32
77
+ n_random_features: int = 128
78
+ dropout: float = 0.0
79
+
80
+ # ReSA scorer settings
81
+ scorer_type: str = "dot"
82
+ scorer_alpha_init: float = 0.1
83
+ scorer_hidden: int = 0 # 0 defaults to d_head
84
+
85
+ # GLA (Gated Linear Attention) settings
86
+ gla_expand_k: float = 0.5 # key dim expansion (key_dim = d_model * expand_k)
87
+ gla_expand_v: float = 1.0 # value dim expansion (value_dim = d_model * expand_v)
88
+ gla_gate_low_rank_dim: int = 16 # low-rank dim for gating projection
89
+ gla_gate_logit_normalizer: int = 16 # normalizer for gate logits
90
+ gla_use_short_conv: bool = True # whether to use short conv on Q,K,V
91
+ gla_conv_size: int = 4 # short conv kernel size
92
+
93
+ # Special token IDs
94
+ pad_id: int = 0
95
+ unk_id: int = 1
96
+ bos_id: int = 2
97
+ eos_id: int = 3
98
+ qry_id: int = 4
99
+ doc_id: int = 5
100
+ sym_id: int = 6
101
+ n_special_tokens: int = 7
102
+
103
+ @property
104
+ def d_head(self) -> int:
105
+ """Per-head dimension."""
106
+ return self.d_model // self.n_heads
107
+
108
+ @property
109
+ def ffn_hidden(self) -> int:
110
+ """SwiGLU FFN hidden dimension."""
111
+ return int(self.d_model * self.ffn_mult)
112
+
113
+ def task_token_id(self, task: TaskToken) -> int:
114
+ """Return token ID for a task token."""
115
+ mapping = {
116
+ TaskToken.QRY: self.qry_id,
117
+ TaskToken.DOC: self.doc_id,
118
+ TaskToken.SYM: self.sym_id,
119
+ }
120
+ return mapping[task]
121
+
122
+ def to_dict(self) -> dict[str, Any]:
123
+ """Serialize config to dictionary."""
124
+ return {
125
+ "variant": self.variant.value,
126
+ "d_embed": self.d_embed,
127
+ "d_model": self.d_model,
128
+ "n_layers": self.n_layers,
129
+ "n_heads": self.n_heads,
130
+ "vocab_size": self.vocab_size,
131
+ "max_seq_len": self.max_seq_len,
132
+ "matryoshka_dims": self.matryoshka_dims,
133
+ "pooling": self.pooling.value,
134
+ "d_output": self.d_output,
135
+ "ffn_mult": self.ffn_mult,
136
+ "conv_kernel_size": self.conv_kernel_size,
137
+ "spatial_rank": self.spatial_rank,
138
+ "n_random_features": self.n_random_features,
139
+ "dropout": self.dropout,
140
+ "scorer_type": self.scorer_type,
141
+ "scorer_alpha_init": self.scorer_alpha_init,
142
+ "scorer_hidden": self.scorer_hidden,
143
+ "gla_expand_k": self.gla_expand_k,
144
+ "gla_expand_v": self.gla_expand_v,
145
+ "gla_gate_low_rank_dim": self.gla_gate_low_rank_dim,
146
+ "gla_gate_logit_normalizer": self.gla_gate_logit_normalizer,
147
+ "gla_use_short_conv": self.gla_use_short_conv,
148
+ "gla_conv_size": self.gla_conv_size,
149
+ }
150
+
151
+ @classmethod
152
+ def from_dict(cls, data: dict[str, Any]) -> OgmaConfig:
153
+ """Deserialize config from dictionary."""
154
+ data = dict(data)
155
+ if "variant" in data:
156
+ data["variant"] = VariantType(data["variant"])
157
+ if "pooling" in data:
158
+ data["pooling"] = PoolingType(data["pooling"])
159
+ known = {f.name for f in cls.__dataclass_fields__.values()}
160
+ filtered = {k: v for k, v in data.items() if k in known}
161
+ return cls(**filtered)
config.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ conv_kernel_size: 7
2
+ d_embed: 128
3
+ d_model: 256
4
+ d_output: 256
5
+ dropout: 0.0
6
+ ffn_mult: 2.6666666666666665
7
+ matryoshka_dims:
8
+ - 32
9
+ - 64
10
+ - 128
11
+ - 256
12
+ max_seq_len: 1024
13
+ n_heads: 4
14
+ n_layers: 6
15
+ n_random_features: 128
16
+ pooling: mean
17
+ spatial_rank: 32
18
+ variant: transformer
19
+ vocab_size: 30000
embeddings.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Token embeddings, task token embeddings, and RoPE for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+
8
+ from ogma.model.config import OgmaConfig
9
+
10
+ __all__ = ["TokenEmbedding", "RotaryPositionalEncoding"]
11
+
12
+
13
+ class TokenEmbedding(nn.Module):
14
+ """Token embedding with optional linear projection.
15
+
16
+ Loads a vocab_size x d_embed embedding table and projects to d_model.
17
+ Includes 3 learnable task token embeddings ([QRY], [DOC], [SYM]).
18
+ """
19
+
20
+ def __init__(self, config: OgmaConfig) -> None:
21
+ super().__init__()
22
+ self.config = config
23
+ self.embed = nn.Embedding(
24
+ config.vocab_size + config.n_special_tokens,
25
+ config.d_embed,
26
+ padding_idx=config.pad_id,
27
+ )
28
+ if config.d_embed != config.d_model:
29
+ self.proj = nn.Linear(config.d_embed, config.d_model)
30
+ else:
31
+ self.proj = nn.Identity() # type: ignore[assignment]
32
+
33
+ # Task token embeddings are learned separately at d_model
34
+ self.task_tokens = nn.Embedding(3, config.d_model)
35
+
36
+ def forward(
37
+ self,
38
+ token_ids: torch.Tensor,
39
+ task_token_ids: torch.Tensor,
40
+ ) -> torch.Tensor:
41
+ """Embed tokens and prepend task token.
42
+
43
+ Args:
44
+ token_ids: (B, S) token IDs.
45
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
46
+
47
+ Returns:
48
+ (B, S+1, d_model) embeddings with task token prepended.
49
+ """
50
+ # Embed and project regular tokens
51
+ x = self.embed(token_ids) # (B, S, d_embed)
52
+ x = self.proj(x) # (B, S, d_model)
53
+
54
+ # Get task token embeddings (map 4,5,6 -> 0,1,2)
55
+ task_idx = task_token_ids - self.config.qry_id # (B,)
56
+ task_emb = self.task_tokens(task_idx) # (B, d_model)
57
+ task_emb = task_emb.unsqueeze(1) # (B, 1, d_model)
58
+
59
+ # Prepend task token
60
+ return torch.cat([task_emb, x], dim=1) # (B, S+1, d_model)
61
+
62
+ def load_pretrained_embeddings(
63
+ self, embeddings: torch.Tensor
64
+ ) -> None:
65
+ """Load pre-computed token embeddings (e.g., from teacher PCA).
66
+
67
+ Args:
68
+ embeddings: (vocab_size, d_embed) tensor.
69
+ """
70
+ with torch.no_grad():
71
+ n = min(embeddings.shape[0], self.config.vocab_size)
72
+ start = self.config.n_special_tokens
73
+ self.embed.weight[start : n + start] = embeddings[:n]
74
+
75
+
76
+ class RotaryPositionalEncoding(nn.Module):
77
+ """Rotary Position Embedding (RoPE). Zero trainable parameters."""
78
+
79
+ def __init__(self, dim: int, max_seq_len: int = 512) -> None:
80
+ super().__init__()
81
+ inv_freq = 1.0 / (
82
+ 10000.0 ** (torch.arange(0, dim, 2).float() / dim)
83
+ )
84
+ self.register_buffer("inv_freq", inv_freq)
85
+ self._build_cache(max_seq_len)
86
+
87
+ def _build_cache(self, seq_len: int) -> None:
88
+ inv_freq: torch.Tensor = self.inv_freq # type: ignore[assignment]
89
+ t = torch.arange(seq_len, dtype=inv_freq.dtype)
90
+ freqs = torch.outer(t, inv_freq)
91
+ cos_cached = freqs.cos()
92
+ sin_cached = freqs.sin()
93
+ self.register_buffer("cos_cached", cos_cached, persistent=False)
94
+ self.register_buffer("sin_cached", sin_cached, persistent=False)
95
+
96
+ def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
97
+ """Return cos and sin for sequence length of x.
98
+
99
+ Args:
100
+ x: (B, S, ...) tensor to determine sequence length.
101
+
102
+ Returns:
103
+ Tuple of (cos, sin) each of shape (S, d_head//2).
104
+ """
105
+ seq_len = x.shape[1]
106
+ cos: torch.Tensor = self.cos_cached # type: ignore[assignment]
107
+ sin: torch.Tensor = self.sin_cached # type: ignore[assignment]
108
+ if seq_len > cos.shape[0]:
109
+ self._build_cache(seq_len)
110
+ cos = self.cos_cached # type: ignore[assignment]
111
+ sin = self.sin_cached # type: ignore[assignment]
112
+ return cos[:seq_len], sin[:seq_len]
113
+
114
+
115
+ def apply_rope(
116
+ q: torch.Tensor,
117
+ k: torch.Tensor,
118
+ cos: torch.Tensor,
119
+ sin: torch.Tensor,
120
+ ) -> tuple[torch.Tensor, torch.Tensor]:
121
+ """Apply rotary embeddings to query and key tensors.
122
+
123
+ Args:
124
+ q: (B, n_heads, S, d_head) query tensor.
125
+ k: (B, n_heads, S, d_head) key tensor.
126
+ cos: (S, d_head//2) cosine cache.
127
+ sin: (S, d_head//2) sine cache.
128
+
129
+ Returns:
130
+ Rotated (q, k) tensors.
131
+ """
132
+
133
+ def _rotate(x: torch.Tensor) -> torch.Tensor:
134
+ x1 = x[..., : x.shape[-1] // 2]
135
+ x2 = x[..., x.shape[-1] // 2 :]
136
+ cos_exp = cos.unsqueeze(0).unsqueeze(0) # (1, 1, S, d_head//2)
137
+ sin_exp = sin.unsqueeze(0).unsqueeze(0)
138
+ return torch.cat(
139
+ [x1 * cos_exp - x2 * sin_exp, x2 * cos_exp + x1 * sin_exp],
140
+ dim=-1,
141
+ )
142
+
143
+ return _rotate(q), _rotate(k)
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1e6d94ca13f26d03aa1e1d465bdb3f8b1fa83447a4e36a49653c0aecd2f2cfb
3
+ size 34408948
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6142f9f6637006e944bf436f872df1b69076a7ceb62a39b4d65a011be3126097
3
+ size 34393728
ogma_model.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OgmaModel — top-level model wrapping any architecture variant."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, TaskToken, VariantType
10
+ from ogma.model.embeddings import TokenEmbedding
11
+ from ogma.model.pooling import create_pooling
12
+ from ogma.model.variants.conv import ConvVariant
13
+ from ogma.model.variants.deep_narrow import DeepNarrowVariant
14
+ from ogma.model.variants.linear_attention import LinearAttentionVariant
15
+ from ogma.model.variants.mlp_mixer import MLPMixerVariant
16
+ from ogma.model.variants.transformer import TransformerVariant
17
+ from ogma.model.variants.transformer_resa import TransformerReSAVariant
18
+ from ogma.model.variants.gla import GLAVariant
19
+
20
+ __all__ = ["OgmaModel"]
21
+
22
+ MAX_PARAMS = 10_000_000
23
+
24
+
25
+ def _build_variant(config: OgmaConfig) -> nn.Module:
26
+ """Instantiate the appropriate architecture variant."""
27
+ if config.variant == VariantType.TRANSFORMER:
28
+ return TransformerVariant(config)
29
+ elif config.variant == VariantType.DEEP_NARROW:
30
+ return DeepNarrowVariant(config)
31
+ elif config.variant == VariantType.CONV:
32
+ return ConvVariant(config)
33
+ elif config.variant == VariantType.LINEAR_ATTENTION:
34
+ return LinearAttentionVariant(config)
35
+ elif config.variant == VariantType.MLP_MIXER:
36
+ return MLPMixerVariant(config)
37
+ elif config.variant == VariantType.TRANSFORMER_RESA:
38
+ return TransformerReSAVariant(config)
39
+ elif config.variant == VariantType.GLA:
40
+ return GLAVariant(config)
41
+ raise ValueError(f"Unknown variant: {config.variant}")
42
+
43
+
44
+ class OgmaModel(nn.Module):
45
+ """Ogma embedding model.
46
+
47
+ Wraps any architecture variant with shared embedding, pooling, and
48
+ normalization. Produces L2-normalized embeddings at d_output dimensions,
49
+ Matryoshka-compatible at configured sub-dimensions.
50
+ """
51
+
52
+ def __init__(self, config: OgmaConfig) -> None:
53
+ super().__init__()
54
+ self.config = config
55
+ self.embedding = TokenEmbedding(config)
56
+ self.variant = _build_variant(config)
57
+ self.pooling = create_pooling(config)
58
+
59
+ # Output projection if variant output != d_output
60
+ needs_proj = (
61
+ config.variant == VariantType.DEEP_NARROW
62
+ and config.d_model != config.d_output
63
+ )
64
+ # DeepNarrowVariant already has output_proj, so no extra needed here
65
+ if not needs_proj and config.d_model != config.d_output:
66
+ self.output_proj: nn.Module = nn.Linear(
67
+ config.d_model, config.d_output
68
+ )
69
+ else:
70
+ self.output_proj = nn.Identity()
71
+
72
+ def forward(
73
+ self,
74
+ token_ids: torch.Tensor,
75
+ attention_mask: torch.Tensor,
76
+ task_token_ids: torch.Tensor,
77
+ ) -> torch.Tensor:
78
+ """Forward pass producing L2-normalized embeddings.
79
+
80
+ Args:
81
+ token_ids: (B, S) token IDs.
82
+ attention_mask: (B, S) attention mask (1=valid, 0=pad).
83
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
84
+
85
+ Returns:
86
+ (B, d_output) L2-normalized embeddings.
87
+ """
88
+ # Embed tokens with task token prepended -> (B, S+1, d_model)
89
+ x = self.embedding(token_ids, task_token_ids)
90
+
91
+ # Extend attention mask for prepended task token
92
+ task_mask = torch.ones(
93
+ attention_mask.shape[0], 1,
94
+ device=attention_mask.device,
95
+ dtype=attention_mask.dtype,
96
+ )
97
+ extended_mask = torch.cat([task_mask, attention_mask], dim=1)
98
+
99
+ # Run through variant
100
+ x = self.variant(x, extended_mask)
101
+
102
+ # Pool
103
+ x = self.pooling(x, extended_mask)
104
+
105
+ # Project if needed
106
+ x = self.output_proj(x)
107
+
108
+ # L2 normalize
109
+ return F.normalize(x, p=2, dim=-1)
110
+
111
+ def encode(
112
+ self,
113
+ token_ids: torch.Tensor,
114
+ attention_mask: torch.Tensor,
115
+ task: TaskToken = TaskToken.SYM,
116
+ ) -> torch.Tensor:
117
+ """Encode tokens with a specified task mode.
118
+
119
+ Args:
120
+ token_ids: (B, S) token IDs.
121
+ attention_mask: (B, S) attention mask.
122
+ task: Task token to use.
123
+
124
+ Returns:
125
+ (B, d_output) L2-normalized embeddings.
126
+ """
127
+ task_ids = torch.full(
128
+ (token_ids.shape[0],),
129
+ self.config.task_token_id(task),
130
+ device=token_ids.device,
131
+ dtype=torch.long,
132
+ )
133
+ return self.forward(token_ids, attention_mask, task_ids)
134
+
135
+ def param_count(self) -> int:
136
+ """Count total trainable parameters."""
137
+ return sum(p.numel() for p in self.parameters() if p.requires_grad)
138
+
139
+ def assert_param_budget(self) -> None:
140
+ """Assert model is under the 10M parameter budget."""
141
+ count = self.param_count()
142
+ assert count < MAX_PARAMS, (
143
+ f"Model has {count:,} params, exceeds {MAX_PARAMS:,} budget"
144
+ )
145
+
146
+ @classmethod
147
+ def from_config(cls, config: OgmaConfig) -> OgmaModel:
148
+ """Factory method to build a model from config."""
149
+ model = cls(config)
150
+ model.assert_param_budget()
151
+ return model
152
+
153
+ @classmethod
154
+ def from_checkpoint(
155
+ cls,
156
+ path: str,
157
+ device: str = "cpu",
158
+ ) -> OgmaModel:
159
+ """Load model from a checkpoint directory.
160
+
161
+ Args:
162
+ path: Path to checkpoint directory containing config.yaml
163
+ and model.pt.
164
+ device: Device to load model to.
165
+
166
+ Returns:
167
+ Loaded OgmaModel.
168
+ """
169
+ from pathlib import Path
170
+
171
+ import yaml
172
+
173
+ ckpt_path = Path(path)
174
+ with open(ckpt_path / "config.yaml") as f:
175
+ config_dict = yaml.safe_load(f)
176
+ config = OgmaConfig.from_dict(config_dict)
177
+
178
+ model = cls(config)
179
+ state_dict = torch.load(
180
+ ckpt_path / "model.pt",
181
+ map_location=device,
182
+ weights_only=True,
183
+ )
184
+ model.load_state_dict(state_dict)
185
+ model.to(device)
186
+ model.eval()
187
+ return model
188
+
189
+ def save_checkpoint(self, path: str) -> None:
190
+ """Save model checkpoint.
191
+
192
+ Args:
193
+ path: Directory to save config.yaml and model.pt.
194
+ """
195
+ from pathlib import Path
196
+
197
+ import yaml
198
+
199
+ ckpt_path = Path(path)
200
+ ckpt_path.mkdir(parents=True, exist_ok=True)
201
+ with open(ckpt_path / "config.yaml", "w") as f:
202
+ yaml.dump(self.config.to_dict(), f, default_flow_style=False)
203
+ torch.save(self.state_dict(), ckpt_path / "model.pt")
pooling.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Pooling strategies for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, PoolingType
10
+
11
+ __all__ = [
12
+ "create_pooling",
13
+ "TaskTokenPooling",
14
+ "LatentAttentionPooling",
15
+ "MeanPooling",
16
+ ]
17
+
18
+
19
+ def create_pooling(config: OgmaConfig) -> nn.Module:
20
+ """Factory for pooling layers."""
21
+ if config.pooling == PoolingType.TASK_TOKEN:
22
+ return TaskTokenPooling()
23
+ elif config.pooling == PoolingType.LATENT_ATTENTION:
24
+ return LatentAttentionPooling(config.d_model)
25
+ elif config.pooling == PoolingType.MEAN:
26
+ return MeanPooling()
27
+ raise ValueError(f"Unknown pooling type: {config.pooling}")
28
+
29
+
30
+ class TaskTokenPooling(nn.Module):
31
+ """Use the output at position 0 (task token) as the sentence embedding."""
32
+
33
+ def forward(
34
+ self,
35
+ x: torch.Tensor,
36
+ attention_mask: torch.Tensor | None = None,
37
+ ) -> torch.Tensor:
38
+ """Extract task token output.
39
+
40
+ Args:
41
+ x: (B, S, D) sequence outputs.
42
+ attention_mask: unused, for interface compatibility.
43
+
44
+ Returns:
45
+ (B, D) pooled output.
46
+ """
47
+ return x[:, 0, :]
48
+
49
+
50
+ class LatentAttentionPooling(nn.Module):
51
+ """Learned query vector attends over all token outputs."""
52
+
53
+ def __init__(self, d_model: int) -> None:
54
+ super().__init__()
55
+ self.query = nn.Parameter(torch.randn(d_model))
56
+
57
+ def forward(
58
+ self,
59
+ x: torch.Tensor,
60
+ attention_mask: torch.Tensor | None = None,
61
+ ) -> torch.Tensor:
62
+ """Attend over sequence with learned query.
63
+
64
+ Args:
65
+ x: (B, S, D) sequence outputs.
66
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
67
+
68
+ Returns:
69
+ (B, D) pooled output.
70
+ """
71
+ # (B, S)
72
+ scores = torch.matmul(x, self.query) / (x.shape[-1] ** 0.5)
73
+ if attention_mask is not None:
74
+ scores = scores.masked_fill(attention_mask == 0, float("-inf"))
75
+ weights = F.softmax(scores, dim=-1) # (B, S)
76
+ return torch.bmm(weights.unsqueeze(1), x).squeeze(1) # (B, D)
77
+
78
+
79
+ class MeanPooling(nn.Module):
80
+ """Average all token outputs (excluding padding)."""
81
+
82
+ def forward(
83
+ self,
84
+ x: torch.Tensor,
85
+ attention_mask: torch.Tensor | None = None,
86
+ ) -> torch.Tensor:
87
+ """Mean pool over valid tokens.
88
+
89
+ Args:
90
+ x: (B, S, D) sequence outputs.
91
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
92
+
93
+ Returns:
94
+ (B, D) pooled output.
95
+ """
96
+ if attention_mask is None:
97
+ return x.mean(dim=1)
98
+ mask = attention_mask.unsqueeze(-1).float() # (B, S, 1)
99
+ return (x * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
results/AmazonCounterfactualClassification.json ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1f7e6a9d6fa6e64c53d146e428565640410c0df1",
3
+ "task_name": "AmazonCounterfactualClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.77961,
11
+ "f1": 0.655834,
12
+ "f1_weighted": 0.819218,
13
+ "precision": 0.641782,
14
+ "precision_weighted": 0.909331,
15
+ "recall": 0.816592,
16
+ "recall_weighted": 0.77961,
17
+ "ap": 0.276514,
18
+ "ap_weighted": 0.276514
19
+ },
20
+ {
21
+ "accuracy": 0.6994,
22
+ "f1": 0.580126,
23
+ "f1_weighted": 0.757276,
24
+ "precision": 0.598199,
25
+ "precision_weighted": 0.89019,
26
+ "recall": 0.743214,
27
+ "recall_weighted": 0.6994,
28
+ "ap": 0.204131,
29
+ "ap_weighted": 0.204131
30
+ },
31
+ {
32
+ "accuracy": 0.717391,
33
+ "f1": 0.598927,
34
+ "f1_weighted": 0.771476,
35
+ "precision": 0.610094,
36
+ "precision_weighted": 0.897672,
37
+ "recall": 0.769149,
38
+ "recall_weighted": 0.717391,
39
+ "ap": 0.223211,
40
+ "ap_weighted": 0.223211
41
+ },
42
+ {
43
+ "accuracy": 0.625187,
44
+ "f1": 0.515825,
45
+ "f1_weighted": 0.697981,
46
+ "precision": 0.5639,
47
+ "precision_weighted": 0.86918,
48
+ "recall": 0.666825,
49
+ "recall_weighted": 0.625187,
50
+ "ap": 0.157475,
51
+ "ap_weighted": 0.157475
52
+ },
53
+ {
54
+ "accuracy": 0.734633,
55
+ "f1": 0.6081,
56
+ "f1_weighted": 0.784378,
57
+ "precision": 0.611395,
58
+ "precision_weighted": 0.894306,
59
+ "recall": 0.762879,
60
+ "recall_weighted": 0.734633,
61
+ "ap": 0.223828,
62
+ "ap_weighted": 0.223828
63
+ },
64
+ {
65
+ "accuracy": 0.718141,
66
+ "f1": 0.599527,
67
+ "f1_weighted": 0.772056,
68
+ "precision": 0.610373,
69
+ "precision_weighted": 0.897755,
70
+ "recall": 0.769567,
71
+ "recall_weighted": 0.718141,
72
+ "ap": 0.22365,
73
+ "ap_weighted": 0.22365
74
+ },
75
+ {
76
+ "accuracy": 0.703898,
77
+ "f1": 0.583608,
78
+ "f1_weighted": 0.760771,
79
+ "precision": 0.599754,
80
+ "precision_weighted": 0.890697,
81
+ "recall": 0.745724,
82
+ "recall_weighted": 0.703898,
83
+ "ap": 0.206429,
84
+ "ap_weighted": 0.206429
85
+ },
86
+ {
87
+ "accuracy": 0.706897,
88
+ "f1": 0.590597,
89
+ "f1_weighted": 0.763329,
90
+ "precision": 0.606305,
91
+ "precision_weighted": 0.896536,
92
+ "recall": 0.763291,
93
+ "recall_weighted": 0.706897,
94
+ "ap": 0.217253,
95
+ "ap_weighted": 0.217253
96
+ },
97
+ {
98
+ "accuracy": 0.76087,
99
+ "f1": 0.63221,
100
+ "f1_weighted": 0.804408,
101
+ "precision": 0.625154,
102
+ "precision_weighted": 0.899705,
103
+ "recall": 0.783881,
104
+ "recall_weighted": 0.76087,
105
+ "ap": 0.245755,
106
+ "ap_weighted": 0.245755
107
+ },
108
+ {
109
+ "accuracy": 0.73913,
110
+ "f1": 0.611801,
111
+ "f1_weighted": 0.787796,
112
+ "precision": 0.613266,
113
+ "precision_weighted": 0.89486,
114
+ "recall": 0.765389,
115
+ "recall_weighted": 0.73913,
116
+ "ap": 0.226651,
117
+ "ap_weighted": 0.226651
118
+ }
119
+ ],
120
+ "accuracy": 0.718516,
121
+ "f1": 0.597656,
122
+ "f1_weighted": 0.771869,
123
+ "precision": 0.608022,
124
+ "precision_weighted": 0.894023,
125
+ "recall": 0.758651,
126
+ "recall_weighted": 0.718516,
127
+ "ap": 0.22049,
128
+ "ap_weighted": 0.22049,
129
+ "main_score": 0.718516,
130
+ "hf_subset": "en-ext",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ },
135
+ {
136
+ "scores_per_experiment": [
137
+ {
138
+ "accuracy": 0.650746,
139
+ "f1": 0.604091,
140
+ "f1_weighted": 0.686853,
141
+ "precision": 0.627886,
142
+ "precision_weighted": 0.807255,
143
+ "recall": 0.702035,
144
+ "recall_weighted": 0.650746,
145
+ "ap": 0.303878,
146
+ "ap_weighted": 0.303878
147
+ },
148
+ {
149
+ "accuracy": 0.679104,
150
+ "f1": 0.618523,
151
+ "f1_weighted": 0.711097,
152
+ "precision": 0.626312,
153
+ "precision_weighted": 0.79726,
154
+ "recall": 0.693658,
155
+ "recall_weighted": 0.679104,
156
+ "ap": 0.303203,
157
+ "ap_weighted": 0.303203
158
+ },
159
+ {
160
+ "accuracy": 0.644776,
161
+ "f1": 0.596381,
162
+ "f1_weighted": 0.681489,
163
+ "precision": 0.620167,
164
+ "precision_weighted": 0.799464,
165
+ "recall": 0.689657,
166
+ "recall_weighted": 0.644776,
167
+ "ap": 0.29492,
168
+ "ap_weighted": 0.29492
169
+ },
170
+ {
171
+ "accuracy": 0.685075,
172
+ "f1": 0.628735,
173
+ "f1_weighted": 0.716806,
174
+ "precision": 0.637241,
175
+ "precision_weighted": 0.808434,
176
+ "recall": 0.711814,
177
+ "recall_weighted": 0.685075,
178
+ "ap": 0.316886,
179
+ "ap_weighted": 0.316886
180
+ },
181
+ {
182
+ "accuracy": 0.68209,
183
+ "f1": 0.632087,
184
+ "f1_weighted": 0.714682,
185
+ "precision": 0.645403,
186
+ "precision_weighted": 0.819451,
187
+ "recall": 0.727294,
188
+ "recall_weighted": 0.68209,
189
+ "ap": 0.327026,
190
+ "ap_weighted": 0.327026
191
+ },
192
+ {
193
+ "accuracy": 0.707463,
194
+ "f1": 0.646624,
195
+ "f1_weighted": 0.735912,
196
+ "precision": 0.647397,
197
+ "precision_weighted": 0.812983,
198
+ "recall": 0.72284,
199
+ "recall_weighted": 0.707463,
200
+ "ap": 0.330146,
201
+ "ap_weighted": 0.330146
202
+ },
203
+ {
204
+ "accuracy": 0.729851,
205
+ "f1": 0.655658,
206
+ "f1_weighted": 0.752991,
207
+ "precision": 0.647664,
208
+ "precision_weighted": 0.804356,
209
+ "recall": 0.710752,
210
+ "recall_weighted": 0.729851,
211
+ "ap": 0.327887,
212
+ "ap_weighted": 0.327887
213
+ },
214
+ {
215
+ "accuracy": 0.714925,
216
+ "f1": 0.64637,
217
+ "f1_weighted": 0.741186,
218
+ "precision": 0.64275,
219
+ "precision_weighted": 0.80455,
220
+ "recall": 0.710143,
221
+ "recall_weighted": 0.714925,
222
+ "ap": 0.323007,
223
+ "ap_weighted": 0.323007
224
+ },
225
+ {
226
+ "accuracy": 0.631343,
227
+ "f1": 0.58258,
228
+ "f1_weighted": 0.66946,
229
+ "precision": 0.609229,
230
+ "precision_weighted": 0.789679,
231
+ "recall": 0.672641,
232
+ "recall_weighted": 0.631343,
233
+ "ap": 0.282438,
234
+ "ap_weighted": 0.282438
235
+ },
236
+ {
237
+ "accuracy": 0.61791,
238
+ "f1": 0.569828,
239
+ "f1_weighted": 0.657407,
240
+ "precision": 0.600089,
241
+ "precision_weighted": 0.781864,
242
+ "recall": 0.658514,
243
+ "recall_weighted": 0.61791,
244
+ "ap": 0.27244,
245
+ "ap_weighted": 0.27244
246
+ }
247
+ ],
248
+ "accuracy": 0.674328,
249
+ "f1": 0.618088,
250
+ "f1_weighted": 0.706788,
251
+ "precision": 0.630414,
252
+ "precision_weighted": 0.80253,
253
+ "recall": 0.699935,
254
+ "recall_weighted": 0.674328,
255
+ "ap": 0.308183,
256
+ "ap_weighted": 0.308183,
257
+ "main_score": 0.674328,
258
+ "hf_subset": "en",
259
+ "languages": [
260
+ "eng-Latn"
261
+ ]
262
+ }
263
+ ]
264
+ },
265
+ "evaluation_time": 24.336832523345947,
266
+ "kg_co2_emissions": null,
267
+ "date": null
268
+ }
results/AmazonPolarityClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e2d317d38cd51312af73b3d32a06d1a08b442046",
3
+ "task_name": "AmazonPolarityClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.774242,
11
+ "f1": 0.773955,
12
+ "f1_weighted": 0.773955,
13
+ "precision": 0.775647,
14
+ "precision_weighted": 0.775647,
15
+ "recall": 0.774242,
16
+ "recall_weighted": 0.774242,
17
+ "ap": 0.718111,
18
+ "ap_weighted": 0.718111
19
+ },
20
+ {
21
+ "accuracy": 0.712098,
22
+ "f1": 0.712096,
23
+ "f1_weighted": 0.712096,
24
+ "precision": 0.712101,
25
+ "precision_weighted": 0.712101,
26
+ "recall": 0.712098,
27
+ "recall_weighted": 0.712098,
28
+ "ap": 0.650841,
29
+ "ap_weighted": 0.650841
30
+ },
31
+ {
32
+ "accuracy": 0.800255,
33
+ "f1": 0.800027,
34
+ "f1_weighted": 0.800027,
35
+ "precision": 0.801633,
36
+ "precision_weighted": 0.801633,
37
+ "recall": 0.800255,
38
+ "recall_weighted": 0.800255,
39
+ "ap": 0.734574,
40
+ "ap_weighted": 0.734574
41
+ },
42
+ {
43
+ "accuracy": 0.770332,
44
+ "f1": 0.766838,
45
+ "f1_weighted": 0.766838,
46
+ "precision": 0.787571,
47
+ "precision_weighted": 0.787571,
48
+ "recall": 0.770333,
49
+ "recall_weighted": 0.770332,
50
+ "ap": 0.731939,
51
+ "ap_weighted": 0.731939
52
+ },
53
+ {
54
+ "accuracy": 0.79084,
55
+ "f1": 0.790832,
56
+ "f1_weighted": 0.790832,
57
+ "precision": 0.790883,
58
+ "precision_weighted": 0.790883,
59
+ "recall": 0.79084,
60
+ "recall_weighted": 0.79084,
61
+ "ap": 0.731053,
62
+ "ap_weighted": 0.731053
63
+ },
64
+ {
65
+ "accuracy": 0.731045,
66
+ "f1": 0.729318,
67
+ "f1_weighted": 0.729318,
68
+ "precision": 0.737096,
69
+ "precision_weighted": 0.737096,
70
+ "recall": 0.731045,
71
+ "recall_weighted": 0.731045,
72
+ "ap": 0.679053,
73
+ "ap_weighted": 0.679053
74
+ },
75
+ {
76
+ "accuracy": 0.753105,
77
+ "f1": 0.751956,
78
+ "f1_weighted": 0.751956,
79
+ "precision": 0.757883,
80
+ "precision_weighted": 0.757883,
81
+ "recall": 0.753105,
82
+ "recall_weighted": 0.753105,
83
+ "ap": 0.682939,
84
+ "ap_weighted": 0.682939
85
+ },
86
+ {
87
+ "accuracy": 0.816152,
88
+ "f1": 0.815148,
89
+ "f1_weighted": 0.815148,
90
+ "precision": 0.823176,
91
+ "precision_weighted": 0.823176,
92
+ "recall": 0.816153,
93
+ "recall_weighted": 0.816152,
94
+ "ap": 0.775312,
95
+ "ap_weighted": 0.775312
96
+ },
97
+ {
98
+ "accuracy": 0.731452,
99
+ "f1": 0.72609,
100
+ "f1_weighted": 0.72609,
101
+ "precision": 0.751118,
102
+ "precision_weighted": 0.751118,
103
+ "recall": 0.731453,
104
+ "recall_weighted": 0.731452,
105
+ "ap": 0.657583,
106
+ "ap_weighted": 0.657583
107
+ },
108
+ {
109
+ "accuracy": 0.792623,
110
+ "f1": 0.791771,
111
+ "f1_weighted": 0.791771,
112
+ "precision": 0.797491,
113
+ "precision_weighted": 0.797491,
114
+ "recall": 0.792623,
115
+ "recall_weighted": 0.792623,
116
+ "ap": 0.722228,
117
+ "ap_weighted": 0.722228
118
+ }
119
+ ],
120
+ "accuracy": 0.767215,
121
+ "f1": 0.765803,
122
+ "f1_weighted": 0.765803,
123
+ "precision": 0.77346,
124
+ "precision_weighted": 0.77346,
125
+ "recall": 0.767215,
126
+ "recall_weighted": 0.767215,
127
+ "ap": 0.708363,
128
+ "ap_weighted": 0.708363,
129
+ "main_score": 0.767215,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 5941.902814865112,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/AmazonReviewsClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6b5d328eaae8ef408dd7d775040245cf86f92e9d",
3
+ "task_name": "AmazonReviewsClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.4108,
11
+ "f1": 0.393756,
12
+ "f1_weighted": 0.393756,
13
+ "precision": 0.394797,
14
+ "precision_weighted": 0.394797,
15
+ "recall": 0.4108,
16
+ "recall_weighted": 0.4108,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.412,
22
+ "f1": 0.401409,
23
+ "f1_weighted": 0.401409,
24
+ "precision": 0.404834,
25
+ "precision_weighted": 0.404834,
26
+ "recall": 0.412,
27
+ "recall_weighted": 0.412,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.3964,
33
+ "f1": 0.39079,
34
+ "f1_weighted": 0.39079,
35
+ "precision": 0.388085,
36
+ "precision_weighted": 0.388085,
37
+ "recall": 0.3964,
38
+ "recall_weighted": 0.3964,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3852,
44
+ "f1": 0.387035,
45
+ "f1_weighted": 0.387035,
46
+ "precision": 0.389871,
47
+ "precision_weighted": 0.389871,
48
+ "recall": 0.3852,
49
+ "recall_weighted": 0.3852,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.4038,
55
+ "f1": 0.38943,
56
+ "f1_weighted": 0.38943,
57
+ "precision": 0.397404,
58
+ "precision_weighted": 0.397404,
59
+ "recall": 0.4038,
60
+ "recall_weighted": 0.4038,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.3654,
66
+ "f1": 0.356646,
67
+ "f1_weighted": 0.356646,
68
+ "precision": 0.369934,
69
+ "precision_weighted": 0.369934,
70
+ "recall": 0.3654,
71
+ "recall_weighted": 0.3654,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.3604,
77
+ "f1": 0.356647,
78
+ "f1_weighted": 0.356647,
79
+ "precision": 0.354729,
80
+ "precision_weighted": 0.354729,
81
+ "recall": 0.3604,
82
+ "recall_weighted": 0.3604,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.414,
88
+ "f1": 0.410875,
89
+ "f1_weighted": 0.410875,
90
+ "precision": 0.40967,
91
+ "precision_weighted": 0.40967,
92
+ "recall": 0.414,
93
+ "recall_weighted": 0.414,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.3752,
99
+ "f1": 0.358417,
100
+ "f1_weighted": 0.358417,
101
+ "precision": 0.370493,
102
+ "precision_weighted": 0.370493,
103
+ "recall": 0.3752,
104
+ "recall_weighted": 0.3752,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3772,
110
+ "f1": 0.359868,
111
+ "f1_weighted": 0.359868,
112
+ "precision": 0.36441,
113
+ "precision_weighted": 0.36441,
114
+ "recall": 0.3772,
115
+ "recall_weighted": 0.3772,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.39004,
121
+ "f1": 0.380487,
122
+ "f1_weighted": 0.380487,
123
+ "precision": 0.384423,
124
+ "precision_weighted": 0.384423,
125
+ "recall": 0.39004,
126
+ "recall_weighted": 0.39004,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.39004,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 86.87519001960754,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/ArXivHierarchicalClusteringP2P.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0bbdb47bcbe3a90093699aefeed338a0f28a7ee8",
3
+ "task_name": "ArXivHierarchicalClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.520154,
11
+ 0.515828,
12
+ 0.535515,
13
+ 0.533674,
14
+ 0.474025,
15
+ 0.490379,
16
+ 0.541541,
17
+ 0.534128,
18
+ 0.558393,
19
+ 0.515761
20
+ ],
21
+ "Level 1": [
22
+ 0.610618,
23
+ 0.570487,
24
+ 0.57927,
25
+ 0.58846,
26
+ 0.60162,
27
+ 0.580356,
28
+ 0.576163,
29
+ 0.59404,
30
+ 0.589051,
31
+ 0.597647
32
+ ]
33
+ },
34
+ "v_measure": 0.555355,
35
+ "v_measure_std": 0.038269,
36
+ "main_score": 0.555355,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.509460210800171,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArXivHierarchicalClusteringS2S.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3",
3
+ "task_name": "ArXivHierarchicalClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.469153,
11
+ 0.483904,
12
+ 0.526902,
13
+ 0.469214,
14
+ 0.47674,
15
+ 0.457316,
16
+ 0.464642,
17
+ 0.485136,
18
+ 0.457321,
19
+ 0.478392
20
+ ],
21
+ "Level 1": [
22
+ 0.574796,
23
+ 0.542969,
24
+ 0.580082,
25
+ 0.582003,
26
+ 0.58335,
27
+ 0.547177,
28
+ 0.540052,
29
+ 0.574295,
30
+ 0.547945,
31
+ 0.582625
32
+ ]
33
+ },
34
+ "v_measure": 0.521201,
35
+ "v_measure_std": 0.047967,
36
+ "main_score": 0.521201,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.479525566101074,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArguAna.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c22ab2a51041ffd869aaddef7af8d8215647e41a",
3
+ "task_name": "ArguAna",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.18421,
9
+ "ndcg_at_3": 0.31658,
10
+ "ndcg_at_5": 0.3627,
11
+ "ndcg_at_10": 0.42323,
12
+ "ndcg_at_20": 0.4614,
13
+ "ndcg_at_100": 0.48216,
14
+ "ndcg_at_1000": 0.48544,
15
+ "map_at_1": 0.18421,
16
+ "map_at_3": 0.28307,
17
+ "map_at_5": 0.30857,
18
+ "map_at_10": 0.33384,
19
+ "map_at_20": 0.34462,
20
+ "map_at_100": 0.34769,
21
+ "map_at_1000": 0.34782,
22
+ "recall_at_1": 0.18421,
23
+ "recall_at_3": 0.41394,
24
+ "recall_at_5": 0.52632,
25
+ "recall_at_10": 0.71195,
26
+ "recall_at_20": 0.8606,
27
+ "recall_at_100": 0.96942,
28
+ "recall_at_1000": 0.99502,
29
+ "accuracy": 0.18421,
30
+ "precision_at_1": 0.18421,
31
+ "precision_at_3": 0.13798,
32
+ "precision_at_5": 0.10526,
33
+ "precision_at_10": 0.07119,
34
+ "precision_at_20": 0.04303,
35
+ "precision_at_100": 0.00969,
36
+ "precision_at_1000": 0.001,
37
+ "mrr_at_1": 0.187055,
38
+ "mrr_at_3": 0.284258,
39
+ "mrr_at_5": 0.309471,
40
+ "mrr_at_10": 0.33498,
41
+ "mrr_at_20": 0.345595,
42
+ "mrr_at_100": 0.348699,
43
+ "mrr_at_1000": 0.348824,
44
+ "nauc_ndcg_at_1_max": -0.040756,
45
+ "nauc_ndcg_at_1_std": -0.117154,
46
+ "nauc_ndcg_at_1_diff1": 0.170308,
47
+ "nauc_ndcg_at_3_max": -0.030666,
48
+ "nauc_ndcg_at_3_std": -0.080075,
49
+ "nauc_ndcg_at_3_diff1": 0.084019,
50
+ "nauc_ndcg_at_5_max": -0.01313,
51
+ "nauc_ndcg_at_5_std": -0.084901,
52
+ "nauc_ndcg_at_5_diff1": 0.080777,
53
+ "nauc_ndcg_at_10_max": 0.015636,
54
+ "nauc_ndcg_at_10_std": -0.060906,
55
+ "nauc_ndcg_at_10_diff1": 0.090183,
56
+ "nauc_ndcg_at_20_max": 0.014795,
57
+ "nauc_ndcg_at_20_std": -0.075479,
58
+ "nauc_ndcg_at_20_diff1": 0.094634,
59
+ "nauc_ndcg_at_100_max": 0.006286,
60
+ "nauc_ndcg_at_100_std": -0.066076,
61
+ "nauc_ndcg_at_100_diff1": 0.103047,
62
+ "nauc_ndcg_at_1000_max": -0.00161,
63
+ "nauc_ndcg_at_1000_std": -0.076239,
64
+ "nauc_ndcg_at_1000_diff1": 0.100072,
65
+ "nauc_map_at_1_max": -0.040756,
66
+ "nauc_map_at_1_std": -0.117154,
67
+ "nauc_map_at_1_diff1": 0.170308,
68
+ "nauc_map_at_3_max": -0.032612,
69
+ "nauc_map_at_3_std": -0.089076,
70
+ "nauc_map_at_3_diff1": 0.101665,
71
+ "nauc_map_at_5_max": -0.022874,
72
+ "nauc_map_at_5_std": -0.091874,
73
+ "nauc_map_at_5_diff1": 0.09999,
74
+ "nauc_map_at_10_max": -0.01117,
75
+ "nauc_map_at_10_std": -0.082613,
76
+ "nauc_map_at_10_diff1": 0.104829,
77
+ "nauc_map_at_20_max": -0.01173,
78
+ "nauc_map_at_20_std": -0.087093,
79
+ "nauc_map_at_20_diff1": 0.106293,
80
+ "nauc_map_at_100_max": -0.012778,
81
+ "nauc_map_at_100_std": -0.086179,
82
+ "nauc_map_at_100_diff1": 0.107626,
83
+ "nauc_map_at_1000_max": -0.013034,
84
+ "nauc_map_at_1000_std": -0.086486,
85
+ "nauc_map_at_1000_diff1": 0.107523,
86
+ "nauc_recall_at_1_max": -0.040756,
87
+ "nauc_recall_at_1_std": -0.117154,
88
+ "nauc_recall_at_1_diff1": 0.170308,
89
+ "nauc_recall_at_3_max": -0.025834,
90
+ "nauc_recall_at_3_std": -0.057019,
91
+ "nauc_recall_at_3_diff1": 0.03984,
92
+ "nauc_recall_at_5_max": 0.015047,
93
+ "nauc_recall_at_5_std": -0.066905,
94
+ "nauc_recall_at_5_diff1": 0.030192,
95
+ "nauc_recall_at_10_max": 0.124208,
96
+ "nauc_recall_at_10_std": 0.025652,
97
+ "nauc_recall_at_10_diff1": 0.043643,
98
+ "nauc_recall_at_20_max": 0.205014,
99
+ "nauc_recall_at_20_std": -0.006456,
100
+ "nauc_recall_at_20_diff1": 0.038373,
101
+ "nauc_recall_at_100_max": 0.459125,
102
+ "nauc_recall_at_100_std": 0.637786,
103
+ "nauc_recall_at_100_diff1": 0.166582,
104
+ "nauc_recall_at_1000_max": -0.148905,
105
+ "nauc_recall_at_1000_std": 0.417181,
106
+ "nauc_recall_at_1000_diff1": -0.58964,
107
+ "nauc_precision_at_1_max": -0.040756,
108
+ "nauc_precision_at_1_std": -0.117154,
109
+ "nauc_precision_at_1_diff1": 0.170308,
110
+ "nauc_precision_at_3_max": -0.025834,
111
+ "nauc_precision_at_3_std": -0.057019,
112
+ "nauc_precision_at_3_diff1": 0.03984,
113
+ "nauc_precision_at_5_max": 0.015047,
114
+ "nauc_precision_at_5_std": -0.066905,
115
+ "nauc_precision_at_5_diff1": 0.030192,
116
+ "nauc_precision_at_10_max": 0.124208,
117
+ "nauc_precision_at_10_std": 0.025652,
118
+ "nauc_precision_at_10_diff1": 0.043643,
119
+ "nauc_precision_at_20_max": 0.205014,
120
+ "nauc_precision_at_20_std": -0.006456,
121
+ "nauc_precision_at_20_diff1": 0.038373,
122
+ "nauc_precision_at_100_max": 0.459125,
123
+ "nauc_precision_at_100_std": 0.637786,
124
+ "nauc_precision_at_100_diff1": 0.166582,
125
+ "nauc_precision_at_1000_max": -0.148905,
126
+ "nauc_precision_at_1000_std": 0.417181,
127
+ "nauc_precision_at_1000_diff1": -0.58964,
128
+ "nauc_mrr_at_1_max": -0.041155,
129
+ "nauc_mrr_at_1_std": -0.121635,
130
+ "nauc_mrr_at_1_diff1": 0.157903,
131
+ "nauc_mrr_at_3_max": -0.034367,
132
+ "nauc_mrr_at_3_std": -0.089098,
133
+ "nauc_mrr_at_3_diff1": 0.093949,
134
+ "nauc_mrr_at_5_max": -0.026883,
135
+ "nauc_mrr_at_5_std": -0.09347,
136
+ "nauc_mrr_at_5_diff1": 0.091462,
137
+ "nauc_mrr_at_10_max": -0.014492,
138
+ "nauc_mrr_at_10_std": -0.084178,
139
+ "nauc_mrr_at_10_diff1": 0.096409,
140
+ "nauc_mrr_at_20_max": -0.015278,
141
+ "nauc_mrr_at_20_std": -0.088586,
142
+ "nauc_mrr_at_20_diff1": 0.097639,
143
+ "nauc_mrr_at_100_max": -0.01629,
144
+ "nauc_mrr_at_100_std": -0.087496,
145
+ "nauc_mrr_at_100_diff1": 0.098877,
146
+ "nauc_mrr_at_1000_max": -0.016546,
147
+ "nauc_mrr_at_1000_std": -0.087802,
148
+ "nauc_mrr_at_1000_diff1": 0.098771,
149
+ "hit_rate_at_1": 0.18421,
150
+ "hit_rate_at_3": 0.41394,
151
+ "hit_rate_at_5": 0.52632,
152
+ "hit_rate_at_10": 0.71195,
153
+ "hit_rate_at_20": 0.8606,
154
+ "hit_rate_at_100": 0.96942,
155
+ "hit_rate_at_1000": 0.99502,
156
+ "main_score": 0.42323,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 28.33542037010193,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/AskUbuntuDupQuestions.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5691e3c48741d5f83b5cc8e630653d7a8cfc048",
3
+ "task_name": "AskUbuntuDupQuestions",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.52078,
9
+ "ndcg_at_3": 0.51458,
10
+ "ndcg_at_5": 0.53232,
11
+ "ndcg_at_10": 0.60543,
12
+ "ndcg_at_20": 0.71998,
13
+ "ndcg_at_100": 0.71998,
14
+ "ndcg_at_1000": 0.71998,
15
+ "map_at_1": 0.1313,
16
+ "map_at_3": 0.25533,
17
+ "map_at_5": 0.3283,
18
+ "map_at_10": 0.44283,
19
+ "map_at_20": 0.55081,
20
+ "map_at_100": 0.55081,
21
+ "map_at_1000": 0.55081,
22
+ "recall_at_1": 0.1313,
23
+ "recall_at_3": 0.31087,
24
+ "recall_at_5": 0.45426,
25
+ "recall_at_10": 0.7067,
26
+ "recall_at_20": 1.0,
27
+ "recall_at_100": 1.0,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.1313,
30
+ "precision_at_1": 0.52078,
31
+ "precision_at_3": 0.4626,
32
+ "precision_at_5": 0.4205,
33
+ "precision_at_10": 0.35679,
34
+ "precision_at_20": 0.27355,
35
+ "precision_at_100": 0.05471,
36
+ "precision_at_1000": 0.00547,
37
+ "mrr_at_1": 0.520776,
38
+ "mrr_at_3": 0.637119,
39
+ "mrr_at_5": 0.661496,
40
+ "mrr_at_10": 0.669663,
41
+ "mrr_at_20": 0.67308,
42
+ "mrr_at_100": 0.67308,
43
+ "mrr_at_1000": 0.67308,
44
+ "nauc_ndcg_at_1_max": 0.283561,
45
+ "nauc_ndcg_at_1_std": 0.226528,
46
+ "nauc_ndcg_at_1_diff1": 0.112504,
47
+ "nauc_ndcg_at_3_max": 0.215275,
48
+ "nauc_ndcg_at_3_std": 0.159846,
49
+ "nauc_ndcg_at_3_diff1": 0.089294,
50
+ "nauc_ndcg_at_5_max": 0.206483,
51
+ "nauc_ndcg_at_5_std": 0.176641,
52
+ "nauc_ndcg_at_5_diff1": 0.092346,
53
+ "nauc_ndcg_at_10_max": 0.202064,
54
+ "nauc_ndcg_at_10_std": 0.202791,
55
+ "nauc_ndcg_at_10_diff1": 0.060188,
56
+ "nauc_ndcg_at_20_max": 0.267742,
57
+ "nauc_ndcg_at_20_std": 0.185069,
58
+ "nauc_ndcg_at_20_diff1": 0.106487,
59
+ "nauc_ndcg_at_100_max": 0.267742,
60
+ "nauc_ndcg_at_100_std": 0.185069,
61
+ "nauc_ndcg_at_100_diff1": 0.106487,
62
+ "nauc_ndcg_at_1000_max": 0.267742,
63
+ "nauc_ndcg_at_1000_std": 0.185069,
64
+ "nauc_ndcg_at_1000_diff1": 0.106487,
65
+ "nauc_map_at_1_max": 0.058718,
66
+ "nauc_map_at_1_std": 0.098439,
67
+ "nauc_map_at_1_diff1": 0.146192,
68
+ "nauc_map_at_3_max": 0.043917,
69
+ "nauc_map_at_3_std": 0.111986,
70
+ "nauc_map_at_3_diff1": 0.091431,
71
+ "nauc_map_at_5_max": 0.080912,
72
+ "nauc_map_at_5_std": 0.152408,
73
+ "nauc_map_at_5_diff1": 0.083909,
74
+ "nauc_map_at_10_max": 0.148512,
75
+ "nauc_map_at_10_std": 0.184537,
76
+ "nauc_map_at_10_diff1": 0.067553,
77
+ "nauc_map_at_20_max": 0.225526,
78
+ "nauc_map_at_20_std": 0.174553,
79
+ "nauc_map_at_20_diff1": 0.093769,
80
+ "nauc_map_at_100_max": 0.225526,
81
+ "nauc_map_at_100_std": 0.174553,
82
+ "nauc_map_at_100_diff1": 0.093769,
83
+ "nauc_map_at_1000_max": 0.225526,
84
+ "nauc_map_at_1000_std": 0.174553,
85
+ "nauc_map_at_1000_diff1": 0.093769,
86
+ "nauc_recall_at_1_max": 0.058718,
87
+ "nauc_recall_at_1_std": 0.098439,
88
+ "nauc_recall_at_1_diff1": 0.146192,
89
+ "nauc_recall_at_3_max": -0.01651,
90
+ "nauc_recall_at_3_std": 0.061092,
91
+ "nauc_recall_at_3_diff1": 0.072252,
92
+ "nauc_recall_at_5_max": -0.001038,
93
+ "nauc_recall_at_5_std": 0.104026,
94
+ "nauc_recall_at_5_diff1": 0.044219,
95
+ "nauc_recall_at_10_max": -0.017375,
96
+ "nauc_recall_at_10_std": 0.159011,
97
+ "nauc_recall_at_10_diff1": -0.053415,
98
+ "nauc_recall_at_20_max": NaN,
99
+ "nauc_recall_at_20_std": NaN,
100
+ "nauc_recall_at_20_diff1": NaN,
101
+ "nauc_recall_at_100_max": NaN,
102
+ "nauc_recall_at_100_std": NaN,
103
+ "nauc_recall_at_100_diff1": NaN,
104
+ "nauc_recall_at_1000_max": NaN,
105
+ "nauc_recall_at_1000_std": NaN,
106
+ "nauc_recall_at_1000_diff1": NaN,
107
+ "nauc_precision_at_1_max": 0.283561,
108
+ "nauc_precision_at_1_std": 0.226528,
109
+ "nauc_precision_at_1_diff1": 0.112504,
110
+ "nauc_precision_at_3_max": 0.227009,
111
+ "nauc_precision_at_3_std": 0.157928,
112
+ "nauc_precision_at_3_diff1": 0.032997,
113
+ "nauc_precision_at_5_max": 0.262271,
114
+ "nauc_precision_at_5_std": 0.171762,
115
+ "nauc_precision_at_5_diff1": 0.028237,
116
+ "nauc_precision_at_10_max": 0.266197,
117
+ "nauc_precision_at_10_std": 0.114503,
118
+ "nauc_precision_at_10_diff1": 0.002259,
119
+ "nauc_precision_at_20_max": 0.237252,
120
+ "nauc_precision_at_20_std": 0.054517,
121
+ "nauc_precision_at_20_diff1": 0.034271,
122
+ "nauc_precision_at_100_max": 0.237252,
123
+ "nauc_precision_at_100_std": 0.054517,
124
+ "nauc_precision_at_100_diff1": 0.034271,
125
+ "nauc_precision_at_1000_max": 0.237252,
126
+ "nauc_precision_at_1000_std": 0.054517,
127
+ "nauc_precision_at_1000_diff1": 0.034271,
128
+ "nauc_mrr_at_1_max": 0.283561,
129
+ "nauc_mrr_at_1_std": 0.226528,
130
+ "nauc_mrr_at_1_diff1": 0.112504,
131
+ "nauc_mrr_at_3_max": 0.273495,
132
+ "nauc_mrr_at_3_std": 0.205782,
133
+ "nauc_mrr_at_3_diff1": 0.118475,
134
+ "nauc_mrr_at_5_max": 0.284715,
135
+ "nauc_mrr_at_5_std": 0.20563,
136
+ "nauc_mrr_at_5_diff1": 0.119816,
137
+ "nauc_mrr_at_10_max": 0.281396,
138
+ "nauc_mrr_at_10_std": 0.212961,
139
+ "nauc_mrr_at_10_diff1": 0.117308,
140
+ "nauc_mrr_at_20_max": 0.280871,
141
+ "nauc_mrr_at_20_std": 0.210843,
142
+ "nauc_mrr_at_20_diff1": 0.117422,
143
+ "nauc_mrr_at_100_max": 0.280871,
144
+ "nauc_mrr_at_100_std": 0.210843,
145
+ "nauc_mrr_at_100_diff1": 0.117422,
146
+ "nauc_mrr_at_1000_max": 0.280871,
147
+ "nauc_mrr_at_1000_std": 0.210843,
148
+ "nauc_mrr_at_1000_diff1": 0.117422,
149
+ "hit_rate_at_1": 0.52078,
150
+ "hit_rate_at_3": 0.78393,
151
+ "hit_rate_at_5": 0.89197,
152
+ "hit_rate_at_10": 0.95291,
153
+ "hit_rate_at_20": 1.0,
154
+ "hit_rate_at_100": 1.0,
155
+ "hit_rate_at_1000": 1.0,
156
+ "main_score": 0.55081,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 19.16975712776184,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/BIOSSES.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "d3fb88f8f02e40887cd149695127462bbcf29b4a",
3
+ "task_name": "BIOSSES",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "pearson": 0.844406,
9
+ "spearman": 0.838091,
10
+ "cosine_pearson": 0.844406,
11
+ "cosine_spearman": 0.838091,
12
+ "manhattan_pearson": 0.822617,
13
+ "manhattan_spearman": 0.82729,
14
+ "euclidean_pearson": 0.826264,
15
+ "euclidean_spearman": 0.838091,
16
+ "main_score": 0.838091,
17
+ "hf_subset": "default",
18
+ "languages": [
19
+ "eng-Latn"
20
+ ]
21
+ }
22
+ ]
23
+ },
24
+ "evaluation_time": 0.7556891441345215,
25
+ "kg_co2_emissions": null,
26
+ "date": null
27
+ }
results/Banking77Classification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0fd18e25b25c072e09e0d92ab615fda904d66300",
3
+ "task_name": "Banking77Classification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.773377,
11
+ "f1": 0.763021,
12
+ "f1_weighted": 0.763021,
13
+ "precision": 0.78969,
14
+ "precision_weighted": 0.78969,
15
+ "recall": 0.773377,
16
+ "recall_weighted": 0.773377,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.786039,
22
+ "f1": 0.779442,
23
+ "f1_weighted": 0.779442,
24
+ "precision": 0.805212,
25
+ "precision_weighted": 0.805212,
26
+ "recall": 0.786039,
27
+ "recall_weighted": 0.786039,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.776623,
33
+ "f1": 0.768612,
34
+ "f1_weighted": 0.768612,
35
+ "precision": 0.799428,
36
+ "precision_weighted": 0.799428,
37
+ "recall": 0.776623,
38
+ "recall_weighted": 0.776623,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.778896,
44
+ "f1": 0.773521,
45
+ "f1_weighted": 0.773521,
46
+ "precision": 0.795165,
47
+ "precision_weighted": 0.795165,
48
+ "recall": 0.778896,
49
+ "recall_weighted": 0.778896,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.771104,
55
+ "f1": 0.763493,
56
+ "f1_weighted": 0.763493,
57
+ "precision": 0.795884,
58
+ "precision_weighted": 0.795884,
59
+ "recall": 0.771104,
60
+ "recall_weighted": 0.771104,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.774675,
66
+ "f1": 0.770592,
67
+ "f1_weighted": 0.770592,
68
+ "precision": 0.795213,
69
+ "precision_weighted": 0.795213,
70
+ "recall": 0.774675,
71
+ "recall_weighted": 0.774675,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.773701,
77
+ "f1": 0.766316,
78
+ "f1_weighted": 0.766316,
79
+ "precision": 0.789125,
80
+ "precision_weighted": 0.789125,
81
+ "recall": 0.773701,
82
+ "recall_weighted": 0.773701,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.768831,
88
+ "f1": 0.760985,
89
+ "f1_weighted": 0.760985,
90
+ "precision": 0.784371,
91
+ "precision_weighted": 0.784371,
92
+ "recall": 0.768831,
93
+ "recall_weighted": 0.768831,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.759091,
99
+ "f1": 0.750893,
100
+ "f1_weighted": 0.750893,
101
+ "precision": 0.785353,
102
+ "precision_weighted": 0.785353,
103
+ "recall": 0.759091,
104
+ "recall_weighted": 0.759091,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.775649,
110
+ "f1": 0.769507,
111
+ "f1_weighted": 0.769507,
112
+ "precision": 0.79415,
113
+ "precision_weighted": 0.79415,
114
+ "recall": 0.775649,
115
+ "recall_weighted": 0.775649,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.773799,
121
+ "f1": 0.766638,
122
+ "f1_weighted": 0.766638,
123
+ "precision": 0.793359,
124
+ "precision_weighted": 0.793359,
125
+ "recall": 0.773799,
126
+ "recall_weighted": 0.773799,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.773799,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 61.292359590530396,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/BiorxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65b79d1d13f80053f67aca9498d9402c2d9f1f40",
3
+ "task_name": "BiorxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.332819,
9
+ "v_measure_std": 0.00929,
10
+ "v_measures": [
11
+ 0.329148,
12
+ 0.344644,
13
+ 0.342251,
14
+ 0.310465,
15
+ 0.328389,
16
+ 0.332113,
17
+ 0.341108,
18
+ 0.333053,
19
+ 0.337463,
20
+ 0.329554
21
+ ],
22
+ "main_score": 0.332819,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 196.8920702934265,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/BiorxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "258694dd0231531bc1fd9de6ceb52a0853c6d908",
3
+ "task_name": "BiorxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.254135,
9
+ "v_measure_std": 0.010097,
10
+ "v_measures": [
11
+ 0.243601,
12
+ 0.262892,
13
+ 0.25074,
14
+ 0.248503,
15
+ 0.235993,
16
+ 0.25293,
17
+ 0.251425,
18
+ 0.272283,
19
+ 0.258935,
20
+ 0.264044
21
+ ],
22
+ "main_score": 0.254135,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 181.23420882225037,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/CQADupstackAndroidRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3",
3
+ "task_name": "CQADupstackAndroidRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.27325,
9
+ "ndcg_at_3": 0.32725,
10
+ "ndcg_at_5": 0.34873,
11
+ "ndcg_at_10": 0.36801,
12
+ "ndcg_at_20": 0.38887,
13
+ "ndcg_at_100": 0.42454,
14
+ "ndcg_at_1000": 0.45455,
15
+ "map_at_1": 0.23147,
16
+ "map_at_3": 0.29181,
17
+ "map_at_5": 0.30738,
18
+ "map_at_10": 0.31748,
19
+ "map_at_20": 0.32441,
20
+ "map_at_100": 0.33035,
21
+ "map_at_1000": 0.33195,
22
+ "recall_at_1": 0.23147,
23
+ "recall_at_3": 0.35425,
24
+ "recall_at_5": 0.40952,
25
+ "recall_at_10": 0.46966,
26
+ "recall_at_20": 0.54627,
27
+ "recall_at_100": 0.71602,
28
+ "recall_at_1000": 0.91233,
29
+ "accuracy": 0.23147,
30
+ "precision_at_1": 0.27325,
31
+ "precision_at_3": 0.15498,
32
+ "precision_at_5": 0.11359,
33
+ "precision_at_10": 0.06838,
34
+ "precision_at_20": 0.04127,
35
+ "precision_at_100": 0.01177,
36
+ "precision_at_1000": 0.00175,
37
+ "mrr_at_1": 0.273247,
38
+ "mrr_at_3": 0.33691,
39
+ "mrr_at_5": 0.352861,
40
+ "mrr_at_10": 0.361644,
41
+ "mrr_at_20": 0.366881,
42
+ "mrr_at_100": 0.37096,
43
+ "mrr_at_1000": 0.37163,
44
+ "nauc_ndcg_at_1_max": 0.191269,
45
+ "nauc_ndcg_at_1_std": -0.081532,
46
+ "nauc_ndcg_at_1_diff1": 0.445361,
47
+ "nauc_ndcg_at_3_max": 0.189527,
48
+ "nauc_ndcg_at_3_std": -0.03902,
49
+ "nauc_ndcg_at_3_diff1": 0.418706,
50
+ "nauc_ndcg_at_5_max": 0.196867,
51
+ "nauc_ndcg_at_5_std": -0.033222,
52
+ "nauc_ndcg_at_5_diff1": 0.41318,
53
+ "nauc_ndcg_at_10_max": 0.204271,
54
+ "nauc_ndcg_at_10_std": -0.009138,
55
+ "nauc_ndcg_at_10_diff1": 0.412094,
56
+ "nauc_ndcg_at_20_max": 0.214548,
57
+ "nauc_ndcg_at_20_std": 0.002806,
58
+ "nauc_ndcg_at_20_diff1": 0.409864,
59
+ "nauc_ndcg_at_100_max": 0.221849,
60
+ "nauc_ndcg_at_100_std": 0.011525,
61
+ "nauc_ndcg_at_100_diff1": 0.411306,
62
+ "nauc_ndcg_at_1000_max": 0.220773,
63
+ "nauc_ndcg_at_1000_std": 0.009238,
64
+ "nauc_ndcg_at_1000_diff1": 0.411122,
65
+ "nauc_map_at_1_max": 0.164655,
66
+ "nauc_map_at_1_std": -0.062543,
67
+ "nauc_map_at_1_diff1": 0.477909,
68
+ "nauc_map_at_3_max": 0.182996,
69
+ "nauc_map_at_3_std": -0.045929,
70
+ "nauc_map_at_3_diff1": 0.437227,
71
+ "nauc_map_at_5_max": 0.190009,
72
+ "nauc_map_at_5_std": -0.040515,
73
+ "nauc_map_at_5_diff1": 0.430601,
74
+ "nauc_map_at_10_max": 0.194091,
75
+ "nauc_map_at_10_std": -0.029762,
76
+ "nauc_map_at_10_diff1": 0.427398,
77
+ "nauc_map_at_20_max": 0.198465,
78
+ "nauc_map_at_20_std": -0.025707,
79
+ "nauc_map_at_20_diff1": 0.426071,
80
+ "nauc_map_at_100_max": 0.199775,
81
+ "nauc_map_at_100_std": -0.024717,
82
+ "nauc_map_at_100_diff1": 0.426101,
83
+ "nauc_map_at_1000_max": 0.199774,
84
+ "nauc_map_at_1000_std": -0.024556,
85
+ "nauc_map_at_1000_diff1": 0.426137,
86
+ "nauc_recall_at_1_max": 0.164655,
87
+ "nauc_recall_at_1_std": -0.062543,
88
+ "nauc_recall_at_1_diff1": 0.477909,
89
+ "nauc_recall_at_3_max": 0.182829,
90
+ "nauc_recall_at_3_std": -0.016419,
91
+ "nauc_recall_at_3_diff1": 0.398849,
92
+ "nauc_recall_at_5_max": 0.200567,
93
+ "nauc_recall_at_5_std": -0.002313,
94
+ "nauc_recall_at_5_diff1": 0.378741,
95
+ "nauc_recall_at_10_max": 0.216364,
96
+ "nauc_recall_at_10_std": 0.067073,
97
+ "nauc_recall_at_10_diff1": 0.365825,
98
+ "nauc_recall_at_20_max": 0.244567,
99
+ "nauc_recall_at_20_std": 0.112261,
100
+ "nauc_recall_at_20_diff1": 0.344649,
101
+ "nauc_recall_at_100_max": 0.293471,
102
+ "nauc_recall_at_100_std": 0.209091,
103
+ "nauc_recall_at_100_diff1": 0.336814,
104
+ "nauc_recall_at_1000_max": 0.429681,
105
+ "nauc_recall_at_1000_std": 0.503093,
106
+ "nauc_recall_at_1000_diff1": 0.285465,
107
+ "nauc_precision_at_1_max": 0.191269,
108
+ "nauc_precision_at_1_std": -0.081532,
109
+ "nauc_precision_at_1_diff1": 0.445361,
110
+ "nauc_precision_at_3_max": 0.201968,
111
+ "nauc_precision_at_3_std": -0.028264,
112
+ "nauc_precision_at_3_diff1": 0.313341,
113
+ "nauc_precision_at_5_max": 0.206313,
114
+ "nauc_precision_at_5_std": -0.002351,
115
+ "nauc_precision_at_5_diff1": 0.255312,
116
+ "nauc_precision_at_10_max": 0.192068,
117
+ "nauc_precision_at_10_std": 0.050436,
118
+ "nauc_precision_at_10_diff1": 0.198217,
119
+ "nauc_precision_at_20_max": 0.189118,
120
+ "nauc_precision_at_20_std": 0.074088,
121
+ "nauc_precision_at_20_diff1": 0.152343,
122
+ "nauc_precision_at_100_max": 0.130064,
123
+ "nauc_precision_at_100_std": 0.070885,
124
+ "nauc_precision_at_100_diff1": 0.060724,
125
+ "nauc_precision_at_1000_max": -0.01841,
126
+ "nauc_precision_at_1000_std": -0.028442,
127
+ "nauc_precision_at_1000_diff1": -0.028636,
128
+ "nauc_mrr_at_1_max": 0.191269,
129
+ "nauc_mrr_at_1_std": -0.081532,
130
+ "nauc_mrr_at_1_diff1": 0.445361,
131
+ "nauc_mrr_at_3_max": 0.198708,
132
+ "nauc_mrr_at_3_std": -0.05805,
133
+ "nauc_mrr_at_3_diff1": 0.412522,
134
+ "nauc_mrr_at_5_max": 0.202005,
135
+ "nauc_mrr_at_5_std": -0.054669,
136
+ "nauc_mrr_at_5_diff1": 0.409534,
137
+ "nauc_mrr_at_10_max": 0.203933,
138
+ "nauc_mrr_at_10_std": -0.045568,
139
+ "nauc_mrr_at_10_diff1": 0.41045,
140
+ "nauc_mrr_at_20_max": 0.206089,
141
+ "nauc_mrr_at_20_std": -0.04366,
142
+ "nauc_mrr_at_20_diff1": 0.411157,
143
+ "nauc_mrr_at_100_max": 0.205686,
144
+ "nauc_mrr_at_100_std": -0.043721,
145
+ "nauc_mrr_at_100_diff1": 0.410602,
146
+ "nauc_mrr_at_1000_max": 0.205823,
147
+ "nauc_mrr_at_1000_std": -0.043815,
148
+ "nauc_mrr_at_1000_diff1": 0.410594,
149
+ "hit_rate_at_1": 0.27325,
150
+ "hit_rate_at_3": 0.41917,
151
+ "hit_rate_at_5": 0.48927,
152
+ "hit_rate_at_10": 0.55508,
153
+ "hit_rate_at_20": 0.63233,
154
+ "hit_rate_at_100": 0.79542,
155
+ "hit_rate_at_1000": 0.94564,
156
+ "main_score": 0.36801,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 60.65263891220093,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackEnglishRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ad9991cb51e31e31e430383c75ffb2885547b5f0",
3
+ "task_name": "CQADupstackEnglishRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.26306,
9
+ "ndcg_at_3": 0.29527,
10
+ "ndcg_at_5": 0.31199,
11
+ "ndcg_at_10": 0.33008,
12
+ "ndcg_at_20": 0.34991,
13
+ "ndcg_at_100": 0.37615,
14
+ "ndcg_at_1000": 0.40468,
15
+ "map_at_1": 0.2107,
16
+ "map_at_3": 0.26094,
17
+ "map_at_5": 0.27411,
18
+ "map_at_10": 0.28383,
19
+ "map_at_20": 0.29053,
20
+ "map_at_100": 0.29554,
21
+ "map_at_1000": 0.29702,
22
+ "recall_at_1": 0.2107,
23
+ "recall_at_3": 0.30788,
24
+ "recall_at_5": 0.35456,
25
+ "recall_at_10": 0.41092,
26
+ "recall_at_20": 0.48302,
27
+ "recall_at_100": 0.60831,
28
+ "recall_at_1000": 0.80037,
29
+ "accuracy": 0.2107,
30
+ "precision_at_1": 0.26306,
31
+ "precision_at_3": 0.1431,
32
+ "precision_at_5": 0.10331,
33
+ "precision_at_10": 0.06395,
34
+ "precision_at_20": 0.03994,
35
+ "precision_at_100": 0.01126,
36
+ "precision_at_1000": 0.00168,
37
+ "mrr_at_1": 0.263057,
38
+ "mrr_at_3": 0.313376,
39
+ "mrr_at_5": 0.324586,
40
+ "mrr_at_10": 0.333392,
41
+ "mrr_at_20": 0.339062,
42
+ "mrr_at_100": 0.341833,
43
+ "mrr_at_1000": 0.342509,
44
+ "nauc_ndcg_at_1_max": 0.291716,
45
+ "nauc_ndcg_at_1_std": 0.013269,
46
+ "nauc_ndcg_at_1_diff1": 0.504039,
47
+ "nauc_ndcg_at_3_max": 0.251007,
48
+ "nauc_ndcg_at_3_std": -0.001913,
49
+ "nauc_ndcg_at_3_diff1": 0.437945,
50
+ "nauc_ndcg_at_5_max": 0.252196,
51
+ "nauc_ndcg_at_5_std": 0.002045,
52
+ "nauc_ndcg_at_5_diff1": 0.429674,
53
+ "nauc_ndcg_at_10_max": 0.249708,
54
+ "nauc_ndcg_at_10_std": 0.005478,
55
+ "nauc_ndcg_at_10_diff1": 0.424842,
56
+ "nauc_ndcg_at_20_max": 0.249471,
57
+ "nauc_ndcg_at_20_std": 0.011919,
58
+ "nauc_ndcg_at_20_diff1": 0.416657,
59
+ "nauc_ndcg_at_100_max": 0.257433,
60
+ "nauc_ndcg_at_100_std": 0.028522,
61
+ "nauc_ndcg_at_100_diff1": 0.420342,
62
+ "nauc_ndcg_at_1000_max": 0.26413,
63
+ "nauc_ndcg_at_1000_std": 0.041954,
64
+ "nauc_ndcg_at_1000_diff1": 0.415154,
65
+ "nauc_map_at_1_max": 0.248571,
66
+ "nauc_map_at_1_std": -0.027072,
67
+ "nauc_map_at_1_diff1": 0.539663,
68
+ "nauc_map_at_3_max": 0.237047,
69
+ "nauc_map_at_3_std": -0.026754,
70
+ "nauc_map_at_3_diff1": 0.476703,
71
+ "nauc_map_at_5_max": 0.24069,
72
+ "nauc_map_at_5_std": -0.019902,
73
+ "nauc_map_at_5_diff1": 0.468007,
74
+ "nauc_map_at_10_max": 0.244946,
75
+ "nauc_map_at_10_std": -0.01445,
76
+ "nauc_map_at_10_diff1": 0.462874,
77
+ "nauc_map_at_20_max": 0.24768,
78
+ "nauc_map_at_20_std": -0.009102,
79
+ "nauc_map_at_20_diff1": 0.459081,
80
+ "nauc_map_at_100_max": 0.251077,
81
+ "nauc_map_at_100_std": -0.003816,
82
+ "nauc_map_at_100_diff1": 0.45858,
83
+ "nauc_map_at_1000_max": 0.252101,
84
+ "nauc_map_at_1000_std": -0.002101,
85
+ "nauc_map_at_1000_diff1": 0.457664,
86
+ "nauc_recall_at_1_max": 0.248571,
87
+ "nauc_recall_at_1_std": -0.027072,
88
+ "nauc_recall_at_1_diff1": 0.539663,
89
+ "nauc_recall_at_3_max": 0.201544,
90
+ "nauc_recall_at_3_std": -0.030379,
91
+ "nauc_recall_at_3_diff1": 0.412651,
92
+ "nauc_recall_at_5_max": 0.201675,
93
+ "nauc_recall_at_5_std": -0.014002,
94
+ "nauc_recall_at_5_diff1": 0.378035,
95
+ "nauc_recall_at_10_max": 0.203024,
96
+ "nauc_recall_at_10_std": 0.001326,
97
+ "nauc_recall_at_10_diff1": 0.349493,
98
+ "nauc_recall_at_20_max": 0.197157,
99
+ "nauc_recall_at_20_std": 0.0217,
100
+ "nauc_recall_at_20_diff1": 0.306041,
101
+ "nauc_recall_at_100_max": 0.225177,
102
+ "nauc_recall_at_100_std": 0.099068,
103
+ "nauc_recall_at_100_diff1": 0.310556,
104
+ "nauc_recall_at_1000_max": 0.264082,
105
+ "nauc_recall_at_1000_std": 0.232027,
106
+ "nauc_recall_at_1000_diff1": 0.253516,
107
+ "nauc_precision_at_1_max": 0.291716,
108
+ "nauc_precision_at_1_std": 0.013269,
109
+ "nauc_precision_at_1_diff1": 0.504039,
110
+ "nauc_precision_at_3_max": 0.262759,
111
+ "nauc_precision_at_3_std": 0.057721,
112
+ "nauc_precision_at_3_diff1": 0.304756,
113
+ "nauc_precision_at_5_max": 0.279442,
114
+ "nauc_precision_at_5_std": 0.100679,
115
+ "nauc_precision_at_5_diff1": 0.234081,
116
+ "nauc_precision_at_10_max": 0.287459,
117
+ "nauc_precision_at_10_std": 0.144557,
118
+ "nauc_precision_at_10_diff1": 0.156109,
119
+ "nauc_precision_at_20_max": 0.296862,
120
+ "nauc_precision_at_20_std": 0.206718,
121
+ "nauc_precision_at_20_diff1": 0.066229,
122
+ "nauc_precision_at_100_max": 0.293532,
123
+ "nauc_precision_at_100_std": 0.288593,
124
+ "nauc_precision_at_100_diff1": -0.038353,
125
+ "nauc_precision_at_1000_max": 0.227376,
126
+ "nauc_precision_at_1000_std": 0.325348,
127
+ "nauc_precision_at_1000_diff1": -0.171045,
128
+ "nauc_mrr_at_1_max": 0.291716,
129
+ "nauc_mrr_at_1_std": 0.013269,
130
+ "nauc_mrr_at_1_diff1": 0.504039,
131
+ "nauc_mrr_at_3_max": 0.267701,
132
+ "nauc_mrr_at_3_std": 0.011158,
133
+ "nauc_mrr_at_3_diff1": 0.442983,
134
+ "nauc_mrr_at_5_max": 0.269357,
135
+ "nauc_mrr_at_5_std": 0.015544,
136
+ "nauc_mrr_at_5_diff1": 0.434879,
137
+ "nauc_mrr_at_10_max": 0.268916,
138
+ "nauc_mrr_at_10_std": 0.018307,
139
+ "nauc_mrr_at_10_diff1": 0.430385,
140
+ "nauc_mrr_at_20_max": 0.26927,
141
+ "nauc_mrr_at_20_std": 0.019146,
142
+ "nauc_mrr_at_20_diff1": 0.428644,
143
+ "nauc_mrr_at_100_max": 0.269891,
144
+ "nauc_mrr_at_100_std": 0.020217,
145
+ "nauc_mrr_at_100_diff1": 0.429509,
146
+ "nauc_mrr_at_1000_max": 0.269966,
147
+ "nauc_mrr_at_1000_std": 0.020452,
148
+ "nauc_mrr_at_1000_diff1": 0.429617,
149
+ "hit_rate_at_1": 0.26306,
150
+ "hit_rate_at_3": 0.37771,
151
+ "hit_rate_at_5": 0.42739,
152
+ "hit_rate_at_10": 0.49299,
153
+ "hit_rate_at_20": 0.57261,
154
+ "hit_rate_at_100": 0.68344,
155
+ "hit_rate_at_1000": 0.85223,
156
+ "main_score": 0.33008,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 104.83354067802429,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGamingRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4885aa143210c98657558c04aaf3dc47cfb54340",
3
+ "task_name": "CQADupstackGamingRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.33166,
9
+ "ndcg_at_3": 0.40134,
10
+ "ndcg_at_5": 0.42304,
11
+ "ndcg_at_10": 0.45007,
12
+ "ndcg_at_20": 0.46718,
13
+ "ndcg_at_100": 0.49618,
14
+ "ndcg_at_1000": 0.51402,
15
+ "map_at_1": 0.28781,
16
+ "map_at_3": 0.36766,
17
+ "map_at_5": 0.38184,
18
+ "map_at_10": 0.3943,
19
+ "map_at_20": 0.4001,
20
+ "map_at_100": 0.40492,
21
+ "map_at_1000": 0.40569,
22
+ "recall_at_1": 0.28781,
23
+ "recall_at_3": 0.45151,
24
+ "recall_at_5": 0.50364,
25
+ "recall_at_10": 0.58407,
26
+ "recall_at_20": 0.64667,
27
+ "recall_at_100": 0.78808,
28
+ "recall_at_1000": 0.91902,
29
+ "accuracy": 0.28781,
30
+ "precision_at_1": 0.33166,
31
+ "precision_at_3": 0.18077,
32
+ "precision_at_5": 0.12464,
33
+ "precision_at_10": 0.07429,
34
+ "precision_at_20": 0.04232,
35
+ "precision_at_100": 0.01077,
36
+ "precision_at_1000": 0.00129,
37
+ "mrr_at_1": 0.331661,
38
+ "mrr_at_3": 0.404075,
39
+ "mrr_at_5": 0.416144,
40
+ "mrr_at_10": 0.427083,
41
+ "mrr_at_20": 0.431337,
42
+ "mrr_at_100": 0.434685,
43
+ "mrr_at_1000": 0.435156,
44
+ "nauc_ndcg_at_1_max": 0.300718,
45
+ "nauc_ndcg_at_1_std": -0.048042,
46
+ "nauc_ndcg_at_1_diff1": 0.490193,
47
+ "nauc_ndcg_at_3_max": 0.268874,
48
+ "nauc_ndcg_at_3_std": -0.069336,
49
+ "nauc_ndcg_at_3_diff1": 0.43476,
50
+ "nauc_ndcg_at_5_max": 0.282918,
51
+ "nauc_ndcg_at_5_std": -0.056988,
52
+ "nauc_ndcg_at_5_diff1": 0.426014,
53
+ "nauc_ndcg_at_10_max": 0.279594,
54
+ "nauc_ndcg_at_10_std": -0.053439,
55
+ "nauc_ndcg_at_10_diff1": 0.42593,
56
+ "nauc_ndcg_at_20_max": 0.289306,
57
+ "nauc_ndcg_at_20_std": -0.039443,
58
+ "nauc_ndcg_at_20_diff1": 0.42625,
59
+ "nauc_ndcg_at_100_max": 0.301556,
60
+ "nauc_ndcg_at_100_std": -0.021085,
61
+ "nauc_ndcg_at_100_diff1": 0.424096,
62
+ "nauc_ndcg_at_1000_max": 0.304947,
63
+ "nauc_ndcg_at_1000_std": -0.022049,
64
+ "nauc_ndcg_at_1000_diff1": 0.43089,
65
+ "nauc_map_at_1_max": 0.259938,
66
+ "nauc_map_at_1_std": -0.072908,
67
+ "nauc_map_at_1_diff1": 0.491451,
68
+ "nauc_map_at_3_max": 0.261373,
69
+ "nauc_map_at_3_std": -0.080522,
70
+ "nauc_map_at_3_diff1": 0.449579,
71
+ "nauc_map_at_5_max": 0.269659,
72
+ "nauc_map_at_5_std": -0.070982,
73
+ "nauc_map_at_5_diff1": 0.443534,
74
+ "nauc_map_at_10_max": 0.269735,
75
+ "nauc_map_at_10_std": -0.067827,
76
+ "nauc_map_at_10_diff1": 0.443509,
77
+ "nauc_map_at_20_max": 0.274598,
78
+ "nauc_map_at_20_std": -0.061974,
79
+ "nauc_map_at_20_diff1": 0.443367,
80
+ "nauc_map_at_100_max": 0.277786,
81
+ "nauc_map_at_100_std": -0.058075,
82
+ "nauc_map_at_100_diff1": 0.442864,
83
+ "nauc_map_at_1000_max": 0.278279,
84
+ "nauc_map_at_1000_std": -0.057844,
85
+ "nauc_map_at_1000_diff1": 0.443131,
86
+ "nauc_recall_at_1_max": 0.259938,
87
+ "nauc_recall_at_1_std": -0.072908,
88
+ "nauc_recall_at_1_diff1": 0.491451,
89
+ "nauc_recall_at_3_max": 0.238132,
90
+ "nauc_recall_at_3_std": -0.08786,
91
+ "nauc_recall_at_3_diff1": 0.388728,
92
+ "nauc_recall_at_5_max": 0.268101,
93
+ "nauc_recall_at_5_std": -0.056521,
94
+ "nauc_recall_at_5_diff1": 0.36559,
95
+ "nauc_recall_at_10_max": 0.25348,
96
+ "nauc_recall_at_10_std": -0.045808,
97
+ "nauc_recall_at_10_diff1": 0.355903,
98
+ "nauc_recall_at_20_max": 0.293821,
99
+ "nauc_recall_at_20_std": 0.018654,
100
+ "nauc_recall_at_20_diff1": 0.346775,
101
+ "nauc_recall_at_100_max": 0.364392,
102
+ "nauc_recall_at_100_std": 0.163059,
103
+ "nauc_recall_at_100_diff1": 0.292505,
104
+ "nauc_recall_at_1000_max": 0.49132,
105
+ "nauc_recall_at_1000_std": 0.334752,
106
+ "nauc_recall_at_1000_diff1": 0.328464,
107
+ "nauc_precision_at_1_max": 0.300718,
108
+ "nauc_precision_at_1_std": -0.048042,
109
+ "nauc_precision_at_1_diff1": 0.490193,
110
+ "nauc_precision_at_3_max": 0.28472,
111
+ "nauc_precision_at_3_std": -0.029571,
112
+ "nauc_precision_at_3_diff1": 0.340208,
113
+ "nauc_precision_at_5_max": 0.307124,
114
+ "nauc_precision_at_5_std": 0.027012,
115
+ "nauc_precision_at_5_diff1": 0.274712,
116
+ "nauc_precision_at_10_max": 0.3068,
117
+ "nauc_precision_at_10_std": 0.077365,
118
+ "nauc_precision_at_10_diff1": 0.227848,
119
+ "nauc_precision_at_20_max": 0.32311,
120
+ "nauc_precision_at_20_std": 0.151326,
121
+ "nauc_precision_at_20_diff1": 0.169852,
122
+ "nauc_precision_at_100_max": 0.354862,
123
+ "nauc_precision_at_100_std": 0.276512,
124
+ "nauc_precision_at_100_diff1": 0.07489,
125
+ "nauc_precision_at_1000_max": 0.330395,
126
+ "nauc_precision_at_1000_std": 0.30934,
127
+ "nauc_precision_at_1000_diff1": 0.016778,
128
+ "nauc_mrr_at_1_max": 0.300718,
129
+ "nauc_mrr_at_1_std": -0.048042,
130
+ "nauc_mrr_at_1_diff1": 0.490193,
131
+ "nauc_mrr_at_3_max": 0.289955,
132
+ "nauc_mrr_at_3_std": -0.052334,
133
+ "nauc_mrr_at_3_diff1": 0.45044,
134
+ "nauc_mrr_at_5_max": 0.299462,
135
+ "nauc_mrr_at_5_std": -0.045425,
136
+ "nauc_mrr_at_5_diff1": 0.447534,
137
+ "nauc_mrr_at_10_max": 0.298138,
138
+ "nauc_mrr_at_10_std": -0.043489,
139
+ "nauc_mrr_at_10_diff1": 0.446233,
140
+ "nauc_mrr_at_20_max": 0.299536,
141
+ "nauc_mrr_at_20_std": -0.04066,
142
+ "nauc_mrr_at_20_diff1": 0.446589,
143
+ "nauc_mrr_at_100_max": 0.300286,
144
+ "nauc_mrr_at_100_std": -0.038961,
145
+ "nauc_mrr_at_100_diff1": 0.446264,
146
+ "nauc_mrr_at_1000_max": 0.300254,
147
+ "nauc_mrr_at_1000_std": -0.039159,
148
+ "nauc_mrr_at_1000_diff1": 0.446424,
149
+ "hit_rate_at_1": 0.33166,
150
+ "hit_rate_at_3": 0.49718,
151
+ "hit_rate_at_5": 0.54984,
152
+ "hit_rate_at_10": 0.63386,
153
+ "hit_rate_at_20": 0.69342,
154
+ "hit_rate_at_100": 0.82571,
155
+ "hit_rate_at_1000": 0.9373,
156
+ "main_score": 0.45007,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 117.63552784919739,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGisRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "5003b3064772da1887988e05400cf3806fe491f2",
3
+ "task_name": "CQADupstackGisRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19435,
9
+ "ndcg_at_3": 0.24102,
10
+ "ndcg_at_5": 0.26205,
11
+ "ndcg_at_10": 0.28192,
12
+ "ndcg_at_20": 0.297,
13
+ "ndcg_at_100": 0.33343,
14
+ "ndcg_at_1000": 0.36233,
15
+ "map_at_1": 0.17856,
16
+ "map_at_3": 0.22225,
17
+ "map_at_5": 0.23428,
18
+ "map_at_10": 0.24294,
19
+ "map_at_20": 0.24727,
20
+ "map_at_100": 0.25209,
21
+ "map_at_1000": 0.25316,
22
+ "recall_at_1": 0.17856,
23
+ "recall_at_3": 0.27564,
24
+ "recall_at_5": 0.32628,
25
+ "recall_at_10": 0.38586,
26
+ "recall_at_20": 0.44217,
27
+ "recall_at_100": 0.63596,
28
+ "recall_at_1000": 0.85697,
29
+ "accuracy": 0.17856,
30
+ "precision_at_1": 0.19435,
31
+ "precision_at_3": 0.10169,
32
+ "precision_at_5": 0.07367,
33
+ "precision_at_10": 0.04418,
34
+ "precision_at_20": 0.02571,
35
+ "precision_at_100": 0.00754,
36
+ "precision_at_1000": 0.00104,
37
+ "mrr_at_1": 0.19435,
38
+ "mrr_at_3": 0.240301,
39
+ "mrr_at_5": 0.252618,
40
+ "mrr_at_10": 0.260556,
41
+ "mrr_at_20": 0.264596,
42
+ "mrr_at_100": 0.269163,
43
+ "mrr_at_1000": 0.270043,
44
+ "nauc_ndcg_at_1_max": 0.214314,
45
+ "nauc_ndcg_at_1_std": -0.080803,
46
+ "nauc_ndcg_at_1_diff1": 0.481837,
47
+ "nauc_ndcg_at_3_max": 0.229502,
48
+ "nauc_ndcg_at_3_std": -0.063393,
49
+ "nauc_ndcg_at_3_diff1": 0.41777,
50
+ "nauc_ndcg_at_5_max": 0.235571,
51
+ "nauc_ndcg_at_5_std": -0.046246,
52
+ "nauc_ndcg_at_5_diff1": 0.411079,
53
+ "nauc_ndcg_at_10_max": 0.240693,
54
+ "nauc_ndcg_at_10_std": -0.031234,
55
+ "nauc_ndcg_at_10_diff1": 0.392186,
56
+ "nauc_ndcg_at_20_max": 0.23942,
57
+ "nauc_ndcg_at_20_std": -0.019616,
58
+ "nauc_ndcg_at_20_diff1": 0.387374,
59
+ "nauc_ndcg_at_100_max": 0.244919,
60
+ "nauc_ndcg_at_100_std": -0.014102,
61
+ "nauc_ndcg_at_100_diff1": 0.384012,
62
+ "nauc_ndcg_at_1000_max": 0.24158,
63
+ "nauc_ndcg_at_1000_std": -0.014094,
64
+ "nauc_ndcg_at_1000_diff1": 0.383793,
65
+ "nauc_map_at_1_max": 0.219188,
66
+ "nauc_map_at_1_std": -0.086031,
67
+ "nauc_map_at_1_diff1": 0.511723,
68
+ "nauc_map_at_3_max": 0.226807,
69
+ "nauc_map_at_3_std": -0.070458,
70
+ "nauc_map_at_3_diff1": 0.443863,
71
+ "nauc_map_at_5_max": 0.231897,
72
+ "nauc_map_at_5_std": -0.05929,
73
+ "nauc_map_at_5_diff1": 0.43947,
74
+ "nauc_map_at_10_max": 0.234361,
75
+ "nauc_map_at_10_std": -0.052612,
76
+ "nauc_map_at_10_diff1": 0.431597,
77
+ "nauc_map_at_20_max": 0.234238,
78
+ "nauc_map_at_20_std": -0.049538,
79
+ "nauc_map_at_20_diff1": 0.429714,
80
+ "nauc_map_at_100_max": 0.235372,
81
+ "nauc_map_at_100_std": -0.048465,
82
+ "nauc_map_at_100_diff1": 0.429225,
83
+ "nauc_map_at_1000_max": 0.235292,
84
+ "nauc_map_at_1000_std": -0.048304,
85
+ "nauc_map_at_1000_diff1": 0.429096,
86
+ "nauc_recall_at_1_max": 0.219188,
87
+ "nauc_recall_at_1_std": -0.086031,
88
+ "nauc_recall_at_1_diff1": 0.511723,
89
+ "nauc_recall_at_3_max": 0.231377,
90
+ "nauc_recall_at_3_std": -0.050092,
91
+ "nauc_recall_at_3_diff1": 0.369393,
92
+ "nauc_recall_at_5_max": 0.240148,
93
+ "nauc_recall_at_5_std": -0.013678,
94
+ "nauc_recall_at_5_diff1": 0.345244,
95
+ "nauc_recall_at_10_max": 0.252531,
96
+ "nauc_recall_at_10_std": 0.022088,
97
+ "nauc_recall_at_10_diff1": 0.295485,
98
+ "nauc_recall_at_20_max": 0.246192,
99
+ "nauc_recall_at_20_std": 0.064745,
100
+ "nauc_recall_at_20_diff1": 0.279541,
101
+ "nauc_recall_at_100_max": 0.267756,
102
+ "nauc_recall_at_100_std": 0.104313,
103
+ "nauc_recall_at_100_diff1": 0.236633,
104
+ "nauc_recall_at_1000_max": 0.242463,
105
+ "nauc_recall_at_1000_std": 0.216985,
106
+ "nauc_recall_at_1000_diff1": 0.111449,
107
+ "nauc_precision_at_1_max": 0.214314,
108
+ "nauc_precision_at_1_std": -0.080803,
109
+ "nauc_precision_at_1_diff1": 0.481837,
110
+ "nauc_precision_at_3_max": 0.247693,
111
+ "nauc_precision_at_3_std": -0.038822,
112
+ "nauc_precision_at_3_diff1": 0.348361,
113
+ "nauc_precision_at_5_max": 0.266184,
114
+ "nauc_precision_at_5_std": 0.003505,
115
+ "nauc_precision_at_5_diff1": 0.322782,
116
+ "nauc_precision_at_10_max": 0.269063,
117
+ "nauc_precision_at_10_std": 0.039732,
118
+ "nauc_precision_at_10_diff1": 0.262571,
119
+ "nauc_precision_at_20_max": 0.252032,
120
+ "nauc_precision_at_20_std": 0.076884,
121
+ "nauc_precision_at_20_diff1": 0.231765,
122
+ "nauc_precision_at_100_max": 0.246091,
123
+ "nauc_precision_at_100_std": 0.105348,
124
+ "nauc_precision_at_100_diff1": 0.142117,
125
+ "nauc_precision_at_1000_max": 0.136316,
126
+ "nauc_precision_at_1000_std": 0.131657,
127
+ "nauc_precision_at_1000_diff1": -0.030724,
128
+ "nauc_mrr_at_1_max": 0.214314,
129
+ "nauc_mrr_at_1_std": -0.080803,
130
+ "nauc_mrr_at_1_diff1": 0.481837,
131
+ "nauc_mrr_at_3_max": 0.22246,
132
+ "nauc_mrr_at_3_std": -0.062401,
133
+ "nauc_mrr_at_3_diff1": 0.419952,
134
+ "nauc_mrr_at_5_max": 0.223774,
135
+ "nauc_mrr_at_5_std": -0.05427,
136
+ "nauc_mrr_at_5_diff1": 0.414294,
137
+ "nauc_mrr_at_10_max": 0.226437,
138
+ "nauc_mrr_at_10_std": -0.048616,
139
+ "nauc_mrr_at_10_diff1": 0.40566,
140
+ "nauc_mrr_at_20_max": 0.225949,
141
+ "nauc_mrr_at_20_std": -0.046243,
142
+ "nauc_mrr_at_20_diff1": 0.404128,
143
+ "nauc_mrr_at_100_max": 0.226785,
144
+ "nauc_mrr_at_100_std": -0.045525,
145
+ "nauc_mrr_at_100_diff1": 0.404635,
146
+ "nauc_mrr_at_1000_max": 0.226639,
147
+ "nauc_mrr_at_1000_std": -0.045693,
148
+ "nauc_mrr_at_1000_diff1": 0.40464,
149
+ "hit_rate_at_1": 0.19435,
150
+ "hit_rate_at_3": 0.29944,
151
+ "hit_rate_at_5": 0.3548,
152
+ "hit_rate_at_10": 0.41469,
153
+ "hit_rate_at_20": 0.47232,
154
+ "hit_rate_at_100": 0.66893,
155
+ "hit_rate_at_1000": 0.87571,
156
+ "main_score": 0.28192,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 100.23941278457642,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackMathematicaRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "90fceea13679c63fe563ded68f3b6f06e50061de",
3
+ "task_name": "CQADupstackMathematicaRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.14055,
9
+ "ndcg_at_3": 0.1754,
10
+ "ndcg_at_5": 0.19358,
11
+ "ndcg_at_10": 0.21649,
12
+ "ndcg_at_20": 0.2365,
13
+ "ndcg_at_100": 0.27532,
14
+ "ndcg_at_1000": 0.30858,
15
+ "map_at_1": 0.10754,
16
+ "map_at_3": 0.15072,
17
+ "map_at_5": 0.16265,
18
+ "map_at_10": 0.17224,
19
+ "map_at_20": 0.17819,
20
+ "map_at_100": 0.1839,
21
+ "map_at_1000": 0.18525,
22
+ "recall_at_1": 0.10754,
23
+ "recall_at_3": 0.20052,
24
+ "recall_at_5": 0.24722,
25
+ "recall_at_10": 0.3149,
26
+ "recall_at_20": 0.3874,
27
+ "recall_at_100": 0.57837,
28
+ "recall_at_1000": 0.81753,
29
+ "accuracy": 0.10754,
30
+ "precision_at_1": 0.14055,
31
+ "precision_at_3": 0.08665,
32
+ "precision_at_5": 0.06493,
33
+ "precision_at_10": 0.04229,
34
+ "precision_at_20": 0.02643,
35
+ "precision_at_100": 0.00828,
36
+ "precision_at_1000": 0.00125,
37
+ "mrr_at_1": 0.140547,
38
+ "mrr_at_3": 0.189469,
39
+ "mrr_at_5": 0.201161,
40
+ "mrr_at_10": 0.212456,
41
+ "mrr_at_20": 0.217626,
42
+ "mrr_at_100": 0.222675,
43
+ "mrr_at_1000": 0.223498,
44
+ "nauc_ndcg_at_1_max": 0.159996,
45
+ "nauc_ndcg_at_1_std": 0.023565,
46
+ "nauc_ndcg_at_1_diff1": 0.426586,
47
+ "nauc_ndcg_at_3_max": 0.146019,
48
+ "nauc_ndcg_at_3_std": 0.023706,
49
+ "nauc_ndcg_at_3_diff1": 0.328195,
50
+ "nauc_ndcg_at_5_max": 0.14388,
51
+ "nauc_ndcg_at_5_std": 0.017239,
52
+ "nauc_ndcg_at_5_diff1": 0.311204,
53
+ "nauc_ndcg_at_10_max": 0.139401,
54
+ "nauc_ndcg_at_10_std": 0.024374,
55
+ "nauc_ndcg_at_10_diff1": 0.288464,
56
+ "nauc_ndcg_at_20_max": 0.134018,
57
+ "nauc_ndcg_at_20_std": 0.031681,
58
+ "nauc_ndcg_at_20_diff1": 0.289273,
59
+ "nauc_ndcg_at_100_max": 0.149382,
60
+ "nauc_ndcg_at_100_std": 0.064106,
61
+ "nauc_ndcg_at_100_diff1": 0.286003,
62
+ "nauc_ndcg_at_1000_max": 0.157415,
63
+ "nauc_ndcg_at_1000_std": 0.063598,
64
+ "nauc_ndcg_at_1000_diff1": 0.282199,
65
+ "nauc_map_at_1_max": 0.181098,
66
+ "nauc_map_at_1_std": 0.059489,
67
+ "nauc_map_at_1_diff1": 0.436485,
68
+ "nauc_map_at_3_max": 0.149361,
69
+ "nauc_map_at_3_std": 0.035824,
70
+ "nauc_map_at_3_diff1": 0.348899,
71
+ "nauc_map_at_5_max": 0.146888,
72
+ "nauc_map_at_5_std": 0.028052,
73
+ "nauc_map_at_5_diff1": 0.337494,
74
+ "nauc_map_at_10_max": 0.145458,
75
+ "nauc_map_at_10_std": 0.030086,
76
+ "nauc_map_at_10_diff1": 0.326403,
77
+ "nauc_map_at_20_max": 0.144015,
78
+ "nauc_map_at_20_std": 0.033069,
79
+ "nauc_map_at_20_diff1": 0.326062,
80
+ "nauc_map_at_100_max": 0.146576,
81
+ "nauc_map_at_100_std": 0.038624,
82
+ "nauc_map_at_100_diff1": 0.325176,
83
+ "nauc_map_at_1000_max": 0.146983,
84
+ "nauc_map_at_1000_std": 0.03893,
85
+ "nauc_map_at_1000_diff1": 0.324735,
86
+ "nauc_recall_at_1_max": 0.181098,
87
+ "nauc_recall_at_1_std": 0.059489,
88
+ "nauc_recall_at_1_diff1": 0.436485,
89
+ "nauc_recall_at_3_max": 0.129728,
90
+ "nauc_recall_at_3_std": 0.023485,
91
+ "nauc_recall_at_3_diff1": 0.275661,
92
+ "nauc_recall_at_5_max": 0.128204,
93
+ "nauc_recall_at_5_std": 0.006688,
94
+ "nauc_recall_at_5_diff1": 0.241675,
95
+ "nauc_recall_at_10_max": 0.113973,
96
+ "nauc_recall_at_10_std": 0.021342,
97
+ "nauc_recall_at_10_diff1": 0.191007,
98
+ "nauc_recall_at_20_max": 0.094993,
99
+ "nauc_recall_at_20_std": 0.04139,
100
+ "nauc_recall_at_20_diff1": 0.195022,
101
+ "nauc_recall_at_100_max": 0.150974,
102
+ "nauc_recall_at_100_std": 0.175349,
103
+ "nauc_recall_at_100_diff1": 0.171555,
104
+ "nauc_recall_at_1000_max": 0.248646,
105
+ "nauc_recall_at_1000_std": 0.252241,
106
+ "nauc_recall_at_1000_diff1": 0.066258,
107
+ "nauc_precision_at_1_max": 0.159996,
108
+ "nauc_precision_at_1_std": 0.023565,
109
+ "nauc_precision_at_1_diff1": 0.426586,
110
+ "nauc_precision_at_3_max": 0.126416,
111
+ "nauc_precision_at_3_std": 0.011468,
112
+ "nauc_precision_at_3_diff1": 0.271509,
113
+ "nauc_precision_at_5_max": 0.108273,
114
+ "nauc_precision_at_5_std": -0.020556,
115
+ "nauc_precision_at_5_diff1": 0.231173,
116
+ "nauc_precision_at_10_max": 0.115649,
117
+ "nauc_precision_at_10_std": 0.011115,
118
+ "nauc_precision_at_10_diff1": 0.17711,
119
+ "nauc_precision_at_20_max": 0.099879,
120
+ "nauc_precision_at_20_std": 0.021326,
121
+ "nauc_precision_at_20_diff1": 0.170156,
122
+ "nauc_precision_at_100_max": 0.113928,
123
+ "nauc_precision_at_100_std": 0.099222,
124
+ "nauc_precision_at_100_diff1": 0.082136,
125
+ "nauc_precision_at_1000_max": 0.086783,
126
+ "nauc_precision_at_1000_std": 0.044841,
127
+ "nauc_precision_at_1000_diff1": -0.015553,
128
+ "nauc_mrr_at_1_max": 0.159996,
129
+ "nauc_mrr_at_1_std": 0.023565,
130
+ "nauc_mrr_at_1_diff1": 0.426586,
131
+ "nauc_mrr_at_3_max": 0.15158,
132
+ "nauc_mrr_at_3_std": 0.015076,
133
+ "nauc_mrr_at_3_diff1": 0.346451,
134
+ "nauc_mrr_at_5_max": 0.149913,
135
+ "nauc_mrr_at_5_std": 0.00817,
136
+ "nauc_mrr_at_5_diff1": 0.335999,
137
+ "nauc_mrr_at_10_max": 0.148225,
138
+ "nauc_mrr_at_10_std": 0.010656,
139
+ "nauc_mrr_at_10_diff1": 0.325405,
140
+ "nauc_mrr_at_20_max": 0.146763,
141
+ "nauc_mrr_at_20_std": 0.013221,
142
+ "nauc_mrr_at_20_diff1": 0.3254,
143
+ "nauc_mrr_at_100_max": 0.148254,
144
+ "nauc_mrr_at_100_std": 0.016903,
145
+ "nauc_mrr_at_100_diff1": 0.325536,
146
+ "nauc_mrr_at_1000_max": 0.148116,
147
+ "nauc_mrr_at_1000_std": 0.016666,
148
+ "nauc_mrr_at_1000_diff1": 0.325482,
149
+ "hit_rate_at_1": 0.14055,
150
+ "hit_rate_at_3": 0.25,
151
+ "hit_rate_at_5": 0.301,
152
+ "hit_rate_at_10": 0.38557,
153
+ "hit_rate_at_20": 0.45896,
154
+ "hit_rate_at_100": 0.65796,
155
+ "hit_rate_at_1000": 0.8607,
156
+ "main_score": 0.21649,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 47.62054395675659,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackPhysicsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4",
3
+ "task_name": "CQADupstackPhysicsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.2435,
9
+ "ndcg_at_3": 0.29094,
10
+ "ndcg_at_5": 0.31219,
11
+ "ndcg_at_10": 0.33827,
12
+ "ndcg_at_20": 0.3603,
13
+ "ndcg_at_100": 0.39754,
14
+ "ndcg_at_1000": 0.42392,
15
+ "map_at_1": 0.2027,
16
+ "map_at_3": 0.25895,
17
+ "map_at_5": 0.27241,
18
+ "map_at_10": 0.28474,
19
+ "map_at_20": 0.29181,
20
+ "map_at_100": 0.29795,
21
+ "map_at_1000": 0.29918,
22
+ "recall_at_1": 0.2027,
23
+ "recall_at_3": 0.3239,
24
+ "recall_at_5": 0.37587,
25
+ "recall_at_10": 0.45487,
26
+ "recall_at_20": 0.5312,
27
+ "recall_at_100": 0.7068,
28
+ "recall_at_1000": 0.8865,
29
+ "accuracy": 0.2027,
30
+ "precision_at_1": 0.2435,
31
+ "precision_at_3": 0.13795,
32
+ "precision_at_5": 0.10067,
33
+ "precision_at_10": 0.06304,
34
+ "precision_at_20": 0.03864,
35
+ "precision_at_100": 0.01117,
36
+ "precision_at_1000": 0.00153,
37
+ "mrr_at_1": 0.243503,
38
+ "mrr_at_3": 0.301732,
39
+ "mrr_at_5": 0.31718,
40
+ "mrr_at_10": 0.328751,
41
+ "mrr_at_20": 0.334086,
42
+ "mrr_at_100": 0.338196,
43
+ "mrr_at_1000": 0.338854,
44
+ "nauc_ndcg_at_1_max": 0.181596,
45
+ "nauc_ndcg_at_1_std": -0.048648,
46
+ "nauc_ndcg_at_1_diff1": 0.491961,
47
+ "nauc_ndcg_at_3_max": 0.184608,
48
+ "nauc_ndcg_at_3_std": -0.047382,
49
+ "nauc_ndcg_at_3_diff1": 0.4463,
50
+ "nauc_ndcg_at_5_max": 0.191397,
51
+ "nauc_ndcg_at_5_std": -0.035138,
52
+ "nauc_ndcg_at_5_diff1": 0.435931,
53
+ "nauc_ndcg_at_10_max": 0.19675,
54
+ "nauc_ndcg_at_10_std": -0.020216,
55
+ "nauc_ndcg_at_10_diff1": 0.430616,
56
+ "nauc_ndcg_at_20_max": 0.194664,
57
+ "nauc_ndcg_at_20_std": -0.006382,
58
+ "nauc_ndcg_at_20_diff1": 0.433119,
59
+ "nauc_ndcg_at_100_max": 0.207914,
60
+ "nauc_ndcg_at_100_std": 0.010853,
61
+ "nauc_ndcg_at_100_diff1": 0.429275,
62
+ "nauc_ndcg_at_1000_max": 0.210852,
63
+ "nauc_ndcg_at_1000_std": 0.013282,
64
+ "nauc_ndcg_at_1000_diff1": 0.428582,
65
+ "nauc_map_at_1_max": 0.156015,
66
+ "nauc_map_at_1_std": -0.081615,
67
+ "nauc_map_at_1_diff1": 0.518613,
68
+ "nauc_map_at_3_max": 0.177424,
69
+ "nauc_map_at_3_std": -0.058999,
70
+ "nauc_map_at_3_diff1": 0.473341,
71
+ "nauc_map_at_5_max": 0.182866,
72
+ "nauc_map_at_5_std": -0.05001,
73
+ "nauc_map_at_5_diff1": 0.463195,
74
+ "nauc_map_at_10_max": 0.186324,
75
+ "nauc_map_at_10_std": -0.041233,
76
+ "nauc_map_at_10_diff1": 0.458376,
77
+ "nauc_map_at_20_max": 0.187148,
78
+ "nauc_map_at_20_std": -0.035269,
79
+ "nauc_map_at_20_diff1": 0.459109,
80
+ "nauc_map_at_100_max": 0.190097,
81
+ "nauc_map_at_100_std": -0.031785,
82
+ "nauc_map_at_100_diff1": 0.458227,
83
+ "nauc_map_at_1000_max": 0.190482,
84
+ "nauc_map_at_1000_std": -0.031341,
85
+ "nauc_map_at_1000_diff1": 0.458154,
86
+ "nauc_recall_at_1_max": 0.156015,
87
+ "nauc_recall_at_1_std": -0.081615,
88
+ "nauc_recall_at_1_diff1": 0.518613,
89
+ "nauc_recall_at_3_max": 0.173379,
90
+ "nauc_recall_at_3_std": -0.054999,
91
+ "nauc_recall_at_3_diff1": 0.416448,
92
+ "nauc_recall_at_5_max": 0.184946,
93
+ "nauc_recall_at_5_std": -0.032813,
94
+ "nauc_recall_at_5_diff1": 0.388239,
95
+ "nauc_recall_at_10_max": 0.195083,
96
+ "nauc_recall_at_10_std": 0.015458,
97
+ "nauc_recall_at_10_diff1": 0.354658,
98
+ "nauc_recall_at_20_max": 0.178003,
99
+ "nauc_recall_at_20_std": 0.060382,
100
+ "nauc_recall_at_20_diff1": 0.355312,
101
+ "nauc_recall_at_100_max": 0.228025,
102
+ "nauc_recall_at_100_std": 0.162362,
103
+ "nauc_recall_at_100_diff1": 0.314496,
104
+ "nauc_recall_at_1000_max": 0.295882,
105
+ "nauc_recall_at_1000_std": 0.354066,
106
+ "nauc_recall_at_1000_diff1": 0.237593,
107
+ "nauc_precision_at_1_max": 0.181596,
108
+ "nauc_precision_at_1_std": -0.048648,
109
+ "nauc_precision_at_1_diff1": 0.491961,
110
+ "nauc_precision_at_3_max": 0.212694,
111
+ "nauc_precision_at_3_std": 0.00495,
112
+ "nauc_precision_at_3_diff1": 0.351785,
113
+ "nauc_precision_at_5_max": 0.244781,
114
+ "nauc_precision_at_5_std": 0.060882,
115
+ "nauc_precision_at_5_diff1": 0.287823,
116
+ "nauc_precision_at_10_max": 0.256429,
117
+ "nauc_precision_at_10_std": 0.122018,
118
+ "nauc_precision_at_10_diff1": 0.227145,
119
+ "nauc_precision_at_20_max": 0.227199,
120
+ "nauc_precision_at_20_std": 0.162562,
121
+ "nauc_precision_at_20_diff1": 0.170318,
122
+ "nauc_precision_at_100_max": 0.202895,
123
+ "nauc_precision_at_100_std": 0.204177,
124
+ "nauc_precision_at_100_diff1": 0.010253,
125
+ "nauc_precision_at_1000_max": 0.121503,
126
+ "nauc_precision_at_1000_std": 0.166102,
127
+ "nauc_precision_at_1000_diff1": -0.130242,
128
+ "nauc_mrr_at_1_max": 0.181596,
129
+ "nauc_mrr_at_1_std": -0.048648,
130
+ "nauc_mrr_at_1_diff1": 0.491961,
131
+ "nauc_mrr_at_3_max": 0.191301,
132
+ "nauc_mrr_at_3_std": -0.042027,
133
+ "nauc_mrr_at_3_diff1": 0.445373,
134
+ "nauc_mrr_at_5_max": 0.197178,
135
+ "nauc_mrr_at_5_std": -0.031312,
136
+ "nauc_mrr_at_5_diff1": 0.437029,
137
+ "nauc_mrr_at_10_max": 0.198721,
138
+ "nauc_mrr_at_10_std": -0.025837,
139
+ "nauc_mrr_at_10_diff1": 0.435513,
140
+ "nauc_mrr_at_20_max": 0.197212,
141
+ "nauc_mrr_at_20_std": -0.023355,
142
+ "nauc_mrr_at_20_diff1": 0.43601,
143
+ "nauc_mrr_at_100_max": 0.198343,
144
+ "nauc_mrr_at_100_std": -0.022112,
145
+ "nauc_mrr_at_100_diff1": 0.43601,
146
+ "nauc_mrr_at_1000_max": 0.198318,
147
+ "nauc_mrr_at_1000_std": -0.022138,
148
+ "nauc_mrr_at_1000_diff1": 0.436109,
149
+ "hit_rate_at_1": 0.2435,
150
+ "hit_rate_at_3": 0.37921,
151
+ "hit_rate_at_5": 0.44755,
152
+ "hit_rate_at_10": 0.53513,
153
+ "hit_rate_at_20": 0.61116,
154
+ "hit_rate_at_100": 0.77286,
155
+ "hit_rate_at_1000": 0.92204,
156
+ "main_score": 0.33827,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 100.5362138748169,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackProgrammersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6184bc1440d2dbc7612be22b50686b8826d22b32",
3
+ "task_name": "CQADupstackProgrammersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.25571,
9
+ "ndcg_at_3": 0.28717,
10
+ "ndcg_at_5": 0.30829,
11
+ "ndcg_at_10": 0.33073,
12
+ "ndcg_at_20": 0.35531,
13
+ "ndcg_at_100": 0.39398,
14
+ "ndcg_at_1000": 0.42209,
15
+ "map_at_1": 0.20709,
16
+ "map_at_3": 0.2565,
17
+ "map_at_5": 0.27151,
18
+ "map_at_10": 0.28198,
19
+ "map_at_20": 0.28925,
20
+ "map_at_100": 0.29564,
21
+ "map_at_1000": 0.29684,
22
+ "recall_at_1": 0.20709,
23
+ "recall_at_3": 0.30817,
24
+ "recall_at_5": 0.36263,
25
+ "recall_at_10": 0.42857,
26
+ "recall_at_20": 0.51959,
27
+ "recall_at_100": 0.70798,
28
+ "recall_at_1000": 0.90351,
29
+ "accuracy": 0.20709,
30
+ "precision_at_1": 0.25571,
31
+ "precision_at_3": 0.13584,
32
+ "precision_at_5": 0.09954,
33
+ "precision_at_10": 0.0613,
34
+ "precision_at_20": 0.03761,
35
+ "precision_at_100": 0.0108,
36
+ "precision_at_1000": 0.00151,
37
+ "mrr_at_1": 0.255708,
38
+ "mrr_at_3": 0.306507,
39
+ "mrr_at_5": 0.320548,
40
+ "mrr_at_10": 0.330034,
41
+ "mrr_at_20": 0.33628,
42
+ "mrr_at_100": 0.340411,
43
+ "mrr_at_1000": 0.341135,
44
+ "nauc_ndcg_at_1_max": 0.353371,
45
+ "nauc_ndcg_at_1_std": 0.022743,
46
+ "nauc_ndcg_at_1_diff1": 0.416417,
47
+ "nauc_ndcg_at_3_max": 0.319183,
48
+ "nauc_ndcg_at_3_std": 0.008533,
49
+ "nauc_ndcg_at_3_diff1": 0.381189,
50
+ "nauc_ndcg_at_5_max": 0.326986,
51
+ "nauc_ndcg_at_5_std": 0.014322,
52
+ "nauc_ndcg_at_5_diff1": 0.368202,
53
+ "nauc_ndcg_at_10_max": 0.327268,
54
+ "nauc_ndcg_at_10_std": 0.011862,
55
+ "nauc_ndcg_at_10_diff1": 0.362156,
56
+ "nauc_ndcg_at_20_max": 0.329261,
57
+ "nauc_ndcg_at_20_std": 0.027592,
58
+ "nauc_ndcg_at_20_diff1": 0.353135,
59
+ "nauc_ndcg_at_100_max": 0.336528,
60
+ "nauc_ndcg_at_100_std": 0.055207,
61
+ "nauc_ndcg_at_100_diff1": 0.348969,
62
+ "nauc_ndcg_at_1000_max": 0.342207,
63
+ "nauc_ndcg_at_1000_std": 0.054119,
64
+ "nauc_ndcg_at_1000_diff1": 0.359885,
65
+ "nauc_map_at_1_max": 0.301285,
66
+ "nauc_map_at_1_std": -0.0149,
67
+ "nauc_map_at_1_diff1": 0.41649,
68
+ "nauc_map_at_3_max": 0.307608,
69
+ "nauc_map_at_3_std": -0.003504,
70
+ "nauc_map_at_3_diff1": 0.391008,
71
+ "nauc_map_at_5_max": 0.317756,
72
+ "nauc_map_at_5_std": 0.004136,
73
+ "nauc_map_at_5_diff1": 0.382235,
74
+ "nauc_map_at_10_max": 0.32044,
75
+ "nauc_map_at_10_std": 0.005674,
76
+ "nauc_map_at_10_diff1": 0.38056,
77
+ "nauc_map_at_20_max": 0.32157,
78
+ "nauc_map_at_20_std": 0.010327,
79
+ "nauc_map_at_20_diff1": 0.377702,
80
+ "nauc_map_at_100_max": 0.322949,
81
+ "nauc_map_at_100_std": 0.014455,
82
+ "nauc_map_at_100_diff1": 0.376234,
83
+ "nauc_map_at_1000_max": 0.323396,
84
+ "nauc_map_at_1000_std": 0.014803,
85
+ "nauc_map_at_1000_diff1": 0.376456,
86
+ "nauc_recall_at_1_max": 0.301285,
87
+ "nauc_recall_at_1_std": -0.0149,
88
+ "nauc_recall_at_1_diff1": 0.41649,
89
+ "nauc_recall_at_3_max": 0.287159,
90
+ "nauc_recall_at_3_std": -0.005501,
91
+ "nauc_recall_at_3_diff1": 0.352614,
92
+ "nauc_recall_at_5_max": 0.310621,
93
+ "nauc_recall_at_5_std": 0.013225,
94
+ "nauc_recall_at_5_diff1": 0.316179,
95
+ "nauc_recall_at_10_max": 0.301185,
96
+ "nauc_recall_at_10_std": 0.004468,
97
+ "nauc_recall_at_10_diff1": 0.29493,
98
+ "nauc_recall_at_20_max": 0.294671,
99
+ "nauc_recall_at_20_std": 0.056175,
100
+ "nauc_recall_at_20_diff1": 0.255216,
101
+ "nauc_recall_at_100_max": 0.320555,
102
+ "nauc_recall_at_100_std": 0.224865,
103
+ "nauc_recall_at_100_diff1": 0.208752,
104
+ "nauc_recall_at_1000_max": 0.456414,
105
+ "nauc_recall_at_1000_std": 0.44558,
106
+ "nauc_recall_at_1000_diff1": 0.303243,
107
+ "nauc_precision_at_1_max": 0.353371,
108
+ "nauc_precision_at_1_std": 0.022743,
109
+ "nauc_precision_at_1_diff1": 0.416417,
110
+ "nauc_precision_at_3_max": 0.349937,
111
+ "nauc_precision_at_3_std": 0.04923,
112
+ "nauc_precision_at_3_diff1": 0.343446,
113
+ "nauc_precision_at_5_max": 0.364381,
114
+ "nauc_precision_at_5_std": 0.076029,
115
+ "nauc_precision_at_5_diff1": 0.291103,
116
+ "nauc_precision_at_10_max": 0.3434,
117
+ "nauc_precision_at_10_std": 0.079722,
118
+ "nauc_precision_at_10_diff1": 0.234901,
119
+ "nauc_precision_at_20_max": 0.319063,
120
+ "nauc_precision_at_20_std": 0.123481,
121
+ "nauc_precision_at_20_diff1": 0.158236,
122
+ "nauc_precision_at_100_max": 0.225154,
123
+ "nauc_precision_at_100_std": 0.179363,
124
+ "nauc_precision_at_100_diff1": 0.028036,
125
+ "nauc_precision_at_1000_max": 0.069617,
126
+ "nauc_precision_at_1000_std": 0.103029,
127
+ "nauc_precision_at_1000_diff1": -0.067505,
128
+ "nauc_mrr_at_1_max": 0.353371,
129
+ "nauc_mrr_at_1_std": 0.022743,
130
+ "nauc_mrr_at_1_diff1": 0.416417,
131
+ "nauc_mrr_at_3_max": 0.34354,
132
+ "nauc_mrr_at_3_std": 0.026017,
133
+ "nauc_mrr_at_3_diff1": 0.393517,
134
+ "nauc_mrr_at_5_max": 0.350767,
135
+ "nauc_mrr_at_5_std": 0.030475,
136
+ "nauc_mrr_at_5_diff1": 0.384408,
137
+ "nauc_mrr_at_10_max": 0.348525,
138
+ "nauc_mrr_at_10_std": 0.027043,
139
+ "nauc_mrr_at_10_diff1": 0.38089,
140
+ "nauc_mrr_at_20_max": 0.348142,
141
+ "nauc_mrr_at_20_std": 0.030765,
142
+ "nauc_mrr_at_20_diff1": 0.378142,
143
+ "nauc_mrr_at_100_max": 0.347878,
144
+ "nauc_mrr_at_100_std": 0.033748,
145
+ "nauc_mrr_at_100_diff1": 0.377537,
146
+ "nauc_mrr_at_1000_max": 0.348009,
147
+ "nauc_mrr_at_1000_std": 0.033478,
148
+ "nauc_mrr_at_1000_diff1": 0.37806,
149
+ "hit_rate_at_1": 0.25571,
150
+ "hit_rate_at_3": 0.37329,
151
+ "hit_rate_at_5": 0.43493,
152
+ "hit_rate_at_10": 0.50799,
153
+ "hit_rate_at_20": 0.60046,
154
+ "hit_rate_at_100": 0.76598,
155
+ "hit_rate_at_1000": 0.94064,
156
+ "main_score": 0.33073,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 85.61160373687744,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackRetrieval.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1",
3
+ "task_name": "CQADupstackRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_10": 0.300415,
9
+ "main_score": 0.300415,
10
+ "hf_subset": "default",
11
+ "languages": [
12
+ "eng-Latn"
13
+ ]
14
+ }
15
+ ]
16
+ },
17
+ "evaluation_time": 1223.3310227394104,
18
+ "kg_co2_emissions": NaN,
19
+ "date": 1775502984.226138
20
+ }
results/CQADupstackStatsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65ac3a16b8e91f9cee4c9828cc7c335575432a2a",
3
+ "task_name": "CQADupstackStatsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19939,
9
+ "ndcg_at_3": 0.22601,
10
+ "ndcg_at_5": 0.24338,
11
+ "ndcg_at_10": 0.26304,
12
+ "ndcg_at_20": 0.27657,
13
+ "ndcg_at_100": 0.30302,
14
+ "ndcg_at_1000": 0.33156,
15
+ "map_at_1": 0.17693,
16
+ "map_at_3": 0.20983,
17
+ "map_at_5": 0.22042,
18
+ "map_at_10": 0.2293,
19
+ "map_at_20": 0.23328,
20
+ "map_at_100": 0.23675,
21
+ "map_at_1000": 0.23779,
22
+ "recall_at_1": 0.17693,
23
+ "recall_at_3": 0.24628,
24
+ "recall_at_5": 0.28837,
25
+ "recall_at_10": 0.34653,
26
+ "recall_at_20": 0.39704,
27
+ "recall_at_100": 0.53374,
28
+ "recall_at_1000": 0.74777,
29
+ "accuracy": 0.17693,
30
+ "precision_at_1": 0.19939,
31
+ "precision_at_3": 0.09407,
32
+ "precision_at_5": 0.0681,
33
+ "precision_at_10": 0.04187,
34
+ "precision_at_20": 0.02416,
35
+ "precision_at_100": 0.00669,
36
+ "precision_at_1000": 0.00099,
37
+ "mrr_at_1": 0.199387,
38
+ "mrr_at_3": 0.232106,
39
+ "mrr_at_5": 0.242689,
40
+ "mrr_at_10": 0.250669,
41
+ "mrr_at_20": 0.254426,
42
+ "mrr_at_100": 0.258133,
43
+ "mrr_at_1000": 0.258933,
44
+ "nauc_ndcg_at_1_max": 0.34239,
45
+ "nauc_ndcg_at_1_std": 0.053912,
46
+ "nauc_ndcg_at_1_diff1": 0.54593,
47
+ "nauc_ndcg_at_3_max": 0.287571,
48
+ "nauc_ndcg_at_3_std": 0.030092,
49
+ "nauc_ndcg_at_3_diff1": 0.513483,
50
+ "nauc_ndcg_at_5_max": 0.286978,
51
+ "nauc_ndcg_at_5_std": 0.070714,
52
+ "nauc_ndcg_at_5_diff1": 0.496357,
53
+ "nauc_ndcg_at_10_max": 0.286388,
54
+ "nauc_ndcg_at_10_std": 0.08719,
55
+ "nauc_ndcg_at_10_diff1": 0.479912,
56
+ "nauc_ndcg_at_20_max": 0.277857,
57
+ "nauc_ndcg_at_20_std": 0.085656,
58
+ "nauc_ndcg_at_20_diff1": 0.4654,
59
+ "nauc_ndcg_at_100_max": 0.289598,
60
+ "nauc_ndcg_at_100_std": 0.108399,
61
+ "nauc_ndcg_at_100_diff1": 0.455575,
62
+ "nauc_ndcg_at_1000_max": 0.297222,
63
+ "nauc_ndcg_at_1000_std": 0.115509,
64
+ "nauc_ndcg_at_1000_diff1": 0.457413,
65
+ "nauc_map_at_1_max": 0.33093,
66
+ "nauc_map_at_1_std": 0.009692,
67
+ "nauc_map_at_1_diff1": 0.563824,
68
+ "nauc_map_at_3_max": 0.29155,
69
+ "nauc_map_at_3_std": 0.013803,
70
+ "nauc_map_at_3_diff1": 0.526676,
71
+ "nauc_map_at_5_max": 0.292056,
72
+ "nauc_map_at_5_std": 0.041113,
73
+ "nauc_map_at_5_diff1": 0.515763,
74
+ "nauc_map_at_10_max": 0.292077,
75
+ "nauc_map_at_10_std": 0.049991,
76
+ "nauc_map_at_10_diff1": 0.508472,
77
+ "nauc_map_at_20_max": 0.289244,
78
+ "nauc_map_at_20_std": 0.050419,
79
+ "nauc_map_at_20_diff1": 0.503322,
80
+ "nauc_map_at_100_max": 0.291686,
81
+ "nauc_map_at_100_std": 0.054083,
82
+ "nauc_map_at_100_diff1": 0.501732,
83
+ "nauc_map_at_1000_max": 0.292147,
84
+ "nauc_map_at_1000_std": 0.054493,
85
+ "nauc_map_at_1000_diff1": 0.501673,
86
+ "nauc_recall_at_1_max": 0.33093,
87
+ "nauc_recall_at_1_std": 0.009692,
88
+ "nauc_recall_at_1_diff1": 0.563824,
89
+ "nauc_recall_at_3_max": 0.252838,
90
+ "nauc_recall_at_3_std": 0.014645,
91
+ "nauc_recall_at_3_diff1": 0.482003,
92
+ "nauc_recall_at_5_max": 0.251224,
93
+ "nauc_recall_at_5_std": 0.10284,
94
+ "nauc_recall_at_5_diff1": 0.446269,
95
+ "nauc_recall_at_10_max": 0.248578,
96
+ "nauc_recall_at_10_std": 0.145866,
97
+ "nauc_recall_at_10_diff1": 0.399785,
98
+ "nauc_recall_at_20_max": 0.223837,
99
+ "nauc_recall_at_20_std": 0.138939,
100
+ "nauc_recall_at_20_diff1": 0.35209,
101
+ "nauc_recall_at_100_max": 0.261487,
102
+ "nauc_recall_at_100_std": 0.244132,
103
+ "nauc_recall_at_100_diff1": 0.293158,
104
+ "nauc_recall_at_1000_max": 0.292956,
105
+ "nauc_recall_at_1000_std": 0.360413,
106
+ "nauc_recall_at_1000_diff1": 0.241864,
107
+ "nauc_precision_at_1_max": 0.34239,
108
+ "nauc_precision_at_1_std": 0.053912,
109
+ "nauc_precision_at_1_diff1": 0.54593,
110
+ "nauc_precision_at_3_max": 0.264329,
111
+ "nauc_precision_at_3_std": 0.073249,
112
+ "nauc_precision_at_3_diff1": 0.452344,
113
+ "nauc_precision_at_5_max": 0.28006,
114
+ "nauc_precision_at_5_std": 0.189603,
115
+ "nauc_precision_at_5_diff1": 0.409172,
116
+ "nauc_precision_at_10_max": 0.271046,
117
+ "nauc_precision_at_10_std": 0.23617,
118
+ "nauc_precision_at_10_diff1": 0.348251,
119
+ "nauc_precision_at_20_max": 0.233673,
120
+ "nauc_precision_at_20_std": 0.224729,
121
+ "nauc_precision_at_20_diff1": 0.292413,
122
+ "nauc_precision_at_100_max": 0.273339,
123
+ "nauc_precision_at_100_std": 0.320007,
124
+ "nauc_precision_at_100_diff1": 0.227479,
125
+ "nauc_precision_at_1000_max": 0.234037,
126
+ "nauc_precision_at_1000_std": 0.294253,
127
+ "nauc_precision_at_1000_diff1": 0.083338,
128
+ "nauc_mrr_at_1_max": 0.34239,
129
+ "nauc_mrr_at_1_std": 0.053912,
130
+ "nauc_mrr_at_1_diff1": 0.54593,
131
+ "nauc_mrr_at_3_max": 0.308662,
132
+ "nauc_mrr_at_3_std": 0.060212,
133
+ "nauc_mrr_at_3_diff1": 0.513656,
134
+ "nauc_mrr_at_5_max": 0.307116,
135
+ "nauc_mrr_at_5_std": 0.083493,
136
+ "nauc_mrr_at_5_diff1": 0.502983,
137
+ "nauc_mrr_at_10_max": 0.306032,
138
+ "nauc_mrr_at_10_std": 0.087963,
139
+ "nauc_mrr_at_10_diff1": 0.495289,
140
+ "nauc_mrr_at_20_max": 0.303188,
141
+ "nauc_mrr_at_20_std": 0.086541,
142
+ "nauc_mrr_at_20_diff1": 0.492665,
143
+ "nauc_mrr_at_100_max": 0.305259,
144
+ "nauc_mrr_at_100_std": 0.089207,
145
+ "nauc_mrr_at_100_diff1": 0.491325,
146
+ "nauc_mrr_at_1000_max": 0.305481,
147
+ "nauc_mrr_at_1000_std": 0.089425,
148
+ "nauc_mrr_at_1000_diff1": 0.491205,
149
+ "hit_rate_at_1": 0.19939,
150
+ "hit_rate_at_3": 0.27147,
151
+ "hit_rate_at_5": 0.31748,
152
+ "hit_rate_at_10": 0.3773,
153
+ "hit_rate_at_20": 0.43098,
154
+ "hit_rate_at_100": 0.58282,
155
+ "hit_rate_at_1000": 0.78374,
156
+ "main_score": 0.26304,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 110.71832537651062,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackTexRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "46989137a86843e03a6195de44b09deda022eec7",
3
+ "task_name": "CQADupstackTexRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.14797,
9
+ "ndcg_at_3": 0.17433,
10
+ "ndcg_at_5": 0.18774,
11
+ "ndcg_at_10": 0.20565,
12
+ "ndcg_at_20": 0.22284,
13
+ "ndcg_at_100": 0.25573,
14
+ "ndcg_at_1000": 0.28913,
15
+ "map_at_1": 0.12244,
16
+ "map_at_3": 0.15457,
17
+ "map_at_5": 0.1633,
18
+ "map_at_10": 0.17105,
19
+ "map_at_20": 0.17611,
20
+ "map_at_100": 0.18076,
21
+ "map_at_1000": 0.18204,
22
+ "recall_at_1": 0.12244,
23
+ "recall_at_3": 0.19341,
24
+ "recall_at_5": 0.22778,
25
+ "recall_at_10": 0.28047,
26
+ "recall_at_20": 0.34418,
27
+ "recall_at_100": 0.51073,
28
+ "recall_at_1000": 0.75379,
29
+ "accuracy": 0.12244,
30
+ "precision_at_1": 0.14797,
31
+ "precision_at_3": 0.08052,
32
+ "precision_at_5": 0.05836,
33
+ "precision_at_10": 0.03744,
34
+ "precision_at_20": 0.02345,
35
+ "precision_at_100": 0.00736,
36
+ "precision_at_1000": 0.00119,
37
+ "mrr_at_1": 0.14797,
38
+ "mrr_at_3": 0.185306,
39
+ "mrr_at_5": 0.193892,
40
+ "mrr_at_10": 0.20229,
41
+ "mrr_at_20": 0.207239,
42
+ "mrr_at_100": 0.211374,
43
+ "mrr_at_1000": 0.212275,
44
+ "nauc_ndcg_at_1_max": 0.155807,
45
+ "nauc_ndcg_at_1_std": -0.024993,
46
+ "nauc_ndcg_at_1_diff1": 0.393798,
47
+ "nauc_ndcg_at_3_max": 0.145912,
48
+ "nauc_ndcg_at_3_std": -0.030579,
49
+ "nauc_ndcg_at_3_diff1": 0.324371,
50
+ "nauc_ndcg_at_5_max": 0.140763,
51
+ "nauc_ndcg_at_5_std": -0.022035,
52
+ "nauc_ndcg_at_5_diff1": 0.313873,
53
+ "nauc_ndcg_at_10_max": 0.14338,
54
+ "nauc_ndcg_at_10_std": -0.016675,
55
+ "nauc_ndcg_at_10_diff1": 0.306491,
56
+ "nauc_ndcg_at_20_max": 0.146305,
57
+ "nauc_ndcg_at_20_std": -0.000771,
58
+ "nauc_ndcg_at_20_diff1": 0.293395,
59
+ "nauc_ndcg_at_100_max": 0.151494,
60
+ "nauc_ndcg_at_100_std": 0.021626,
61
+ "nauc_ndcg_at_100_diff1": 0.287014,
62
+ "nauc_ndcg_at_1000_max": 0.160739,
63
+ "nauc_ndcg_at_1000_std": 0.028774,
64
+ "nauc_ndcg_at_1000_diff1": 0.286397,
65
+ "nauc_map_at_1_max": 0.138954,
66
+ "nauc_map_at_1_std": -0.018631,
67
+ "nauc_map_at_1_diff1": 0.39995,
68
+ "nauc_map_at_3_max": 0.137714,
69
+ "nauc_map_at_3_std": -0.028376,
70
+ "nauc_map_at_3_diff1": 0.341232,
71
+ "nauc_map_at_5_max": 0.135899,
72
+ "nauc_map_at_5_std": -0.023753,
73
+ "nauc_map_at_5_diff1": 0.335552,
74
+ "nauc_map_at_10_max": 0.138319,
75
+ "nauc_map_at_10_std": -0.021774,
76
+ "nauc_map_at_10_diff1": 0.332066,
77
+ "nauc_map_at_20_max": 0.139679,
78
+ "nauc_map_at_20_std": -0.016602,
79
+ "nauc_map_at_20_diff1": 0.327615,
80
+ "nauc_map_at_100_max": 0.140515,
81
+ "nauc_map_at_100_std": -0.012958,
82
+ "nauc_map_at_100_diff1": 0.326161,
83
+ "nauc_map_at_1000_max": 0.141,
84
+ "nauc_map_at_1000_std": -0.012567,
85
+ "nauc_map_at_1000_diff1": 0.325984,
86
+ "nauc_recall_at_1_max": 0.138954,
87
+ "nauc_recall_at_1_std": -0.018631,
88
+ "nauc_recall_at_1_diff1": 0.39995,
89
+ "nauc_recall_at_3_max": 0.139341,
90
+ "nauc_recall_at_3_std": -0.030361,
91
+ "nauc_recall_at_3_diff1": 0.283086,
92
+ "nauc_recall_at_5_max": 0.129933,
93
+ "nauc_recall_at_5_std": -0.012577,
94
+ "nauc_recall_at_5_diff1": 0.259801,
95
+ "nauc_recall_at_10_max": 0.137034,
96
+ "nauc_recall_at_10_std": -0.001981,
97
+ "nauc_recall_at_10_diff1": 0.246985,
98
+ "nauc_recall_at_20_max": 0.14314,
99
+ "nauc_recall_at_20_std": 0.045758,
100
+ "nauc_recall_at_20_diff1": 0.206429,
101
+ "nauc_recall_at_100_max": 0.158292,
102
+ "nauc_recall_at_100_std": 0.134625,
103
+ "nauc_recall_at_100_diff1": 0.178534,
104
+ "nauc_recall_at_1000_max": 0.231397,
105
+ "nauc_recall_at_1000_std": 0.254449,
106
+ "nauc_recall_at_1000_diff1": 0.137398,
107
+ "nauc_precision_at_1_max": 0.155807,
108
+ "nauc_precision_at_1_std": -0.024993,
109
+ "nauc_precision_at_1_diff1": 0.393798,
110
+ "nauc_precision_at_3_max": 0.164192,
111
+ "nauc_precision_at_3_std": -0.040173,
112
+ "nauc_precision_at_3_diff1": 0.279641,
113
+ "nauc_precision_at_5_max": 0.154864,
114
+ "nauc_precision_at_5_std": -0.025184,
115
+ "nauc_precision_at_5_diff1": 0.251833,
116
+ "nauc_precision_at_10_max": 0.15883,
117
+ "nauc_precision_at_10_std": -0.00548,
118
+ "nauc_precision_at_10_diff1": 0.217341,
119
+ "nauc_precision_at_20_max": 0.159998,
120
+ "nauc_precision_at_20_std": 0.028167,
121
+ "nauc_precision_at_20_diff1": 0.165511,
122
+ "nauc_precision_at_100_max": 0.170826,
123
+ "nauc_precision_at_100_std": 0.098782,
124
+ "nauc_precision_at_100_diff1": 0.099011,
125
+ "nauc_precision_at_1000_max": 0.174437,
126
+ "nauc_precision_at_1000_std": 0.100415,
127
+ "nauc_precision_at_1000_diff1": 0.021104,
128
+ "nauc_mrr_at_1_max": 0.155807,
129
+ "nauc_mrr_at_1_std": -0.024993,
130
+ "nauc_mrr_at_1_diff1": 0.393798,
131
+ "nauc_mrr_at_3_max": 0.152887,
132
+ "nauc_mrr_at_3_std": -0.031589,
133
+ "nauc_mrr_at_3_diff1": 0.33324,
134
+ "nauc_mrr_at_5_max": 0.151109,
135
+ "nauc_mrr_at_5_std": -0.02687,
136
+ "nauc_mrr_at_5_diff1": 0.326086,
137
+ "nauc_mrr_at_10_max": 0.150897,
138
+ "nauc_mrr_at_10_std": -0.023405,
139
+ "nauc_mrr_at_10_diff1": 0.321458,
140
+ "nauc_mrr_at_20_max": 0.15147,
141
+ "nauc_mrr_at_20_std": -0.018521,
142
+ "nauc_mrr_at_20_diff1": 0.317529,
143
+ "nauc_mrr_at_100_max": 0.152248,
144
+ "nauc_mrr_at_100_std": -0.015871,
145
+ "nauc_mrr_at_100_diff1": 0.316934,
146
+ "nauc_mrr_at_1000_max": 0.152538,
147
+ "nauc_mrr_at_1000_std": -0.015814,
148
+ "nauc_mrr_at_1000_diff1": 0.317058,
149
+ "hit_rate_at_1": 0.14797,
150
+ "hit_rate_at_3": 0.23331,
151
+ "hit_rate_at_5": 0.27116,
152
+ "hit_rate_at_10": 0.33414,
153
+ "hit_rate_at_20": 0.40399,
154
+ "hit_rate_at_100": 0.57915,
155
+ "hit_rate_at_1000": 0.80007,
156
+ "main_score": 0.20565,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 194.65334963798523,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackUnixRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6c6430d3a6d36f8d2a829195bc5dc94d7e063e53",
3
+ "task_name": "CQADupstackUnixRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.21642,
9
+ "ndcg_at_3": 0.25066,
10
+ "ndcg_at_5": 0.26658,
11
+ "ndcg_at_10": 0.28733,
12
+ "ndcg_at_20": 0.30645,
13
+ "ndcg_at_100": 0.3386,
14
+ "ndcg_at_1000": 0.3713,
15
+ "map_at_1": 0.18771,
16
+ "map_at_3": 0.22856,
17
+ "map_at_5": 0.23871,
18
+ "map_at_10": 0.24771,
19
+ "map_at_20": 0.25314,
20
+ "map_at_100": 0.25778,
21
+ "map_at_1000": 0.25905,
22
+ "recall_at_1": 0.18771,
23
+ "recall_at_3": 0.27613,
24
+ "recall_at_5": 0.31762,
25
+ "recall_at_10": 0.37816,
26
+ "recall_at_20": 0.44911,
27
+ "recall_at_100": 0.60946,
28
+ "recall_at_1000": 0.84511,
29
+ "accuracy": 0.18771,
30
+ "precision_at_1": 0.21642,
31
+ "precision_at_3": 0.11132,
32
+ "precision_at_5": 0.07761,
33
+ "precision_at_10": 0.04739,
34
+ "precision_at_20": 0.02868,
35
+ "precision_at_100": 0.00816,
36
+ "precision_at_1000": 0.00124,
37
+ "mrr_at_1": 0.216418,
38
+ "mrr_at_3": 0.260106,
39
+ "mrr_at_5": 0.27046,
40
+ "mrr_at_10": 0.279593,
41
+ "mrr_at_20": 0.284582,
42
+ "mrr_at_100": 0.288729,
43
+ "mrr_at_1000": 0.289584,
44
+ "nauc_ndcg_at_1_max": 0.272642,
45
+ "nauc_ndcg_at_1_std": -0.065343,
46
+ "nauc_ndcg_at_1_diff1": 0.432844,
47
+ "nauc_ndcg_at_3_max": 0.266708,
48
+ "nauc_ndcg_at_3_std": -0.033573,
49
+ "nauc_ndcg_at_3_diff1": 0.375249,
50
+ "nauc_ndcg_at_5_max": 0.262256,
51
+ "nauc_ndcg_at_5_std": -0.02073,
52
+ "nauc_ndcg_at_5_diff1": 0.358351,
53
+ "nauc_ndcg_at_10_max": 0.262991,
54
+ "nauc_ndcg_at_10_std": -0.009598,
55
+ "nauc_ndcg_at_10_diff1": 0.348228,
56
+ "nauc_ndcg_at_20_max": 0.268275,
57
+ "nauc_ndcg_at_20_std": 0.000742,
58
+ "nauc_ndcg_at_20_diff1": 0.338186,
59
+ "nauc_ndcg_at_100_max": 0.26905,
60
+ "nauc_ndcg_at_100_std": 0.012228,
61
+ "nauc_ndcg_at_100_diff1": 0.341533,
62
+ "nauc_ndcg_at_1000_max": 0.280964,
63
+ "nauc_ndcg_at_1000_std": 0.017772,
64
+ "nauc_ndcg_at_1000_diff1": 0.346544,
65
+ "nauc_map_at_1_max": 0.300503,
66
+ "nauc_map_at_1_std": -0.066139,
67
+ "nauc_map_at_1_diff1": 0.450556,
68
+ "nauc_map_at_3_max": 0.275569,
69
+ "nauc_map_at_3_std": -0.044332,
70
+ "nauc_map_at_3_diff1": 0.394534,
71
+ "nauc_map_at_5_max": 0.273614,
72
+ "nauc_map_at_5_std": -0.035012,
73
+ "nauc_map_at_5_diff1": 0.383999,
74
+ "nauc_map_at_10_max": 0.273702,
75
+ "nauc_map_at_10_std": -0.029923,
76
+ "nauc_map_at_10_diff1": 0.37825,
77
+ "nauc_map_at_20_max": 0.275303,
78
+ "nauc_map_at_20_std": -0.02717,
79
+ "nauc_map_at_20_diff1": 0.375299,
80
+ "nauc_map_at_100_max": 0.274922,
81
+ "nauc_map_at_100_std": -0.026014,
82
+ "nauc_map_at_100_diff1": 0.375934,
83
+ "nauc_map_at_1000_max": 0.275515,
84
+ "nauc_map_at_1000_std": -0.025387,
85
+ "nauc_map_at_1000_diff1": 0.37605,
86
+ "nauc_recall_at_1_max": 0.300503,
87
+ "nauc_recall_at_1_std": -0.066139,
88
+ "nauc_recall_at_1_diff1": 0.450556,
89
+ "nauc_recall_at_3_max": 0.250636,
90
+ "nauc_recall_at_3_std": -0.015563,
91
+ "nauc_recall_at_3_diff1": 0.337269,
92
+ "nauc_recall_at_5_max": 0.233709,
93
+ "nauc_recall_at_5_std": 0.01322,
94
+ "nauc_recall_at_5_diff1": 0.292095,
95
+ "nauc_recall_at_10_max": 0.234079,
96
+ "nauc_recall_at_10_std": 0.04199,
97
+ "nauc_recall_at_10_diff1": 0.270756,
98
+ "nauc_recall_at_20_max": 0.242039,
99
+ "nauc_recall_at_20_std": 0.076241,
100
+ "nauc_recall_at_20_diff1": 0.233256,
101
+ "nauc_recall_at_100_max": 0.246044,
102
+ "nauc_recall_at_100_std": 0.156422,
103
+ "nauc_recall_at_100_diff1": 0.232147,
104
+ "nauc_recall_at_1000_max": 0.385948,
105
+ "nauc_recall_at_1000_std": 0.361926,
106
+ "nauc_recall_at_1000_diff1": 0.238452,
107
+ "nauc_precision_at_1_max": 0.272642,
108
+ "nauc_precision_at_1_std": -0.065343,
109
+ "nauc_precision_at_1_diff1": 0.432844,
110
+ "nauc_precision_at_3_max": 0.229907,
111
+ "nauc_precision_at_3_std": -0.004613,
112
+ "nauc_precision_at_3_diff1": 0.30632,
113
+ "nauc_precision_at_5_max": 0.215841,
114
+ "nauc_precision_at_5_std": 0.024235,
115
+ "nauc_precision_at_5_diff1": 0.254839,
116
+ "nauc_precision_at_10_max": 0.214978,
117
+ "nauc_precision_at_10_std": 0.05216,
118
+ "nauc_precision_at_10_diff1": 0.216263,
119
+ "nauc_precision_at_20_max": 0.228239,
120
+ "nauc_precision_at_20_std": 0.085482,
121
+ "nauc_precision_at_20_diff1": 0.155654,
122
+ "nauc_precision_at_100_max": 0.167922,
123
+ "nauc_precision_at_100_std": 0.107264,
124
+ "nauc_precision_at_100_diff1": 0.11087,
125
+ "nauc_precision_at_1000_max": 0.093998,
126
+ "nauc_precision_at_1000_std": 0.07508,
127
+ "nauc_precision_at_1000_diff1": -0.041364,
128
+ "nauc_mrr_at_1_max": 0.272642,
129
+ "nauc_mrr_at_1_std": -0.065343,
130
+ "nauc_mrr_at_1_diff1": 0.432844,
131
+ "nauc_mrr_at_3_max": 0.264813,
132
+ "nauc_mrr_at_3_std": -0.037417,
133
+ "nauc_mrr_at_3_diff1": 0.379349,
134
+ "nauc_mrr_at_5_max": 0.260002,
135
+ "nauc_mrr_at_5_std": -0.030665,
136
+ "nauc_mrr_at_5_diff1": 0.368757,
137
+ "nauc_mrr_at_10_max": 0.25947,
138
+ "nauc_mrr_at_10_std": -0.026415,
139
+ "nauc_mrr_at_10_diff1": 0.364459,
140
+ "nauc_mrr_at_20_max": 0.261223,
141
+ "nauc_mrr_at_20_std": -0.024026,
142
+ "nauc_mrr_at_20_diff1": 0.362216,
143
+ "nauc_mrr_at_100_max": 0.261118,
144
+ "nauc_mrr_at_100_std": -0.023111,
145
+ "nauc_mrr_at_100_diff1": 0.362313,
146
+ "nauc_mrr_at_1000_max": 0.261584,
147
+ "nauc_mrr_at_1000_std": -0.022876,
148
+ "nauc_mrr_at_1000_diff1": 0.362583,
149
+ "hit_rate_at_1": 0.21642,
150
+ "hit_rate_at_3": 0.31716,
151
+ "hit_rate_at_5": 0.36287,
152
+ "hit_rate_at_10": 0.43097,
153
+ "hit_rate_at_20": 0.5056,
154
+ "hit_rate_at_100": 0.67257,
155
+ "hit_rate_at_1000": 0.88806,
156
+ "main_score": 0.28733,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 127.21370887756348,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWebmastersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "160c094312a0e1facb97e55eeddb698c0abe3571",
3
+ "task_name": "CQADupstackWebmastersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.22134,
9
+ "ndcg_at_3": 0.26211,
10
+ "ndcg_at_5": 0.2788,
11
+ "ndcg_at_10": 0.30248,
12
+ "ndcg_at_20": 0.32386,
13
+ "ndcg_at_100": 0.36086,
14
+ "ndcg_at_1000": 0.3931,
15
+ "map_at_1": 0.18366,
16
+ "map_at_3": 0.22916,
17
+ "map_at_5": 0.24131,
18
+ "map_at_10": 0.25398,
19
+ "map_at_20": 0.26192,
20
+ "map_at_100": 0.26913,
21
+ "map_at_1000": 0.27134,
22
+ "recall_at_1": 0.18366,
23
+ "recall_at_3": 0.27537,
24
+ "recall_at_5": 0.3206,
25
+ "recall_at_10": 0.39205,
26
+ "recall_at_20": 0.47546,
27
+ "recall_at_100": 0.6574,
28
+ "recall_at_1000": 0.87259,
29
+ "accuracy": 0.18366,
30
+ "precision_at_1": 0.22134,
31
+ "precision_at_3": 0.12319,
32
+ "precision_at_5": 0.0913,
33
+ "precision_at_10": 0.06028,
34
+ "precision_at_20": 0.03943,
35
+ "precision_at_100": 0.01324,
36
+ "precision_at_1000": 0.00226,
37
+ "mrr_at_1": 0.221344,
38
+ "mrr_at_3": 0.270092,
39
+ "mrr_at_5": 0.281653,
40
+ "mrr_at_10": 0.292026,
41
+ "mrr_at_20": 0.298255,
42
+ "mrr_at_100": 0.30299,
43
+ "mrr_at_1000": 0.30372,
44
+ "nauc_ndcg_at_1_max": 0.178493,
45
+ "nauc_ndcg_at_1_std": 0.023765,
46
+ "nauc_ndcg_at_1_diff1": 0.418198,
47
+ "nauc_ndcg_at_3_max": 0.187413,
48
+ "nauc_ndcg_at_3_std": 0.051171,
49
+ "nauc_ndcg_at_3_diff1": 0.375683,
50
+ "nauc_ndcg_at_5_max": 0.186653,
51
+ "nauc_ndcg_at_5_std": 0.043239,
52
+ "nauc_ndcg_at_5_diff1": 0.373942,
53
+ "nauc_ndcg_at_10_max": 0.184562,
54
+ "nauc_ndcg_at_10_std": 0.071817,
55
+ "nauc_ndcg_at_10_diff1": 0.361516,
56
+ "nauc_ndcg_at_20_max": 0.191904,
57
+ "nauc_ndcg_at_20_std": 0.095907,
58
+ "nauc_ndcg_at_20_diff1": 0.360312,
59
+ "nauc_ndcg_at_100_max": 0.199862,
60
+ "nauc_ndcg_at_100_std": 0.112067,
61
+ "nauc_ndcg_at_100_diff1": 0.361154,
62
+ "nauc_ndcg_at_1000_max": 0.201186,
63
+ "nauc_ndcg_at_1000_std": 0.106915,
64
+ "nauc_ndcg_at_1000_diff1": 0.365619,
65
+ "nauc_map_at_1_max": 0.204154,
66
+ "nauc_map_at_1_std": -0.006417,
67
+ "nauc_map_at_1_diff1": 0.420492,
68
+ "nauc_map_at_3_max": 0.199085,
69
+ "nauc_map_at_3_std": 0.023741,
70
+ "nauc_map_at_3_diff1": 0.38389,
71
+ "nauc_map_at_5_max": 0.197771,
72
+ "nauc_map_at_5_std": 0.018024,
73
+ "nauc_map_at_5_diff1": 0.382059,
74
+ "nauc_map_at_10_max": 0.197015,
75
+ "nauc_map_at_10_std": 0.035706,
76
+ "nauc_map_at_10_diff1": 0.376587,
77
+ "nauc_map_at_20_max": 0.198904,
78
+ "nauc_map_at_20_std": 0.047267,
79
+ "nauc_map_at_20_diff1": 0.377131,
80
+ "nauc_map_at_100_max": 0.19885,
81
+ "nauc_map_at_100_std": 0.055077,
82
+ "nauc_map_at_100_diff1": 0.378871,
83
+ "nauc_map_at_1000_max": 0.197346,
84
+ "nauc_map_at_1000_std": 0.056368,
85
+ "nauc_map_at_1000_diff1": 0.379304,
86
+ "nauc_recall_at_1_max": 0.204154,
87
+ "nauc_recall_at_1_std": -0.006417,
88
+ "nauc_recall_at_1_diff1": 0.420492,
89
+ "nauc_recall_at_3_max": 0.182451,
90
+ "nauc_recall_at_3_std": 0.051838,
91
+ "nauc_recall_at_3_diff1": 0.338178,
92
+ "nauc_recall_at_5_max": 0.181388,
93
+ "nauc_recall_at_5_std": 0.042869,
94
+ "nauc_recall_at_5_diff1": 0.340121,
95
+ "nauc_recall_at_10_max": 0.172174,
96
+ "nauc_recall_at_10_std": 0.122086,
97
+ "nauc_recall_at_10_diff1": 0.301817,
98
+ "nauc_recall_at_20_max": 0.187429,
99
+ "nauc_recall_at_20_std": 0.201298,
100
+ "nauc_recall_at_20_diff1": 0.29608,
101
+ "nauc_recall_at_100_max": 0.207613,
102
+ "nauc_recall_at_100_std": 0.311317,
103
+ "nauc_recall_at_100_diff1": 0.284505,
104
+ "nauc_recall_at_1000_max": 0.273464,
105
+ "nauc_recall_at_1000_std": 0.410027,
106
+ "nauc_recall_at_1000_diff1": 0.276375,
107
+ "nauc_precision_at_1_max": 0.178493,
108
+ "nauc_precision_at_1_std": 0.023765,
109
+ "nauc_precision_at_1_diff1": 0.418198,
110
+ "nauc_precision_at_3_max": 0.156274,
111
+ "nauc_precision_at_3_std": 0.09332,
112
+ "nauc_precision_at_3_diff1": 0.328823,
113
+ "nauc_precision_at_5_max": 0.138814,
114
+ "nauc_precision_at_5_std": 0.078812,
115
+ "nauc_precision_at_5_diff1": 0.28963,
116
+ "nauc_precision_at_10_max": 0.112561,
117
+ "nauc_precision_at_10_std": 0.15428,
118
+ "nauc_precision_at_10_diff1": 0.243124,
119
+ "nauc_precision_at_20_max": 0.10464,
120
+ "nauc_precision_at_20_std": 0.270382,
121
+ "nauc_precision_at_20_diff1": 0.238329,
122
+ "nauc_precision_at_100_max": -0.030835,
123
+ "nauc_precision_at_100_std": 0.276277,
124
+ "nauc_precision_at_100_diff1": 0.142989,
125
+ "nauc_precision_at_1000_max": -0.144315,
126
+ "nauc_precision_at_1000_std": 0.124791,
127
+ "nauc_precision_at_1000_diff1": 0.025244,
128
+ "nauc_mrr_at_1_max": 0.178493,
129
+ "nauc_mrr_at_1_std": 0.023765,
130
+ "nauc_mrr_at_1_diff1": 0.418198,
131
+ "nauc_mrr_at_3_max": 0.178987,
132
+ "nauc_mrr_at_3_std": 0.050853,
133
+ "nauc_mrr_at_3_diff1": 0.380415,
134
+ "nauc_mrr_at_5_max": 0.183008,
135
+ "nauc_mrr_at_5_std": 0.048254,
136
+ "nauc_mrr_at_5_diff1": 0.380948,
137
+ "nauc_mrr_at_10_max": 0.179507,
138
+ "nauc_mrr_at_10_std": 0.061629,
139
+ "nauc_mrr_at_10_diff1": 0.374216,
140
+ "nauc_mrr_at_20_max": 0.182849,
141
+ "nauc_mrr_at_20_std": 0.067356,
142
+ "nauc_mrr_at_20_diff1": 0.375427,
143
+ "nauc_mrr_at_100_max": 0.182661,
144
+ "nauc_mrr_at_100_std": 0.067239,
145
+ "nauc_mrr_at_100_diff1": 0.376557,
146
+ "nauc_mrr_at_1000_max": 0.182542,
147
+ "nauc_mrr_at_1000_std": 0.067028,
148
+ "nauc_mrr_at_1000_diff1": 0.37654,
149
+ "hit_rate_at_1": 0.22134,
150
+ "hit_rate_at_3": 0.33004,
151
+ "hit_rate_at_5": 0.38142,
152
+ "hit_rate_at_10": 0.45455,
153
+ "hit_rate_at_20": 0.54545,
154
+ "hit_rate_at_100": 0.72134,
155
+ "hit_rate_at_1000": 0.89921,
156
+ "main_score": 0.30248,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 45.840434551239014,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWordpressRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4ffe81d471b1924886b33c7567bfb200e9eec5c4",
3
+ "task_name": "CQADupstackWordpressRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.15157,
9
+ "ndcg_at_3": 0.19469,
10
+ "ndcg_at_5": 0.21324,
11
+ "ndcg_at_10": 0.23091,
12
+ "ndcg_at_20": 0.25211,
13
+ "ndcg_at_100": 0.28814,
14
+ "ndcg_at_1000": 0.31786,
15
+ "map_at_1": 0.13956,
16
+ "map_at_3": 0.1773,
17
+ "map_at_5": 0.18814,
18
+ "map_at_10": 0.19596,
19
+ "map_at_20": 0.20184,
20
+ "map_at_100": 0.20689,
21
+ "map_at_1000": 0.20814,
22
+ "recall_at_1": 0.13956,
23
+ "recall_at_3": 0.22785,
24
+ "recall_at_5": 0.27197,
25
+ "recall_at_10": 0.32372,
26
+ "recall_at_20": 0.40388,
27
+ "recall_at_100": 0.58994,
28
+ "recall_at_1000": 0.80734,
29
+ "accuracy": 0.13956,
30
+ "precision_at_1": 0.15157,
31
+ "precision_at_3": 0.08503,
32
+ "precision_at_5": 0.06174,
33
+ "precision_at_10": 0.03697,
34
+ "precision_at_20": 0.02338,
35
+ "precision_at_100": 0.00725,
36
+ "precision_at_1000": 0.00108,
37
+ "mrr_at_1": 0.151571,
38
+ "mrr_at_3": 0.193161,
39
+ "mrr_at_5": 0.204251,
40
+ "mrr_at_10": 0.211534,
41
+ "mrr_at_20": 0.21777,
42
+ "mrr_at_100": 0.222356,
43
+ "mrr_at_1000": 0.223307,
44
+ "nauc_ndcg_at_1_max": 0.200608,
45
+ "nauc_ndcg_at_1_std": -0.101298,
46
+ "nauc_ndcg_at_1_diff1": 0.372787,
47
+ "nauc_ndcg_at_3_max": 0.21147,
48
+ "nauc_ndcg_at_3_std": -0.024249,
49
+ "nauc_ndcg_at_3_diff1": 0.300933,
50
+ "nauc_ndcg_at_5_max": 0.20131,
51
+ "nauc_ndcg_at_5_std": -0.015126,
52
+ "nauc_ndcg_at_5_diff1": 0.275209,
53
+ "nauc_ndcg_at_10_max": 0.193596,
54
+ "nauc_ndcg_at_10_std": -0.021784,
55
+ "nauc_ndcg_at_10_diff1": 0.259783,
56
+ "nauc_ndcg_at_20_max": 0.20253,
57
+ "nauc_ndcg_at_20_std": -0.007344,
58
+ "nauc_ndcg_at_20_diff1": 0.260696,
59
+ "nauc_ndcg_at_100_max": 0.193511,
60
+ "nauc_ndcg_at_100_std": 0.024032,
61
+ "nauc_ndcg_at_100_diff1": 0.242661,
62
+ "nauc_ndcg_at_1000_max": 0.197246,
63
+ "nauc_ndcg_at_1000_std": 0.035055,
64
+ "nauc_ndcg_at_1000_diff1": 0.243718,
65
+ "nauc_map_at_1_max": 0.226679,
66
+ "nauc_map_at_1_std": -0.103613,
67
+ "nauc_map_at_1_diff1": 0.39661,
68
+ "nauc_map_at_3_max": 0.217148,
69
+ "nauc_map_at_3_std": -0.042586,
70
+ "nauc_map_at_3_diff1": 0.321894,
71
+ "nauc_map_at_5_max": 0.211393,
72
+ "nauc_map_at_5_std": -0.036031,
73
+ "nauc_map_at_5_diff1": 0.304229,
74
+ "nauc_map_at_10_max": 0.207786,
75
+ "nauc_map_at_10_std": -0.03914,
76
+ "nauc_map_at_10_diff1": 0.295886,
77
+ "nauc_map_at_20_max": 0.210869,
78
+ "nauc_map_at_20_std": -0.034689,
79
+ "nauc_map_at_20_diff1": 0.296085,
80
+ "nauc_map_at_100_max": 0.208742,
81
+ "nauc_map_at_100_std": -0.030291,
82
+ "nauc_map_at_100_diff1": 0.293044,
83
+ "nauc_map_at_1000_max": 0.208814,
84
+ "nauc_map_at_1000_std": -0.029741,
85
+ "nauc_map_at_1000_diff1": 0.292955,
86
+ "nauc_recall_at_1_max": 0.226679,
87
+ "nauc_recall_at_1_std": -0.103613,
88
+ "nauc_recall_at_1_diff1": 0.39661,
89
+ "nauc_recall_at_3_max": 0.193489,
90
+ "nauc_recall_at_3_std": 0.006385,
91
+ "nauc_recall_at_3_diff1": 0.26264,
92
+ "nauc_recall_at_5_max": 0.178748,
93
+ "nauc_recall_at_5_std": 0.028513,
94
+ "nauc_recall_at_5_diff1": 0.216429,
95
+ "nauc_recall_at_10_max": 0.153293,
96
+ "nauc_recall_at_10_std": 0.007644,
97
+ "nauc_recall_at_10_diff1": 0.176287,
98
+ "nauc_recall_at_20_max": 0.178418,
99
+ "nauc_recall_at_20_std": 0.047086,
100
+ "nauc_recall_at_20_diff1": 0.183082,
101
+ "nauc_recall_at_100_max": 0.14828,
102
+ "nauc_recall_at_100_std": 0.196014,
103
+ "nauc_recall_at_100_diff1": 0.097947,
104
+ "nauc_recall_at_1000_max": 0.159086,
105
+ "nauc_recall_at_1000_std": 0.418362,
106
+ "nauc_recall_at_1000_diff1": 0.025671,
107
+ "nauc_precision_at_1_max": 0.200608,
108
+ "nauc_precision_at_1_std": -0.101298,
109
+ "nauc_precision_at_1_diff1": 0.372787,
110
+ "nauc_precision_at_3_max": 0.203567,
111
+ "nauc_precision_at_3_std": 0.023521,
112
+ "nauc_precision_at_3_diff1": 0.236986,
113
+ "nauc_precision_at_5_max": 0.179039,
114
+ "nauc_precision_at_5_std": 0.0456,
115
+ "nauc_precision_at_5_diff1": 0.177251,
116
+ "nauc_precision_at_10_max": 0.166347,
117
+ "nauc_precision_at_10_std": 0.029963,
118
+ "nauc_precision_at_10_diff1": 0.139187,
119
+ "nauc_precision_at_20_max": 0.19965,
120
+ "nauc_precision_at_20_std": 0.087433,
121
+ "nauc_precision_at_20_diff1": 0.127039,
122
+ "nauc_precision_at_100_max": 0.107708,
123
+ "nauc_precision_at_100_std": 0.227818,
124
+ "nauc_precision_at_100_diff1": -0.022958,
125
+ "nauc_precision_at_1000_max": 0.038302,
126
+ "nauc_precision_at_1000_std": 0.271083,
127
+ "nauc_precision_at_1000_diff1": -0.155681,
128
+ "nauc_mrr_at_1_max": 0.200608,
129
+ "nauc_mrr_at_1_std": -0.101298,
130
+ "nauc_mrr_at_1_diff1": 0.372787,
131
+ "nauc_mrr_at_3_max": 0.203131,
132
+ "nauc_mrr_at_3_std": -0.034366,
133
+ "nauc_mrr_at_3_diff1": 0.303909,
134
+ "nauc_mrr_at_5_max": 0.198854,
135
+ "nauc_mrr_at_5_std": -0.03008,
136
+ "nauc_mrr_at_5_diff1": 0.287446,
137
+ "nauc_mrr_at_10_max": 0.197469,
138
+ "nauc_mrr_at_10_std": -0.030856,
139
+ "nauc_mrr_at_10_diff1": 0.283163,
140
+ "nauc_mrr_at_20_max": 0.199878,
141
+ "nauc_mrr_at_20_std": -0.0259,
142
+ "nauc_mrr_at_20_diff1": 0.282693,
143
+ "nauc_mrr_at_100_max": 0.198253,
144
+ "nauc_mrr_at_100_std": -0.022159,
145
+ "nauc_mrr_at_100_diff1": 0.280719,
146
+ "nauc_mrr_at_1000_max": 0.198212,
147
+ "nauc_mrr_at_1000_std": -0.022136,
148
+ "nauc_mrr_at_1000_diff1": 0.280987,
149
+ "hit_rate_at_1": 0.15157,
150
+ "hit_rate_at_3": 0.24954,
151
+ "hit_rate_at_5": 0.2976,
152
+ "hit_rate_at_10": 0.34935,
153
+ "hit_rate_at_20": 0.43808,
154
+ "hit_rate_at_100": 0.62662,
155
+ "hit_rate_at_1000": 0.8281,
156
+ "main_score": 0.23091,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 127.77572250366211,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ClimateFEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "47f2ac6acb640fc46020b02a5b59fdda04d39380",
3
+ "task_name": "ClimateFEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.27166,
9
+ "ndcg_at_3": 0.24007,
10
+ "ndcg_at_5": 0.25492,
11
+ "ndcg_at_10": 0.28605,
12
+ "ndcg_at_20": 0.30978,
13
+ "ndcg_at_100": 0.34753,
14
+ "ndcg_at_1000": 0.37957,
15
+ "map_at_1": 0.12365,
16
+ "map_at_3": 0.17699,
17
+ "map_at_5": 0.19141,
18
+ "map_at_10": 0.20665,
19
+ "map_at_20": 0.21508,
20
+ "map_at_100": 0.22226,
21
+ "map_at_1000": 0.22391,
22
+ "recall_at_1": 0.12365,
23
+ "recall_at_3": 0.2221,
24
+ "recall_at_5": 0.26655,
25
+ "recall_at_10": 0.33526,
26
+ "recall_at_20": 0.40185,
27
+ "recall_at_100": 0.54678,
28
+ "recall_at_1000": 0.72823,
29
+ "accuracy": 0.12365,
30
+ "precision_at_1": 0.27166,
31
+ "precision_at_3": 0.17785,
32
+ "precision_at_5": 0.13199,
33
+ "precision_at_10": 0.08684,
34
+ "precision_at_20": 0.05345,
35
+ "precision_at_100": 0.01522,
36
+ "precision_at_1000": 0.00212,
37
+ "mrr_at_1": 0.271661,
38
+ "mrr_at_3": 0.356026,
39
+ "mrr_at_5": 0.373453,
40
+ "mrr_at_10": 0.385454,
41
+ "mrr_at_20": 0.390603,
42
+ "mrr_at_100": 0.39385,
43
+ "mrr_at_1000": 0.394272,
44
+ "nauc_ndcg_at_1_max": 0.404484,
45
+ "nauc_ndcg_at_1_std": 0.251546,
46
+ "nauc_ndcg_at_1_diff1": 0.241109,
47
+ "nauc_ndcg_at_3_max": 0.40682,
48
+ "nauc_ndcg_at_3_std": 0.26695,
49
+ "nauc_ndcg_at_3_diff1": 0.218201,
50
+ "nauc_ndcg_at_5_max": 0.42415,
51
+ "nauc_ndcg_at_5_std": 0.288247,
52
+ "nauc_ndcg_at_5_diff1": 0.220472,
53
+ "nauc_ndcg_at_10_max": 0.41644,
54
+ "nauc_ndcg_at_10_std": 0.304935,
55
+ "nauc_ndcg_at_10_diff1": 0.212308,
56
+ "nauc_ndcg_at_20_max": 0.425169,
57
+ "nauc_ndcg_at_20_std": 0.31979,
58
+ "nauc_ndcg_at_20_diff1": 0.211163,
59
+ "nauc_ndcg_at_100_max": 0.428409,
60
+ "nauc_ndcg_at_100_std": 0.341799,
61
+ "nauc_ndcg_at_100_diff1": 0.203146,
62
+ "nauc_ndcg_at_1000_max": 0.427482,
63
+ "nauc_ndcg_at_1000_std": 0.343238,
64
+ "nauc_ndcg_at_1000_diff1": 0.201552,
65
+ "nauc_map_at_1_max": 0.428368,
66
+ "nauc_map_at_1_std": 0.22614,
67
+ "nauc_map_at_1_diff1": 0.292095,
68
+ "nauc_map_at_3_max": 0.418182,
69
+ "nauc_map_at_3_std": 0.250222,
70
+ "nauc_map_at_3_diff1": 0.243764,
71
+ "nauc_map_at_5_max": 0.425811,
72
+ "nauc_map_at_5_std": 0.26602,
73
+ "nauc_map_at_5_diff1": 0.240949,
74
+ "nauc_map_at_10_max": 0.421675,
75
+ "nauc_map_at_10_std": 0.27694,
76
+ "nauc_map_at_10_diff1": 0.235757,
77
+ "nauc_map_at_20_max": 0.424048,
78
+ "nauc_map_at_20_std": 0.285006,
79
+ "nauc_map_at_20_diff1": 0.234597,
80
+ "nauc_map_at_100_max": 0.426017,
81
+ "nauc_map_at_100_std": 0.292322,
82
+ "nauc_map_at_100_diff1": 0.232227,
83
+ "nauc_map_at_1000_max": 0.425918,
84
+ "nauc_map_at_1000_std": 0.292526,
85
+ "nauc_map_at_1000_diff1": 0.231959,
86
+ "nauc_recall_at_1_max": 0.428368,
87
+ "nauc_recall_at_1_std": 0.22614,
88
+ "nauc_recall_at_1_diff1": 0.292095,
89
+ "nauc_recall_at_3_max": 0.390738,
90
+ "nauc_recall_at_3_std": 0.253726,
91
+ "nauc_recall_at_3_diff1": 0.205184,
92
+ "nauc_recall_at_5_max": 0.401014,
93
+ "nauc_recall_at_5_std": 0.280105,
94
+ "nauc_recall_at_5_diff1": 0.19672,
95
+ "nauc_recall_at_10_max": 0.363871,
96
+ "nauc_recall_at_10_std": 0.293503,
97
+ "nauc_recall_at_10_diff1": 0.171939,
98
+ "nauc_recall_at_20_max": 0.368313,
99
+ "nauc_recall_at_20_std": 0.312865,
100
+ "nauc_recall_at_20_diff1": 0.163465,
101
+ "nauc_recall_at_100_max": 0.339612,
102
+ "nauc_recall_at_100_std": 0.352721,
103
+ "nauc_recall_at_100_diff1": 0.121591,
104
+ "nauc_recall_at_1000_max": 0.324113,
105
+ "nauc_recall_at_1000_std": 0.372466,
106
+ "nauc_recall_at_1000_diff1": 0.101186,
107
+ "nauc_precision_at_1_max": 0.404484,
108
+ "nauc_precision_at_1_std": 0.251546,
109
+ "nauc_precision_at_1_diff1": 0.241109,
110
+ "nauc_precision_at_3_max": 0.359823,
111
+ "nauc_precision_at_3_std": 0.298083,
112
+ "nauc_precision_at_3_diff1": 0.142555,
113
+ "nauc_precision_at_5_max": 0.358761,
114
+ "nauc_precision_at_5_std": 0.32681,
115
+ "nauc_precision_at_5_diff1": 0.123607,
116
+ "nauc_precision_at_10_max": 0.290333,
117
+ "nauc_precision_at_10_std": 0.330787,
118
+ "nauc_precision_at_10_diff1": 0.083615,
119
+ "nauc_precision_at_20_max": 0.279626,
120
+ "nauc_precision_at_20_std": 0.341307,
121
+ "nauc_precision_at_20_diff1": 0.063896,
122
+ "nauc_precision_at_100_max": 0.211091,
123
+ "nauc_precision_at_100_std": 0.340348,
124
+ "nauc_precision_at_100_diff1": -0.001064,
125
+ "nauc_precision_at_1000_max": 0.08748,
126
+ "nauc_precision_at_1000_std": 0.250078,
127
+ "nauc_precision_at_1000_diff1": -0.068066,
128
+ "nauc_mrr_at_1_max": 0.404484,
129
+ "nauc_mrr_at_1_std": 0.251546,
130
+ "nauc_mrr_at_1_diff1": 0.241109,
131
+ "nauc_mrr_at_3_max": 0.406483,
132
+ "nauc_mrr_at_3_std": 0.285934,
133
+ "nauc_mrr_at_3_diff1": 0.205992,
134
+ "nauc_mrr_at_5_max": 0.414143,
135
+ "nauc_mrr_at_5_std": 0.297012,
136
+ "nauc_mrr_at_5_diff1": 0.205702,
137
+ "nauc_mrr_at_10_max": 0.411174,
138
+ "nauc_mrr_at_10_std": 0.300685,
139
+ "nauc_mrr_at_10_diff1": 0.20337,
140
+ "nauc_mrr_at_20_max": 0.4138,
141
+ "nauc_mrr_at_20_std": 0.299742,
142
+ "nauc_mrr_at_20_diff1": 0.203857,
143
+ "nauc_mrr_at_100_max": 0.41325,
144
+ "nauc_mrr_at_100_std": 0.299777,
145
+ "nauc_mrr_at_100_diff1": 0.204159,
146
+ "nauc_mrr_at_1000_max": 0.41323,
147
+ "nauc_mrr_at_1000_std": 0.299558,
148
+ "nauc_mrr_at_1000_diff1": 0.204386,
149
+ "hit_rate_at_1": 0.27166,
150
+ "hit_rate_at_3": 0.46384,
151
+ "hit_rate_at_5": 0.53941,
152
+ "hit_rate_at_10": 0.62866,
153
+ "hit_rate_at_20": 0.69902,
154
+ "hit_rate_at_100": 0.82736,
155
+ "hit_rate_at_1000": 0.92313,
156
+ "main_score": 0.28605,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 42900.58644294739,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/DBPedia.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c0f706b76e590d620bd6618b3ca8efdd34e2d659",
3
+ "task_name": "DBPedia",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.5025,
9
+ "ndcg_at_3": 0.41046,
10
+ "ndcg_at_5": 0.38473,
11
+ "ndcg_at_10": 0.35937,
12
+ "ndcg_at_20": 0.34902,
13
+ "ndcg_at_100": 0.3843,
14
+ "ndcg_at_1000": 0.45195,
15
+ "map_at_1": 0.08139,
16
+ "map_at_3": 0.12263,
17
+ "map_at_5": 0.14164,
18
+ "map_at_10": 0.16405,
19
+ "map_at_20": 0.18559,
20
+ "map_at_100": 0.21788,
21
+ "map_at_1000": 0.23162,
22
+ "recall_at_1": 0.08139,
23
+ "recall_at_3": 0.13669,
24
+ "recall_at_5": 0.16722,
25
+ "recall_at_10": 0.21133,
26
+ "recall_at_20": 0.2671,
27
+ "recall_at_100": 0.41492,
28
+ "recall_at_1000": 0.63956,
29
+ "accuracy": 0.08139,
30
+ "precision_at_1": 0.63,
31
+ "precision_at_3": 0.44,
32
+ "precision_at_5": 0.373,
33
+ "precision_at_10": 0.28175,
34
+ "precision_at_20": 0.20625,
35
+ "precision_at_100": 0.08528,
36
+ "precision_at_1000": 0.01855,
37
+ "mrr_at_1": 0.63,
38
+ "mrr_at_3": 0.689167,
39
+ "mrr_at_5": 0.697292,
40
+ "mrr_at_10": 0.704438,
41
+ "mrr_at_20": 0.706955,
42
+ "mrr_at_100": 0.708539,
43
+ "mrr_at_1000": 0.708709,
44
+ "nauc_ndcg_at_1_max": 0.441705,
45
+ "nauc_ndcg_at_1_std": 0.227258,
46
+ "nauc_ndcg_at_1_diff1": 0.461762,
47
+ "nauc_ndcg_at_3_max": 0.436047,
48
+ "nauc_ndcg_at_3_std": 0.289915,
49
+ "nauc_ndcg_at_3_diff1": 0.345201,
50
+ "nauc_ndcg_at_5_max": 0.433506,
51
+ "nauc_ndcg_at_5_std": 0.309515,
52
+ "nauc_ndcg_at_5_diff1": 0.323497,
53
+ "nauc_ndcg_at_10_max": 0.410831,
54
+ "nauc_ndcg_at_10_std": 0.321822,
55
+ "nauc_ndcg_at_10_diff1": 0.313376,
56
+ "nauc_ndcg_at_20_max": 0.374823,
57
+ "nauc_ndcg_at_20_std": 0.329585,
58
+ "nauc_ndcg_at_20_diff1": 0.310113,
59
+ "nauc_ndcg_at_100_max": 0.392714,
60
+ "nauc_ndcg_at_100_std": 0.390049,
61
+ "nauc_ndcg_at_100_diff1": 0.30267,
62
+ "nauc_ndcg_at_1000_max": 0.444436,
63
+ "nauc_ndcg_at_1000_std": 0.456919,
64
+ "nauc_ndcg_at_1000_diff1": 0.284393,
65
+ "nauc_map_at_1_max": 0.097788,
66
+ "nauc_map_at_1_std": -0.065405,
67
+ "nauc_map_at_1_diff1": 0.462808,
68
+ "nauc_map_at_3_max": 0.127401,
69
+ "nauc_map_at_3_std": -0.009296,
70
+ "nauc_map_at_3_diff1": 0.385789,
71
+ "nauc_map_at_5_max": 0.15133,
72
+ "nauc_map_at_5_std": 0.028016,
73
+ "nauc_map_at_5_diff1": 0.368005,
74
+ "nauc_map_at_10_max": 0.192608,
75
+ "nauc_map_at_10_std": 0.105175,
76
+ "nauc_map_at_10_diff1": 0.33768,
77
+ "nauc_map_at_20_max": 0.235011,
78
+ "nauc_map_at_20_std": 0.194548,
79
+ "nauc_map_at_20_diff1": 0.310178,
80
+ "nauc_map_at_100_max": 0.293586,
81
+ "nauc_map_at_100_std": 0.319851,
82
+ "nauc_map_at_100_diff1": 0.27551,
83
+ "nauc_map_at_1000_max": 0.309119,
84
+ "nauc_map_at_1000_std": 0.347481,
85
+ "nauc_map_at_1000_diff1": 0.266596,
86
+ "nauc_recall_at_1_max": 0.097788,
87
+ "nauc_recall_at_1_std": -0.065405,
88
+ "nauc_recall_at_1_diff1": 0.462808,
89
+ "nauc_recall_at_3_max": 0.07969,
90
+ "nauc_recall_at_3_std": -0.006795,
91
+ "nauc_recall_at_3_diff1": 0.311553,
92
+ "nauc_recall_at_5_max": 0.100784,
93
+ "nauc_recall_at_5_std": 0.028521,
94
+ "nauc_recall_at_5_diff1": 0.29621,
95
+ "nauc_recall_at_10_max": 0.138185,
96
+ "nauc_recall_at_10_std": 0.111493,
97
+ "nauc_recall_at_10_diff1": 0.259055,
98
+ "nauc_recall_at_20_max": 0.179596,
99
+ "nauc_recall_at_20_std": 0.221873,
100
+ "nauc_recall_at_20_diff1": 0.223454,
101
+ "nauc_recall_at_100_max": 0.28468,
102
+ "nauc_recall_at_100_std": 0.422788,
103
+ "nauc_recall_at_100_diff1": 0.181994,
104
+ "nauc_recall_at_1000_max": 0.332402,
105
+ "nauc_recall_at_1000_std": 0.509843,
106
+ "nauc_recall_at_1000_diff1": 0.136201,
107
+ "nauc_precision_at_1_max": 0.534496,
108
+ "nauc_precision_at_1_std": 0.29313,
109
+ "nauc_precision_at_1_diff1": 0.466393,
110
+ "nauc_precision_at_3_max": 0.456164,
111
+ "nauc_precision_at_3_std": 0.369089,
112
+ "nauc_precision_at_3_diff1": 0.137968,
113
+ "nauc_precision_at_5_max": 0.453404,
114
+ "nauc_precision_at_5_std": 0.414697,
115
+ "nauc_precision_at_5_diff1": 0.075475,
116
+ "nauc_precision_at_10_max": 0.415729,
117
+ "nauc_precision_at_10_std": 0.464763,
118
+ "nauc_precision_at_10_diff1": -0.001324,
119
+ "nauc_precision_at_20_max": 0.389501,
120
+ "nauc_precision_at_20_std": 0.526699,
121
+ "nauc_precision_at_20_diff1": -0.054929,
122
+ "nauc_precision_at_100_max": 0.312581,
123
+ "nauc_precision_at_100_std": 0.481248,
124
+ "nauc_precision_at_100_diff1": -0.093958,
125
+ "nauc_precision_at_1000_max": 0.088633,
126
+ "nauc_precision_at_1000_std": 0.14462,
127
+ "nauc_precision_at_1000_diff1": -0.133815,
128
+ "nauc_mrr_at_1_max": 0.534496,
129
+ "nauc_mrr_at_1_std": 0.29313,
130
+ "nauc_mrr_at_1_diff1": 0.466393,
131
+ "nauc_mrr_at_3_max": 0.56829,
132
+ "nauc_mrr_at_3_std": 0.354724,
133
+ "nauc_mrr_at_3_diff1": 0.4422,
134
+ "nauc_mrr_at_5_max": 0.56845,
135
+ "nauc_mrr_at_5_std": 0.358733,
136
+ "nauc_mrr_at_5_diff1": 0.442479,
137
+ "nauc_mrr_at_10_max": 0.567141,
138
+ "nauc_mrr_at_10_std": 0.35573,
139
+ "nauc_mrr_at_10_diff1": 0.438848,
140
+ "nauc_mrr_at_20_max": 0.566839,
141
+ "nauc_mrr_at_20_std": 0.356222,
142
+ "nauc_mrr_at_20_diff1": 0.437184,
143
+ "nauc_mrr_at_100_max": 0.566705,
144
+ "nauc_mrr_at_100_std": 0.355308,
145
+ "nauc_mrr_at_100_diff1": 0.438661,
146
+ "nauc_mrr_at_1000_max": 0.566607,
147
+ "nauc_mrr_at_1000_std": 0.355096,
148
+ "nauc_mrr_at_1000_diff1": 0.438683,
149
+ "hit_rate_at_1": 0.63,
150
+ "hit_rate_at_3": 0.76,
151
+ "hit_rate_at_5": 0.795,
152
+ "hit_rate_at_10": 0.8475,
153
+ "hit_rate_at_20": 0.8825,
154
+ "hit_rate_at_100": 0.94,
155
+ "hit_rate_at_1000": 0.975,
156
+ "main_score": 0.35937,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 36278.173233270645,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/EmotionClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4f58c6b202a23cf9a4da393831edf4f9183cad37",
3
+ "task_name": "EmotionClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.506,
11
+ "f1": 0.449103,
12
+ "f1_weighted": 0.525392,
13
+ "precision": 0.44738,
14
+ "precision_weighted": 0.580676,
15
+ "recall": 0.502032,
16
+ "recall_weighted": 0.506,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.4055,
22
+ "f1": 0.359579,
23
+ "f1_weighted": 0.409011,
24
+ "precision": 0.370037,
25
+ "precision_weighted": 0.486783,
26
+ "recall": 0.423523,
27
+ "recall_weighted": 0.4055,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.4295,
33
+ "f1": 0.380588,
34
+ "f1_weighted": 0.455239,
35
+ "precision": 0.388068,
36
+ "precision_weighted": 0.513807,
37
+ "recall": 0.428664,
38
+ "recall_weighted": 0.4295,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.461,
44
+ "f1": 0.406482,
45
+ "f1_weighted": 0.485626,
46
+ "precision": 0.419851,
47
+ "precision_weighted": 0.552188,
48
+ "recall": 0.45917,
49
+ "recall_weighted": 0.461,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.4675,
55
+ "f1": 0.425349,
56
+ "f1_weighted": 0.493169,
57
+ "precision": 0.438314,
58
+ "precision_weighted": 0.577194,
59
+ "recall": 0.483948,
60
+ "recall_weighted": 0.4675,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.4695,
66
+ "f1": 0.408054,
67
+ "f1_weighted": 0.494595,
68
+ "precision": 0.412384,
69
+ "precision_weighted": 0.552737,
70
+ "recall": 0.453167,
71
+ "recall_weighted": 0.4695,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.456,
77
+ "f1": 0.40207,
78
+ "f1_weighted": 0.47492,
79
+ "precision": 0.407834,
80
+ "precision_weighted": 0.534999,
81
+ "recall": 0.456004,
82
+ "recall_weighted": 0.456,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.402,
88
+ "f1": 0.369415,
89
+ "f1_weighted": 0.419511,
90
+ "precision": 0.375487,
91
+ "precision_weighted": 0.496676,
92
+ "recall": 0.434631,
93
+ "recall_weighted": 0.402,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.4795,
99
+ "f1": 0.425036,
100
+ "f1_weighted": 0.502268,
101
+ "precision": 0.429169,
102
+ "precision_weighted": 0.559932,
103
+ "recall": 0.461785,
104
+ "recall_weighted": 0.4795,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.4455,
110
+ "f1": 0.399398,
111
+ "f1_weighted": 0.468938,
112
+ "precision": 0.408432,
113
+ "precision_weighted": 0.550323,
114
+ "recall": 0.458149,
115
+ "recall_weighted": 0.4455,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.4522,
121
+ "f1": 0.402507,
122
+ "f1_weighted": 0.472867,
123
+ "precision": 0.409695,
124
+ "precision_weighted": 0.540531,
125
+ "recall": 0.456107,
126
+ "recall_weighted": 0.4522,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.4522,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 21.581204652786255,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/FEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "bea83ef9e8fb933d90a2f1d5515737465d613e12",
3
+ "task_name": "FEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.56871,
9
+ "ndcg_at_3": 0.646,
10
+ "ndcg_at_5": 0.67074,
11
+ "ndcg_at_10": 0.68797,
12
+ "ndcg_at_20": 0.69893,
13
+ "ndcg_at_100": 0.70956,
14
+ "ndcg_at_1000": 0.71442,
15
+ "map_at_1": 0.52593,
16
+ "map_at_3": 0.6093,
17
+ "map_at_5": 0.62385,
18
+ "map_at_10": 0.63133,
19
+ "map_at_20": 0.63452,
20
+ "map_at_100": 0.63614,
21
+ "map_at_1000": 0.63634,
22
+ "recall_at_1": 0.52593,
23
+ "recall_at_3": 0.70784,
24
+ "recall_at_5": 0.76885,
25
+ "recall_at_10": 0.82124,
26
+ "recall_at_20": 0.86285,
27
+ "recall_at_100": 0.9172,
28
+ "recall_at_1000": 0.95243,
29
+ "accuracy": 0.52593,
30
+ "precision_at_1": 0.56871,
31
+ "precision_at_3": 0.25663,
32
+ "precision_at_5": 0.16745,
33
+ "precision_at_10": 0.08972,
34
+ "precision_at_20": 0.0473,
35
+ "precision_at_100": 0.01012,
36
+ "precision_at_1000": 0.00107,
37
+ "mrr_at_1": 0.568707,
38
+ "mrr_at_3": 0.65379,
39
+ "mrr_at_5": 0.668139,
40
+ "mrr_at_10": 0.675324,
41
+ "mrr_at_20": 0.678219,
42
+ "mrr_at_100": 0.679586,
43
+ "mrr_at_1000": 0.679706,
44
+ "nauc_ndcg_at_1_max": 0.307272,
45
+ "nauc_ndcg_at_1_std": -0.071659,
46
+ "nauc_ndcg_at_1_diff1": 0.57292,
47
+ "nauc_ndcg_at_3_max": 0.330102,
48
+ "nauc_ndcg_at_3_std": -0.054024,
49
+ "nauc_ndcg_at_3_diff1": 0.478683,
50
+ "nauc_ndcg_at_5_max": 0.328291,
51
+ "nauc_ndcg_at_5_std": -0.041522,
52
+ "nauc_ndcg_at_5_diff1": 0.467141,
53
+ "nauc_ndcg_at_10_max": 0.328765,
54
+ "nauc_ndcg_at_10_std": -0.030284,
55
+ "nauc_ndcg_at_10_diff1": 0.468659,
56
+ "nauc_ndcg_at_20_max": 0.331535,
57
+ "nauc_ndcg_at_20_std": -0.020337,
58
+ "nauc_ndcg_at_20_diff1": 0.467534,
59
+ "nauc_ndcg_at_100_max": 0.326263,
60
+ "nauc_ndcg_at_100_std": -0.018799,
61
+ "nauc_ndcg_at_100_diff1": 0.468106,
62
+ "nauc_ndcg_at_1000_max": 0.324176,
63
+ "nauc_ndcg_at_1000_std": -0.023459,
64
+ "nauc_ndcg_at_1000_diff1": 0.470391,
65
+ "nauc_map_at_1_max": 0.276183,
66
+ "nauc_map_at_1_std": -0.059912,
67
+ "nauc_map_at_1_diff1": 0.512305,
68
+ "nauc_map_at_3_max": 0.308915,
69
+ "nauc_map_at_3_std": -0.052841,
70
+ "nauc_map_at_3_diff1": 0.476222,
71
+ "nauc_map_at_5_max": 0.308588,
72
+ "nauc_map_at_5_std": -0.046431,
73
+ "nauc_map_at_5_diff1": 0.471374,
74
+ "nauc_map_at_10_max": 0.309031,
75
+ "nauc_map_at_10_std": -0.042456,
76
+ "nauc_map_at_10_diff1": 0.472355,
77
+ "nauc_map_at_20_max": 0.309637,
78
+ "nauc_map_at_20_std": -0.039993,
79
+ "nauc_map_at_20_diff1": 0.472099,
80
+ "nauc_map_at_100_max": 0.309201,
81
+ "nauc_map_at_100_std": -0.03961,
82
+ "nauc_map_at_100_diff1": 0.472269,
83
+ "nauc_map_at_1000_max": 0.309147,
84
+ "nauc_map_at_1000_std": -0.039724,
85
+ "nauc_map_at_1000_diff1": 0.472342,
86
+ "nauc_recall_at_1_max": 0.276183,
87
+ "nauc_recall_at_1_std": -0.059912,
88
+ "nauc_recall_at_1_diff1": 0.512305,
89
+ "nauc_recall_at_3_max": 0.341813,
90
+ "nauc_recall_at_3_std": -0.042091,
91
+ "nauc_recall_at_3_diff1": 0.40008,
92
+ "nauc_recall_at_5_max": 0.338868,
93
+ "nauc_recall_at_5_std": -0.001615,
94
+ "nauc_recall_at_5_diff1": 0.348356,
95
+ "nauc_recall_at_10_max": 0.341614,
96
+ "nauc_recall_at_10_std": 0.058066,
97
+ "nauc_recall_at_10_diff1": 0.32087,
98
+ "nauc_recall_at_20_max": 0.359773,
99
+ "nauc_recall_at_20_std": 0.149934,
100
+ "nauc_recall_at_20_diff1": 0.269446,
101
+ "nauc_recall_at_100_max": 0.284984,
102
+ "nauc_recall_at_100_std": 0.265406,
103
+ "nauc_recall_at_100_diff1": 0.148358,
104
+ "nauc_recall_at_1000_max": 0.185364,
105
+ "nauc_recall_at_1000_std": 0.286873,
106
+ "nauc_recall_at_1000_diff1": 0.021222,
107
+ "nauc_precision_at_1_max": 0.307272,
108
+ "nauc_precision_at_1_std": -0.071659,
109
+ "nauc_precision_at_1_diff1": 0.57292,
110
+ "nauc_precision_at_3_max": 0.39507,
111
+ "nauc_precision_at_3_std": -0.050833,
112
+ "nauc_precision_at_3_diff1": 0.449301,
113
+ "nauc_precision_at_5_max": 0.393233,
114
+ "nauc_precision_at_5_std": -0.006871,
115
+ "nauc_precision_at_5_diff1": 0.387131,
116
+ "nauc_precision_at_10_max": 0.381568,
117
+ "nauc_precision_at_10_std": 0.066299,
118
+ "nauc_precision_at_10_diff1": 0.331526,
119
+ "nauc_precision_at_20_max": 0.370347,
120
+ "nauc_precision_at_20_std": 0.156228,
121
+ "nauc_precision_at_20_diff1": 0.238405,
122
+ "nauc_precision_at_100_max": 0.229863,
123
+ "nauc_precision_at_100_std": 0.232952,
124
+ "nauc_precision_at_100_diff1": 0.051149,
125
+ "nauc_precision_at_1000_max": 0.084204,
126
+ "nauc_precision_at_1000_std": 0.150754,
127
+ "nauc_precision_at_1000_diff1": -0.059337,
128
+ "nauc_mrr_at_1_max": 0.307272,
129
+ "nauc_mrr_at_1_std": -0.071659,
130
+ "nauc_mrr_at_1_diff1": 0.57292,
131
+ "nauc_mrr_at_3_max": 0.346525,
132
+ "nauc_mrr_at_3_std": -0.066881,
133
+ "nauc_mrr_at_3_diff1": 0.540973,
134
+ "nauc_mrr_at_5_max": 0.346909,
135
+ "nauc_mrr_at_5_std": -0.061135,
136
+ "nauc_mrr_at_5_diff1": 0.538091,
137
+ "nauc_mrr_at_10_max": 0.347144,
138
+ "nauc_mrr_at_10_std": -0.05796,
139
+ "nauc_mrr_at_10_diff1": 0.540921,
140
+ "nauc_mrr_at_20_max": 0.347624,
141
+ "nauc_mrr_at_20_std": -0.056116,
142
+ "nauc_mrr_at_20_diff1": 0.541099,
143
+ "nauc_mrr_at_100_max": 0.346841,
144
+ "nauc_mrr_at_100_std": -0.056289,
145
+ "nauc_mrr_at_100_diff1": 0.541319,
146
+ "nauc_mrr_at_1000_max": 0.346716,
147
+ "nauc_mrr_at_1000_std": -0.056497,
148
+ "nauc_mrr_at_1000_diff1": 0.54137,
149
+ "hit_rate_at_1": 0.56871,
150
+ "hit_rate_at_3": 0.75923,
151
+ "hit_rate_at_5": 0.82163,
152
+ "hit_rate_at_10": 0.87459,
153
+ "hit_rate_at_20": 0.91524,
154
+ "hit_rate_at_100": 0.96595,
155
+ "hit_rate_at_1000": 0.9916,
156
+ "main_score": 0.68797,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 47118.720616579056,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/FiQA2018.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "27a168819829fe9bcd655c2df245fb19452e8e06",
3
+ "task_name": "FiQA2018",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.28086,
9
+ "ndcg_at_3": 0.25988,
10
+ "ndcg_at_5": 0.27223,
11
+ "ndcg_at_10": 0.30046,
12
+ "ndcg_at_20": 0.32565,
13
+ "ndcg_at_100": 0.3662,
14
+ "ndcg_at_1000": 0.40401,
15
+ "map_at_1": 0.14028,
16
+ "map_at_3": 0.19592,
17
+ "map_at_5": 0.21323,
18
+ "map_at_10": 0.2303,
19
+ "map_at_20": 0.23877,
20
+ "map_at_100": 0.24645,
21
+ "map_at_1000": 0.24854,
22
+ "recall_at_1": 0.14028,
23
+ "recall_at_3": 0.23787,
24
+ "recall_at_5": 0.28788,
25
+ "recall_at_10": 0.36944,
26
+ "recall_at_20": 0.45049,
27
+ "recall_at_100": 0.61727,
28
+ "recall_at_1000": 0.84505,
29
+ "accuracy": 0.14028,
30
+ "precision_at_1": 0.28086,
31
+ "precision_at_3": 0.17181,
32
+ "precision_at_5": 0.13056,
33
+ "precision_at_10": 0.08688,
34
+ "precision_at_20": 0.05316,
35
+ "precision_at_100": 0.01522,
36
+ "precision_at_1000": 0.00219,
37
+ "mrr_at_1": 0.280864,
38
+ "mrr_at_3": 0.340021,
39
+ "mrr_at_5": 0.355607,
40
+ "mrr_at_10": 0.366696,
41
+ "mrr_at_20": 0.372785,
42
+ "mrr_at_100": 0.376569,
43
+ "mrr_at_1000": 0.377169,
44
+ "nauc_ndcg_at_1_max": 0.282182,
45
+ "nauc_ndcg_at_1_std": -0.005919,
46
+ "nauc_ndcg_at_1_diff1": 0.434698,
47
+ "nauc_ndcg_at_3_max": 0.285436,
48
+ "nauc_ndcg_at_3_std": 0.024304,
49
+ "nauc_ndcg_at_3_diff1": 0.371713,
50
+ "nauc_ndcg_at_5_max": 0.284284,
51
+ "nauc_ndcg_at_5_std": 0.028341,
52
+ "nauc_ndcg_at_5_diff1": 0.372482,
53
+ "nauc_ndcg_at_10_max": 0.278889,
54
+ "nauc_ndcg_at_10_std": 0.052332,
55
+ "nauc_ndcg_at_10_diff1": 0.364608,
56
+ "nauc_ndcg_at_20_max": 0.291439,
57
+ "nauc_ndcg_at_20_std": 0.067484,
58
+ "nauc_ndcg_at_20_diff1": 0.359704,
59
+ "nauc_ndcg_at_100_max": 0.302294,
60
+ "nauc_ndcg_at_100_std": 0.090976,
61
+ "nauc_ndcg_at_100_diff1": 0.350494,
62
+ "nauc_ndcg_at_1000_max": 0.315943,
63
+ "nauc_ndcg_at_1000_std": 0.098928,
64
+ "nauc_ndcg_at_1000_diff1": 0.348637,
65
+ "nauc_map_at_1_max": 0.249055,
66
+ "nauc_map_at_1_std": -0.022283,
67
+ "nauc_map_at_1_diff1": 0.416121,
68
+ "nauc_map_at_3_max": 0.2593,
69
+ "nauc_map_at_3_std": 0.005654,
70
+ "nauc_map_at_3_diff1": 0.367743,
71
+ "nauc_map_at_5_max": 0.271928,
72
+ "nauc_map_at_5_std": 0.010838,
73
+ "nauc_map_at_5_diff1": 0.367681,
74
+ "nauc_map_at_10_max": 0.278414,
75
+ "nauc_map_at_10_std": 0.03289,
76
+ "nauc_map_at_10_diff1": 0.363794,
77
+ "nauc_map_at_20_max": 0.284905,
78
+ "nauc_map_at_20_std": 0.040127,
79
+ "nauc_map_at_20_diff1": 0.361574,
80
+ "nauc_map_at_100_max": 0.287581,
81
+ "nauc_map_at_100_std": 0.045517,
82
+ "nauc_map_at_100_diff1": 0.360065,
83
+ "nauc_map_at_1000_max": 0.288699,
84
+ "nauc_map_at_1000_std": 0.046204,
85
+ "nauc_map_at_1000_diff1": 0.360119,
86
+ "nauc_recall_at_1_max": 0.249055,
87
+ "nauc_recall_at_1_std": -0.022283,
88
+ "nauc_recall_at_1_diff1": 0.416121,
89
+ "nauc_recall_at_3_max": 0.229926,
90
+ "nauc_recall_at_3_std": 0.024338,
91
+ "nauc_recall_at_3_diff1": 0.321113,
92
+ "nauc_recall_at_5_max": 0.238645,
93
+ "nauc_recall_at_5_std": 0.029092,
94
+ "nauc_recall_at_5_diff1": 0.313235,
95
+ "nauc_recall_at_10_max": 0.214676,
96
+ "nauc_recall_at_10_std": 0.076826,
97
+ "nauc_recall_at_10_diff1": 0.283965,
98
+ "nauc_recall_at_20_max": 0.235089,
99
+ "nauc_recall_at_20_std": 0.116213,
100
+ "nauc_recall_at_20_diff1": 0.253209,
101
+ "nauc_recall_at_100_max": 0.250115,
102
+ "nauc_recall_at_100_std": 0.207467,
103
+ "nauc_recall_at_100_diff1": 0.196793,
104
+ "nauc_recall_at_1000_max": 0.348401,
105
+ "nauc_recall_at_1000_std": 0.399427,
106
+ "nauc_recall_at_1000_diff1": 0.072174,
107
+ "nauc_precision_at_1_max": 0.282182,
108
+ "nauc_precision_at_1_std": -0.005919,
109
+ "nauc_precision_at_1_diff1": 0.434698,
110
+ "nauc_precision_at_3_max": 0.288645,
111
+ "nauc_precision_at_3_std": 0.055596,
112
+ "nauc_precision_at_3_diff1": 0.308554,
113
+ "nauc_precision_at_5_max": 0.304058,
114
+ "nauc_precision_at_5_std": 0.082317,
115
+ "nauc_precision_at_5_diff1": 0.29534,
116
+ "nauc_precision_at_10_max": 0.292602,
117
+ "nauc_precision_at_10_std": 0.137439,
118
+ "nauc_precision_at_10_diff1": 0.234054,
119
+ "nauc_precision_at_20_max": 0.300958,
120
+ "nauc_precision_at_20_std": 0.15644,
121
+ "nauc_precision_at_20_diff1": 0.19526,
122
+ "nauc_precision_at_100_max": 0.285515,
123
+ "nauc_precision_at_100_std": 0.20098,
124
+ "nauc_precision_at_100_diff1": 0.104872,
125
+ "nauc_precision_at_1000_max": 0.237252,
126
+ "nauc_precision_at_1000_std": 0.169333,
127
+ "nauc_precision_at_1000_diff1": 0.019882,
128
+ "nauc_mrr_at_1_max": 0.282182,
129
+ "nauc_mrr_at_1_std": -0.005919,
130
+ "nauc_mrr_at_1_diff1": 0.434698,
131
+ "nauc_mrr_at_3_max": 0.291531,
132
+ "nauc_mrr_at_3_std": 0.025346,
133
+ "nauc_mrr_at_3_diff1": 0.40537,
134
+ "nauc_mrr_at_5_max": 0.288307,
135
+ "nauc_mrr_at_5_std": 0.027858,
136
+ "nauc_mrr_at_5_diff1": 0.404072,
137
+ "nauc_mrr_at_10_max": 0.283973,
138
+ "nauc_mrr_at_10_std": 0.028988,
139
+ "nauc_mrr_at_10_diff1": 0.400819,
140
+ "nauc_mrr_at_20_max": 0.286636,
141
+ "nauc_mrr_at_20_std": 0.031924,
142
+ "nauc_mrr_at_20_diff1": 0.401876,
143
+ "nauc_mrr_at_100_max": 0.288009,
144
+ "nauc_mrr_at_100_std": 0.03306,
145
+ "nauc_mrr_at_100_diff1": 0.401813,
146
+ "nauc_mrr_at_1000_max": 0.288299,
147
+ "nauc_mrr_at_1000_std": 0.033152,
148
+ "nauc_mrr_at_1000_diff1": 0.401918,
149
+ "hit_rate_at_1": 0.28086,
150
+ "hit_rate_at_3": 0.41358,
151
+ "hit_rate_at_5": 0.47994,
152
+ "hit_rate_at_10": 0.56481,
153
+ "hit_rate_at_20": 0.65278,
154
+ "hit_rate_at_100": 0.79784,
155
+ "hit_rate_at_1000": 0.92747,
156
+ "main_score": 0.30046,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 147.02466917037964,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/HotpotQA.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ab518f4d6fcca38d87c25209f94beba119d02014",
3
+ "task_name": "HotpotQA",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.62984,
9
+ "ndcg_at_3": 0.47679,
10
+ "ndcg_at_5": 0.49652,
11
+ "ndcg_at_10": 0.51573,
12
+ "ndcg_at_20": 0.52787,
13
+ "ndcg_at_100": 0.54883,
14
+ "ndcg_at_1000": 0.56562,
15
+ "map_at_1": 0.31492,
16
+ "map_at_3": 0.40117,
17
+ "map_at_5": 0.41488,
18
+ "map_at_10": 0.42506,
19
+ "map_at_20": 0.42947,
20
+ "map_at_100": 0.4333,
21
+ "map_at_1000": 0.43408,
22
+ "recall_at_1": 0.31492,
23
+ "recall_at_3": 0.44024,
24
+ "recall_at_5": 0.4792,
25
+ "recall_at_10": 0.52755,
26
+ "recall_at_20": 0.56644,
27
+ "recall_at_100": 0.65942,
28
+ "recall_at_1000": 0.7713,
29
+ "accuracy": 0.31492,
30
+ "precision_at_1": 0.62984,
31
+ "precision_at_3": 0.2935,
32
+ "precision_at_5": 0.19168,
33
+ "precision_at_10": 0.10551,
34
+ "precision_at_20": 0.05664,
35
+ "precision_at_100": 0.01319,
36
+ "precision_at_1000": 0.00154,
37
+ "mrr_at_1": 0.629845,
38
+ "mrr_at_3": 0.681567,
39
+ "mrr_at_5": 0.68998,
40
+ "mrr_at_10": 0.695554,
41
+ "mrr_at_20": 0.697744,
42
+ "mrr_at_100": 0.699295,
43
+ "mrr_at_1000": 0.699524,
44
+ "nauc_ndcg_at_1_max": 0.534041,
45
+ "nauc_ndcg_at_1_std": 0.117229,
46
+ "nauc_ndcg_at_1_diff1": 0.707139,
47
+ "nauc_ndcg_at_3_max": 0.409243,
48
+ "nauc_ndcg_at_3_std": 0.121794,
49
+ "nauc_ndcg_at_3_diff1": 0.448046,
50
+ "nauc_ndcg_at_5_max": 0.39382,
51
+ "nauc_ndcg_at_5_std": 0.1281,
52
+ "nauc_ndcg_at_5_diff1": 0.423221,
53
+ "nauc_ndcg_at_10_max": 0.380286,
54
+ "nauc_ndcg_at_10_std": 0.136941,
55
+ "nauc_ndcg_at_10_diff1": 0.400257,
56
+ "nauc_ndcg_at_20_max": 0.372918,
57
+ "nauc_ndcg_at_20_std": 0.143157,
58
+ "nauc_ndcg_at_20_diff1": 0.392407,
59
+ "nauc_ndcg_at_100_max": 0.364336,
60
+ "nauc_ndcg_at_100_std": 0.150263,
61
+ "nauc_ndcg_at_100_diff1": 0.381477,
62
+ "nauc_ndcg_at_1000_max": 0.367327,
63
+ "nauc_ndcg_at_1000_std": 0.152501,
64
+ "nauc_ndcg_at_1000_diff1": 0.384615,
65
+ "nauc_map_at_1_max": 0.534041,
66
+ "nauc_map_at_1_std": 0.117229,
67
+ "nauc_map_at_1_diff1": 0.707139,
68
+ "nauc_map_at_3_max": 0.367765,
69
+ "nauc_map_at_3_std": 0.112282,
70
+ "nauc_map_at_3_diff1": 0.392886,
71
+ "nauc_map_at_5_max": 0.358514,
72
+ "nauc_map_at_5_std": 0.117568,
73
+ "nauc_map_at_5_diff1": 0.37662,
74
+ "nauc_map_at_10_max": 0.351397,
75
+ "nauc_map_at_10_std": 0.122102,
76
+ "nauc_map_at_10_diff1": 0.36496,
77
+ "nauc_map_at_20_max": 0.348738,
78
+ "nauc_map_at_20_std": 0.124337,
79
+ "nauc_map_at_20_diff1": 0.362358,
80
+ "nauc_map_at_100_max": 0.347188,
81
+ "nauc_map_at_100_std": 0.125824,
82
+ "nauc_map_at_100_diff1": 0.360522,
83
+ "nauc_map_at_1000_max": 0.347386,
84
+ "nauc_map_at_1000_std": 0.125968,
85
+ "nauc_map_at_1000_diff1": 0.360715,
86
+ "nauc_recall_at_1_max": 0.534041,
87
+ "nauc_recall_at_1_std": 0.117229,
88
+ "nauc_recall_at_1_diff1": 0.707139,
89
+ "nauc_recall_at_3_max": 0.343494,
90
+ "nauc_recall_at_3_std": 0.122367,
91
+ "nauc_recall_at_3_diff1": 0.321633,
92
+ "nauc_recall_at_5_max": 0.300377,
93
+ "nauc_recall_at_5_std": 0.130005,
94
+ "nauc_recall_at_5_diff1": 0.26136,
95
+ "nauc_recall_at_10_max": 0.259191,
96
+ "nauc_recall_at_10_std": 0.148957,
97
+ "nauc_recall_at_10_diff1": 0.197694,
98
+ "nauc_recall_at_20_max": 0.219669,
99
+ "nauc_recall_at_20_std": 0.162044,
100
+ "nauc_recall_at_20_diff1": 0.155806,
101
+ "nauc_recall_at_100_max": 0.15039,
102
+ "nauc_recall_at_100_std": 0.185146,
103
+ "nauc_recall_at_100_diff1": 0.069017,
104
+ "nauc_recall_at_1000_max": 0.111545,
105
+ "nauc_recall_at_1000_std": 0.207346,
106
+ "nauc_recall_at_1000_diff1": 0.006229,
107
+ "nauc_precision_at_1_max": 0.534041,
108
+ "nauc_precision_at_1_std": 0.117229,
109
+ "nauc_precision_at_1_diff1": 0.707139,
110
+ "nauc_precision_at_3_max": 0.343494,
111
+ "nauc_precision_at_3_std": 0.122367,
112
+ "nauc_precision_at_3_diff1": 0.321633,
113
+ "nauc_precision_at_5_max": 0.300377,
114
+ "nauc_precision_at_5_std": 0.130005,
115
+ "nauc_precision_at_5_diff1": 0.26136,
116
+ "nauc_precision_at_10_max": 0.259191,
117
+ "nauc_precision_at_10_std": 0.148957,
118
+ "nauc_precision_at_10_diff1": 0.197694,
119
+ "nauc_precision_at_20_max": 0.219669,
120
+ "nauc_precision_at_20_std": 0.162044,
121
+ "nauc_precision_at_20_diff1": 0.155806,
122
+ "nauc_precision_at_100_max": 0.15039,
123
+ "nauc_precision_at_100_std": 0.185146,
124
+ "nauc_precision_at_100_diff1": 0.069017,
125
+ "nauc_precision_at_1000_max": 0.111545,
126
+ "nauc_precision_at_1000_std": 0.207346,
127
+ "nauc_precision_at_1000_diff1": 0.006229,
128
+ "nauc_mrr_at_1_max": 0.534041,
129
+ "nauc_mrr_at_1_std": 0.117229,
130
+ "nauc_mrr_at_1_diff1": 0.707139,
131
+ "nauc_mrr_at_3_max": 0.545955,
132
+ "nauc_mrr_at_3_std": 0.134605,
133
+ "nauc_mrr_at_3_diff1": 0.677299,
134
+ "nauc_mrr_at_5_max": 0.543093,
135
+ "nauc_mrr_at_5_std": 0.136752,
136
+ "nauc_mrr_at_5_diff1": 0.674116,
137
+ "nauc_mrr_at_10_max": 0.542856,
138
+ "nauc_mrr_at_10_std": 0.13956,
139
+ "nauc_mrr_at_10_diff1": 0.672568,
140
+ "nauc_mrr_at_20_max": 0.542904,
141
+ "nauc_mrr_at_20_std": 0.140699,
142
+ "nauc_mrr_at_20_diff1": 0.672927,
143
+ "nauc_mrr_at_100_max": 0.542702,
144
+ "nauc_mrr_at_100_std": 0.140804,
145
+ "nauc_mrr_at_100_diff1": 0.673047,
146
+ "nauc_mrr_at_1000_max": 0.542759,
147
+ "nauc_mrr_at_1000_std": 0.140716,
148
+ "nauc_mrr_at_1000_diff1": 0.673225,
149
+ "hit_rate_at_1": 0.62984,
150
+ "hit_rate_at_3": 0.74517,
151
+ "hit_rate_at_5": 0.78136,
152
+ "hit_rate_at_10": 0.82228,
153
+ "hit_rate_at_20": 0.85267,
154
+ "hit_rate_at_100": 0.91465,
155
+ "hit_rate_at_1000": 0.96543,
156
+ "main_score": 0.51573,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 44493.19784450531,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ImdbClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "3d86128a09e091d6018b6d26cad27f2739fc2db7",
3
+ "task_name": "ImdbClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.76784,
11
+ "f1": 0.766234,
12
+ "f1_weighted": 0.766234,
13
+ "precision": 0.775407,
14
+ "precision_weighted": 0.775407,
15
+ "recall": 0.76784,
16
+ "recall_weighted": 0.76784,
17
+ "ap": 0.719912,
18
+ "ap_weighted": 0.719912
19
+ },
20
+ {
21
+ "accuracy": 0.76716,
22
+ "f1": 0.76716,
23
+ "f1_weighted": 0.76716,
24
+ "precision": 0.767161,
25
+ "precision_weighted": 0.767161,
26
+ "recall": 0.76716,
27
+ "recall_weighted": 0.76716,
28
+ "ap": 0.705098,
29
+ "ap_weighted": 0.705098
30
+ },
31
+ {
32
+ "accuracy": 0.67568,
33
+ "f1": 0.674903,
34
+ "f1_weighted": 0.674903,
35
+ "precision": 0.677375,
36
+ "precision_weighted": 0.677375,
37
+ "recall": 0.67568,
38
+ "recall_weighted": 0.67568,
39
+ "ap": 0.615955,
40
+ "ap_weighted": 0.615955
41
+ },
42
+ {
43
+ "accuracy": 0.77256,
44
+ "f1": 0.77255,
45
+ "f1_weighted": 0.77255,
46
+ "precision": 0.772606,
47
+ "precision_weighted": 0.772606,
48
+ "recall": 0.77256,
49
+ "recall_weighted": 0.77256,
50
+ "ap": 0.711544,
51
+ "ap_weighted": 0.711544
52
+ },
53
+ {
54
+ "accuracy": 0.7206,
55
+ "f1": 0.72051,
56
+ "f1_weighted": 0.72051,
57
+ "precision": 0.720885,
58
+ "precision_weighted": 0.720885,
59
+ "recall": 0.7206,
60
+ "recall_weighted": 0.7206,
61
+ "ap": 0.660778,
62
+ "ap_weighted": 0.660778
63
+ },
64
+ {
65
+ "accuracy": 0.69748,
66
+ "f1": 0.696681,
67
+ "f1_weighted": 0.696681,
68
+ "precision": 0.699583,
69
+ "precision_weighted": 0.699583,
70
+ "recall": 0.69748,
71
+ "recall_weighted": 0.69748,
72
+ "ap": 0.634108,
73
+ "ap_weighted": 0.634108
74
+ },
75
+ {
76
+ "accuracy": 0.69976,
77
+ "f1": 0.698724,
78
+ "f1_weighted": 0.698724,
79
+ "precision": 0.702546,
80
+ "precision_weighted": 0.702546,
81
+ "recall": 0.69976,
82
+ "recall_weighted": 0.69976,
83
+ "ap": 0.635595,
84
+ "ap_weighted": 0.635595
85
+ },
86
+ {
87
+ "accuracy": 0.72864,
88
+ "f1": 0.727201,
89
+ "f1_weighted": 0.727201,
90
+ "precision": 0.73357,
91
+ "precision_weighted": 0.73357,
92
+ "recall": 0.72864,
93
+ "recall_weighted": 0.72864,
94
+ "ap": 0.659965,
95
+ "ap_weighted": 0.659965
96
+ },
97
+ {
98
+ "accuracy": 0.71084,
99
+ "f1": 0.710808,
100
+ "f1_weighted": 0.710808,
101
+ "precision": 0.710932,
102
+ "precision_weighted": 0.710932,
103
+ "recall": 0.71084,
104
+ "recall_weighted": 0.71084,
105
+ "ap": 0.648964,
106
+ "ap_weighted": 0.648964
107
+ },
108
+ {
109
+ "accuracy": 0.70844,
110
+ "f1": 0.707814,
111
+ "f1_weighted": 0.707814,
112
+ "precision": 0.710241,
113
+ "precision_weighted": 0.710241,
114
+ "recall": 0.70844,
115
+ "recall_weighted": 0.70844,
116
+ "ap": 0.643986,
117
+ "ap_weighted": 0.643986
118
+ }
119
+ ],
120
+ "accuracy": 0.7249,
121
+ "f1": 0.724259,
122
+ "f1_weighted": 0.724259,
123
+ "precision": 0.727031,
124
+ "precision_weighted": 0.727031,
125
+ "recall": 0.7249,
126
+ "recall_weighted": 0.7249,
127
+ "ap": 0.663591,
128
+ "ap_weighted": 0.663591,
129
+ "main_score": 0.7249,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 239.4080970287323,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MSMARCO.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5a29a104738b98a9e76336939199e264163d4a0",
3
+ "task_name": "MSMARCO",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "dev": [
7
+ {
8
+ "ndcg_at_1": 0.18352,
9
+ "ndcg_at_3": 0.27647,
10
+ "ndcg_at_5": 0.31043,
11
+ "ndcg_at_10": 0.34306,
12
+ "ndcg_at_20": 0.37093,
13
+ "ndcg_at_100": 0.4049,
14
+ "ndcg_at_1000": 0.4225,
15
+ "map_at_1": 0.17927,
16
+ "map_at_3": 0.2514,
17
+ "map_at_5": 0.27035,
18
+ "map_at_10": 0.28399,
19
+ "map_at_20": 0.29183,
20
+ "map_at_100": 0.29663,
21
+ "map_at_1000": 0.2973,
22
+ "recall_at_1": 0.17927,
23
+ "recall_at_3": 0.34451,
24
+ "recall_at_5": 0.42605,
25
+ "recall_at_10": 0.5253,
26
+ "recall_at_20": 0.63357,
27
+ "recall_at_100": 0.81353,
28
+ "recall_at_1000": 0.95008,
29
+ "accuracy": 0.17927,
30
+ "precision_at_1": 0.18352,
31
+ "precision_at_3": 0.11848,
32
+ "precision_at_5": 0.08825,
33
+ "precision_at_10": 0.05471,
34
+ "precision_at_20": 0.03312,
35
+ "precision_at_100": 0.00858,
36
+ "precision_at_1000": 0.00101,
37
+ "mrr_at_1": 0.183524,
38
+ "mrr_at_3": 0.255946,
39
+ "mrr_at_5": 0.274964,
40
+ "mrr_at_10": 0.288474,
41
+ "mrr_at_20": 0.296043,
42
+ "mrr_at_100": 0.300633,
43
+ "mrr_at_1000": 0.30124,
44
+ "nauc_ndcg_at_1_max": 0.097826,
45
+ "nauc_ndcg_at_1_std": -0.15536,
46
+ "nauc_ndcg_at_1_diff1": 0.394587,
47
+ "nauc_ndcg_at_3_max": 0.111719,
48
+ "nauc_ndcg_at_3_std": -0.17947,
49
+ "nauc_ndcg_at_3_diff1": 0.338187,
50
+ "nauc_ndcg_at_5_max": 0.121826,
51
+ "nauc_ndcg_at_5_std": -0.177252,
52
+ "nauc_ndcg_at_5_diff1": 0.335629,
53
+ "nauc_ndcg_at_10_max": 0.122196,
54
+ "nauc_ndcg_at_10_std": -0.17097,
55
+ "nauc_ndcg_at_10_diff1": 0.328234,
56
+ "nauc_ndcg_at_20_max": 0.129861,
57
+ "nauc_ndcg_at_20_std": -0.148546,
58
+ "nauc_ndcg_at_20_diff1": 0.323021,
59
+ "nauc_ndcg_at_100_max": 0.141991,
60
+ "nauc_ndcg_at_100_std": -0.122136,
61
+ "nauc_ndcg_at_100_diff1": 0.327261,
62
+ "nauc_ndcg_at_1000_max": 0.13643,
63
+ "nauc_ndcg_at_1000_std": -0.133134,
64
+ "nauc_ndcg_at_1000_diff1": 0.333515,
65
+ "nauc_map_at_1_max": 0.100003,
66
+ "nauc_map_at_1_std": -0.152663,
67
+ "nauc_map_at_1_diff1": 0.397579,
68
+ "nauc_map_at_3_max": 0.108434,
69
+ "nauc_map_at_3_std": -0.174675,
70
+ "nauc_map_at_3_diff1": 0.352186,
71
+ "nauc_map_at_5_max": 0.114191,
72
+ "nauc_map_at_5_std": -0.173608,
73
+ "nauc_map_at_5_diff1": 0.350296,
74
+ "nauc_map_at_10_max": 0.114444,
75
+ "nauc_map_at_10_std": -0.170924,
76
+ "nauc_map_at_10_diff1": 0.347235,
77
+ "nauc_map_at_20_max": 0.116522,
78
+ "nauc_map_at_20_std": -0.164583,
79
+ "nauc_map_at_20_diff1": 0.345848,
80
+ "nauc_map_at_100_max": 0.118198,
81
+ "nauc_map_at_100_std": -0.161072,
82
+ "nauc_map_at_100_diff1": 0.346733,
83
+ "nauc_map_at_1000_max": 0.118039,
84
+ "nauc_map_at_1000_std": -0.161343,
85
+ "nauc_map_at_1000_diff1": 0.346952,
86
+ "nauc_recall_at_1_max": 0.100003,
87
+ "nauc_recall_at_1_std": -0.152663,
88
+ "nauc_recall_at_1_diff1": 0.397579,
89
+ "nauc_recall_at_3_max": 0.121155,
90
+ "nauc_recall_at_3_std": -0.189539,
91
+ "nauc_recall_at_3_diff1": 0.303116,
92
+ "nauc_recall_at_5_max": 0.142036,
93
+ "nauc_recall_at_5_std": -0.184207,
94
+ "nauc_recall_at_5_diff1": 0.297834,
95
+ "nauc_recall_at_10_max": 0.141496,
96
+ "nauc_recall_at_10_std": -0.168341,
97
+ "nauc_recall_at_10_diff1": 0.274339,
98
+ "nauc_recall_at_20_max": 0.175482,
99
+ "nauc_recall_at_20_std": -0.07786,
100
+ "nauc_recall_at_20_diff1": 0.246519,
101
+ "nauc_recall_at_100_max": 0.303663,
102
+ "nauc_recall_at_100_std": 0.189521,
103
+ "nauc_recall_at_100_diff1": 0.228804,
104
+ "nauc_recall_at_1000_max": 0.457624,
105
+ "nauc_recall_at_1000_std": 0.460749,
106
+ "nauc_recall_at_1000_diff1": 0.259203,
107
+ "nauc_precision_at_1_max": 0.097826,
108
+ "nauc_precision_at_1_std": -0.15536,
109
+ "nauc_precision_at_1_diff1": 0.394587,
110
+ "nauc_precision_at_3_max": 0.118029,
111
+ "nauc_precision_at_3_std": -0.191784,
112
+ "nauc_precision_at_3_diff1": 0.30004,
113
+ "nauc_precision_at_5_max": 0.139008,
114
+ "nauc_precision_at_5_std": -0.185008,
115
+ "nauc_precision_at_5_diff1": 0.292145,
116
+ "nauc_precision_at_10_max": 0.140984,
117
+ "nauc_precision_at_10_std": -0.161441,
118
+ "nauc_precision_at_10_diff1": 0.263404,
119
+ "nauc_precision_at_20_max": 0.171223,
120
+ "nauc_precision_at_20_std": -0.066617,
121
+ "nauc_precision_at_20_diff1": 0.225704,
122
+ "nauc_precision_at_100_max": 0.260124,
123
+ "nauc_precision_at_100_std": 0.175427,
124
+ "nauc_precision_at_100_diff1": 0.16645,
125
+ "nauc_precision_at_1000_max": 0.210277,
126
+ "nauc_precision_at_1000_std": 0.217386,
127
+ "nauc_precision_at_1000_diff1": 0.054492,
128
+ "nauc_mrr_at_1_max": 0.097826,
129
+ "nauc_mrr_at_1_std": -0.15536,
130
+ "nauc_mrr_at_1_diff1": 0.394587,
131
+ "nauc_mrr_at_3_max": 0.106522,
132
+ "nauc_mrr_at_3_std": -0.175168,
133
+ "nauc_mrr_at_3_diff1": 0.34909,
134
+ "nauc_mrr_at_5_max": 0.112892,
135
+ "nauc_mrr_at_5_std": -0.173536,
136
+ "nauc_mrr_at_5_diff1": 0.347709,
137
+ "nauc_mrr_at_10_max": 0.113668,
138
+ "nauc_mrr_at_10_std": -0.1702,
139
+ "nauc_mrr_at_10_diff1": 0.344778,
140
+ "nauc_mrr_at_20_max": 0.115344,
141
+ "nauc_mrr_at_20_std": -0.164495,
142
+ "nauc_mrr_at_20_diff1": 0.343361,
143
+ "nauc_mrr_at_100_max": 0.116744,
144
+ "nauc_mrr_at_100_std": -0.161294,
145
+ "nauc_mrr_at_100_diff1": 0.344159,
146
+ "nauc_mrr_at_1000_max": 0.116545,
147
+ "nauc_mrr_at_1000_std": -0.161614,
148
+ "nauc_mrr_at_1000_diff1": 0.344419,
149
+ "hit_rate_at_1": 0.18352,
150
+ "hit_rate_at_3": 0.35143,
151
+ "hit_rate_at_5": 0.43453,
152
+ "hit_rate_at_10": 0.5351,
153
+ "hit_rate_at_20": 0.64312,
154
+ "hit_rate_at_100": 0.82178,
155
+ "hit_rate_at_1000": 0.95401,
156
+ "main_score": 0.34306,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 21226.186717748642,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/MTOPDomainClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "a76d16fae880597b9c73047b50159220a441cb54",
3
+ "task_name": "MTOPDomainClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.900821,
11
+ "f1": 0.894414,
12
+ "f1_weighted": 0.900212,
13
+ "precision": 0.894371,
14
+ "precision_weighted": 0.902882,
15
+ "recall": 0.898187,
16
+ "recall_weighted": 0.900821,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.907889,
22
+ "f1": 0.902631,
23
+ "f1_weighted": 0.907373,
24
+ "precision": 0.899159,
25
+ "precision_weighted": 0.907826,
26
+ "recall": 0.907207,
27
+ "recall_weighted": 0.907889,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.906521,
33
+ "f1": 0.901483,
34
+ "f1_weighted": 0.906194,
35
+ "precision": 0.897922,
36
+ "precision_weighted": 0.90713,
37
+ "recall": 0.906383,
38
+ "recall_weighted": 0.906521,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.912221,
44
+ "f1": 0.907049,
45
+ "f1_weighted": 0.913244,
46
+ "precision": 0.904719,
47
+ "precision_weighted": 0.919115,
48
+ "recall": 0.913999,
49
+ "recall_weighted": 0.912221,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.898769,
55
+ "f1": 0.892104,
56
+ "f1_weighted": 0.899325,
57
+ "precision": 0.889255,
58
+ "precision_weighted": 0.903113,
59
+ "recall": 0.898813,
60
+ "recall_weighted": 0.898769,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.914501,
66
+ "f1": 0.91,
67
+ "f1_weighted": 0.913992,
68
+ "precision": 0.907744,
69
+ "precision_weighted": 0.914529,
70
+ "recall": 0.913379,
71
+ "recall_weighted": 0.914501,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.904697,
77
+ "f1": 0.897702,
78
+ "f1_weighted": 0.903523,
79
+ "precision": 0.896038,
80
+ "precision_weighted": 0.904804,
81
+ "recall": 0.901848,
82
+ "recall_weighted": 0.904697,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.911081,
88
+ "f1": 0.903115,
89
+ "f1_weighted": 0.912288,
90
+ "precision": 0.898983,
91
+ "precision_weighted": 0.917029,
92
+ "recall": 0.912515,
93
+ "recall_weighted": 0.911081,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.897173,
99
+ "f1": 0.890582,
100
+ "f1_weighted": 0.897482,
101
+ "precision": 0.884795,
102
+ "precision_weighted": 0.901172,
103
+ "recall": 0.899448,
104
+ "recall_weighted": 0.897173,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.911537,
110
+ "f1": 0.903535,
111
+ "f1_weighted": 0.911786,
112
+ "precision": 0.900866,
113
+ "precision_weighted": 0.912908,
114
+ "recall": 0.907346,
115
+ "recall_weighted": 0.911537,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.906521,
121
+ "f1": 0.900261,
122
+ "f1_weighted": 0.906542,
123
+ "precision": 0.897385,
124
+ "precision_weighted": 0.909051,
125
+ "recall": 0.905913,
126
+ "recall_weighted": 0.906521,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.906521,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 39.02710032463074,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MTOPIntentClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "2992d820f31312593c49a4890430aadadb0f0039",
3
+ "task_name": "MTOPIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.581395,
11
+ "f1": 0.405436,
12
+ "f1_weighted": 0.608488,
13
+ "precision": 0.39225,
14
+ "precision_weighted": 0.817559,
15
+ "recall": 0.627798,
16
+ "recall_weighted": 0.581395,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.593023,
22
+ "f1": 0.414719,
23
+ "f1_weighted": 0.615382,
24
+ "precision": 0.402968,
25
+ "precision_weighted": 0.818255,
26
+ "recall": 0.641342,
27
+ "recall_weighted": 0.593023,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.639991,
33
+ "f1": 0.451057,
34
+ "f1_weighted": 0.663545,
35
+ "precision": 0.421202,
36
+ "precision_weighted": 0.8179,
37
+ "recall": 0.658777,
38
+ "recall_weighted": 0.639991,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.594391,
44
+ "f1": 0.418811,
45
+ "f1_weighted": 0.618191,
46
+ "precision": 0.405412,
47
+ "precision_weighted": 0.815189,
48
+ "recall": 0.637559,
49
+ "recall_weighted": 0.594391,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.591427,
55
+ "f1": 0.435491,
56
+ "f1_weighted": 0.602733,
57
+ "precision": 0.419357,
58
+ "precision_weighted": 0.799612,
59
+ "recall": 0.651045,
60
+ "recall_weighted": 0.591427,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.600547,
66
+ "f1": 0.429205,
67
+ "f1_weighted": 0.618274,
68
+ "precision": 0.406865,
69
+ "precision_weighted": 0.806025,
70
+ "recall": 0.646415,
71
+ "recall_weighted": 0.600547,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.588007,
77
+ "f1": 0.421838,
78
+ "f1_weighted": 0.613199,
79
+ "precision": 0.406351,
80
+ "precision_weighted": 0.807821,
81
+ "recall": 0.631906,
82
+ "recall_weighted": 0.588007,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.635659,
88
+ "f1": 0.435009,
89
+ "f1_weighted": 0.658551,
90
+ "precision": 0.415655,
91
+ "precision_weighted": 0.830892,
92
+ "recall": 0.646068,
93
+ "recall_weighted": 0.635659,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.626995,
99
+ "f1": 0.426097,
100
+ "f1_weighted": 0.653042,
101
+ "precision": 0.405923,
102
+ "precision_weighted": 0.824495,
103
+ "recall": 0.646228,
104
+ "recall_weighted": 0.626995,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.629275,
110
+ "f1": 0.433398,
111
+ "f1_weighted": 0.649904,
112
+ "precision": 0.408811,
113
+ "precision_weighted": 0.803445,
114
+ "recall": 0.640411,
115
+ "recall_weighted": 0.629275,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.608071,
121
+ "f1": 0.427106,
122
+ "f1_weighted": 0.630131,
123
+ "precision": 0.408479,
124
+ "precision_weighted": 0.814119,
125
+ "recall": 0.642755,
126
+ "recall_weighted": 0.608071,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.608071,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 74.10429525375366,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MassiveIntentClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4672e20407010da34463acc759c162ca9734bca6",
3
+ "task_name": "MassiveIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.667451,
11
+ "f1": 0.648678,
12
+ "f1_weighted": 0.655716,
13
+ "precision": 0.643185,
14
+ "precision_weighted": 0.717259,
15
+ "recall": 0.744907,
16
+ "recall_weighted": 0.667451,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.699395,
22
+ "f1": 0.678374,
23
+ "f1_weighted": 0.687472,
24
+ "precision": 0.653341,
25
+ "precision_weighted": 0.733456,
26
+ "recall": 0.760127,
27
+ "recall_weighted": 0.699395,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.660726,
33
+ "f1": 0.635524,
34
+ "f1_weighted": 0.647047,
35
+ "precision": 0.612503,
36
+ "precision_weighted": 0.682605,
37
+ "recall": 0.740933,
38
+ "recall_weighted": 0.660726,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.676866,
44
+ "f1": 0.647704,
45
+ "f1_weighted": 0.66291,
46
+ "precision": 0.634801,
47
+ "precision_weighted": 0.721557,
48
+ "recall": 0.737105,
49
+ "recall_weighted": 0.676866,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.659045,
55
+ "f1": 0.632608,
56
+ "f1_weighted": 0.64493,
57
+ "precision": 0.635989,
58
+ "precision_weighted": 0.711611,
59
+ "recall": 0.728085,
60
+ "recall_weighted": 0.659045,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.637861,
66
+ "f1": 0.624383,
67
+ "f1_weighted": 0.62512,
68
+ "precision": 0.614889,
69
+ "precision_weighted": 0.698777,
70
+ "recall": 0.728036,
71
+ "recall_weighted": 0.637861,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.644586,
77
+ "f1": 0.630637,
78
+ "f1_weighted": 0.630088,
79
+ "precision": 0.6187,
80
+ "precision_weighted": 0.681501,
81
+ "recall": 0.732731,
82
+ "recall_weighted": 0.644586,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.658373,
88
+ "f1": 0.643129,
89
+ "f1_weighted": 0.639615,
90
+ "precision": 0.640523,
91
+ "precision_weighted": 0.725897,
92
+ "recall": 0.747192,
93
+ "recall_weighted": 0.658373,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.654674,
99
+ "f1": 0.655173,
100
+ "f1_weighted": 0.625944,
101
+ "precision": 0.651018,
102
+ "precision_weighted": 0.708277,
103
+ "recall": 0.760025,
104
+ "recall_weighted": 0.654674,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.677202,
110
+ "f1": 0.664569,
111
+ "f1_weighted": 0.669185,
112
+ "precision": 0.650802,
113
+ "precision_weighted": 0.723547,
114
+ "recall": 0.75546,
115
+ "recall_weighted": 0.677202,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.663618,
121
+ "f1": 0.646078,
122
+ "f1_weighted": 0.648803,
123
+ "precision": 0.635575,
124
+ "precision_weighted": 0.710449,
125
+ "recall": 0.74346,
126
+ "recall_weighted": 0.663618,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.663618,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 56.978368520736694,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MassiveScenarioClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "fad2c6e8459f9e1c45d9315f4953d921437d70f8",
3
+ "task_name": "MassiveScenarioClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.739744,
11
+ "f1": 0.732875,
12
+ "f1_weighted": 0.737816,
13
+ "precision": 0.712032,
14
+ "precision_weighted": 0.766673,
15
+ "recall": 0.790948,
16
+ "recall_weighted": 0.739744,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.740081,
22
+ "f1": 0.724906,
23
+ "f1_weighted": 0.73434,
24
+ "precision": 0.701874,
25
+ "precision_weighted": 0.765691,
26
+ "recall": 0.788278,
27
+ "recall_weighted": 0.740081,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.735037,
33
+ "f1": 0.719232,
34
+ "f1_weighted": 0.73226,
35
+ "precision": 0.697464,
36
+ "precision_weighted": 0.762755,
37
+ "recall": 0.779451,
38
+ "recall_weighted": 0.735037,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.740417,
44
+ "f1": 0.730607,
45
+ "f1_weighted": 0.740344,
46
+ "precision": 0.715476,
47
+ "precision_weighted": 0.771636,
48
+ "recall": 0.782892,
49
+ "recall_weighted": 0.740417,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.731338,
55
+ "f1": 0.71571,
56
+ "f1_weighted": 0.724592,
57
+ "precision": 0.705682,
58
+ "precision_weighted": 0.766852,
59
+ "recall": 0.779016,
60
+ "recall_weighted": 0.731338,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.717552,
66
+ "f1": 0.701985,
67
+ "f1_weighted": 0.716721,
68
+ "precision": 0.69143,
69
+ "precision_weighted": 0.768825,
70
+ "recall": 0.763497,
71
+ "recall_weighted": 0.717552,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.723268,
77
+ "f1": 0.712177,
78
+ "f1_weighted": 0.723512,
79
+ "precision": 0.697841,
80
+ "precision_weighted": 0.767755,
81
+ "recall": 0.774967,
82
+ "recall_weighted": 0.723268,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.709818,
88
+ "f1": 0.703785,
89
+ "f1_weighted": 0.711193,
90
+ "precision": 0.691228,
91
+ "precision_weighted": 0.750309,
92
+ "recall": 0.760539,
93
+ "recall_weighted": 0.709818,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.726967,
99
+ "f1": 0.719436,
100
+ "f1_weighted": 0.723845,
101
+ "precision": 0.704581,
102
+ "precision_weighted": 0.762673,
103
+ "recall": 0.780024,
104
+ "recall_weighted": 0.726967,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.713853,
110
+ "f1": 0.708783,
111
+ "f1_weighted": 0.714044,
112
+ "precision": 0.695919,
113
+ "precision_weighted": 0.753502,
114
+ "recall": 0.769142,
115
+ "recall_weighted": 0.713853,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.727808,
121
+ "f1": 0.71695,
122
+ "f1_weighted": 0.725867,
123
+ "precision": 0.701353,
124
+ "precision_weighted": 0.763667,
125
+ "recall": 0.776875,
126
+ "recall_weighted": 0.727808,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.727808,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 27.185521364212036,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MedrxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e7a26af6f3ae46b30dde8737f02c07b1505bcc73",
3
+ "task_name": "MedrxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.319608,
9
+ "v_measure_std": 0.013704,
10
+ "v_measures": [
11
+ 0.309803,
12
+ 0.309477,
13
+ 0.309381,
14
+ 0.301519,
15
+ 0.3042,
16
+ 0.327714,
17
+ 0.334626,
18
+ 0.343796,
19
+ 0.328003,
20
+ 0.327557
21
+ ],
22
+ "main_score": 0.319608,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 100.74085593223572,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MedrxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "35191c8c0dca72d8ff3efcd72aa802307d469663",
3
+ "task_name": "MedrxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.285863,
9
+ "v_measure_std": 0.017834,
10
+ "v_measures": [
11
+ 0.277872,
12
+ 0.267676,
13
+ 0.268935,
14
+ 0.264688,
15
+ 0.268748,
16
+ 0.30648,
17
+ 0.291251,
18
+ 0.316271,
19
+ 0.304149,
20
+ 0.292562
21
+ ],
22
+ "main_score": 0.285863,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 91.14747357368469,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MindSmallReranking.json ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "227478e3235572039f4f7661840e059f31ef6eb1",
3
+ "task_name": "MindSmallReranking",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.13289,
9
+ "ndcg_at_3": 0.20961,
10
+ "ndcg_at_5": 0.25543,
11
+ "ndcg_at_10": 0.31778,
12
+ "ndcg_at_20": 0.37062,
13
+ "ndcg_at_100": 0.43274,
14
+ "ndcg_at_1000": 0.43605,
15
+ "map_at_1": 0.1,
16
+ "map_at_3": 0.1715,
17
+ "map_at_5": 0.19884,
18
+ "map_at_10": 0.22688,
19
+ "map_at_20": 0.24425,
20
+ "map_at_100": 0.25767,
21
+ "map_at_1000": 0.25809,
22
+ "recall_at_1": 0.1,
23
+ "recall_at_3": 0.26003,
24
+ "recall_at_5": 0.36952,
25
+ "recall_at_10": 0.54558,
26
+ "recall_at_20": 0.72533,
27
+ "recall_at_100": 0.98398,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.1,
30
+ "precision_at_1": 0.13289,
31
+ "precision_at_3": 0.11654,
32
+ "precision_at_5": 0.10187,
33
+ "precision_at_10": 0.07927,
34
+ "precision_at_20": 0.05635,
35
+ "precision_at_100": 0.01768,
36
+ "precision_at_1000": 0.00183,
37
+ "mrr_at_1": 0.132893,
38
+ "mrr_at_3": 0.218151,
39
+ "mrr_at_5": 0.247594,
40
+ "mrr_at_10": 0.273322,
41
+ "mrr_at_20": 0.285348,
42
+ "mrr_at_100": 0.29053,
43
+ "mrr_at_1000": 0.290573,
44
+ "nauc_ndcg_at_1_max": -0.081096,
45
+ "nauc_ndcg_at_1_std": 0.022047,
46
+ "nauc_ndcg_at_1_diff1": 0.106432,
47
+ "nauc_ndcg_at_3_max": -0.184694,
48
+ "nauc_ndcg_at_3_std": -0.005789,
49
+ "nauc_ndcg_at_3_diff1": 0.128101,
50
+ "nauc_ndcg_at_5_max": -0.216364,
51
+ "nauc_ndcg_at_5_std": -0.005153,
52
+ "nauc_ndcg_at_5_diff1": 0.129003,
53
+ "nauc_ndcg_at_10_max": -0.248543,
54
+ "nauc_ndcg_at_10_std": -0.003319,
55
+ "nauc_ndcg_at_10_diff1": 0.127177,
56
+ "nauc_ndcg_at_20_max": -0.25796,
57
+ "nauc_ndcg_at_20_std": 0.001561,
58
+ "nauc_ndcg_at_20_diff1": 0.124312,
59
+ "nauc_ndcg_at_100_max": -0.198169,
60
+ "nauc_ndcg_at_100_std": 0.006065,
61
+ "nauc_ndcg_at_100_diff1": 0.11918,
62
+ "nauc_ndcg_at_1000_max": -0.187316,
63
+ "nauc_ndcg_at_1000_std": 0.004893,
64
+ "nauc_ndcg_at_1000_diff1": 0.118321,
65
+ "nauc_map_at_1_max": -0.149328,
66
+ "nauc_map_at_1_std": -0.005372,
67
+ "nauc_map_at_1_diff1": 0.127279,
68
+ "nauc_map_at_3_max": -0.198154,
69
+ "nauc_map_at_3_std": -0.012539,
70
+ "nauc_map_at_3_diff1": 0.134628,
71
+ "nauc_map_at_5_max": -0.213323,
72
+ "nauc_map_at_5_std": -0.009984,
73
+ "nauc_map_at_5_diff1": 0.133635,
74
+ "nauc_map_at_10_max": -0.226266,
75
+ "nauc_map_at_10_std": -0.007496,
76
+ "nauc_map_at_10_diff1": 0.131835,
77
+ "nauc_map_at_20_max": -0.22749,
78
+ "nauc_map_at_20_std": -0.005076,
79
+ "nauc_map_at_20_diff1": 0.130237,
80
+ "nauc_map_at_100_max": -0.21506,
81
+ "nauc_map_at_100_std": -0.003351,
82
+ "nauc_map_at_100_diff1": 0.128925,
83
+ "nauc_map_at_1000_max": -0.21392,
84
+ "nauc_map_at_1000_std": -0.003461,
85
+ "nauc_map_at_1000_diff1": 0.128844,
86
+ "nauc_recall_at_1_max": -0.149328,
87
+ "nauc_recall_at_1_std": -0.005372,
88
+ "nauc_recall_at_1_diff1": 0.127279,
89
+ "nauc_recall_at_3_max": -0.226985,
90
+ "nauc_recall_at_3_std": -0.018404,
91
+ "nauc_recall_at_3_diff1": 0.132289,
92
+ "nauc_recall_at_5_max": -0.26921,
93
+ "nauc_recall_at_5_std": -0.015077,
94
+ "nauc_recall_at_5_diff1": 0.128272,
95
+ "nauc_recall_at_10_max": -0.347051,
96
+ "nauc_recall_at_10_std": -0.014033,
97
+ "nauc_recall_at_10_diff1": 0.122558,
98
+ "nauc_recall_at_20_max": -0.440311,
99
+ "nauc_recall_at_20_std": -0.004843,
100
+ "nauc_recall_at_20_diff1": 0.116763,
101
+ "nauc_recall_at_100_max": -0.763943,
102
+ "nauc_recall_at_100_std": 0.07822,
103
+ "nauc_recall_at_100_diff1": 0.131796,
104
+ "nauc_recall_at_1000_max": -0.598786,
105
+ "nauc_recall_at_1000_std": -0.022316,
106
+ "nauc_recall_at_1000_diff1": 0.401447,
107
+ "nauc_precision_at_1_max": -0.081096,
108
+ "nauc_precision_at_1_std": 0.022047,
109
+ "nauc_precision_at_1_diff1": 0.106432,
110
+ "nauc_precision_at_3_max": -0.141504,
111
+ "nauc_precision_at_3_std": 0.018994,
112
+ "nauc_precision_at_3_diff1": 0.105888,
113
+ "nauc_precision_at_5_max": -0.155582,
114
+ "nauc_precision_at_5_std": 0.029229,
115
+ "nauc_precision_at_5_diff1": 0.092438,
116
+ "nauc_precision_at_10_max": -0.133501,
117
+ "nauc_precision_at_10_std": 0.042204,
118
+ "nauc_precision_at_10_diff1": 0.055072,
119
+ "nauc_precision_at_20_max": -0.036075,
120
+ "nauc_precision_at_20_std": 0.055399,
121
+ "nauc_precision_at_20_diff1": 0.004799,
122
+ "nauc_precision_at_100_max": 0.241571,
123
+ "nauc_precision_at_100_std": 0.043082,
124
+ "nauc_precision_at_100_diff1": -0.063059,
125
+ "nauc_precision_at_1000_max": 0.269487,
126
+ "nauc_precision_at_1000_std": 0.036656,
127
+ "nauc_precision_at_1000_diff1": -0.066302,
128
+ "nauc_mrr_at_1_max": -0.081096,
129
+ "nauc_mrr_at_1_std": 0.022047,
130
+ "nauc_mrr_at_1_diff1": 0.106432,
131
+ "nauc_mrr_at_3_max": -0.127994,
132
+ "nauc_mrr_at_3_std": 0.011958,
133
+ "nauc_mrr_at_3_diff1": 0.111549,
134
+ "nauc_mrr_at_5_max": -0.141091,
135
+ "nauc_mrr_at_5_std": 0.012583,
136
+ "nauc_mrr_at_5_diff1": 0.111249,
137
+ "nauc_mrr_at_10_max": -0.150418,
138
+ "nauc_mrr_at_10_std": 0.012671,
139
+ "nauc_mrr_at_10_diff1": 0.110822,
140
+ "nauc_mrr_at_20_max": -0.150848,
141
+ "nauc_mrr_at_20_std": 0.013038,
142
+ "nauc_mrr_at_20_diff1": 0.110722,
143
+ "nauc_mrr_at_100_max": -0.147291,
144
+ "nauc_mrr_at_100_std": 0.013155,
145
+ "nauc_mrr_at_100_diff1": 0.110815,
146
+ "nauc_mrr_at_1000_max": -0.147192,
147
+ "nauc_mrr_at_1000_std": 0.013141,
148
+ "nauc_mrr_at_1000_diff1": 0.110811,
149
+ "hit_rate_at_1": 0.13289,
150
+ "hit_rate_at_3": 0.33093,
151
+ "hit_rate_at_5": 0.46052,
152
+ "hit_rate_at_10": 0.65304,
153
+ "hit_rate_at_20": 0.82379,
154
+ "hit_rate_at_100": 0.99464,
155
+ "hit_rate_at_1000": 1.0,
156
+ "max_over_subqueries_ndcg_at_1": 0.16352,
157
+ "max_over_subqueries_ndcg_at_3": 0.26308,
158
+ "max_over_subqueries_ndcg_at_5": 0.3138,
159
+ "max_over_subqueries_ndcg_at_10": 0.37595,
160
+ "max_over_subqueries_ndcg_at_20": 0.42204,
161
+ "max_over_subqueries_ndcg_at_100": 0.46764,
162
+ "max_over_subqueries_ndcg_at_1000": 0.46943,
163
+ "max_over_subqueries_map_at_1": 0.13468,
164
+ "max_over_subqueries_map_at_3": 0.22322,
165
+ "max_over_subqueries_map_at_5": 0.253,
166
+ "max_over_subqueries_map_at_10": 0.28089,
167
+ "max_over_subqueries_map_at_20": 0.29587,
168
+ "max_over_subqueries_map_at_100": 0.30532,
169
+ "max_over_subqueries_map_at_1000": 0.30552,
170
+ "max_over_subqueries_recall_at_1": 0.13468,
171
+ "max_over_subqueries_recall_at_3": 0.33169,
172
+ "max_over_subqueries_recall_at_5": 0.45161,
173
+ "max_over_subqueries_recall_at_10": 0.62924,
174
+ "max_over_subqueries_recall_at_20": 0.79068,
175
+ "max_over_subqueries_recall_at_100": 0.99049,
176
+ "max_over_subqueries_recall_at_1000": 0.99999,
177
+ "max_over_subqueries_accuracy": 0.13468,
178
+ "max_over_subqueries_precision_at_1": 0.16352,
179
+ "max_over_subqueries_precision_at_3": 0.13625,
180
+ "max_over_subqueries_precision_at_5": 0.11394,
181
+ "max_over_subqueries_precision_at_10": 0.08279,
182
+ "max_over_subqueries_precision_at_20": 0.05464,
183
+ "max_over_subqueries_precision_at_100": 0.01495,
184
+ "max_over_subqueries_precision_at_1000": 0.00152,
185
+ "max_over_subqueries_mrr_at_1_max": -0.092325,
186
+ "max_over_subqueries_mrr_at_1_std": 0.020512,
187
+ "max_over_subqueries_mrr_at_1_diff1": 0.12738,
188
+ "max_over_subqueries_mrr_at_3_max": -0.168712,
189
+ "max_over_subqueries_mrr_at_3_std": -0.002239,
190
+ "max_over_subqueries_mrr_at_3_diff1": 0.117414,
191
+ "max_over_subqueries_mrr_at_5_max": -0.167807,
192
+ "max_over_subqueries_mrr_at_5_std": 0.00871,
193
+ "max_over_subqueries_mrr_at_5_diff1": 0.097714,
194
+ "max_over_subqueries_mrr_at_10_max": -0.10325,
195
+ "max_over_subqueries_mrr_at_10_std": 0.024524,
196
+ "max_over_subqueries_mrr_at_10_diff1": 0.050076,
197
+ "max_over_subqueries_mrr_at_20_max": 0.05542,
198
+ "max_over_subqueries_mrr_at_20_std": 0.072283,
199
+ "max_over_subqueries_mrr_at_20_diff1": 0.004324,
200
+ "max_over_subqueries_mrr_at_100_max": 0.378665,
201
+ "max_over_subqueries_mrr_at_100_std": 0.145863,
202
+ "max_over_subqueries_mrr_at_100_diff1": -0.035885,
203
+ "max_over_subqueries_mrr_at_1000_max": 0.399349,
204
+ "max_over_subqueries_mrr_at_1000_std": 0.142234,
205
+ "max_over_subqueries_mrr_at_1000_diff1": -0.036702,
206
+ "max_over_subqueries_mrr_at_1": 0.163523,
207
+ "max_over_subqueries_mrr_at_3": 0.262332,
208
+ "max_over_subqueries_mrr_at_5": 0.292531,
209
+ "max_over_subqueries_mrr_at_10": 0.317226,
210
+ "max_over_subqueries_mrr_at_20": 0.327661,
211
+ "max_over_subqueries_mrr_at_100": 0.331954,
212
+ "max_over_subqueries_mrr_at_1000": 0.331988,
213
+ "max_over_subqueries_nauc_mrr_at_1_max": -0.092325,
214
+ "max_over_subqueries_nauc_mrr_at_1_std": 0.020512,
215
+ "max_over_subqueries_nauc_mrr_at_1_diff1": 0.12738,
216
+ "max_over_subqueries_nauc_mrr_at_3_max": -0.148287,
217
+ "max_over_subqueries_nauc_mrr_at_3_std": -6.1e-05,
218
+ "max_over_subqueries_nauc_mrr_at_3_diff1": 0.129012,
219
+ "max_over_subqueries_nauc_mrr_at_5_max": -0.158819,
220
+ "max_over_subqueries_nauc_mrr_at_5_std": -0.000714,
221
+ "max_over_subqueries_nauc_mrr_at_5_diff1": 0.127788,
222
+ "max_over_subqueries_nauc_mrr_at_10_max": -0.164296,
223
+ "max_over_subqueries_nauc_mrr_at_10_std": -0.002603,
224
+ "max_over_subqueries_nauc_mrr_at_10_diff1": 0.126336,
225
+ "max_over_subqueries_nauc_mrr_at_20_max": -0.163048,
226
+ "max_over_subqueries_nauc_mrr_at_20_std": -0.0018,
227
+ "max_over_subqueries_nauc_mrr_at_20_diff1": 0.126687,
228
+ "max_over_subqueries_nauc_mrr_at_100_max": -0.160286,
229
+ "max_over_subqueries_nauc_mrr_at_100_std": -0.001026,
230
+ "max_over_subqueries_nauc_mrr_at_100_diff1": 0.127016,
231
+ "max_over_subqueries_nauc_mrr_at_1000_max": -0.160227,
232
+ "max_over_subqueries_nauc_mrr_at_1000_std": -0.00103,
233
+ "max_over_subqueries_nauc_mrr_at_1000_diff1": 0.127017,
234
+ "max_over_subqueries_hit_rate_at_1": 0.16352,
235
+ "max_over_subqueries_hit_rate_at_3": 0.39068,
236
+ "max_over_subqueries_hit_rate_at_5": 0.52344,
237
+ "max_over_subqueries_hit_rate_at_10": 0.70731,
238
+ "max_over_subqueries_hit_rate_at_20": 0.85486,
239
+ "max_over_subqueries_hit_rate_at_100": 0.99586,
240
+ "max_over_subqueries_hit_rate_at_1000": 0.99999,
241
+ "main_score": 0.30552,
242
+ "hf_subset": "default",
243
+ "languages": [
244
+ "eng-Latn"
245
+ ]
246
+ }
247
+ ]
248
+ },
249
+ "evaluation_time": 6225.677058696747,
250
+ "kg_co2_emissions": null,
251
+ "date": null
252
+ }
results/NFCorpus.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ec0fa4fe99da2ff19ca1214b7966684033a58814",
3
+ "task_name": "NFCorpus",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.37307,
9
+ "ndcg_at_3": 0.34244,
10
+ "ndcg_at_5": 0.32082,
11
+ "ndcg_at_10": 0.30117,
12
+ "ndcg_at_20": 0.28224,
13
+ "ndcg_at_100": 0.28017,
14
+ "ndcg_at_1000": 0.36998,
15
+ "map_at_1": 0.04024,
16
+ "map_at_3": 0.07388,
17
+ "map_at_5": 0.08641,
18
+ "map_at_10": 0.10347,
19
+ "map_at_20": 0.1155,
20
+ "map_at_100": 0.13404,
21
+ "map_at_1000": 0.14871,
22
+ "recall_at_1": 0.04024,
23
+ "recall_at_3": 0.08984,
24
+ "recall_at_5": 0.10928,
25
+ "recall_at_10": 0.1429,
26
+ "recall_at_20": 0.18095,
27
+ "recall_at_100": 0.29105,
28
+ "recall_at_1000": 0.60654,
29
+ "accuracy": 0.04024,
30
+ "precision_at_1": 0.39319,
31
+ "precision_at_3": 0.32611,
32
+ "precision_at_5": 0.2805,
33
+ "precision_at_10": 0.2322,
34
+ "precision_at_20": 0.1743,
35
+ "precision_at_100": 0.07752,
36
+ "precision_at_1000": 0.02071,
37
+ "mrr_at_1": 0.396285,
38
+ "mrr_at_3": 0.46646,
39
+ "mrr_at_5": 0.475593,
40
+ "mrr_at_10": 0.484588,
41
+ "mrr_at_20": 0.488838,
42
+ "mrr_at_100": 0.491774,
43
+ "mrr_at_1000": 0.492301,
44
+ "nauc_ndcg_at_1_max": 0.475653,
45
+ "nauc_ndcg_at_1_std": 0.248391,
46
+ "nauc_ndcg_at_1_diff1": 0.253194,
47
+ "nauc_ndcg_at_3_max": 0.461832,
48
+ "nauc_ndcg_at_3_std": 0.2457,
49
+ "nauc_ndcg_at_3_diff1": 0.147745,
50
+ "nauc_ndcg_at_5_max": 0.478577,
51
+ "nauc_ndcg_at_5_std": 0.26071,
52
+ "nauc_ndcg_at_5_diff1": 0.145068,
53
+ "nauc_ndcg_at_10_max": 0.48321,
54
+ "nauc_ndcg_at_10_std": 0.291876,
55
+ "nauc_ndcg_at_10_diff1": 0.148375,
56
+ "nauc_ndcg_at_20_max": 0.476063,
57
+ "nauc_ndcg_at_20_std": 0.304541,
58
+ "nauc_ndcg_at_20_diff1": 0.151692,
59
+ "nauc_ndcg_at_100_max": 0.491943,
60
+ "nauc_ndcg_at_100_std": 0.336851,
61
+ "nauc_ndcg_at_100_diff1": 0.181995,
62
+ "nauc_ndcg_at_1000_max": 0.521277,
63
+ "nauc_ndcg_at_1000_std": 0.392306,
64
+ "nauc_ndcg_at_1000_diff1": 0.165772,
65
+ "nauc_map_at_1_max": 0.28237,
66
+ "nauc_map_at_1_std": -0.039818,
67
+ "nauc_map_at_1_diff1": 0.533464,
68
+ "nauc_map_at_3_max": 0.241516,
69
+ "nauc_map_at_3_std": -0.038714,
70
+ "nauc_map_at_3_diff1": 0.350527,
71
+ "nauc_map_at_5_max": 0.276295,
72
+ "nauc_map_at_5_std": -0.005185,
73
+ "nauc_map_at_5_diff1": 0.32186,
74
+ "nauc_map_at_10_max": 0.309433,
75
+ "nauc_map_at_10_std": 0.030775,
76
+ "nauc_map_at_10_diff1": 0.284326,
77
+ "nauc_map_at_20_max": 0.33109,
78
+ "nauc_map_at_20_std": 0.059453,
79
+ "nauc_map_at_20_diff1": 0.272491,
80
+ "nauc_map_at_100_max": 0.378686,
81
+ "nauc_map_at_100_std": 0.142614,
82
+ "nauc_map_at_100_diff1": 0.253981,
83
+ "nauc_map_at_1000_max": 0.393113,
84
+ "nauc_map_at_1000_std": 0.186464,
85
+ "nauc_map_at_1000_diff1": 0.238002,
86
+ "nauc_recall_at_1_max": 0.28237,
87
+ "nauc_recall_at_1_std": -0.039818,
88
+ "nauc_recall_at_1_diff1": 0.533464,
89
+ "nauc_recall_at_3_max": 0.187927,
90
+ "nauc_recall_at_3_std": -0.05598,
91
+ "nauc_recall_at_3_diff1": 0.265839,
92
+ "nauc_recall_at_5_max": 0.234823,
93
+ "nauc_recall_at_5_std": -0.006788,
94
+ "nauc_recall_at_5_diff1": 0.254623,
95
+ "nauc_recall_at_10_max": 0.270799,
96
+ "nauc_recall_at_10_std": 0.040193,
97
+ "nauc_recall_at_10_diff1": 0.222662,
98
+ "nauc_recall_at_20_max": 0.269393,
99
+ "nauc_recall_at_20_std": 0.060182,
100
+ "nauc_recall_at_20_diff1": 0.192001,
101
+ "nauc_recall_at_100_max": 0.319542,
102
+ "nauc_recall_at_100_std": 0.215602,
103
+ "nauc_recall_at_100_diff1": 0.141138,
104
+ "nauc_recall_at_1000_max": 0.287135,
105
+ "nauc_recall_at_1000_std": 0.309128,
106
+ "nauc_recall_at_1000_diff1": 0.068053,
107
+ "nauc_precision_at_1_max": 0.473818,
108
+ "nauc_precision_at_1_std": 0.244862,
109
+ "nauc_precision_at_1_diff1": 0.255377,
110
+ "nauc_precision_at_3_max": 0.453339,
111
+ "nauc_precision_at_3_std": 0.272983,
112
+ "nauc_precision_at_3_diff1": 0.067804,
113
+ "nauc_precision_at_5_max": 0.472984,
114
+ "nauc_precision_at_5_std": 0.312177,
115
+ "nauc_precision_at_5_diff1": 0.033883,
116
+ "nauc_precision_at_10_max": 0.460687,
117
+ "nauc_precision_at_10_std": 0.370329,
118
+ "nauc_precision_at_10_diff1": 0.009234,
119
+ "nauc_precision_at_20_max": 0.431938,
120
+ "nauc_precision_at_20_std": 0.416193,
121
+ "nauc_precision_at_20_diff1": -0.019298,
122
+ "nauc_precision_at_100_max": 0.353294,
123
+ "nauc_precision_at_100_std": 0.509973,
124
+ "nauc_precision_at_100_diff1": -0.061963,
125
+ "nauc_precision_at_1000_max": 0.226786,
126
+ "nauc_precision_at_1000_std": 0.422205,
127
+ "nauc_precision_at_1000_diff1": -0.105499,
128
+ "nauc_mrr_at_1_max": 0.482932,
129
+ "nauc_mrr_at_1_std": 0.245975,
130
+ "nauc_mrr_at_1_diff1": 0.24646,
131
+ "nauc_mrr_at_3_max": 0.497902,
132
+ "nauc_mrr_at_3_std": 0.295904,
133
+ "nauc_mrr_at_3_diff1": 0.191546,
134
+ "nauc_mrr_at_5_max": 0.507897,
135
+ "nauc_mrr_at_5_std": 0.298306,
136
+ "nauc_mrr_at_5_diff1": 0.2045,
137
+ "nauc_mrr_at_10_max": 0.516799,
138
+ "nauc_mrr_at_10_std": 0.307095,
139
+ "nauc_mrr_at_10_diff1": 0.20988,
140
+ "nauc_mrr_at_20_max": 0.517387,
141
+ "nauc_mrr_at_20_std": 0.309826,
142
+ "nauc_mrr_at_20_diff1": 0.205706,
143
+ "nauc_mrr_at_100_max": 0.518001,
144
+ "nauc_mrr_at_100_std": 0.311274,
145
+ "nauc_mrr_at_100_diff1": 0.205342,
146
+ "nauc_mrr_at_1000_max": 0.517538,
147
+ "nauc_mrr_at_1000_std": 0.310776,
148
+ "nauc_mrr_at_1000_diff1": 0.205633,
149
+ "hit_rate_at_1": 0.39319,
150
+ "hit_rate_at_3": 0.54799,
151
+ "hit_rate_at_5": 0.58824,
152
+ "hit_rate_at_10": 0.65635,
153
+ "hit_rate_at_20": 0.71827,
154
+ "hit_rate_at_100": 0.82353,
155
+ "hit_rate_at_1000": 0.94118,
156
+ "main_score": 0.30117,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 12.046990871429443,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/NQ.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b774495ed302d8c44a3a7ea25c90dbce03968f31",
3
+ "task_name": "NQ",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.29925,
9
+ "ndcg_at_3": 0.39497,
10
+ "ndcg_at_5": 0.4312,
11
+ "ndcg_at_10": 0.46718,
12
+ "ndcg_at_20": 0.48805,
13
+ "ndcg_at_100": 0.51555,
14
+ "ndcg_at_1000": 0.52682,
15
+ "map_at_1": 0.2675,
16
+ "map_at_3": 0.36018,
17
+ "map_at_5": 0.38202,
18
+ "map_at_10": 0.39829,
19
+ "map_at_20": 0.40466,
20
+ "map_at_100": 0.40894,
21
+ "map_at_1000": 0.40941,
22
+ "recall_at_1": 0.2675,
23
+ "recall_at_3": 0.46599,
24
+ "recall_at_5": 0.5491,
25
+ "recall_at_10": 0.65397,
26
+ "recall_at_20": 0.73103,
27
+ "recall_at_100": 0.87027,
28
+ "recall_at_1000": 0.95401,
29
+ "accuracy": 0.2675,
30
+ "precision_at_1": 0.29925,
31
+ "precision_at_3": 0.17903,
32
+ "precision_at_5": 0.12839,
33
+ "precision_at_10": 0.07738,
34
+ "precision_at_20": 0.04364,
35
+ "precision_at_100": 0.01049,
36
+ "precision_at_1000": 0.00116,
37
+ "mrr_at_1": 0.299247,
38
+ "mrr_at_3": 0.390015,
39
+ "mrr_at_5": 0.408628,
40
+ "mrr_at_10": 0.422079,
41
+ "mrr_at_20": 0.427242,
42
+ "mrr_at_100": 0.430589,
43
+ "mrr_at_1000": 0.430944,
44
+ "nauc_ndcg_at_1_max": 0.274179,
45
+ "nauc_ndcg_at_1_std": 0.008329,
46
+ "nauc_ndcg_at_1_diff1": 0.3698,
47
+ "nauc_ndcg_at_3_max": 0.281464,
48
+ "nauc_ndcg_at_3_std": 0.010636,
49
+ "nauc_ndcg_at_3_diff1": 0.315137,
50
+ "nauc_ndcg_at_5_max": 0.291231,
51
+ "nauc_ndcg_at_5_std": 0.017749,
52
+ "nauc_ndcg_at_5_diff1": 0.318275,
53
+ "nauc_ndcg_at_10_max": 0.306565,
54
+ "nauc_ndcg_at_10_std": 0.034593,
55
+ "nauc_ndcg_at_10_diff1": 0.321969,
56
+ "nauc_ndcg_at_20_max": 0.311039,
57
+ "nauc_ndcg_at_20_std": 0.045824,
58
+ "nauc_ndcg_at_20_diff1": 0.322021,
59
+ "nauc_ndcg_at_100_max": 0.316535,
60
+ "nauc_ndcg_at_100_std": 0.058909,
61
+ "nauc_ndcg_at_100_diff1": 0.321677,
62
+ "nauc_ndcg_at_1000_max": 0.31242,
63
+ "nauc_ndcg_at_1000_std": 0.053254,
64
+ "nauc_ndcg_at_1000_diff1": 0.321267,
65
+ "nauc_map_at_1_max": 0.250967,
66
+ "nauc_map_at_1_std": -0.01364,
67
+ "nauc_map_at_1_diff1": 0.371368,
68
+ "nauc_map_at_3_max": 0.273661,
69
+ "nauc_map_at_3_std": 0.000658,
70
+ "nauc_map_at_3_diff1": 0.327572,
71
+ "nauc_map_at_5_max": 0.280928,
72
+ "nauc_map_at_5_std": 0.005325,
73
+ "nauc_map_at_5_diff1": 0.330102,
74
+ "nauc_map_at_10_max": 0.287339,
75
+ "nauc_map_at_10_std": 0.012743,
76
+ "nauc_map_at_10_diff1": 0.331855,
77
+ "nauc_map_at_20_max": 0.288623,
78
+ "nauc_map_at_20_std": 0.016084,
79
+ "nauc_map_at_20_diff1": 0.331954,
80
+ "nauc_map_at_100_max": 0.289777,
81
+ "nauc_map_at_100_std": 0.01841,
82
+ "nauc_map_at_100_diff1": 0.33161,
83
+ "nauc_map_at_1000_max": 0.289639,
84
+ "nauc_map_at_1000_std": 0.018246,
85
+ "nauc_map_at_1000_diff1": 0.331604,
86
+ "nauc_recall_at_1_max": 0.250967,
87
+ "nauc_recall_at_1_std": -0.01364,
88
+ "nauc_recall_at_1_diff1": 0.371368,
89
+ "nauc_recall_at_3_max": 0.280043,
90
+ "nauc_recall_at_3_std": 0.016089,
91
+ "nauc_recall_at_3_diff1": 0.272695,
92
+ "nauc_recall_at_5_max": 0.297242,
93
+ "nauc_recall_at_5_std": 0.02949,
94
+ "nauc_recall_at_5_diff1": 0.275279,
95
+ "nauc_recall_at_10_max": 0.349979,
96
+ "nauc_recall_at_10_std": 0.084632,
97
+ "nauc_recall_at_10_diff1": 0.28052,
98
+ "nauc_recall_at_20_max": 0.382423,
99
+ "nauc_recall_at_20_std": 0.146651,
100
+ "nauc_recall_at_20_diff1": 0.27823,
101
+ "nauc_recall_at_100_max": 0.498437,
102
+ "nauc_recall_at_100_std": 0.363445,
103
+ "nauc_recall_at_100_diff1": 0.257089,
104
+ "nauc_recall_at_1000_max": 0.642525,
105
+ "nauc_recall_at_1000_std": 0.619219,
106
+ "nauc_recall_at_1000_diff1": 0.144255,
107
+ "nauc_precision_at_1_max": 0.274179,
108
+ "nauc_precision_at_1_std": 0.008329,
109
+ "nauc_precision_at_1_diff1": 0.3698,
110
+ "nauc_precision_at_3_max": 0.301649,
111
+ "nauc_precision_at_3_std": 0.051351,
112
+ "nauc_precision_at_3_diff1": 0.244373,
113
+ "nauc_precision_at_5_max": 0.304441,
114
+ "nauc_precision_at_5_std": 0.076308,
115
+ "nauc_precision_at_5_diff1": 0.224232,
116
+ "nauc_precision_at_10_max": 0.313808,
117
+ "nauc_precision_at_10_std": 0.126563,
118
+ "nauc_precision_at_10_diff1": 0.195936,
119
+ "nauc_precision_at_20_max": 0.297482,
120
+ "nauc_precision_at_20_std": 0.170993,
121
+ "nauc_precision_at_20_diff1": 0.150609,
122
+ "nauc_precision_at_100_max": 0.264982,
123
+ "nauc_precision_at_100_std": 0.255758,
124
+ "nauc_precision_at_100_diff1": 0.053377,
125
+ "nauc_precision_at_1000_max": 0.163476,
126
+ "nauc_precision_at_1000_std": 0.217805,
127
+ "nauc_precision_at_1000_diff1": -0.034564,
128
+ "nauc_mrr_at_1_max": 0.274179,
129
+ "nauc_mrr_at_1_std": 0.008329,
130
+ "nauc_mrr_at_1_diff1": 0.3698,
131
+ "nauc_mrr_at_3_max": 0.28824,
132
+ "nauc_mrr_at_3_std": 0.024325,
133
+ "nauc_mrr_at_3_diff1": 0.326124,
134
+ "nauc_mrr_at_5_max": 0.292101,
135
+ "nauc_mrr_at_5_std": 0.027477,
136
+ "nauc_mrr_at_5_diff1": 0.327826,
137
+ "nauc_mrr_at_10_max": 0.296935,
138
+ "nauc_mrr_at_10_std": 0.033023,
139
+ "nauc_mrr_at_10_diff1": 0.329649,
140
+ "nauc_mrr_at_20_max": 0.297432,
141
+ "nauc_mrr_at_20_std": 0.034817,
142
+ "nauc_mrr_at_20_diff1": 0.329688,
143
+ "nauc_mrr_at_100_max": 0.297651,
144
+ "nauc_mrr_at_100_std": 0.035628,
145
+ "nauc_mrr_at_100_diff1": 0.329859,
146
+ "nauc_mrr_at_1000_max": 0.297493,
147
+ "nauc_mrr_at_1000_std": 0.035405,
148
+ "nauc_mrr_at_1000_diff1": 0.329851,
149
+ "hit_rate_at_1": 0.29925,
150
+ "hit_rate_at_3": 0.50637,
151
+ "hit_rate_at_5": 0.58749,
152
+ "hit_rate_at_10": 0.68743,
153
+ "hit_rate_at_20": 0.76014,
154
+ "hit_rate_at_100": 0.88905,
155
+ "hit_rate_at_1000": 0.96176,
156
+ "main_score": 0.46718,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 6630.378222703934,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }