Snorkeler commited on
Commit
cb607f7
·
verified ·
1 Parent(s): a91a9ae

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,894 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:150
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: BAAI/bge-base-en-v1.5
14
+ widget:
15
+ - source_sentence: Worldwide Sales Change By Business SegmentOrganic salesAcquisitionsDivestituresTranslationTotal
16
+ sales changeSafety and Industrial1.0 % % %(4.2) %(3.2) %Transportation and Electronics1.2
17
+ (0.5)(4.6)(3.9)Health Care3.2 (1.4)(3.8)(2.0)Consumer(0.9) (0.4)(2.6)(3.9)Total
18
+ Company1.2 (0.5)(3.9)(3.2)
19
+ sentences:
20
+ - Has MGM Resorts paid dividends to common shareholders in FY2022?
21
+ - If we exclude the impact of M&A, which segment has dragged down 3M's overall growth
22
+ in 2022?
23
+ - In 2022 Q2, which of JPM's business segments had the highest net income?
24
+ - source_sentence: 'Table of ContentsConsolidated Statement of IncomePepsiCo, Inc.
25
+ and SubsidiariesFiscal years ended December 31, 2022, December 25, 2021 and December
26
+ 26, 2020(in millions except per share amounts)202220212020Net Revenue$86,392 $79,474
27
+ $70,372 Cost of sales40,576 37,075 31,797 Gross profit45,816 42,399 38,575 Selling,
28
+ general and administrative expenses34,459 31,237 28,453 Gain associated with the
29
+ Juice Transaction (see Note 13)(3,321) Impairment of intangible assets (see Notes
30
+ 1 and 4)3,166 42 Operating Profit11,512 11,162 10,080 Other pension and retiree
31
+ medical benefits income132 522 117 Net interest expense and other(939)(1,863)(1,128)Income
32
+ before income taxes10,705 9,821 9,069 Provision for income taxes1,727 2,142 1,894
33
+ Net income8,978 7,679 7,175 Less: Net income attributable to noncontrolling interests68
34
+ 61 55 Net Income Attributable to PepsiCo$8,910 $7,618 $7,120 Net Income Attributable
35
+ to PepsiCo per Common ShareBasic$6.45 $5.51 $5.14 Diluted$6.42 $5.49 $5.12 Weighted-average
36
+ common shares outstandingBasic1,380 1,382 1,385 Diluted1,387 1,389 1,392 See accompanying
37
+ notes to the consolidated financial statements.60'
38
+ sentences:
39
+ - What is Amcor's year end FY2020 net AR (in USD millions)? Address the question
40
+ by adopting the perspective of a financial analyst who can only use the details
41
+ shown within the balance sheet.
42
+ - What is the FY2022 unadjusted EBITDA less capex for PepsiCo? Define unadjusted
43
+ EBITDA as unadjusted operating income + depreciation and amortization [from cash
44
+ flow statement]. Answer in USD millions. Respond to the question by assuming the
45
+ perspective of an investment analyst who can only use the details shown within
46
+ the statement of cash flows and the income statement.
47
+ - By how much did Pepsico increase its unsecured five year revolving credit agreement
48
+ on May 26, 2023?
49
+ - source_sentence: Lockheed Martin CorporationConsolidated Statements of Earnings(in
50
+ millions, except per share data) Years Ended December 31,202220212020Net salesProducts$
51
+ 55,466 $ 56,435 $ 54,928 Services 10,518 10,609 10,470 Total net sales 65,984
52
+ 67,044 65,398 Cost of salesProducts (49,577) (50,273) (48,996) Services (9,280)
53
+ (9,463) (9,371) Severance and other charges (100) (36) (27) Other unallocated,
54
+ net 1,260 1,789 1,650 Total cost of sales (57,697) (57,983) (56,744) Gross profit
55
+ 8,287 9,061 8,654 Other income (expense), net 61 62 (10) Operating profit 8,348
56
+ 9,123 8,644 Interest expense (623) (569) (591) Non-service FAS pension (expense)
57
+ income (971) (1,292) 219 Other non-operating (expense) income, net (74) 288 (37)
58
+ Earnings from continuing operations before income taxes 6,680 7,550 8,235 Income
59
+ tax expense (948) (1,235) (1,347) Net earnings from continuing operations 5,732
60
+ 6,315 6,888 Net loss from discontinued operations (55) Net earnings$ 5,732 $
61
+ 6,315 $ 6,833 Earnings (loss) per common shareBasicContinuing operations$ 21.74
62
+ $ 22.85 $ 24.60 Discontinued operations (0.20) Basic earnings per common share$
63
+ 21.74 $ 22.85 $ 24.40 DilutedContinuing operations$ 21.66 $ 22.76 $ 24.50 Discontinued
64
+ operations (0.20) Diluted earnings per common share$ 21.66 $ 22.76 $ 24.30 The
65
+ accompanying notes are an integral part of these consolidated financial statements.Table
66
+ of Contents 63
67
+ sentences:
68
+ - As of Q2'2023, is Pfizer spinning off any large business segments?
69
+ - What is Lockheed Martin's 2 year total revenue CAGR from FY2020 to FY2022 (in
70
+ units of percents and round to one decimal place)? Provide a response to the question
71
+ by primarily using the statement of income.
72
+ - What are the geographies that Pepsico primarily operates in as of FY2022?
73
+ - source_sentence: 'The Kraft Heinz CompanyConsolidated Statements of Income(in millions,
74
+ except per share data) December 28, 2019 December 29, 2018 December 30, 2017Net
75
+ sales$24,977 $26,268 $26,076Cost of products sold16,830 17,347 17,043Gross profit8,147
76
+ 8,921 9,033Selling, general and administrative expenses, excluding impairment
77
+ losses3,178 3,190 2,927Goodwill impairment losses1,197 7,008 Intangible asset
78
+ impairment losses702 8,928 49Selling, general and administrative expenses5,077
79
+ 19,126 2,976Operating income/(loss)3,070 (10,205) 6,057Interest expense1,361 1,284
80
+ 1,234Other expense/(income)(952) (168) (627)Income/(loss) before income taxes2,661
81
+ (11,321) 5,450Provision for/(benefit from) income taxes728 (1,067) (5,482)Net
82
+ income/(loss)1,933 (10,254) 10,932Net income/(loss) attributable to noncontrolling
83
+ interest(2) (62) (9)Net income/(loss) attributable to common shareholders$1,935
84
+ $(10,192) $10,941Per share data applicable to common shareholders: Basic earnings/(loss)$1.59
85
+ $(8.36) $8.98Diluted earnings/(loss)1.58 (8.36) 8.91See accompanying notes to
86
+ the consolidated financial statements.45'
87
+ sentences:
88
+ - What drove gross margin change as of the FY2022 for American Express? If gross
89
+ margin is not a useful metric for a company like this, then please state that
90
+ and explain why.
91
+ - How much was the Real change in Sales for AMCOR in FY 2023 vs FY 2022, if we exclude
92
+ the impact of FX movement, passthrough costs and one-off items?
93
+ - 'What is Kraft Heinz''s FY2019 inventory turnover ratio? Inventory turnover ratio
94
+ is defined as: (FY2019 COGS) / (average inventory between FY2018 and FY2019).
95
+ Round your answer to two decimal places. Please base your judgments on the information
96
+ provided primarily in the balance sheet and the P&L statement.'
97
+ - source_sentence: 3M Company and SubsidiariesConsolidated Statement of IncomeYears
98
+ ended December 31(Millions, except per share amounts)202220212020Net sales$34,229
99
+ $35,355 $32,184
100
+ sentences:
101
+ - Is 3M a capital-intensive business based on FY2022 data?
102
+ - What is Amazon's year-over-year change in revenue from FY2016 to FY2017 (in units
103
+ of percents and round to one decimal place)? Calculate what was asked by utilizing
104
+ the line items clearly shown in the statement of income.
105
+ - Among all of the derivative instruments that Verizon used to manage the exposure
106
+ to fluctuations of foreign currencies exchange rates or interest rates, which
107
+ one had the highest notional value in FY 2021?
108
+ pipeline_tag: sentence-similarity
109
+ library_name: sentence-transformers
110
+ metrics:
111
+ - cosine_accuracy@1
112
+ - cosine_accuracy@3
113
+ - cosine_accuracy@5
114
+ - cosine_accuracy@10
115
+ - cosine_precision@1
116
+ - cosine_precision@3
117
+ - cosine_precision@5
118
+ - cosine_precision@10
119
+ - cosine_recall@1
120
+ - cosine_recall@3
121
+ - cosine_recall@5
122
+ - cosine_recall@10
123
+ - cosine_ndcg@10
124
+ - cosine_mrr@10
125
+ - cosine_map@100
126
+ model-index:
127
+ - name: BGE Base - FinBench Finetuned
128
+ results:
129
+ - task:
130
+ type: information-retrieval
131
+ name: Information Retrieval
132
+ dataset:
133
+ name: dim 768
134
+ type: dim_768
135
+ metrics:
136
+ - type: cosine_accuracy@1
137
+ value: 0.8933333333333333
138
+ name: Cosine Accuracy@1
139
+ - type: cosine_accuracy@3
140
+ value: 1.0
141
+ name: Cosine Accuracy@3
142
+ - type: cosine_accuracy@5
143
+ value: 1.0
144
+ name: Cosine Accuracy@5
145
+ - type: cosine_accuracy@10
146
+ value: 1.0
147
+ name: Cosine Accuracy@10
148
+ - type: cosine_precision@1
149
+ value: 0.8933333333333333
150
+ name: Cosine Precision@1
151
+ - type: cosine_precision@3
152
+ value: 0.33333333333333326
153
+ name: Cosine Precision@3
154
+ - type: cosine_precision@5
155
+ value: 0.19999999999999996
156
+ name: Cosine Precision@5
157
+ - type: cosine_precision@10
158
+ value: 0.09999999999999998
159
+ name: Cosine Precision@10
160
+ - type: cosine_recall@1
161
+ value: 0.8933333333333333
162
+ name: Cosine Recall@1
163
+ - type: cosine_recall@3
164
+ value: 1.0
165
+ name: Cosine Recall@3
166
+ - type: cosine_recall@5
167
+ value: 1.0
168
+ name: Cosine Recall@5
169
+ - type: cosine_recall@10
170
+ value: 1.0
171
+ name: Cosine Recall@10
172
+ - type: cosine_ndcg@10
173
+ value: 0.9588867770000028
174
+ name: Cosine Ndcg@10
175
+ - type: cosine_mrr@10
176
+ value: 0.9444444444444444
177
+ name: Cosine Mrr@10
178
+ - type: cosine_map@100
179
+ value: 0.9444444444444445
180
+ name: Cosine Map@100
181
+ - task:
182
+ type: information-retrieval
183
+ name: Information Retrieval
184
+ dataset:
185
+ name: dim 512
186
+ type: dim_512
187
+ metrics:
188
+ - type: cosine_accuracy@1
189
+ value: 0.8866666666666667
190
+ name: Cosine Accuracy@1
191
+ - type: cosine_accuracy@3
192
+ value: 1.0
193
+ name: Cosine Accuracy@3
194
+ - type: cosine_accuracy@5
195
+ value: 1.0
196
+ name: Cosine Accuracy@5
197
+ - type: cosine_accuracy@10
198
+ value: 1.0
199
+ name: Cosine Accuracy@10
200
+ - type: cosine_precision@1
201
+ value: 0.8866666666666667
202
+ name: Cosine Precision@1
203
+ - type: cosine_precision@3
204
+ value: 0.33333333333333326
205
+ name: Cosine Precision@3
206
+ - type: cosine_precision@5
207
+ value: 0.19999999999999996
208
+ name: Cosine Precision@5
209
+ - type: cosine_precision@10
210
+ value: 0.09999999999999998
211
+ name: Cosine Precision@10
212
+ - type: cosine_recall@1
213
+ value: 0.8866666666666667
214
+ name: Cosine Recall@1
215
+ - type: cosine_recall@3
216
+ value: 1.0
217
+ name: Cosine Recall@3
218
+ - type: cosine_recall@5
219
+ value: 1.0
220
+ name: Cosine Recall@5
221
+ - type: cosine_recall@10
222
+ value: 1.0
223
+ name: Cosine Recall@10
224
+ - type: cosine_ndcg@10
225
+ value: 0.9572991737142889
226
+ name: Cosine Ndcg@10
227
+ - type: cosine_mrr@10
228
+ value: 0.9422222222222221
229
+ name: Cosine Mrr@10
230
+ - type: cosine_map@100
231
+ value: 0.9422222222222223
232
+ name: Cosine Map@100
233
+ - task:
234
+ type: information-retrieval
235
+ name: Information Retrieval
236
+ dataset:
237
+ name: dim 256
238
+ type: dim_256
239
+ metrics:
240
+ - type: cosine_accuracy@1
241
+ value: 0.9133333333333333
242
+ name: Cosine Accuracy@1
243
+ - type: cosine_accuracy@3
244
+ value: 1.0
245
+ name: Cosine Accuracy@3
246
+ - type: cosine_accuracy@5
247
+ value: 1.0
248
+ name: Cosine Accuracy@5
249
+ - type: cosine_accuracy@10
250
+ value: 1.0
251
+ name: Cosine Accuracy@10
252
+ - type: cosine_precision@1
253
+ value: 0.9133333333333333
254
+ name: Cosine Precision@1
255
+ - type: cosine_precision@3
256
+ value: 0.33333333333333326
257
+ name: Cosine Precision@3
258
+ - type: cosine_precision@5
259
+ value: 0.19999999999999996
260
+ name: Cosine Precision@5
261
+ - type: cosine_precision@10
262
+ value: 0.09999999999999998
263
+ name: Cosine Precision@10
264
+ - type: cosine_recall@1
265
+ value: 0.9133333333333333
266
+ name: Cosine Recall@1
267
+ - type: cosine_recall@3
268
+ value: 1.0
269
+ name: Cosine Recall@3
270
+ - type: cosine_recall@5
271
+ value: 1.0
272
+ name: Cosine Recall@5
273
+ - type: cosine_recall@10
274
+ value: 1.0
275
+ name: Cosine Recall@10
276
+ - type: cosine_ndcg@10
277
+ value: 0.9671410469523832
278
+ name: Cosine Ndcg@10
279
+ - type: cosine_mrr@10
280
+ value: 0.9555555555555554
281
+ name: Cosine Mrr@10
282
+ - type: cosine_map@100
283
+ value: 0.9555555555555556
284
+ name: Cosine Map@100
285
+ - task:
286
+ type: information-retrieval
287
+ name: Information Retrieval
288
+ dataset:
289
+ name: dim 128
290
+ type: dim_128
291
+ metrics:
292
+ - type: cosine_accuracy@1
293
+ value: 0.9266666666666666
294
+ name: Cosine Accuracy@1
295
+ - type: cosine_accuracy@3
296
+ value: 1.0
297
+ name: Cosine Accuracy@3
298
+ - type: cosine_accuracy@5
299
+ value: 1.0
300
+ name: Cosine Accuracy@5
301
+ - type: cosine_accuracy@10
302
+ value: 1.0
303
+ name: Cosine Accuracy@10
304
+ - type: cosine_precision@1
305
+ value: 0.9266666666666666
306
+ name: Cosine Precision@1
307
+ - type: cosine_precision@3
308
+ value: 0.33333333333333326
309
+ name: Cosine Precision@3
310
+ - type: cosine_precision@5
311
+ value: 0.19999999999999996
312
+ name: Cosine Precision@5
313
+ - type: cosine_precision@10
314
+ value: 0.09999999999999998
315
+ name: Cosine Precision@10
316
+ - type: cosine_recall@1
317
+ value: 0.9266666666666666
318
+ name: Cosine Recall@1
319
+ - type: cosine_recall@3
320
+ value: 1.0
321
+ name: Cosine Recall@3
322
+ - type: cosine_recall@5
323
+ value: 1.0
324
+ name: Cosine Recall@5
325
+ - type: cosine_recall@10
326
+ value: 1.0
327
+ name: Cosine Recall@10
328
+ - type: cosine_ndcg@10
329
+ value: 0.9720619835714305
330
+ name: Cosine Ndcg@10
331
+ - type: cosine_mrr@10
332
+ value: 0.9622222222222221
333
+ name: Cosine Mrr@10
334
+ - type: cosine_map@100
335
+ value: 0.9622222222222223
336
+ name: Cosine Map@100
337
+ - task:
338
+ type: information-retrieval
339
+ name: Information Retrieval
340
+ dataset:
341
+ name: dim 64
342
+ type: dim_64
343
+ metrics:
344
+ - type: cosine_accuracy@1
345
+ value: 0.94
346
+ name: Cosine Accuracy@1
347
+ - type: cosine_accuracy@3
348
+ value: 1.0
349
+ name: Cosine Accuracy@3
350
+ - type: cosine_accuracy@5
351
+ value: 1.0
352
+ name: Cosine Accuracy@5
353
+ - type: cosine_accuracy@10
354
+ value: 1.0
355
+ name: Cosine Accuracy@10
356
+ - type: cosine_precision@1
357
+ value: 0.94
358
+ name: Cosine Precision@1
359
+ - type: cosine_precision@3
360
+ value: 0.33333333333333326
361
+ name: Cosine Precision@3
362
+ - type: cosine_precision@5
363
+ value: 0.19999999999999996
364
+ name: Cosine Precision@5
365
+ - type: cosine_precision@10
366
+ value: 0.09999999999999998
367
+ name: Cosine Precision@10
368
+ - type: cosine_recall@1
369
+ value: 0.94
370
+ name: Cosine Recall@1
371
+ - type: cosine_recall@3
372
+ value: 1.0
373
+ name: Cosine Recall@3
374
+ - type: cosine_recall@5
375
+ value: 1.0
376
+ name: Cosine Recall@5
377
+ - type: cosine_recall@10
378
+ value: 1.0
379
+ name: Cosine Recall@10
380
+ - type: cosine_ndcg@10
381
+ value: 0.9769829201904777
382
+ name: Cosine Ndcg@10
383
+ - type: cosine_mrr@10
384
+ value: 0.9688888888888888
385
+ name: Cosine Mrr@10
386
+ - type: cosine_map@100
387
+ value: 0.9688888888888889
388
+ name: Cosine Map@100
389
+ ---
390
+
391
+ # BGE Base - FinBench Finetuned
392
+
393
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
394
+
395
+ ## Model Details
396
+
397
+ ### Model Description
398
+ - **Model Type:** Sentence Transformer
399
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
400
+ - **Maximum Sequence Length:** 512 tokens
401
+ - **Output Dimensionality:** 768 tokens
402
+ - **Similarity Function:** Cosine Similarity
403
+ - **Training Dataset:**
404
+ - json
405
+ - **Language:** en
406
+ - **License:** apache-2.0
407
+
408
+ ### Model Sources
409
+
410
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
411
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
412
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
413
+
414
+ ### Full Model Architecture
415
+
416
+ ```
417
+ SentenceTransformer(
418
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
419
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
420
+ (2): Normalize()
421
+ )
422
+ ```
423
+
424
+ ## Usage
425
+
426
+ ### Direct Usage (Sentence Transformers)
427
+
428
+ First install the Sentence Transformers library:
429
+
430
+ ```bash
431
+ pip install -U sentence-transformers
432
+ ```
433
+
434
+ Then you can load this model and run inference.
435
+ ```python
436
+ from sentence_transformers import SentenceTransformer
437
+
438
+ # Download from the 🤗 Hub
439
+ model = SentenceTransformer("Snorkeler/BGE-Finetuned-FinBench")
440
+ # Run inference
441
+ sentences = [
442
+ '3M Company and SubsidiariesConsolidated Statement of IncomeYears ended December 31(Millions, except per share amounts)202220212020Net sales$34,229 $35,355 $32,184',
443
+ 'Is 3M a capital-intensive business based on FY2022 data?',
444
+ 'Among all of the derivative instruments that Verizon used to manage the exposure to fluctuations of foreign currencies exchange rates or interest rates, which one had the highest notional value in FY 2021?',
445
+ ]
446
+ embeddings = model.encode(sentences)
447
+ print(embeddings.shape)
448
+ # [3, 768]
449
+
450
+ # Get the similarity scores for the embeddings
451
+ similarities = model.similarity(embeddings, embeddings)
452
+ print(similarities.shape)
453
+ # [3, 3]
454
+ ```
455
+
456
+ <!--
457
+ ### Direct Usage (Transformers)
458
+
459
+ <details><summary>Click to see the direct usage in Transformers</summary>
460
+
461
+ </details>
462
+ -->
463
+
464
+ <!--
465
+ ### Downstream Usage (Sentence Transformers)
466
+
467
+ You can finetune this model on your own dataset.
468
+
469
+ <details><summary>Click to expand</summary>
470
+
471
+ </details>
472
+ -->
473
+
474
+ <!--
475
+ ### Out-of-Scope Use
476
+
477
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
478
+ -->
479
+
480
+ ## Evaluation
481
+
482
+ ### Metrics
483
+
484
+ #### Information Retrieval
485
+ * Dataset: `dim_768`
486
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
487
+
488
+ | Metric | Value |
489
+ |:--------------------|:-----------|
490
+ | cosine_accuracy@1 | 0.8933 |
491
+ | cosine_accuracy@3 | 1.0 |
492
+ | cosine_accuracy@5 | 1.0 |
493
+ | cosine_accuracy@10 | 1.0 |
494
+ | cosine_precision@1 | 0.8933 |
495
+ | cosine_precision@3 | 0.3333 |
496
+ | cosine_precision@5 | 0.2 |
497
+ | cosine_precision@10 | 0.1 |
498
+ | cosine_recall@1 | 0.8933 |
499
+ | cosine_recall@3 | 1.0 |
500
+ | cosine_recall@5 | 1.0 |
501
+ | cosine_recall@10 | 1.0 |
502
+ | cosine_ndcg@10 | 0.9589 |
503
+ | cosine_mrr@10 | 0.9444 |
504
+ | **cosine_map@100** | **0.9444** |
505
+
506
+ #### Information Retrieval
507
+ * Dataset: `dim_512`
508
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
509
+
510
+ | Metric | Value |
511
+ |:--------------------|:-----------|
512
+ | cosine_accuracy@1 | 0.8867 |
513
+ | cosine_accuracy@3 | 1.0 |
514
+ | cosine_accuracy@5 | 1.0 |
515
+ | cosine_accuracy@10 | 1.0 |
516
+ | cosine_precision@1 | 0.8867 |
517
+ | cosine_precision@3 | 0.3333 |
518
+ | cosine_precision@5 | 0.2 |
519
+ | cosine_precision@10 | 0.1 |
520
+ | cosine_recall@1 | 0.8867 |
521
+ | cosine_recall@3 | 1.0 |
522
+ | cosine_recall@5 | 1.0 |
523
+ | cosine_recall@10 | 1.0 |
524
+ | cosine_ndcg@10 | 0.9573 |
525
+ | cosine_mrr@10 | 0.9422 |
526
+ | **cosine_map@100** | **0.9422** |
527
+
528
+ #### Information Retrieval
529
+ * Dataset: `dim_256`
530
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
531
+
532
+ | Metric | Value |
533
+ |:--------------------|:-----------|
534
+ | cosine_accuracy@1 | 0.9133 |
535
+ | cosine_accuracy@3 | 1.0 |
536
+ | cosine_accuracy@5 | 1.0 |
537
+ | cosine_accuracy@10 | 1.0 |
538
+ | cosine_precision@1 | 0.9133 |
539
+ | cosine_precision@3 | 0.3333 |
540
+ | cosine_precision@5 | 0.2 |
541
+ | cosine_precision@10 | 0.1 |
542
+ | cosine_recall@1 | 0.9133 |
543
+ | cosine_recall@3 | 1.0 |
544
+ | cosine_recall@5 | 1.0 |
545
+ | cosine_recall@10 | 1.0 |
546
+ | cosine_ndcg@10 | 0.9671 |
547
+ | cosine_mrr@10 | 0.9556 |
548
+ | **cosine_map@100** | **0.9556** |
549
+
550
+ #### Information Retrieval
551
+ * Dataset: `dim_128`
552
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
553
+
554
+ | Metric | Value |
555
+ |:--------------------|:-----------|
556
+ | cosine_accuracy@1 | 0.9267 |
557
+ | cosine_accuracy@3 | 1.0 |
558
+ | cosine_accuracy@5 | 1.0 |
559
+ | cosine_accuracy@10 | 1.0 |
560
+ | cosine_precision@1 | 0.9267 |
561
+ | cosine_precision@3 | 0.3333 |
562
+ | cosine_precision@5 | 0.2 |
563
+ | cosine_precision@10 | 0.1 |
564
+ | cosine_recall@1 | 0.9267 |
565
+ | cosine_recall@3 | 1.0 |
566
+ | cosine_recall@5 | 1.0 |
567
+ | cosine_recall@10 | 1.0 |
568
+ | cosine_ndcg@10 | 0.9721 |
569
+ | cosine_mrr@10 | 0.9622 |
570
+ | **cosine_map@100** | **0.9622** |
571
+
572
+ #### Information Retrieval
573
+ * Dataset: `dim_64`
574
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
575
+
576
+ | Metric | Value |
577
+ |:--------------------|:-----------|
578
+ | cosine_accuracy@1 | 0.94 |
579
+ | cosine_accuracy@3 | 1.0 |
580
+ | cosine_accuracy@5 | 1.0 |
581
+ | cosine_accuracy@10 | 1.0 |
582
+ | cosine_precision@1 | 0.94 |
583
+ | cosine_precision@3 | 0.3333 |
584
+ | cosine_precision@5 | 0.2 |
585
+ | cosine_precision@10 | 0.1 |
586
+ | cosine_recall@1 | 0.94 |
587
+ | cosine_recall@3 | 1.0 |
588
+ | cosine_recall@5 | 1.0 |
589
+ | cosine_recall@10 | 1.0 |
590
+ | cosine_ndcg@10 | 0.977 |
591
+ | cosine_mrr@10 | 0.9689 |
592
+ | **cosine_map@100** | **0.9689** |
593
+
594
+ <!--
595
+ ## Bias, Risks and Limitations
596
+
597
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
598
+ -->
599
+
600
+ <!--
601
+ ### Recommendations
602
+
603
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
604
+ -->
605
+
606
+ ## Training Details
607
+
608
+ ### Training Dataset
609
+
610
+ #### json
611
+
612
+ * Dataset: json
613
+ * Size: 150 training samples
614
+ * Columns: <code>context</code> and <code>question</code>
615
+ * Approximate statistics based on the first 150 samples:
616
+ | | context | question |
617
+ |:--------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
618
+ | type | string | string |
619
+ | details | <ul><li>min: 17 tokens</li><li>mean: 314.29 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 39.67 tokens</li><li>max: 175 tokens</li></ul> |
620
+ * Samples:
621
+ | context | question |
622
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
623
+ | <code>Table of Contents 3M Company and SubsidiariesConsolidated Statement of Cash Flow sYears ended December 31 (Millions) 2018 2017 2016 Cash Flows from Operating Activities Net income including noncontrolling interest $5,363 $4,869 $5,058 Adjustments to reconcile net income including noncontrolling interest to net cashprovided by operating activities Depreciation and amortization 1,488 1,544 1,474 Company pension and postretirement contributions (370) (967) (383) Company pension and postretirement expense 410 334 250 Stock-based compensation expense 302 324 298 Gain on sale of businesses (545) (586) (111) Deferred income taxes (57) 107 7 Changes in assets and liabilities Accounts receivable (305) (245) (313) Inventories (509) (387) 57 Accounts payable 408 24 148 Accrued income taxes (current and long-term) 134 967 101 Other net 120 256 76 Net cash provided by (used in) operating activities 6,439 6,240 6,662 Cash Flows from Investing Activities Purchases of property, plant and equipment (PP&E) (1,577) (1,373) (1,420) Proceeds from sale of PP&E and other assets 262 49 58 Acquisitions, net of cash acquired 13 (2,023) (16) Purchases of marketable securities and investments (1,828) (2,152) (1,410) Proceeds from maturities and sale of marketable securities and investments 2,497 1,354 1,247 Proceeds from sale of businesses, net of cash sold 846 1,065 142 Other net 9 (6) (4) Net cash provided by (used in) investing activities 222 (3,086) (1,403) Cash Flows from Financing Activities Change in short-term debt net (284) 578 (797) Repayment of debt (maturities greater than 90 days) (1,034) (962) (992) Proceeds from debt (maturities greater than 90 days) 2,251 1,987 2,832 Purchases of treasury stock (4,870) (2,068) (3,753) Proceeds from issuance of treasury stock pursuant to stock option and benefit plans 485 734 804 Dividends paid to shareholders (3,193) (2,803) (2,678) Other net (56) (121) (42) Net cash provided by (used in) financing activities (6,701) (2,655) (4,626) Effect of exchange rate changes on cash and cash equivalents (160) 156 (33) Net increase (decrease) in cash and cash equivalents (200) 655 600 Cash and cash equivalents at beginning of year 3,053 2,398 1,798 Cash and cash equivalents at end of period $2,853 $3,053 $2,398 The accompanying Notes to Consolidated Financial Statements are an integral part of this statement. 60</code> | <code>What is the FY2018 capital expenditure amount (in USD millions) for 3M? Give a response to the question by relying on the details shown in the cash flow statement.</code> |
624
+ | <code>Table of Contents 3M Company and SubsidiariesConsolidated Balance Shee tAt December 31 December 31, December 31, (Dollars in millions, except per share amount) 2018 2017 Assets Current assets Cash and cash equivalents $2,853 $3,053 Marketable securities current 380 1,076 Accounts receivable net of allowances of $95 and $103 5,020 4,911 Inventories Finished goods 2,120 1,915 Work in process 1,292 1,218 Raw materials and supplies 954 901 Total inventories 4,366 4,034 Prepaids 741 937 Other current assets 349 266 Total current assets 13,709 14,277 Property, plant and equipment 24,873 24,914 Less: Accumulated depreciation (16,135) (16,048) Property, plant and equipment net 8,738 8,866 Goodwill 10,051 10,513 Intangible assets net 2,657 2,936 Other assets 1,345 1,395 Total assets $36,500 $37,987 Liabilities Current liabilities Short-term borrowings and current portion of long-term debt $1,211 $1,853 Accounts payable 2,266 1,945 Accrued payroll 749 870 Accrued income taxes 243 310 Other current liabilities 2,775 2,709 Total current liabilities 7,244 7,687 Long-term debt 13,411 12,096 Pension and postretirement benefits 2,987 3,620 Other liabilities 3,010 2,962 Total liabilities $26,652 $26,365 Commitments and contingencies (Note 16) Equity 3M Company shareholders equity: Common stock par value, $.01 par value $ 9 $ 9 Shares outstanding - 2018: 576,575,168 Shares outstanding - 2017: 594,884,237 Additional paid-in capital 5,643 5,352 Retained earnings 40,636 39,115 Treasury stock (29,626) (25,887) Accumulated other comprehensive income (loss) (6,866) (7,026) Total 3M Company shareholders equity 9,796 11,563 Noncontrolling interest 52 59 Total equity $9,848 $11,622 Total liabilities and equity $36,500 $37,987 The accompanying Notes to Consolidated Financial Statements are an integral part of this statement.58</code> | <code>Assume that you are a public equities analyst. Answer the following question by primarily using information that is shown in the balance sheet: what is the year end FY2018 net PPNE for 3M? Answer in USD billions.</code> |
625
+ | <code>3M Company and SubsidiariesConsolidated Statement of IncomeYears ended December 31(Millions, except per share amounts)202220212020Net sales$34,229 $35,355 $32,184</code> | <code>Is 3M a capital-intensive business based on FY2022 data?</code> |
626
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
627
+ ```json
628
+ {
629
+ "loss": "MultipleNegativesRankingLoss",
630
+ "matryoshka_dims": [
631
+ 768,
632
+ 512,
633
+ 256,
634
+ 128,
635
+ 64
636
+ ],
637
+ "matryoshka_weights": [
638
+ 1,
639
+ 1,
640
+ 1,
641
+ 1,
642
+ 1
643
+ ],
644
+ "n_dims_per_step": -1
645
+ }
646
+ ```
647
+
648
+ ### Training Hyperparameters
649
+ #### Non-Default Hyperparameters
650
+
651
+ - `eval_strategy`: epoch
652
+ - `per_device_train_batch_size`: 32
653
+ - `per_device_eval_batch_size`: 16
654
+ - `gradient_accumulation_steps`: 16
655
+ - `learning_rate`: 2e-05
656
+ - `num_train_epochs`: 50
657
+ - `lr_scheduler_type`: cosine
658
+ - `warmup_ratio`: 0.1
659
+ - `fp16`: True
660
+ - `tf32`: False
661
+ - `load_best_model_at_end`: True
662
+ - `optim`: adamw_torch_fused
663
+ - `batch_sampler`: no_duplicates
664
+
665
+ #### All Hyperparameters
666
+ <details><summary>Click to expand</summary>
667
+
668
+ - `overwrite_output_dir`: False
669
+ - `do_predict`: False
670
+ - `eval_strategy`: epoch
671
+ - `prediction_loss_only`: True
672
+ - `per_device_train_batch_size`: 32
673
+ - `per_device_eval_batch_size`: 16
674
+ - `per_gpu_train_batch_size`: None
675
+ - `per_gpu_eval_batch_size`: None
676
+ - `gradient_accumulation_steps`: 16
677
+ - `eval_accumulation_steps`: None
678
+ - `learning_rate`: 2e-05
679
+ - `weight_decay`: 0.0
680
+ - `adam_beta1`: 0.9
681
+ - `adam_beta2`: 0.999
682
+ - `adam_epsilon`: 1e-08
683
+ - `max_grad_norm`: 1.0
684
+ - `num_train_epochs`: 50
685
+ - `max_steps`: -1
686
+ - `lr_scheduler_type`: cosine
687
+ - `lr_scheduler_kwargs`: {}
688
+ - `warmup_ratio`: 0.1
689
+ - `warmup_steps`: 0
690
+ - `log_level`: passive
691
+ - `log_level_replica`: warning
692
+ - `log_on_each_node`: True
693
+ - `logging_nan_inf_filter`: True
694
+ - `save_safetensors`: True
695
+ - `save_on_each_node`: False
696
+ - `save_only_model`: False
697
+ - `restore_callback_states_from_checkpoint`: False
698
+ - `no_cuda`: False
699
+ - `use_cpu`: False
700
+ - `use_mps_device`: False
701
+ - `seed`: 42
702
+ - `data_seed`: None
703
+ - `jit_mode_eval`: False
704
+ - `use_ipex`: False
705
+ - `bf16`: False
706
+ - `fp16`: True
707
+ - `fp16_opt_level`: O1
708
+ - `half_precision_backend`: auto
709
+ - `bf16_full_eval`: False
710
+ - `fp16_full_eval`: False
711
+ - `tf32`: False
712
+ - `local_rank`: 0
713
+ - `ddp_backend`: None
714
+ - `tpu_num_cores`: None
715
+ - `tpu_metrics_debug`: False
716
+ - `debug`: []
717
+ - `dataloader_drop_last`: False
718
+ - `dataloader_num_workers`: 0
719
+ - `dataloader_prefetch_factor`: None
720
+ - `past_index`: -1
721
+ - `disable_tqdm`: False
722
+ - `remove_unused_columns`: True
723
+ - `label_names`: None
724
+ - `load_best_model_at_end`: True
725
+ - `ignore_data_skip`: False
726
+ - `fsdp`: []
727
+ - `fsdp_min_num_params`: 0
728
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
729
+ - `fsdp_transformer_layer_cls_to_wrap`: None
730
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
731
+ - `deepspeed`: None
732
+ - `label_smoothing_factor`: 0.0
733
+ - `optim`: adamw_torch_fused
734
+ - `optim_args`: None
735
+ - `adafactor`: False
736
+ - `group_by_length`: False
737
+ - `length_column_name`: length
738
+ - `ddp_find_unused_parameters`: None
739
+ - `ddp_bucket_cap_mb`: None
740
+ - `ddp_broadcast_buffers`: False
741
+ - `dataloader_pin_memory`: True
742
+ - `dataloader_persistent_workers`: False
743
+ - `skip_memory_metrics`: True
744
+ - `use_legacy_prediction_loop`: False
745
+ - `push_to_hub`: False
746
+ - `resume_from_checkpoint`: None
747
+ - `hub_model_id`: None
748
+ - `hub_strategy`: every_save
749
+ - `hub_private_repo`: False
750
+ - `hub_always_push`: False
751
+ - `gradient_checkpointing`: False
752
+ - `gradient_checkpointing_kwargs`: None
753
+ - `include_inputs_for_metrics`: False
754
+ - `eval_do_concat_batches`: True
755
+ - `fp16_backend`: auto
756
+ - `push_to_hub_model_id`: None
757
+ - `push_to_hub_organization`: None
758
+ - `mp_parameters`:
759
+ - `auto_find_batch_size`: False
760
+ - `full_determinism`: False
761
+ - `torchdynamo`: None
762
+ - `ray_scope`: last
763
+ - `ddp_timeout`: 1800
764
+ - `torch_compile`: False
765
+ - `torch_compile_backend`: None
766
+ - `torch_compile_mode`: None
767
+ - `dispatch_batches`: None
768
+ - `split_batches`: None
769
+ - `include_tokens_per_second`: False
770
+ - `include_num_input_tokens_seen`: False
771
+ - `neftune_noise_alpha`: None
772
+ - `optim_target_modules`: None
773
+ - `batch_eval_metrics`: False
774
+ - `batch_sampler`: no_duplicates
775
+ - `multi_dataset_batch_sampler`: proportional
776
+
777
+ </details>
778
+
779
+ ### Training Logs
780
+ | Epoch | Step | Training Loss | dim_768_cosine_map@100 | dim_512_cosine_map@100 | dim_256_cosine_map@100 | dim_128_cosine_map@100 | dim_64_cosine_map@100 |
781
+ |:--------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
782
+ | 0 | 0 | - | 0.4797 | 0.4762 | 0.4373 | 0.3948 | 0.2870 |
783
+ | 1.0 | 1 | - | 0.4796 | 0.4762 | 0.4374 | 0.3946 | 0.2869 |
784
+ | 2.0 | 2 | - | 0.5128 | 0.4990 | 0.4817 | 0.4673 | 0.3554 |
785
+ | 3.0 | 4 | - | 0.5387 | 0.5180 | 0.5362 | 0.5217 | 0.4156 |
786
+ | 1.0 | 1 | - | 0.5387 | 0.5180 | 0.5362 | 0.5217 | 0.4156 |
787
+ | 2.0 | 2 | - | 0.5509 | 0.5339 | 0.5399 | 0.5288 | 0.4394 |
788
+ | 3.0 | 4 | - | 0.5921 | 0.5763 | 0.5743 | 0.5709 | 0.5007 |
789
+ | 4.0 | 5 | - | 0.6112 | 0.6097 | 0.6068 | 0.6031 | 0.5435 |
790
+ | 5.0 | 6 | - | 0.6244 | 0.6383 | 0.6379 | 0.6478 | 0.5920 |
791
+ | 6.0 | 8 | - | 0.6763 | 0.6857 | 0.7064 | 0.7134 | 0.6909 |
792
+ | 7.0 | 9 | - | 0.6853 | 0.7161 | 0.7264 | 0.7463 | 0.7321 |
793
+ | 8.0 | 10 | 2.0247 | - | - | - | - | - |
794
+ | 8.2 | 11 | - | 0.7454 | 0.7757 | 0.7821 | 0.8181 | 0.7850 |
795
+ | 9.0 | 12 | - | 0.7661 | 0.7926 | 0.8071 | 0.8261 | 0.8165 |
796
+ | 10.0 | 13 | - | 0.7783 | 0.8061 | 0.8221 | 0.8396 | 0.8382 |
797
+ | 11.0 | 15 | - | 0.8221 | 0.8217 | 0.8600 | 0.8834 | 0.8903 |
798
+ | 12.0 | 16 | - | 0.8301 | 0.8393 | 0.8756 | 0.8908 | 0.9143 |
799
+ | 13.0 | 17 | - | 0.8454 | 0.8562 | 0.8943 | 0.9167 | 0.9261 |
800
+ | 14.0 | 19 | - | 0.8697 | 0.8861 | 0.9167 | 0.9311 | 0.9417 |
801
+ | 15.0 | 20 | 0.72 | 0.8808 | 0.8939 | 0.9217 | 0.9344 | 0.9522 |
802
+ | 16.2 | 22 | - | 0.9061 | 0.9 | 0.9439 | 0.9411 | 0.9556 |
803
+ | 17.0 | 23 | - | 0.9061 | 0.9061 | 0.9439 | 0.9444 | 0.9556 |
804
+ | 18.0 | 24 | - | 0.9111 | 0.9117 | 0.9444 | 0.9444 | 0.9589 |
805
+ | 19.0 | 26 | - | 0.9256 | 0.92 | 0.9478 | 0.9522 | 0.9589 |
806
+ | 20.0 | 27 | - | 0.9256 | 0.9233 | 0.9478 | 0.9489 | 0.9611 |
807
+ | 21.0 | 28 | - | 0.9289 | 0.9311 | 0.9478 | 0.9556 | 0.9644 |
808
+ | 22.0 | 30 | 0.3518 | 0.94 | 0.9344 | 0.9511 | 0.9556 | 0.9656 |
809
+ | 23.0 | 31 | - | 0.9411 | 0.9356 | 0.9544 | 0.9556 | 0.9656 |
810
+ | 24.2 | 33 | - | 0.9411 | 0.9389 | 0.9544 | 0.9589 | 0.9689 |
811
+ | 25.0 | 34 | - | 0.9378 | 0.9389 | 0.9556 | 0.9589 | 0.9689 |
812
+ | 26.0 | 35 | - | 0.9378 | 0.9389 | 0.9556 | 0.9589 | 0.9689 |
813
+ | 27.0 | 37 | - | 0.9444 | 0.9389 | 0.9556 | 0.9589 | 0.9689 |
814
+ | 28.0 | 38 | - | 0.9444 | 0.9389 | 0.9589 | 0.9589 | 0.9689 |
815
+ | 29.0 | 39 | - | 0.9444 | 0.9389 | 0.9589 | 0.9589 | 0.9689 |
816
+ | 29.4 | 40 | 0.2456 | - | - | - | - | - |
817
+ | 30.0 | 41 | - | 0.9444 | 0.9422 | 0.9589 | 0.9589 | 0.9689 |
818
+ | **31.0** | **42** | **-** | **0.9444** | **0.9422** | **0.9589** | **0.9622** | **0.9689** |
819
+ | 32.2 | 44 | - | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
820
+ | 33.0 | 45 | - | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
821
+ | 34.0 | 46 | - | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
822
+ | 35.0 | 48 | - | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
823
+ | 36.0 | 49 | - | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
824
+ | 37.0 | 50 | 0.2123 | 0.9444 | 0.9422 | 0.9556 | 0.9622 | 0.9689 |
825
+
826
+ * The bold row denotes the saved checkpoint.
827
+
828
+ ### Framework Versions
829
+ - Python: 3.10.12
830
+ - Sentence Transformers: 3.2.1
831
+ - Transformers: 4.41.2
832
+ - PyTorch: 2.1.2+cu121
833
+ - Accelerate: 1.1.1
834
+ - Datasets: 2.19.1
835
+ - Tokenizers: 0.19.1
836
+
837
+ ## Citation
838
+
839
+ ### BibTeX
840
+
841
+ #### Sentence Transformers
842
+ ```bibtex
843
+ @inproceedings{reimers-2019-sentence-bert,
844
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
845
+ author = "Reimers, Nils and Gurevych, Iryna",
846
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
847
+ month = "11",
848
+ year = "2019",
849
+ publisher = "Association for Computational Linguistics",
850
+ url = "https://arxiv.org/abs/1908.10084",
851
+ }
852
+ ```
853
+
854
+ #### MatryoshkaLoss
855
+ ```bibtex
856
+ @misc{kusupati2024matryoshka,
857
+ title={Matryoshka Representation Learning},
858
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
859
+ year={2024},
860
+ eprint={2205.13147},
861
+ archivePrefix={arXiv},
862
+ primaryClass={cs.LG}
863
+ }
864
+ ```
865
+
866
+ #### MultipleNegativesRankingLoss
867
+ ```bibtex
868
+ @misc{henderson2017efficient,
869
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
870
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
871
+ year={2017},
872
+ eprint={1705.00652},
873
+ archivePrefix={arXiv},
874
+ primaryClass={cs.CL}
875
+ }
876
+ ```
877
+
878
+ <!--
879
+ ## Glossary
880
+
881
+ *Clearly define terms in order to be accessible across audiences.*
882
+ -->
883
+
884
+ <!--
885
+ ## Model Card Authors
886
+
887
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
888
+ -->
889
+
890
+ <!--
891
+ ## Model Card Contact
892
+
893
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
894
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42f6438ab2a6df0aad565f4bdfbeb2348bce0683aca55742d1c2fb496ef35765
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff