Fe2x commited on
Commit
5619cce
·
verified ·
1 Parent(s): 48b2718

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,760 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:6300
11
+ - loss:MatryoshkaLoss
12
+ - loss:MultipleNegativesRankingLoss
13
+ base_model: BAAI/bge-base-en-v1.5
14
+ widget:
15
+ - source_sentence: What year do the patent families related to DARZALEX expire in
16
+ the United States?
17
+ sentences:
18
+ - Amortization for owned content predominantly monetized on an individual basis
19
+ and accrued costs associated with participations and residuals payments are recorded
20
+ using the individual film forecast computation method, which recognizes the costs
21
+ in the same ratio as the associated ultimate revenue.
22
+ - The two patent families both expire in the United States in 2029.
23
+ - For the year ended December 31, 2022, net cash used in investing activities of
24
+ $371.9 million was primarily from the purchase of $247.3 million marketable securities,
25
+ net of sale and maturities, $62.2 million net cash used to acquire GreenCom, SolarLeadFactory
26
+ and ClipperCreek, $46.4 million used in purchases of test and assembly equipment
27
+ to expand our supply capacity, related facility improvements and information technology
28
+ enhancements, including capitalized costs related to internal-use software and
29
+ $16.0 million used to invest in private companies.
30
+ - source_sentence: What legal claims does Fortis Advisors LLC allege against Ethicon
31
+ Inc. in the lawsuit related to the acquisition of Auris Health Inc.?
32
+ sentences:
33
+ - Payments include a single lump-sum per treatment, referred to as bundled rates,
34
+ or in other cases separate payments for dialysis treatments and pharmaceuticals,
35
+ referred to as FFS rates.
36
+ - In October 2020, Fortis Advisors LLC filed a complaint against Ethicon Inc. and
37
+ others in Delaware's Court of Chancery. The lawsuit alleges breach of contract
38
+ and fraud related to Ethicon's acquisition of Auris Health Inc. in 2019. The case
39
+ underwent a partial dismissal in December 2021, and as of January 2024, the trial's
40
+ decision is pending.
41
+ - On September 5, 2023, ICE acquired 100% of Black Knight for aggregate transaction
42
+ consideration of approximately $11.8 billion, or $76 per share of Black Knight
43
+ common stock, with cash comprising 90% of the value of the aggregate transaction
44
+ consideration. The aggregate cash component of the transaction consideration was
45
+ $10.5 billion. ICE issued 10.9 million shares of its common stock to Black Knight
46
+ stockholders, which was based on the market price of the common stock and the
47
+ average of the volume weighted averages of the trading prices of the common stock
48
+ on each of the ten consecutive trading days ending three trading days prior to
49
+ the closing of the merger.
50
+ - source_sentence: What caused the increase in net cash provided by operating activities
51
+ between 2022 and 2023?
52
+ sentences:
53
+ - Net cash provided by operating activities was $712.2 million and $223.7 million
54
+ for the year ended December 31, 2023 and 2022, respectively. The increase was
55
+ primarily driven by timing of payments to vendors and timing of the receipt of
56
+ payments from our customers, as well as an increase in interest income.
57
+ - Joanne D. Smith held the position of Vice President - Marketing at Delta from
58
+ November 2005 to February 2007.
59
+ - Experienced management team with a proven track in the gaming and resort industry.
60
+ Mr. Robert G. Goldstein, our Chairman and Chief Executive Officer, has been an
61
+ integral part of our executive team from the beginning, joining our founder and
62
+ previous Chairman and Chief Executive Officer, Mr. Sheldon G. Adelson, before
63
+ The Venetian Resort Las Vegas was constructed. Mr. Goldstein is one of the most
64
+ respected and experienced executives in our industry today.
65
+ - source_sentence: What does the company believe adds significant value to its business
66
+ regarding intellectual property?
67
+ sentences:
68
+ - In 2022, the net interest expense on pre-acquisition-related debt was $59 million
69
+ and additional adjustments included costs of $30 million associated with the May
70
+ and June 2022 extinguishment of four series of senior notes.
71
+ - Fluctuations in foreign currency exchange rates decreased our consolidated net
72
+ operating revenues by 4%.
73
+ - We believe that, to varying degrees, our trademarks, trade names, copyrights,
74
+ proprietary processes, trade secrets, trade dress, domain names and similar intellectual
75
+ property add significant value to our business
76
+ - source_sentence: What does it mean for financial statements to be incorporated by
77
+ reference?
78
+ sentences:
79
+ - The consolidated financial statements are incorporated by reference in the Annual
80
+ Report on Form 10-K, indicating they are treated as part of the document for legal
81
+ and reporting purposes.
82
+ - The Consolidated Financial Statements, together with the Notes thereto and the
83
+ report thereon dated February 16, 2024, of PricewaterhouseCoopers LLP, the Firm’s
84
+ independent registered public accounting firm (PCAOB ID 238), appear on pages
85
+ 163–309.
86
+ - 'The Goldman Sachs Group, Inc. manages and reports its activities in three business
87
+ segments: Global Banking & Markets, Asset & Wealth Samantha Management and Platform
88
+ Solutions.'
89
+ pipeline_tag: sentence-similarity
90
+ library_name: sentence-transformers
91
+ metrics:
92
+ - cosine_accuracy@1
93
+ - cosine_accuracy@3
94
+ - cosine_accuracy@5
95
+ - cosine_accuracy@10
96
+ - cosine_precision@1
97
+ - cosine_precision@3
98
+ - cosine_precision@5
99
+ - cosine_precision@10
100
+ - cosine_recall@1
101
+ - cosine_recall@3
102
+ - cosine_recall@5
103
+ - cosine_recall@10
104
+ - cosine_ndcg@10
105
+ - cosine_mrr@10
106
+ - cosine_map@100
107
+ model-index:
108
+ - name: BGE base Financial Matryoshka
109
+ results:
110
+ - task:
111
+ type: information-retrieval
112
+ name: Information Retrieval
113
+ dataset:
114
+ name: dim 768
115
+ type: dim_768
116
+ metrics:
117
+ - type: cosine_accuracy@1
118
+ value: 0.7
119
+ name: Cosine Accuracy@1
120
+ - type: cosine_accuracy@3
121
+ value: 0.8285714285714286
122
+ name: Cosine Accuracy@3
123
+ - type: cosine_accuracy@5
124
+ value: 0.8728571428571429
125
+ name: Cosine Accuracy@5
126
+ - type: cosine_accuracy@10
127
+ value: 0.9071428571428571
128
+ name: Cosine Accuracy@10
129
+ - type: cosine_precision@1
130
+ value: 0.7
131
+ name: Cosine Precision@1
132
+ - type: cosine_precision@3
133
+ value: 0.2761904761904762
134
+ name: Cosine Precision@3
135
+ - type: cosine_precision@5
136
+ value: 0.17457142857142854
137
+ name: Cosine Precision@5
138
+ - type: cosine_precision@10
139
+ value: 0.09071428571428569
140
+ name: Cosine Precision@10
141
+ - type: cosine_recall@1
142
+ value: 0.7
143
+ name: Cosine Recall@1
144
+ - type: cosine_recall@3
145
+ value: 0.8285714285714286
146
+ name: Cosine Recall@3
147
+ - type: cosine_recall@5
148
+ value: 0.8728571428571429
149
+ name: Cosine Recall@5
150
+ - type: cosine_recall@10
151
+ value: 0.9071428571428571
152
+ name: Cosine Recall@10
153
+ - type: cosine_ndcg@10
154
+ value: 0.8045805359515339
155
+ name: Cosine Ndcg@10
156
+ - type: cosine_mrr@10
157
+ value: 0.7714971655328795
158
+ name: Cosine Mrr@10
159
+ - type: cosine_map@100
160
+ value: 0.775178941729297
161
+ name: Cosine Map@100
162
+ - task:
163
+ type: information-retrieval
164
+ name: Information Retrieval
165
+ dataset:
166
+ name: dim 512
167
+ type: dim_512
168
+ metrics:
169
+ - type: cosine_accuracy@1
170
+ value: 0.7014285714285714
171
+ name: Cosine Accuracy@1
172
+ - type: cosine_accuracy@3
173
+ value: 0.83
174
+ name: Cosine Accuracy@3
175
+ - type: cosine_accuracy@5
176
+ value: 0.8671428571428571
177
+ name: Cosine Accuracy@5
178
+ - type: cosine_accuracy@10
179
+ value: 0.9042857142857142
180
+ name: Cosine Accuracy@10
181
+ - type: cosine_precision@1
182
+ value: 0.7014285714285714
183
+ name: Cosine Precision@1
184
+ - type: cosine_precision@3
185
+ value: 0.27666666666666667
186
+ name: Cosine Precision@3
187
+ - type: cosine_precision@5
188
+ value: 0.1734285714285714
189
+ name: Cosine Precision@5
190
+ - type: cosine_precision@10
191
+ value: 0.09042857142857141
192
+ name: Cosine Precision@10
193
+ - type: cosine_recall@1
194
+ value: 0.7014285714285714
195
+ name: Cosine Recall@1
196
+ - type: cosine_recall@3
197
+ value: 0.83
198
+ name: Cosine Recall@3
199
+ - type: cosine_recall@5
200
+ value: 0.8671428571428571
201
+ name: Cosine Recall@5
202
+ - type: cosine_recall@10
203
+ value: 0.9042857142857142
204
+ name: Cosine Recall@10
205
+ - type: cosine_ndcg@10
206
+ value: 0.8036464537429646
207
+ name: Cosine Ndcg@10
208
+ - type: cosine_mrr@10
209
+ value: 0.771175736961451
210
+ name: Cosine Mrr@10
211
+ - type: cosine_map@100
212
+ value: 0.7751075563277001
213
+ name: Cosine Map@100
214
+ - task:
215
+ type: information-retrieval
216
+ name: Information Retrieval
217
+ dataset:
218
+ name: dim 256
219
+ type: dim_256
220
+ metrics:
221
+ - type: cosine_accuracy@1
222
+ value: 0.6928571428571428
223
+ name: Cosine Accuracy@1
224
+ - type: cosine_accuracy@3
225
+ value: 0.8185714285714286
226
+ name: Cosine Accuracy@3
227
+ - type: cosine_accuracy@5
228
+ value: 0.8628571428571429
229
+ name: Cosine Accuracy@5
230
+ - type: cosine_accuracy@10
231
+ value: 0.8971428571428571
232
+ name: Cosine Accuracy@10
233
+ - type: cosine_precision@1
234
+ value: 0.6928571428571428
235
+ name: Cosine Precision@1
236
+ - type: cosine_precision@3
237
+ value: 0.27285714285714285
238
+ name: Cosine Precision@3
239
+ - type: cosine_precision@5
240
+ value: 0.17257142857142854
241
+ name: Cosine Precision@5
242
+ - type: cosine_precision@10
243
+ value: 0.0897142857142857
244
+ name: Cosine Precision@10
245
+ - type: cosine_recall@1
246
+ value: 0.6928571428571428
247
+ name: Cosine Recall@1
248
+ - type: cosine_recall@3
249
+ value: 0.8185714285714286
250
+ name: Cosine Recall@3
251
+ - type: cosine_recall@5
252
+ value: 0.8628571428571429
253
+ name: Cosine Recall@5
254
+ - type: cosine_recall@10
255
+ value: 0.8971428571428571
256
+ name: Cosine Recall@10
257
+ - type: cosine_ndcg@10
258
+ value: 0.7963364154792727
259
+ name: Cosine Ndcg@10
260
+ - type: cosine_mrr@10
261
+ value: 0.7638741496598634
262
+ name: Cosine Mrr@10
263
+ - type: cosine_map@100
264
+ value: 0.7683107318753077
265
+ name: Cosine Map@100
266
+ - task:
267
+ type: information-retrieval
268
+ name: Information Retrieval
269
+ dataset:
270
+ name: dim 128
271
+ type: dim_128
272
+ metrics:
273
+ - type: cosine_accuracy@1
274
+ value: 0.6771428571428572
275
+ name: Cosine Accuracy@1
276
+ - type: cosine_accuracy@3
277
+ value: 0.8142857142857143
278
+ name: Cosine Accuracy@3
279
+ - type: cosine_accuracy@5
280
+ value: 0.8514285714285714
281
+ name: Cosine Accuracy@5
282
+ - type: cosine_accuracy@10
283
+ value: 0.8885714285714286
284
+ name: Cosine Accuracy@10
285
+ - type: cosine_precision@1
286
+ value: 0.6771428571428572
287
+ name: Cosine Precision@1
288
+ - type: cosine_precision@3
289
+ value: 0.2714285714285714
290
+ name: Cosine Precision@3
291
+ - type: cosine_precision@5
292
+ value: 0.17028571428571426
293
+ name: Cosine Precision@5
294
+ - type: cosine_precision@10
295
+ value: 0.08885714285714284
296
+ name: Cosine Precision@10
297
+ - type: cosine_recall@1
298
+ value: 0.6771428571428572
299
+ name: Cosine Recall@1
300
+ - type: cosine_recall@3
301
+ value: 0.8142857142857143
302
+ name: Cosine Recall@3
303
+ - type: cosine_recall@5
304
+ value: 0.8514285714285714
305
+ name: Cosine Recall@5
306
+ - type: cosine_recall@10
307
+ value: 0.8885714285714286
308
+ name: Cosine Recall@10
309
+ - type: cosine_ndcg@10
310
+ value: 0.786332288682679
311
+ name: Cosine Ndcg@10
312
+ - type: cosine_mrr@10
313
+ value: 0.7531507936507934
314
+ name: Cosine Mrr@10
315
+ - type: cosine_map@100
316
+ value: 0.7576033800206036
317
+ name: Cosine Map@100
318
+ - task:
319
+ type: information-retrieval
320
+ name: Information Retrieval
321
+ dataset:
322
+ name: dim 64
323
+ type: dim_64
324
+ metrics:
325
+ - type: cosine_accuracy@1
326
+ value: 0.6571428571428571
327
+ name: Cosine Accuracy@1
328
+ - type: cosine_accuracy@3
329
+ value: 0.7814285714285715
330
+ name: Cosine Accuracy@3
331
+ - type: cosine_accuracy@5
332
+ value: 0.8171428571428572
333
+ name: Cosine Accuracy@5
334
+ - type: cosine_accuracy@10
335
+ value: 0.86
336
+ name: Cosine Accuracy@10
337
+ - type: cosine_precision@1
338
+ value: 0.6571428571428571
339
+ name: Cosine Precision@1
340
+ - type: cosine_precision@3
341
+ value: 0.2604761904761905
342
+ name: Cosine Precision@3
343
+ - type: cosine_precision@5
344
+ value: 0.16342857142857142
345
+ name: Cosine Precision@5
346
+ - type: cosine_precision@10
347
+ value: 0.08599999999999998
348
+ name: Cosine Precision@10
349
+ - type: cosine_recall@1
350
+ value: 0.6571428571428571
351
+ name: Cosine Recall@1
352
+ - type: cosine_recall@3
353
+ value: 0.7814285714285715
354
+ name: Cosine Recall@3
355
+ - type: cosine_recall@5
356
+ value: 0.8171428571428572
357
+ name: Cosine Recall@5
358
+ - type: cosine_recall@10
359
+ value: 0.86
360
+ name: Cosine Recall@10
361
+ - type: cosine_ndcg@10
362
+ value: 0.7602042820067257
363
+ name: Cosine Ndcg@10
364
+ - type: cosine_mrr@10
365
+ value: 0.7281371882086165
366
+ name: Cosine Mrr@10
367
+ - type: cosine_map@100
368
+ value: 0.7334805218687248
369
+ name: Cosine Map@100
370
+ ---
371
+
372
+ # BGE base Financial Matryoshka
373
+
374
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
375
+
376
+ ## Model Details
377
+
378
+ ### Model Description
379
+ - **Model Type:** Sentence Transformer
380
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
381
+ - **Maximum Sequence Length:** 512 tokens
382
+ - **Output Dimensionality:** 768 dimensions
383
+ - **Similarity Function:** Cosine Similarity
384
+ - **Training Dataset:**
385
+ - json
386
+ - **Language:** en
387
+ - **License:** apache-2.0
388
+
389
+ ### Model Sources
390
+
391
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
392
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
393
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
394
+
395
+ ### Full Model Architecture
396
+
397
+ ```
398
+ SentenceTransformer(
399
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
400
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
401
+ (2): Normalize()
402
+ )
403
+ ```
404
+
405
+ ## Usage
406
+
407
+ ### Direct Usage (Sentence Transformers)
408
+
409
+ First install the Sentence Transformers library:
410
+
411
+ ```bash
412
+ pip install -U sentence-transformers
413
+ ```
414
+
415
+ Then you can load this model and run inference.
416
+ ```python
417
+ from sentence_transformers import SentenceTransformer
418
+
419
+ # Download from the 🤗 Hub
420
+ model = SentenceTransformer("Fe2x/bge-base-financial-matryoshka")
421
+ # Run inference
422
+ sentences = [
423
+ 'What does it mean for financial statements to be incorporated by reference?',
424
+ 'The consolidated financial statements are incorporated by reference in the Annual Report on Form 10-K, indicating they are treated as part of the document for legal and reporting purposes.',
425
+ 'The Consolidated Financial Statements, together with the Notes thereto and the report thereon dated February 16, 2024, of PricewaterhouseCoopers LLP, the Firm’s independent registered public accounting firm (PCAOB ID 238), appear on pages 163–309.',
426
+ ]
427
+ embeddings = model.encode(sentences)
428
+ print(embeddings.shape)
429
+ # [3, 768]
430
+
431
+ # Get the similarity scores for the embeddings
432
+ similarities = model.similarity(embeddings, embeddings)
433
+ print(similarities.shape)
434
+ # [3, 3]
435
+ ```
436
+
437
+ <!--
438
+ ### Direct Usage (Transformers)
439
+
440
+ <details><summary>Click to see the direct usage in Transformers</summary>
441
+
442
+ </details>
443
+ -->
444
+
445
+ <!--
446
+ ### Downstream Usage (Sentence Transformers)
447
+
448
+ You can finetune this model on your own dataset.
449
+
450
+ <details><summary>Click to expand</summary>
451
+
452
+ </details>
453
+ -->
454
+
455
+ <!--
456
+ ### Out-of-Scope Use
457
+
458
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
459
+ -->
460
+
461
+ ## Evaluation
462
+
463
+ ### Metrics
464
+
465
+ #### Information Retrieval
466
+
467
+ * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
468
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
469
+
470
+ | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
471
+ |:--------------------|:-----------|:-----------|:-----------|:-----------|:-----------|
472
+ | cosine_accuracy@1 | 0.7 | 0.7014 | 0.6929 | 0.6771 | 0.6571 |
473
+ | cosine_accuracy@3 | 0.8286 | 0.83 | 0.8186 | 0.8143 | 0.7814 |
474
+ | cosine_accuracy@5 | 0.8729 | 0.8671 | 0.8629 | 0.8514 | 0.8171 |
475
+ | cosine_accuracy@10 | 0.9071 | 0.9043 | 0.8971 | 0.8886 | 0.86 |
476
+ | cosine_precision@1 | 0.7 | 0.7014 | 0.6929 | 0.6771 | 0.6571 |
477
+ | cosine_precision@3 | 0.2762 | 0.2767 | 0.2729 | 0.2714 | 0.2605 |
478
+ | cosine_precision@5 | 0.1746 | 0.1734 | 0.1726 | 0.1703 | 0.1634 |
479
+ | cosine_precision@10 | 0.0907 | 0.0904 | 0.0897 | 0.0889 | 0.086 |
480
+ | cosine_recall@1 | 0.7 | 0.7014 | 0.6929 | 0.6771 | 0.6571 |
481
+ | cosine_recall@3 | 0.8286 | 0.83 | 0.8186 | 0.8143 | 0.7814 |
482
+ | cosine_recall@5 | 0.8729 | 0.8671 | 0.8629 | 0.8514 | 0.8171 |
483
+ | cosine_recall@10 | 0.9071 | 0.9043 | 0.8971 | 0.8886 | 0.86 |
484
+ | **cosine_ndcg@10** | **0.8046** | **0.8036** | **0.7963** | **0.7863** | **0.7602** |
485
+ | cosine_mrr@10 | 0.7715 | 0.7712 | 0.7639 | 0.7532 | 0.7281 |
486
+ | cosine_map@100 | 0.7752 | 0.7751 | 0.7683 | 0.7576 | 0.7335 |
487
+
488
+ <!--
489
+ ## Bias, Risks and Limitations
490
+
491
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
492
+ -->
493
+
494
+ <!--
495
+ ### Recommendations
496
+
497
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
498
+ -->
499
+
500
+ ## Training Details
501
+
502
+ ### Training Dataset
503
+
504
+ #### json
505
+
506
+ * Dataset: json
507
+ * Size: 6,300 training samples
508
+ * Columns: <code>anchor</code> and <code>positive</code>
509
+ * Approximate statistics based on the first 1000 samples:
510
+ | | anchor | positive |
511
+ |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
512
+ | type | string | string |
513
+ | details | <ul><li>min: 7 tokens</li><li>mean: 20.44 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 45.16 tokens</li><li>max: 512 tokens</li></ul> |
514
+ * Samples:
515
+ | anchor | positive |
516
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
517
+ | <code>What was the amount of cash generated from operations by the company in fiscal year 2023?</code> | <code>Highlights during fiscal year 2023 include the following: We generated $18,085 million of cash from operations.</code> |
518
+ | <code>How much were unrealized losses on U.S. government and agency securities for those held for 12 months or greater as of June 30, 2023?</code> | <code>U.S. government and agency securities | $ | 7,950 | | $ | (336 | ) | $ | 45,273 | $ | (3,534 | ) | $ | 53,223 | $ | (3,870 | )</code> |
519
+ | <code>How is the impairment of assets assessed for projects still under development?</code> | <code>For assets under development, assets are grouped and assessed for impairment by estimating the undiscounted cash flows, which include remaining construction costs, over the asset's remaining useful life. If cash flows do not exceed the carrying amount, impairment based on fair value versus carrying value is considered.</code> |
520
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
521
+ ```json
522
+ {
523
+ "loss": "MultipleNegativesRankingLoss",
524
+ "matryoshka_dims": [
525
+ 768,
526
+ 512,
527
+ 256,
528
+ 128,
529
+ 64
530
+ ],
531
+ "matryoshka_weights": [
532
+ 1,
533
+ 1,
534
+ 1,
535
+ 1,
536
+ 1
537
+ ],
538
+ "n_dims_per_step": -1
539
+ }
540
+ ```
541
+
542
+ ### Training Hyperparameters
543
+ #### Non-Default Hyperparameters
544
+
545
+ - `eval_strategy`: epoch
546
+ - `per_device_train_batch_size`: 32
547
+ - `per_device_eval_batch_size`: 16
548
+ - `gradient_accumulation_steps`: 16
549
+ - `learning_rate`: 2e-05
550
+ - `num_train_epochs`: 4
551
+ - `lr_scheduler_type`: cosine
552
+ - `warmup_ratio`: 0.1
553
+ - `fp16`: True
554
+ - `tf32`: False
555
+ - `load_best_model_at_end`: True
556
+ - `optim`: adamw_torch_fused
557
+ - `batch_sampler`: no_duplicates
558
+
559
+ #### All Hyperparameters
560
+ <details><summary>Click to expand</summary>
561
+
562
+ - `overwrite_output_dir`: False
563
+ - `do_predict`: False
564
+ - `eval_strategy`: epoch
565
+ - `prediction_loss_only`: True
566
+ - `per_device_train_batch_size`: 32
567
+ - `per_device_eval_batch_size`: 16
568
+ - `per_gpu_train_batch_size`: None
569
+ - `per_gpu_eval_batch_size`: None
570
+ - `gradient_accumulation_steps`: 16
571
+ - `eval_accumulation_steps`: None
572
+ - `torch_empty_cache_steps`: None
573
+ - `learning_rate`: 2e-05
574
+ - `weight_decay`: 0.0
575
+ - `adam_beta1`: 0.9
576
+ - `adam_beta2`: 0.999
577
+ - `adam_epsilon`: 1e-08
578
+ - `max_grad_norm`: 1.0
579
+ - `num_train_epochs`: 4
580
+ - `max_steps`: -1
581
+ - `lr_scheduler_type`: cosine
582
+ - `lr_scheduler_kwargs`: {}
583
+ - `warmup_ratio`: 0.1
584
+ - `warmup_steps`: 0
585
+ - `log_level`: passive
586
+ - `log_level_replica`: warning
587
+ - `log_on_each_node`: True
588
+ - `logging_nan_inf_filter`: True
589
+ - `save_safetensors`: True
590
+ - `save_on_each_node`: False
591
+ - `save_only_model`: False
592
+ - `restore_callback_states_from_checkpoint`: False
593
+ - `no_cuda`: False
594
+ - `use_cpu`: False
595
+ - `use_mps_device`: False
596
+ - `seed`: 42
597
+ - `data_seed`: None
598
+ - `jit_mode_eval`: False
599
+ - `use_ipex`: False
600
+ - `bf16`: False
601
+ - `fp16`: True
602
+ - `fp16_opt_level`: O1
603
+ - `half_precision_backend`: auto
604
+ - `bf16_full_eval`: False
605
+ - `fp16_full_eval`: False
606
+ - `tf32`: False
607
+ - `local_rank`: 0
608
+ - `ddp_backend`: None
609
+ - `tpu_num_cores`: None
610
+ - `tpu_metrics_debug`: False
611
+ - `debug`: []
612
+ - `dataloader_drop_last`: False
613
+ - `dataloader_num_workers`: 0
614
+ - `dataloader_prefetch_factor`: None
615
+ - `past_index`: -1
616
+ - `disable_tqdm`: False
617
+ - `remove_unused_columns`: True
618
+ - `label_names`: None
619
+ - `load_best_model_at_end`: True
620
+ - `ignore_data_skip`: False
621
+ - `fsdp`: []
622
+ - `fsdp_min_num_params`: 0
623
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
624
+ - `fsdp_transformer_layer_cls_to_wrap`: None
625
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
626
+ - `deepspeed`: None
627
+ - `label_smoothing_factor`: 0.0
628
+ - `optim`: adamw_torch_fused
629
+ - `optim_args`: None
630
+ - `adafactor`: False
631
+ - `group_by_length`: False
632
+ - `length_column_name`: length
633
+ - `ddp_find_unused_parameters`: None
634
+ - `ddp_bucket_cap_mb`: None
635
+ - `ddp_broadcast_buffers`: False
636
+ - `dataloader_pin_memory`: True
637
+ - `dataloader_persistent_workers`: False
638
+ - `skip_memory_metrics`: True
639
+ - `use_legacy_prediction_loop`: False
640
+ - `push_to_hub`: False
641
+ - `resume_from_checkpoint`: None
642
+ - `hub_model_id`: None
643
+ - `hub_strategy`: every_save
644
+ - `hub_private_repo`: None
645
+ - `hub_always_push`: False
646
+ - `gradient_checkpointing`: False
647
+ - `gradient_checkpointing_kwargs`: None
648
+ - `include_inputs_for_metrics`: False
649
+ - `include_for_metrics`: []
650
+ - `eval_do_concat_batches`: True
651
+ - `fp16_backend`: auto
652
+ - `push_to_hub_model_id`: None
653
+ - `push_to_hub_organization`: None
654
+ - `mp_parameters`:
655
+ - `auto_find_batch_size`: False
656
+ - `full_determinism`: False
657
+ - `torchdynamo`: None
658
+ - `ray_scope`: last
659
+ - `ddp_timeout`: 1800
660
+ - `torch_compile`: False
661
+ - `torch_compile_backend`: None
662
+ - `torch_compile_mode`: None
663
+ - `dispatch_batches`: None
664
+ - `split_batches`: None
665
+ - `include_tokens_per_second`: False
666
+ - `include_num_input_tokens_seen`: False
667
+ - `neftune_noise_alpha`: None
668
+ - `optim_target_modules`: None
669
+ - `batch_eval_metrics`: False
670
+ - `eval_on_start`: False
671
+ - `use_liger_kernel`: False
672
+ - `eval_use_gather_object`: False
673
+ - `average_tokens_across_devices`: False
674
+ - `prompts`: None
675
+ - `batch_sampler`: no_duplicates
676
+ - `multi_dataset_batch_sampler`: proportional
677
+
678
+ </details>
679
+
680
+ ### Training Logs
681
+ | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
682
+ |:---------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
683
+ | 0.8122 | 10 | 1.5872 | - | - | - | - | - |
684
+ | 1.0 | 13 | - | 0.7879 | 0.7860 | 0.7782 | 0.7698 | 0.7320 |
685
+ | 1.5685 | 20 | 0.6329 | - | - | - | - | - |
686
+ | 2.0 | 26 | - | 0.7988 | 0.7969 | 0.7923 | 0.7826 | 0.7520 |
687
+ | 2.3249 | 30 | 0.4465 | - | - | - | - | - |
688
+ | 3.0 | 39 | - | 0.8046 | 0.8026 | 0.7959 | 0.7855 | 0.7596 |
689
+ | 3.0812 | 40 | 0.349 | - | - | - | - | - |
690
+ | **3.731** | **48** | **-** | **0.8046** | **0.8036** | **0.7963** | **0.7863** | **0.7602** |
691
+
692
+ * The bold row denotes the saved checkpoint.
693
+
694
+ ### Framework Versions
695
+ - Python: 3.9.20
696
+ - Sentence Transformers: 3.3.1
697
+ - Transformers: 4.47.1
698
+ - PyTorch: 2.1.2+cu121
699
+ - Accelerate: 1.2.1
700
+ - Datasets: 2.19.1
701
+ - Tokenizers: 0.21.0
702
+
703
+ ## Citation
704
+
705
+ ### BibTeX
706
+
707
+ #### Sentence Transformers
708
+ ```bibtex
709
+ @inproceedings{reimers-2019-sentence-bert,
710
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
711
+ author = "Reimers, Nils and Gurevych, Iryna",
712
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
713
+ month = "11",
714
+ year = "2019",
715
+ publisher = "Association for Computational Linguistics",
716
+ url = "https://arxiv.org/abs/1908.10084",
717
+ }
718
+ ```
719
+
720
+ #### MatryoshkaLoss
721
+ ```bibtex
722
+ @misc{kusupati2024matryoshka,
723
+ title={Matryoshka Representation Learning},
724
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
725
+ year={2024},
726
+ eprint={2205.13147},
727
+ archivePrefix={arXiv},
728
+ primaryClass={cs.LG}
729
+ }
730
+ ```
731
+
732
+ #### MultipleNegativesRankingLoss
733
+ ```bibtex
734
+ @misc{henderson2017efficient,
735
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
736
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
737
+ year={2017},
738
+ eprint={1705.00652},
739
+ archivePrefix={arXiv},
740
+ primaryClass={cs.CL}
741
+ }
742
+ ```
743
+
744
+ <!--
745
+ ## Glossary
746
+
747
+ *Clearly define terms in order to be accessible across audiences.*
748
+ -->
749
+
750
+ <!--
751
+ ## Model Card Authors
752
+
753
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
754
+ -->
755
+
756
+ <!--
757
+ ## Model Card Contact
758
+
759
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
760
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.47.1",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.47.1",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72b540dbcfd6a79edba1110d200d199c801293f597ac76f5207a05b1eee1f0a2
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff