filmonfeme commited on
Commit
502a6b1
·
verified ·
1 Parent(s): 8ab5ef8

Fine-tuned reranker for financial chatbot

Browse files
README.md ADDED
@@ -0,0 +1,316 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - reranker
6
+ - generated_from_trainer
7
+ - dataset_size:2000
8
+ - loss:BinaryCrossEntropyLoss
9
+ base_model: cross-encoder/ms-marco-MiniLM-L6-v2
10
+ pipeline_tag: text-ranking
11
+ library_name: sentence-transformers
12
+ ---
13
+
14
+ # CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2
15
+
16
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
17
+
18
+ ## Model Details
19
+
20
+ ### Model Description
21
+ - **Model Type:** Cross Encoder
22
+ - **Base model:** [cross-encoder/ms-marco-MiniLM-L6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L6-v2) <!-- at revision c5ee24cb16019beea0893ab7796b1df96625c6b8 -->
23
+ - **Maximum Sequence Length:** 512 tokens
24
+ - **Number of Output Labels:** 1 label
25
+ <!-- - **Training Dataset:** Unknown -->
26
+ <!-- - **Language:** Unknown -->
27
+ <!-- - **License:** Unknown -->
28
+
29
+ ### Model Sources
30
+
31
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
32
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
33
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
34
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
35
+
36
+ ## Usage
37
+
38
+ ### Direct Usage (Sentence Transformers)
39
+
40
+ First install the Sentence Transformers library:
41
+
42
+ ```bash
43
+ pip install -U sentence-transformers
44
+ ```
45
+
46
+ Then you can load this model and run inference.
47
+ ```python
48
+ from sentence_transformers import CrossEncoder
49
+
50
+ # Download from the 🤗 Hub
51
+ model = CrossEncoder("cross_encoder_model_id")
52
+ # Get scores for pairs of texts
53
+ pairs = [
54
+ ['(I) Designation The taxpayer shall designate the imputed income limitation of each unit taken into account under such clause?', '(I) Designation\nThe taxpayer shall designate the imputed income limitation of each unit taken into\naccount under such clause.\n(II) Average test\nThe average of the imputed income limitations designated under subclause (I) shall not\nexceed 60 percent of area median gross income.\n(III) 10-percent increments\nThe designated imputed income limitation of any unit under subclause (I) shall be 20\npercent, 30 percent, 40 percent, 50 percent, 60 percent, 70 percent, or 80 percent of area\nmedian gross income.\nAny election under this paragraph, once made, shall be irrevocable. For purposes of this\nparagraph, any property shall not be treated as failing to be residential rental property merely\nbecause part of the building in which such property is located is used for purposes other than\nresidential rental purposes.\n(2) Rent-restricted units\n(A) In general\nFor purposes of paragraph (1), a residential unit is rent-restricted if the gross rent with respect'],
55
+ ['Summarize the key points from usc26@118-78.pdf.', 'turn or other schedules. List the type and \namount of tax.\nOther taxes to be listed include the \nfollowing.\nForm 8978 adjustment. Complete the \nNegative Form 8978 Adjustment Work-\nsheet—Schedule 2 (Line 17z) if you are \nfiling Form 8978 and completed the \nworksheet in the Schedule 3, line 6l, in-\nstructions and the amount on line 3 of \nthat worksheet is negative.\n100'],
56
+ ['Summarize the key points from usc26@118-78.pdf.', 'program who are included in a unit of employees covered by an agreement which the Secretary of Labor finds\nto be a collective bargaining agreement between employee representatives and one or more employers, if there\nis evidence that educational assistance benefits were the subject of good faith bargaining between such\nemployee representatives and such employer or employers."\nPub. L. 99–514, §1114(b)(4), substituted "highly compensated employees (within the meaning of section\n414(q))" for "officers, owners, or highly compensated,".\nSubsec. (b)(6). Pub. L. 99–514, §1151(c)(4)(B), struck out par. (6) which read as follows: "\n.—Reasonable notification of the availability and terms of the program\nNOTIFICATION OF EMPLOYEES\nmust be provided to eligible employees."\nSubsec. (d). Pub. L. 99–514, §1162(a)(1), substituted "December 31, 1987" for "December 31, 1985".\n1984—Subsec. (a). Pub. L. 98–611, §1(b), amended subsec. generally, substituting "Exclusion from gross'],
57
+ ['Summarize the key points from usc26@118-78.pdf.', 'the taxpayer or by law. Taxpayers have the right to expect \nappropriate action will be taken against employees, return \npreparers, and others who wrongfully use or disclose taxpayer \nreturn information.\n9. The Right to Retain Representation\nTaxpayers have the right to retain an authorized representative \nof their choice to represent them in their dealings with the \nIRS. Taxpayers have the right to seek assistance from a Low \nIncome Taxpayer Clinic if they cannot afford representation.\n10. The Right to a Fair and Just Tax System\nTaxpayers have the right to expect the tax system to consider \nfacts and circumstances that might affect their underlying \nliabilities, ability to pay, or ability to provide information timely. \nTaxpayers have the right to receive assistance from the \nTaxpayer Advocate Service if they are experiencing financial \ndifficulty or if the IRS has not resolved their tax issues properly \nand timely through its normal channels. \n113'],
58
+ ['Summarize the key points from usc26@118-78.pdf.', 'ceived, include the amount withheld in \nthe total on line 25b. This should be \nshown in box 4 of Form 1099, box 6 of \nForm SSA-1099, or box 10 of Form \nRRB-1099.\nLine 25c—Other Forms\nInclude on line 25c any federal income \ntax withheld on your Form(s) W-2G. \nThe amount withheld should be shown \nin box 4. Attach Form(s) W-2G to the \nfront of your return if federal income tax \nwas withheld.\nIf you had Additional Medicare Tax \nwithheld, include the amount shown on \nForm 8959, line 24, in the total on \nline 25c. Attach Form 8959.\nInclude on line 25c any federal in-\ncome tax withheld that is shown on a \nSchedule K-1.\nAlso include on line 25c any tax \nwithheld that is shown on Form 1042-S, \nForm 8805, or Form 8288-A. You \nshould attach the form to your return to \nclaim a credit for the withholding.\nLine 26\n2023 Estimated Tax \nPayments\nEnter any estimated federal income tax \npayments you made for 2023. Include \nany overpayment that you applied to \nyour 2023 estimated tax from your 2022'],
59
+ ]
60
+ scores = model.predict(pairs)
61
+ print(scores.shape)
62
+ # (5,)
63
+
64
+ # Or rank different texts based on similarity to a single text
65
+ ranks = model.rank(
66
+ '(I) Designation The taxpayer shall designate the imputed income limitation of each unit taken into account under such clause?',
67
+ [
68
+ '(I) Designation\nThe taxpayer shall designate the imputed income limitation of each unit taken into\naccount under such clause.\n(II) Average test\nThe average of the imputed income limitations designated under subclause (I) shall not\nexceed 60 percent of area median gross income.\n(III) 10-percent increments\nThe designated imputed income limitation of any unit under subclause (I) shall be 20\npercent, 30 percent, 40 percent, 50 percent, 60 percent, 70 percent, or 80 percent of area\nmedian gross income.\nAny election under this paragraph, once made, shall be irrevocable. For purposes of this\nparagraph, any property shall not be treated as failing to be residential rental property merely\nbecause part of the building in which such property is located is used for purposes other than\nresidential rental purposes.\n(2) Rent-restricted units\n(A) In general\nFor purposes of paragraph (1), a residential unit is rent-restricted if the gross rent with respect',
69
+ 'turn or other schedules. List the type and \namount of tax.\nOther taxes to be listed include the \nfollowing.\nForm 8978 adjustment. Complete the \nNegative Form 8978 Adjustment Work-\nsheet—Schedule 2 (Line 17z) if you are \nfiling Form 8978 and completed the \nworksheet in the Schedule 3, line 6l, in-\nstructions and the amount on line 3 of \nthat worksheet is negative.\n100',
70
+ 'program who are included in a unit of employees covered by an agreement which the Secretary of Labor finds\nto be a collective bargaining agreement between employee representatives and one or more employers, if there\nis evidence that educational assistance benefits were the subject of good faith bargaining between such\nemployee representatives and such employer or employers."\nPub. L. 99–514, §1114(b)(4), substituted "highly compensated employees (within the meaning of section\n414(q))" for "officers, owners, or highly compensated,".\nSubsec. (b)(6). Pub. L. 99–514, §1151(c)(4)(B), struck out par. (6) which read as follows: "\n.—Reasonable notification of the availability and terms of the program\nNOTIFICATION OF EMPLOYEES\nmust be provided to eligible employees."\nSubsec. (d). Pub. L. 99–514, §1162(a)(1), substituted "December 31, 1987" for "December 31, 1985".\n1984—Subsec. (a). Pub. L. 98–611, §1(b), amended subsec. generally, substituting "Exclusion from gross',
71
+ 'the taxpayer or by law. Taxpayers have the right to expect \nappropriate action will be taken against employees, return \npreparers, and others who wrongfully use or disclose taxpayer \nreturn information.\n9. The Right to Retain Representation\nTaxpayers have the right to retain an authorized representative \nof their choice to represent them in their dealings with the \nIRS. Taxpayers have the right to seek assistance from a Low \nIncome Taxpayer Clinic if they cannot afford representation.\n10. The Right to a Fair and Just Tax System\nTaxpayers have the right to expect the tax system to consider \nfacts and circumstances that might affect their underlying \nliabilities, ability to pay, or ability to provide information timely. \nTaxpayers have the right to receive assistance from the \nTaxpayer Advocate Service if they are experiencing financial \ndifficulty or if the IRS has not resolved their tax issues properly \nand timely through its normal channels. \n113',
72
+ 'ceived, include the amount withheld in \nthe total on line 25b. This should be \nshown in box 4 of Form 1099, box 6 of \nForm SSA-1099, or box 10 of Form \nRRB-1099.\nLine 25c—Other Forms\nInclude on line 25c any federal income \ntax withheld on your Form(s) W-2G. \nThe amount withheld should be shown \nin box 4. Attach Form(s) W-2G to the \nfront of your return if federal income tax \nwas withheld.\nIf you had Additional Medicare Tax \nwithheld, include the amount shown on \nForm 8959, line 24, in the total on \nline 25c. Attach Form 8959.\nInclude on line 25c any federal in-\ncome tax withheld that is shown on a \nSchedule K-1.\nAlso include on line 25c any tax \nwithheld that is shown on Form 1042-S, \nForm 8805, or Form 8288-A. You \nshould attach the form to your return to \nclaim a credit for the withholding.\nLine 26\n2023 Estimated Tax \nPayments\nEnter any estimated federal income tax \npayments you made for 2023. Include \nany overpayment that you applied to \nyour 2023 estimated tax from your 2022',
73
+ ]
74
+ )
75
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
76
+ ```
77
+
78
+ <!--
79
+ ### Direct Usage (Transformers)
80
+
81
+ <details><summary>Click to see the direct usage in Transformers</summary>
82
+
83
+ </details>
84
+ -->
85
+
86
+ <!--
87
+ ### Downstream Usage (Sentence Transformers)
88
+
89
+ You can finetune this model on your own dataset.
90
+
91
+ <details><summary>Click to expand</summary>
92
+
93
+ </details>
94
+ -->
95
+
96
+ <!--
97
+ ### Out-of-Scope Use
98
+
99
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
100
+ -->
101
+
102
+ <!--
103
+ ## Bias, Risks and Limitations
104
+
105
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
106
+ -->
107
+
108
+ <!--
109
+ ### Recommendations
110
+
111
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
112
+ -->
113
+
114
+ ## Training Details
115
+
116
+ ### Training Dataset
117
+
118
+ #### Unnamed Dataset
119
+
120
+ * Size: 2,000 training samples
121
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
122
+ * Approximate statistics based on the first 1000 samples:
123
+ | | sentence_0 | sentence_1 | label |
124
+ |:--------|:------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:--------------------------------------------------------------|
125
+ | type | string | string | float |
126
+ | details | <ul><li>min: 34 characters</li><li>mean: 60.95 characters</li><li>max: 160 characters</li></ul> | <ul><li>min: 109 characters</li><li>mean: 892.77 characters</li><li>max: 1000 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
127
+ * Samples:
128
+ | sentence_0 | sentence_1 | label |
129
+ |:-------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
130
+ | <code>(I) Designation The taxpayer shall designate the imputed income limitation of each unit taken into account under such clause?</code> | <code>(I) Designation<br>The taxpayer shall designate the imputed income limitation of each unit taken into<br>account under such clause.<br>(II) Average test<br>The average of the imputed income limitations designated under subclause (I) shall not<br>exceed 60 percent of area median gross income.<br>(III) 10-percent increments<br>The designated imputed income limitation of any unit under subclause (I) shall be 20<br>percent, 30 percent, 40 percent, 50 percent, 60 percent, 70 percent, or 80 percent of area<br>median gross income.<br>Any election under this paragraph, once made, shall be irrevocable. For purposes of this<br>paragraph, any property shall not be treated as failing to be residential rental property merely<br>because part of the building in which such property is located is used for purposes other than<br>residential rental purposes.<br>(2) Rent-restricted units<br>(A) In general<br>For purposes of paragraph (1), a residential unit is rent-restricted if the gross rent with respect</code> | <code>1.0</code> |
131
+ | <code>Summarize the key points from usc26@118-78.pdf.</code> | <code>turn or other schedules. List the type and <br>amount of tax.<br>Other taxes to be listed include the <br>following.<br>Form 8978 adjustment. Complete the <br>Negative Form 8978 Adjustment Work-<br>sheet—Schedule 2 (Line 17z) if you are <br>filing Form 8978 and completed the <br>worksheet in the Schedule 3, line 6l, in-<br>structions and the amount on line 3 of <br>that worksheet is negative.<br>100</code> | <code>0.0</code> |
132
+ | <code>Summarize the key points from usc26@118-78.pdf.</code> | <code>program who are included in a unit of employees covered by an agreement which the Secretary of Labor finds<br>to be a collective bargaining agreement between employee representatives and one or more employers, if there<br>is evidence that educational assistance benefits were the subject of good faith bargaining between such<br>employee representatives and such employer or employers."<br>Pub. L. 99–514, §1114(b)(4), substituted "highly compensated employees (within the meaning of section<br>414(q))" for "officers, owners, or highly compensated,".<br>Subsec. (b)(6). Pub. L. 99–514, §1151(c)(4)(B), struck out par. (6) which read as follows: "<br>.—Reasonable notification of the availability and terms of the program<br>NOTIFICATION OF EMPLOYEES<br>must be provided to eligible employees."<br>Subsec. (d). Pub. L. 99–514, §1162(a)(1), substituted "December 31, 1987" for "December 31, 1985".<br>1984—Subsec. (a). Pub. L. 98–611, §1(b), amended subsec. generally, substituting "Exclusion from gross</code> | <code>1.0</code> |
133
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
134
+ ```json
135
+ {
136
+ "activation_fn": "torch.nn.modules.linear.Identity",
137
+ "pos_weight": null
138
+ }
139
+ ```
140
+
141
+ ### Training Hyperparameters
142
+
143
+ #### All Hyperparameters
144
+ <details><summary>Click to expand</summary>
145
+
146
+ - `overwrite_output_dir`: False
147
+ - `do_predict`: False
148
+ - `eval_strategy`: no
149
+ - `prediction_loss_only`: True
150
+ - `per_device_train_batch_size`: 8
151
+ - `per_device_eval_batch_size`: 8
152
+ - `per_gpu_train_batch_size`: None
153
+ - `per_gpu_eval_batch_size`: None
154
+ - `gradient_accumulation_steps`: 1
155
+ - `eval_accumulation_steps`: None
156
+ - `torch_empty_cache_steps`: None
157
+ - `learning_rate`: 5e-05
158
+ - `weight_decay`: 0.0
159
+ - `adam_beta1`: 0.9
160
+ - `adam_beta2`: 0.999
161
+ - `adam_epsilon`: 1e-08
162
+ - `max_grad_norm`: 1
163
+ - `num_train_epochs`: 3
164
+ - `max_steps`: -1
165
+ - `lr_scheduler_type`: linear
166
+ - `lr_scheduler_kwargs`: {}
167
+ - `warmup_ratio`: 0.0
168
+ - `warmup_steps`: 0
169
+ - `log_level`: passive
170
+ - `log_level_replica`: warning
171
+ - `log_on_each_node`: True
172
+ - `logging_nan_inf_filter`: True
173
+ - `save_safetensors`: True
174
+ - `save_on_each_node`: False
175
+ - `save_only_model`: False
176
+ - `restore_callback_states_from_checkpoint`: False
177
+ - `no_cuda`: False
178
+ - `use_cpu`: False
179
+ - `use_mps_device`: False
180
+ - `seed`: 42
181
+ - `data_seed`: None
182
+ - `jit_mode_eval`: False
183
+ - `bf16`: False
184
+ - `fp16`: False
185
+ - `fp16_opt_level`: O1
186
+ - `half_precision_backend`: auto
187
+ - `bf16_full_eval`: False
188
+ - `fp16_full_eval`: False
189
+ - `tf32`: None
190
+ - `local_rank`: 0
191
+ - `ddp_backend`: None
192
+ - `tpu_num_cores`: None
193
+ - `tpu_metrics_debug`: False
194
+ - `debug`: []
195
+ - `dataloader_drop_last`: False
196
+ - `dataloader_num_workers`: 0
197
+ - `dataloader_prefetch_factor`: None
198
+ - `past_index`: -1
199
+ - `disable_tqdm`: False
200
+ - `remove_unused_columns`: True
201
+ - `label_names`: None
202
+ - `load_best_model_at_end`: False
203
+ - `ignore_data_skip`: False
204
+ - `fsdp`: []
205
+ - `fsdp_min_num_params`: 0
206
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
207
+ - `fsdp_transformer_layer_cls_to_wrap`: None
208
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
209
+ - `parallelism_config`: None
210
+ - `deepspeed`: None
211
+ - `label_smoothing_factor`: 0.0
212
+ - `optim`: adamw_torch_fused
213
+ - `optim_args`: None
214
+ - `adafactor`: False
215
+ - `group_by_length`: False
216
+ - `length_column_name`: length
217
+ - `project`: huggingface
218
+ - `trackio_space_id`: trackio
219
+ - `ddp_find_unused_parameters`: None
220
+ - `ddp_bucket_cap_mb`: None
221
+ - `ddp_broadcast_buffers`: False
222
+ - `dataloader_pin_memory`: True
223
+ - `dataloader_persistent_workers`: False
224
+ - `skip_memory_metrics`: True
225
+ - `use_legacy_prediction_loop`: False
226
+ - `push_to_hub`: False
227
+ - `resume_from_checkpoint`: None
228
+ - `hub_model_id`: None
229
+ - `hub_strategy`: every_save
230
+ - `hub_private_repo`: None
231
+ - `hub_always_push`: False
232
+ - `hub_revision`: None
233
+ - `gradient_checkpointing`: False
234
+ - `gradient_checkpointing_kwargs`: None
235
+ - `include_inputs_for_metrics`: False
236
+ - `include_for_metrics`: []
237
+ - `eval_do_concat_batches`: True
238
+ - `fp16_backend`: auto
239
+ - `push_to_hub_model_id`: None
240
+ - `push_to_hub_organization`: None
241
+ - `mp_parameters`:
242
+ - `auto_find_batch_size`: False
243
+ - `full_determinism`: False
244
+ - `torchdynamo`: None
245
+ - `ray_scope`: last
246
+ - `ddp_timeout`: 1800
247
+ - `torch_compile`: False
248
+ - `torch_compile_backend`: None
249
+ - `torch_compile_mode`: None
250
+ - `include_tokens_per_second`: False
251
+ - `include_num_input_tokens_seen`: no
252
+ - `neftune_noise_alpha`: None
253
+ - `optim_target_modules`: None
254
+ - `batch_eval_metrics`: False
255
+ - `eval_on_start`: False
256
+ - `use_liger_kernel`: False
257
+ - `liger_kernel_config`: None
258
+ - `eval_use_gather_object`: False
259
+ - `average_tokens_across_devices`: True
260
+ - `prompts`: None
261
+ - `batch_sampler`: batch_sampler
262
+ - `multi_dataset_batch_sampler`: proportional
263
+ - `router_mapping`: {}
264
+ - `learning_rate_mapping`: {}
265
+
266
+ </details>
267
+
268
+ ### Training Logs
269
+ | Epoch | Step | Training Loss |
270
+ |:-----:|:----:|:-------------:|
271
+ | 2.0 | 500 | 0.2146 |
272
+
273
+
274
+ ### Framework Versions
275
+ - Python: 3.12.12
276
+ - Sentence Transformers: 5.1.2
277
+ - Transformers: 4.57.3
278
+ - PyTorch: 2.9.0+cu126
279
+ - Accelerate: 1.12.0
280
+ - Datasets: 4.0.0
281
+ - Tokenizers: 0.22.1
282
+
283
+ ## Citation
284
+
285
+ ### BibTeX
286
+
287
+ #### Sentence Transformers
288
+ ```bibtex
289
+ @inproceedings{reimers-2019-sentence-bert,
290
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
291
+ author = "Reimers, Nils and Gurevych, Iryna",
292
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
293
+ month = "11",
294
+ year = "2019",
295
+ publisher = "Association for Computational Linguistics",
296
+ url = "https://arxiv.org/abs/1908.10084",
297
+ }
298
+ ```
299
+
300
+ <!--
301
+ ## Glossary
302
+
303
+ *Clearly define terms in order to be accessible across audiences.*
304
+ -->
305
+
306
+ <!--
307
+ ## Model Card Authors
308
+
309
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
310
+ -->
311
+
312
+ <!--
313
+ ## Model Card Contact
314
+
315
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
316
+ -->
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 1536,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 6,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "sentence_transformers": {
28
+ "activation_fn": "torch.nn.modules.linear.Identity",
29
+ "version": "5.1.2"
30
+ },
31
+ "transformers_version": "4.57.3",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 30522
35
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ee5fb8931cb42590aab1cedb07759a5921f2f484cedfe67115ba9c76f1e647f
3
+ size 90866412
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "truncation": true,
58
+ "unk_token": "[UNK]"
59
+ }
training_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
3
+ "max_length": 512,
4
+ "training_samples": 2000,
5
+ "epochs": 3,
6
+ "learning_rate": 2e-05,
7
+ "warmup_steps": 37,
8
+ "original_pairs": 2000,
9
+ "enhanced_pairs": 11
10
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff