AryehRotberg commited on
Commit
3cb27d6
·
verified ·
1 Parent(s): cb096ee

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,521 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - dense
7
+ - generated_from_trainer
8
+ - dataset_size:122856
9
+ - loss:MultipleNegativesRankingLoss
10
+ base_model: sentence-transformers/all-MiniLM-L6-v2
11
+ widget:
12
+ - source_sentence: '"To update your preferences, ask us to remove your information
13
+ from our marketing mailing lists or submit a request, please contact us as outlined
14
+ in the How To Contact Us Section below."'
15
+ sentences:
16
+ - You can opt out of promotional communications
17
+ - IP addresses of website visitors are not tracked
18
+ - If you are the target of a copyright holder's take down notice, this service gives
19
+ you the opportunity to defend yourself
20
+ - source_sentence: To ensure that disputes are dealt with soon after they arise, you
21
+ agree that regardless of any statute or law to the contrary, any claim or cause
22
+ of action you might have arising out of or related to use of our services or these
23
+ Terms of Use must be filed within the applicable statute of limitations or, if
24
+ earlier, one (1) year after the pertinent facts underlying such claim or cause
25
+ of action could have been discovered with reasonable diligence (or be forever
26
+ barred).
27
+ sentences:
28
+ - This service gives your personal data to third parties involved in its operation
29
+ - The service claims to be CCPA compliant for California users
30
+ - You have a reduced time period to take legal action against the service
31
+ - source_sentence: 'The privacy policy states: "To be able to offer our products and
32
+ services for free, we serve third-party ads of advertising companies in our products
33
+ for mobile devices. To enable the ad, we embed a software development kit (“SDK”)
34
+ provided by an advertising company into the product, which then collects Personal
35
+ Data in order to personalize ads for you."'
36
+ sentences:
37
+ - You are tracked via web beacons, tracking pixels, browser fingerprinting, and/or
38
+ device fingerprinting
39
+ - Your personal data may be used for marketing purposes
40
+ - You are tracked via web beacons, tracking pixels, browser fingerprinting, and/or
41
+ device fingerprinting
42
+ - source_sentence: The organization cannot be held responsible for the consequences
43
+ of negligence by the user, notably of failure by the user to secure their password.
44
+ sentences:
45
+ - Your content can be licensed to third parties
46
+ - Spidering, crawling, or accessing the site through any automated means is not
47
+ allowed
48
+ - You are responsible for maintaining the security of your account and for the activities
49
+ on your account
50
+ - source_sentence: The Services may contain links or connections to third party websites
51
+ or services that are not owned or controlled by Guilded. When you access third
52
+ party websites or use third party services, you accept that there are risks in
53
+ doing so, and that Guilded is not responsible for such risks.
54
+ sentences:
55
+ - This service assumes no responsibility and liability for the contents of links
56
+ to other websites
57
+ - Copyright license limited for the purposes of that same service but transferable
58
+ and sublicenseable
59
+ - Your content can be deleted if you violate the terms
60
+ pipeline_tag: sentence-similarity
61
+ library_name: sentence-transformers
62
+ metrics:
63
+ - cosine_accuracy
64
+ model-index:
65
+ - name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
66
+ results:
67
+ - task:
68
+ type: triplet
69
+ name: Triplet
70
+ dataset:
71
+ name: all nli dev
72
+ type: all-nli-dev
73
+ metrics:
74
+ - type: cosine_accuracy
75
+ value: 0.9993162751197815
76
+ name: Cosine Accuracy
77
+ ---
78
+
79
+ # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
80
+
81
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
82
+
83
+ ## Model Details
84
+
85
+ ### Model Description
86
+ - **Model Type:** Sentence Transformer
87
+ - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) <!-- at revision c9745ed1d9f207416be6d2e6f8de32d1f16199bf -->
88
+ - **Maximum Sequence Length:** 256 tokens
89
+ - **Output Dimensionality:** 384 dimensions
90
+ - **Similarity Function:** Cosine Similarity
91
+ <!-- - **Training Dataset:** Unknown -->
92
+ <!-- - **Language:** Unknown -->
93
+ <!-- - **License:** Unknown -->
94
+
95
+ ### Model Sources
96
+
97
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
98
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
99
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
100
+
101
+ ### Full Model Architecture
102
+
103
+ ```
104
+ SentenceTransformer(
105
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
106
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
107
+ (2): Normalize()
108
+ )
109
+ ```
110
+
111
+ ## Usage
112
+
113
+ ### Direct Usage (Sentence Transformers)
114
+
115
+ First install the Sentence Transformers library:
116
+
117
+ ```bash
118
+ pip install -U sentence-transformers
119
+ ```
120
+
121
+ Then you can load this model and run inference.
122
+ ```python
123
+ from sentence_transformers import SentenceTransformer
124
+
125
+ # Download from the 🤗 Hub
126
+ model = SentenceTransformer("AryehRotberg/ToS-Sentence-Transformers-V4")
127
+ # Run inference
128
+ sentences = [
129
+ 'The Services may contain links or connections to third party websites or services that are not owned or controlled by Guilded. When you access third party websites or use third party services, you accept that there are risks in doing so, and that Guilded is not responsible for such risks.',
130
+ 'This service assumes no responsibility and liability for the contents of links to other websites',
131
+ 'Copyright license limited for the purposes of that same service but transferable and sublicenseable',
132
+ ]
133
+ embeddings = model.encode(sentences)
134
+ print(embeddings.shape)
135
+ # [3, 384]
136
+
137
+ # Get the similarity scores for the embeddings
138
+ similarities = model.similarity(embeddings, embeddings)
139
+ print(similarities)
140
+ # tensor([[ 1.0000, 0.6397, -0.0500],
141
+ # [ 0.6397, 1.0000, 0.0874],
142
+ # [-0.0500, 0.0874, 1.0000]])
143
+ ```
144
+
145
+ <!--
146
+ ### Direct Usage (Transformers)
147
+
148
+ <details><summary>Click to see the direct usage in Transformers</summary>
149
+
150
+ </details>
151
+ -->
152
+
153
+ <!--
154
+ ### Downstream Usage (Sentence Transformers)
155
+
156
+ You can finetune this model on your own dataset.
157
+
158
+ <details><summary>Click to expand</summary>
159
+
160
+ </details>
161
+ -->
162
+
163
+ <!--
164
+ ### Out-of-Scope Use
165
+
166
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
167
+ -->
168
+
169
+ ## Evaluation
170
+
171
+ ### Metrics
172
+
173
+ #### Triplet
174
+
175
+ * Dataset: `all-nli-dev`
176
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
177
+
178
+ | Metric | Value |
179
+ |:--------------------|:-----------|
180
+ | **cosine_accuracy** | **0.9993** |
181
+
182
+ <!--
183
+ ## Bias, Risks and Limitations
184
+
185
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
186
+ -->
187
+
188
+ <!--
189
+ ### Recommendations
190
+
191
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
192
+ -->
193
+
194
+ ## Training Details
195
+
196
+ ### Training Dataset
197
+
198
+ #### Unnamed Dataset
199
+
200
+ * Size: 122,856 training samples
201
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
202
+ * Approximate statistics based on the first 1000 samples:
203
+ | | anchor | positive | negative |
204
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
205
+ | type | string | string | string |
206
+ | details | <ul><li>min: 3 tokens</li><li>mean: 48.49 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.21 tokens</li><li>max: 29 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.34 tokens</li><li>max: 29 tokens</li></ul> |
207
+ * Samples:
208
+ | anchor | positive | negative |
209
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------|
210
+ | <code>If you ever decide to stop using Snapchat, you can just ask us to delete your account.</code> | <code>You have the right to leave this service at any time</code> | <code>Your personal information is used for many different purposes</code> |
211
+ | <code>you forever waive and agree not to claim or assert any entitlement to any and all moral rights of an author in any of the User Content.</code> | <code>You waive your moral rights</code> | <code>You aren’t allowed to remove or edit user-generated content</code> |
212
+ | <code>You agree and shall indemnify and hold Dailymotion- harmless from and against any liability, loss, damages (including punitive damages), claim, settlement payment, cost and expense, interest, award, judgment, diminution in value, fine, fee (including reasonable attorneys’ fees), and penalty, or other charge (including reasonable attorneys’ fees and all other cost of investigating, defending or asserting any claim for indemnification under these Terms) arising from or relating to (i) Your Content, (ii) Your violation of the Terms or any other policy of Dailymotion. (iii) Your use of the Dailymotion Service. and (iv) Your violation of any third party rights, including without limitation any copyright, property, publicity or privacy rights.</code> | <code>You agree to defend, indemnify, and hold the service harmless in case of a claim related to your use of the service</code> | <code>User-generated content can be blocked or censored for any reason</code> |
213
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
214
+ ```json
215
+ {
216
+ "scale": 20.0,
217
+ "similarity_fct": "cos_sim",
218
+ "gather_across_devices": false
219
+ }
220
+ ```
221
+
222
+ ### Evaluation Dataset
223
+
224
+ #### Unnamed Dataset
225
+
226
+ * Size: 30,714 evaluation samples
227
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
228
+ * Approximate statistics based on the first 1000 samples:
229
+ | | anchor | positive | negative |
230
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
231
+ | type | string | string | string |
232
+ | details | <ul><li>min: 4 tokens</li><li>mean: 49.34 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.13 tokens</li><li>max: 29 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.28 tokens</li><li>max: 29 tokens</li></ul> |
233
+ * Samples:
234
+ | anchor | positive | negative |
235
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------|
236
+ | <code>YOU AGREE THAT USE OF THE WEB SITE AND THE SERVICES IS AT YOUR SOLE RISK.</code> | <code>The service is provided 'as is' and to be used at your sole risk</code> | <code>The court of law governing the terms is in a jurisdiction that is friendlier to user privacy protection.</code> |
237
+ | <code>If you continue to use our services after the changes have taken effect, it means that you agree to the changes.</code> | <code>Terms may be changed at any time</code> | <code>The service is only available in some countries approved by its government</code> |
238
+ | <code>We may revise these Terms of Use or any of the other Terms from time to time. You are ,expected to check this page and our Terms from time to time to take notice of any changes</code> | <code>Terms may be changed at any time</code> | <code>Voice data is collected and shared with third-parties</code> |
239
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
240
+ ```json
241
+ {
242
+ "scale": 20.0,
243
+ "similarity_fct": "cos_sim",
244
+ "gather_across_devices": false
245
+ }
246
+ ```
247
+
248
+ ### Training Hyperparameters
249
+ #### Non-Default Hyperparameters
250
+
251
+ - `eval_strategy`: steps
252
+ - `per_device_train_batch_size`: 16
253
+ - `per_device_eval_batch_size`: 16
254
+ - `learning_rate`: 2e-05
255
+ - `num_train_epochs`: 1
256
+ - `warmup_ratio`: 0.1
257
+ - `fp16`: True
258
+ - `batch_sampler`: no_duplicates
259
+
260
+ #### All Hyperparameters
261
+ <details><summary>Click to expand</summary>
262
+
263
+ - `overwrite_output_dir`: False
264
+ - `do_predict`: False
265
+ - `eval_strategy`: steps
266
+ - `prediction_loss_only`: True
267
+ - `per_device_train_batch_size`: 16
268
+ - `per_device_eval_batch_size`: 16
269
+ - `per_gpu_train_batch_size`: None
270
+ - `per_gpu_eval_batch_size`: None
271
+ - `gradient_accumulation_steps`: 1
272
+ - `eval_accumulation_steps`: None
273
+ - `torch_empty_cache_steps`: None
274
+ - `learning_rate`: 2e-05
275
+ - `weight_decay`: 0.0
276
+ - `adam_beta1`: 0.9
277
+ - `adam_beta2`: 0.999
278
+ - `adam_epsilon`: 1e-08
279
+ - `max_grad_norm`: 1.0
280
+ - `num_train_epochs`: 1
281
+ - `max_steps`: -1
282
+ - `lr_scheduler_type`: linear
283
+ - `lr_scheduler_kwargs`: {}
284
+ - `warmup_ratio`: 0.1
285
+ - `warmup_steps`: 0
286
+ - `log_level`: passive
287
+ - `log_level_replica`: warning
288
+ - `log_on_each_node`: True
289
+ - `logging_nan_inf_filter`: True
290
+ - `save_safetensors`: True
291
+ - `save_on_each_node`: False
292
+ - `save_only_model`: False
293
+ - `restore_callback_states_from_checkpoint`: False
294
+ - `no_cuda`: False
295
+ - `use_cpu`: False
296
+ - `use_mps_device`: False
297
+ - `seed`: 42
298
+ - `data_seed`: None
299
+ - `jit_mode_eval`: False
300
+ - `bf16`: False
301
+ - `fp16`: True
302
+ - `fp16_opt_level`: O1
303
+ - `half_precision_backend`: auto
304
+ - `bf16_full_eval`: False
305
+ - `fp16_full_eval`: False
306
+ - `tf32`: None
307
+ - `local_rank`: 0
308
+ - `ddp_backend`: None
309
+ - `tpu_num_cores`: None
310
+ - `tpu_metrics_debug`: False
311
+ - `debug`: []
312
+ - `dataloader_drop_last`: False
313
+ - `dataloader_num_workers`: 0
314
+ - `dataloader_prefetch_factor`: None
315
+ - `past_index`: -1
316
+ - `disable_tqdm`: False
317
+ - `remove_unused_columns`: True
318
+ - `label_names`: None
319
+ - `load_best_model_at_end`: False
320
+ - `ignore_data_skip`: False
321
+ - `fsdp`: []
322
+ - `fsdp_min_num_params`: 0
323
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
324
+ - `fsdp_transformer_layer_cls_to_wrap`: None
325
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
326
+ - `parallelism_config`: None
327
+ - `deepspeed`: None
328
+ - `label_smoothing_factor`: 0.0
329
+ - `optim`: adamw_torch_fused
330
+ - `optim_args`: None
331
+ - `adafactor`: False
332
+ - `group_by_length`: False
333
+ - `length_column_name`: length
334
+ - `project`: huggingface
335
+ - `trackio_space_id`: trackio
336
+ - `ddp_find_unused_parameters`: None
337
+ - `ddp_bucket_cap_mb`: None
338
+ - `ddp_broadcast_buffers`: False
339
+ - `dataloader_pin_memory`: True
340
+ - `dataloader_persistent_workers`: False
341
+ - `skip_memory_metrics`: True
342
+ - `use_legacy_prediction_loop`: False
343
+ - `push_to_hub`: False
344
+ - `resume_from_checkpoint`: None
345
+ - `hub_model_id`: None
346
+ - `hub_strategy`: every_save
347
+ - `hub_private_repo`: None
348
+ - `hub_always_push`: False
349
+ - `hub_revision`: None
350
+ - `gradient_checkpointing`: False
351
+ - `gradient_checkpointing_kwargs`: None
352
+ - `include_inputs_for_metrics`: False
353
+ - `include_for_metrics`: []
354
+ - `eval_do_concat_batches`: True
355
+ - `fp16_backend`: auto
356
+ - `push_to_hub_model_id`: None
357
+ - `push_to_hub_organization`: None
358
+ - `mp_parameters`:
359
+ - `auto_find_batch_size`: False
360
+ - `full_determinism`: False
361
+ - `torchdynamo`: None
362
+ - `ray_scope`: last
363
+ - `ddp_timeout`: 1800
364
+ - `torch_compile`: False
365
+ - `torch_compile_backend`: None
366
+ - `torch_compile_mode`: None
367
+ - `include_tokens_per_second`: False
368
+ - `include_num_input_tokens_seen`: no
369
+ - `neftune_noise_alpha`: None
370
+ - `optim_target_modules`: None
371
+ - `batch_eval_metrics`: False
372
+ - `eval_on_start`: False
373
+ - `use_liger_kernel`: False
374
+ - `liger_kernel_config`: None
375
+ - `eval_use_gather_object`: False
376
+ - `average_tokens_across_devices`: True
377
+ - `prompts`: None
378
+ - `batch_sampler`: no_duplicates
379
+ - `multi_dataset_batch_sampler`: proportional
380
+ - `router_mapping`: {}
381
+ - `learning_rate_mapping`: {}
382
+
383
+ </details>
384
+
385
+ ### Training Logs
386
+ | Epoch | Step | Training Loss | Validation Loss | all-nli-dev_cosine_accuracy |
387
+ |:------:|:----:|:-------------:|:---------------:|:---------------------------:|
388
+ | -1 | -1 | - | - | 0.9426 |
389
+ | 0.0130 | 100 | 1.4227 | 1.1709 | 0.9595 |
390
+ | 0.0260 | 200 | 1.1178 | 0.9104 | 0.9727 |
391
+ | 0.0391 | 300 | 0.9473 | 0.7546 | 0.9799 |
392
+ | 0.0521 | 400 | 0.7559 | 0.6471 | 0.9853 |
393
+ | 0.0651 | 500 | 0.6617 | 0.5684 | 0.9880 |
394
+ | 0.0781 | 600 | 0.5857 | 0.5047 | 0.9899 |
395
+ | 0.0912 | 700 | 0.5768 | 0.4578 | 0.9910 |
396
+ | 0.1042 | 800 | 0.493 | 0.4281 | 0.9921 |
397
+ | 0.1172 | 900 | 0.4877 | 0.3899 | 0.9931 |
398
+ | 0.1302 | 1000 | 0.4315 | 0.3593 | 0.9939 |
399
+ | 0.1432 | 1100 | 0.3894 | 0.3458 | 0.9940 |
400
+ | 0.1563 | 1200 | 0.3681 | 0.3215 | 0.9945 |
401
+ | 0.1693 | 1300 | 0.3533 | 0.3151 | 0.9951 |
402
+ | 0.1823 | 1400 | 0.3242 | 0.3093 | 0.9949 |
403
+ | 0.1953 | 1500 | 0.346 | 0.2820 | 0.9955 |
404
+ | 0.2084 | 1600 | 0.3212 | 0.2637 | 0.9960 |
405
+ | 0.2214 | 1700 | 0.2889 | 0.2601 | 0.9960 |
406
+ | 0.2344 | 1800 | 0.2855 | 0.2423 | 0.9960 |
407
+ | 0.2474 | 1900 | 0.2621 | 0.2396 | 0.9964 |
408
+ | 0.2605 | 2000 | 0.265 | 0.2299 | 0.9968 |
409
+ | 0.2735 | 2100 | 0.2401 | 0.2191 | 0.9969 |
410
+ | 0.2865 | 2200 | 0.254 | 0.2166 | 0.9966 |
411
+ | 0.2995 | 2300 | 0.2543 | 0.2036 | 0.9971 |
412
+ | 0.3125 | 2400 | 0.2667 | 0.1958 | 0.9973 |
413
+ | 0.3256 | 2500 | 0.2236 | 0.1937 | 0.9972 |
414
+ | 0.3386 | 2600 | 0.232 | 0.1875 | 0.9974 |
415
+ | 0.3516 | 2700 | 0.2021 | 0.1806 | 0.9977 |
416
+ | 0.3646 | 2800 | 0.2147 | 0.1787 | 0.9974 |
417
+ | 0.3777 | 2900 | 0.1929 | 0.1727 | 0.9975 |
418
+ | 0.3907 | 3000 | 0.1778 | 0.1721 | 0.9977 |
419
+ | 0.4037 | 3100 | 0.2031 | 0.1678 | 0.9974 |
420
+ | 0.4167 | 3200 | 0.1784 | 0.1645 | 0.9978 |
421
+ | 0.4297 | 3300 | 0.183 | 0.1593 | 0.9977 |
422
+ | 0.4428 | 3400 | 0.1878 | 0.1508 | 0.9979 |
423
+ | 0.4558 | 3500 | 0.1915 | 0.1478 | 0.9980 |
424
+ | 0.4688 | 3600 | 0.1611 | 0.1448 | 0.9983 |
425
+ | 0.4818 | 3700 | 0.1606 | 0.1385 | 0.9983 |
426
+ | 0.4949 | 3800 | 0.1604 | 0.1408 | 0.9984 |
427
+ | 0.5079 | 3900 | 0.1733 | 0.1327 | 0.9983 |
428
+ | 0.5209 | 4000 | 0.159 | 0.1277 | 0.9986 |
429
+ | 0.5339 | 4100 | 0.1554 | 0.1255 | 0.9987 |
430
+ | 0.5469 | 4200 | 0.1546 | 0.1225 | 0.9985 |
431
+ | 0.5600 | 4300 | 0.1536 | 0.1222 | 0.9984 |
432
+ | 0.5730 | 4400 | 0.1253 | 0.1174 | 0.9987 |
433
+ | 0.5860 | 4500 | 0.151 | 0.1137 | 0.9986 |
434
+ | 0.5990 | 4600 | 0.1293 | 0.1116 | 0.9988 |
435
+ | 0.6121 | 4700 | 0.1272 | 0.1093 | 0.9986 |
436
+ | 0.6251 | 4800 | 0.1326 | 0.1074 | 0.9985 |
437
+ | 0.6381 | 4900 | 0.135 | 0.1044 | 0.9987 |
438
+ | 0.6511 | 5000 | 0.1253 | 0.1013 | 0.9989 |
439
+ | 0.6641 | 5100 | 0.1466 | 0.0995 | 0.9989 |
440
+ | 0.6772 | 5200 | 0.1378 | 0.0993 | 0.9991 |
441
+ | 0.6902 | 5300 | 0.1245 | 0.0959 | 0.9989 |
442
+ | 0.7032 | 5400 | 0.1124 | 0.0946 | 0.9989 |
443
+ | 0.7162 | 5500 | 0.0937 | 0.0926 | 0.9988 |
444
+ | 0.7293 | 5600 | 0.1378 | 0.0907 | 0.9990 |
445
+ | 0.7423 | 5700 | 0.1234 | 0.0889 | 0.9991 |
446
+ | 0.7553 | 5800 | 0.1153 | 0.0876 | 0.9991 |
447
+ | 0.7683 | 5900 | 0.1172 | 0.0865 | 0.9990 |
448
+ | 0.7814 | 6000 | 0.1135 | 0.0855 | 0.9992 |
449
+ | 0.7944 | 6100 | 0.1178 | 0.0834 | 0.9991 |
450
+ | 0.8074 | 6200 | 0.1195 | 0.0812 | 0.9991 |
451
+ | 0.8204 | 6300 | 0.1068 | 0.0795 | 0.9991 |
452
+ | 0.8334 | 6400 | 0.0824 | 0.0791 | 0.9992 |
453
+ | 0.8465 | 6500 | 0.1173 | 0.0768 | 0.9992 |
454
+ | 0.8595 | 6600 | 0.1166 | 0.0757 | 0.9992 |
455
+ | 0.8725 | 6700 | 0.1119 | 0.0755 | 0.9992 |
456
+ | 0.8855 | 6800 | 0.1017 | 0.0750 | 0.9993 |
457
+ | 0.8986 | 6900 | 0.1148 | 0.0745 | 0.9993 |
458
+ | 0.9116 | 7000 | 0.0976 | 0.0736 | 0.9993 |
459
+ | 0.9246 | 7100 | 0.0973 | 0.0728 | 0.9993 |
460
+ | 0.9376 | 7200 | 0.0984 | 0.0726 | 0.9993 |
461
+ | 0.9506 | 7300 | 0.0943 | 0.0723 | 0.9993 |
462
+ | 0.9637 | 7400 | 0.0825 | 0.0719 | 0.9993 |
463
+ | 0.9767 | 7500 | 0.0961 | 0.0716 | 0.9993 |
464
+ | 0.9897 | 7600 | 0.0893 | 0.0715 | 0.9993 |
465
+
466
+
467
+ ### Framework Versions
468
+ - Python: 3.12.11
469
+ - Sentence Transformers: 5.1.1
470
+ - Transformers: 4.57.0
471
+ - PyTorch: 2.8.0+cu126
472
+ - Accelerate: 1.10.1
473
+ - Datasets: 4.0.0
474
+ - Tokenizers: 0.22.1
475
+
476
+ ## Citation
477
+
478
+ ### BibTeX
479
+
480
+ #### Sentence Transformers
481
+ ```bibtex
482
+ @inproceedings{reimers-2019-sentence-bert,
483
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
484
+ author = "Reimers, Nils and Gurevych, Iryna",
485
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
486
+ month = "11",
487
+ year = "2019",
488
+ publisher = "Association for Computational Linguistics",
489
+ url = "https://arxiv.org/abs/1908.10084",
490
+ }
491
+ ```
492
+
493
+ #### MultipleNegativesRankingLoss
494
+ ```bibtex
495
+ @misc{henderson2017efficient,
496
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
497
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
498
+ year={2017},
499
+ eprint={1705.00652},
500
+ archivePrefix={arXiv},
501
+ primaryClass={cs.CL}
502
+ }
503
+ ```
504
+
505
+ <!--
506
+ ## Glossary
507
+
508
+ *Clearly define terms in order to be accessible across audiences.*
509
+ -->
510
+
511
+ <!--
512
+ ## Model Card Authors
513
+
514
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
515
+ -->
516
+
517
+ <!--
518
+ ## Model Card Contact
519
+
520
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
521
+ -->
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 6,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "transformers_version": "4.57.0",
22
+ "type_vocab_size": 2,
23
+ "use_cache": true,
24
+ "vocab_size": 30522
25
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "5.1.1",
4
+ "transformers": "4.57.0",
5
+ "pytorch": "2.8.0+cu126"
6
+ },
7
+ "model_type": "SentenceTransformer",
8
+ "prompts": {
9
+ "query": "",
10
+ "document": ""
11
+ },
12
+ "default_prompt_name": null,
13
+ "similarity_fn_name": "cosine"
14
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6df49c8222f6c438ee26598471ad64a41d7aedecc000a8e69edcc027bc8aa8dd
3
+ size 90864192
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 256,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff