taido commited on
Commit
11d9b4d
·
verified ·
1 Parent(s): c3521bb

Upload folder using huggingface_hub

Browse files
Files changed (7) hide show
  1. README.md +434 -0
  2. config.json +35 -0
  3. model.safetensors +3 -0
  4. special_tokens_map.json +37 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +58 -0
  7. vocab.txt +0 -0
README.md ADDED
@@ -0,0 +1,434 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - reranker
6
+ - generated_from_trainer
7
+ - dataset_size:1485
8
+ - loss:BinaryCrossEntropyLoss
9
+ base_model: cross-encoder/ms-marco-MiniLM-L12-v2
10
+ pipeline_tag: text-ranking
11
+ library_name: sentence-transformers
12
+ metrics:
13
+ - accuracy
14
+ - accuracy_threshold
15
+ - f1
16
+ - f1_threshold
17
+ - precision
18
+ - recall
19
+ - average_precision
20
+ model-index:
21
+ - name: CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2
22
+ results:
23
+ - task:
24
+ type: cross-encoder-classification
25
+ name: Cross Encoder Classification
26
+ dataset:
27
+ name: compliance eval
28
+ type: compliance-eval
29
+ metrics:
30
+ - type: accuracy
31
+ value: 0.9636363636363636
32
+ name: Accuracy
33
+ - type: accuracy_threshold
34
+ value: -1.7519245147705078
35
+ name: Accuracy Threshold
36
+ - type: f1
37
+ value: 0.9662921348314608
38
+ name: F1
39
+ - type: f1_threshold
40
+ value: -2.8691844940185547
41
+ name: F1 Threshold
42
+ - type: precision
43
+ value: 0.9555555555555556
44
+ name: Precision
45
+ - type: recall
46
+ value: 0.9772727272727273
47
+ name: Recall
48
+ - type: average_precision
49
+ value: 0.9939968601076801
50
+ name: Average Precision
51
+ ---
52
+
53
+ # CrossEncoder based on cross-encoder/ms-marco-MiniLM-L12-v2
54
+
55
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
56
+
57
+ ## Model Details
58
+
59
+ ### Model Description
60
+ - **Model Type:** Cross Encoder
61
+ - **Base model:** [cross-encoder/ms-marco-MiniLM-L12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2) <!-- at revision 7b0235231ca2674cb8ca8f022859a6eba2b1c968 -->
62
+ - **Maximum Sequence Length:** 512 tokens
63
+ - **Number of Output Labels:** 1 label
64
+ <!-- - **Training Dataset:** Unknown -->
65
+ <!-- - **Language:** Unknown -->
66
+ <!-- - **License:** Unknown -->
67
+
68
+ ### Model Sources
69
+
70
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
71
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
72
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
73
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
74
+
75
+ ## Usage
76
+
77
+ ### Direct Usage (Sentence Transformers)
78
+
79
+ First install the Sentence Transformers library:
80
+
81
+ ```bash
82
+ pip install -U sentence-transformers
83
+ ```
84
+
85
+ Then you can load this model and run inference.
86
+ ```python
87
+ from sentence_transformers import CrossEncoder
88
+
89
+ # Download from the 🤗 Hub
90
+ model = CrossEncoder("cross_encoder_model_id")
91
+ # Get scores for pairs of texts
92
+ pairs = [
93
+ ['the system must identify any risk profile that has expired and is currently marked as overdue to ensure ongoing suitability compliance.', "so, like, your portfolio risk profile is out of date, and i've got a flag here saying it needs renewal before we can do any new trades."],
94
+ ['to identify risk misalignment trades, the system must flag a risk mismatch whenever the product risk rating exceeds the client risk profile.', "so, it's a solid choice, but i gotta mention, there's a bit of a risk mismatch between the fund's rating and your own suitability score, so it's a bit of a hurdle."],
95
+ ['the system identifies an execution only wrapper when the order initiation confirms that this trade is performed on an execution only basis with no advice given.', "so... uh... let's just do it, but it's execution only, you know? no advice was provided, so you're on your own with the strategy on this one, i'm so rushed."],
96
+ ['the system must identify any risk profile that has expired and is currently marked as overdue to ensure ongoing suitability compliance.', "hey, um, checking the dashboard here and it says your prp is overdue, you know, we haven't updated it in a bit and it's flagged."],
97
+ ['to identify risk misalignment trades, the system must flag a risk mismatch whenever the product risk rating exceeds the client risk profile.', "don't worry about the specifics right now the main thing is getting the allocation because it's oversubscribed so can i confirm the trade"],
98
+ ]
99
+ scores = model.predict(pairs)
100
+ print(scores.shape)
101
+ # (5,)
102
+
103
+ # Or rank different texts based on similarity to a single text
104
+ ranks = model.rank(
105
+ 'the system must identify any risk profile that has expired and is currently marked as overdue to ensure ongoing suitability compliance.',
106
+ [
107
+ "so, like, your portfolio risk profile is out of date, and i've got a flag here saying it needs renewal before we can do any new trades.",
108
+ "so, it's a solid choice, but i gotta mention, there's a bit of a risk mismatch between the fund's rating and your own suitability score, so it's a bit of a hurdle.",
109
+ "so... uh... let's just do it, but it's execution only, you know? no advice was provided, so you're on your own with the strategy on this one, i'm so rushed.",
110
+ "hey, um, checking the dashboard here and it says your prp is overdue, you know, we haven't updated it in a bit and it's flagged.",
111
+ "don't worry about the specifics right now the main thing is getting the allocation because it's oversubscribed so can i confirm the trade",
112
+ ]
113
+ )
114
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
115
+ ```
116
+
117
+ <!--
118
+ ### Direct Usage (Transformers)
119
+
120
+ <details><summary>Click to see the direct usage in Transformers</summary>
121
+
122
+ </details>
123
+ -->
124
+
125
+ <!--
126
+ ### Downstream Usage (Sentence Transformers)
127
+
128
+ You can finetune this model on your own dataset.
129
+
130
+ <details><summary>Click to expand</summary>
131
+
132
+ </details>
133
+ -->
134
+
135
+ <!--
136
+ ### Out-of-Scope Use
137
+
138
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
139
+ -->
140
+
141
+ ## Evaluation
142
+
143
+ ### Metrics
144
+
145
+ #### Cross Encoder Classification
146
+
147
+ * Dataset: `compliance-eval`
148
+ * Evaluated with [<code>CrossEncoderClassificationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator)
149
+
150
+ | Metric | Value |
151
+ |:----------------------|:----------|
152
+ | accuracy | 0.9636 |
153
+ | accuracy_threshold | -1.7519 |
154
+ | f1 | 0.9663 |
155
+ | f1_threshold | -2.8692 |
156
+ | precision | 0.9556 |
157
+ | recall | 0.9773 |
158
+ | **average_precision** | **0.994** |
159
+
160
+ <!--
161
+ ## Bias, Risks and Limitations
162
+
163
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
164
+ -->
165
+
166
+ <!--
167
+ ### Recommendations
168
+
169
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
170
+ -->
171
+
172
+ ## Training Details
173
+
174
+ ### Training Dataset
175
+
176
+ #### Unnamed Dataset
177
+
178
+ * Size: 1,485 training samples
179
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
180
+ * Approximate statistics based on the first 1000 samples:
181
+ | | sentence1 | sentence2 | label |
182
+ |:--------|:--------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
183
+ | type | string | string | float |
184
+ | details | <ul><li>min: 135 characters</li><li>mean: 302.95 characters</li><li>max: 725 characters</li></ul> | <ul><li>min: 97 characters</li><li>mean: 179.3 characters</li><li>max: 463 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.49</li><li>max: 1.0</li></ul> |
185
+ * Samples:
186
+ | sentence1 | sentence2 | label |
187
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
188
+ | <code>the rm must use the instrument_code to identify the soft lock disclosure and inform the client that 'this fund has a soft lock-up duration of xx months. you will be subjected to an early redemption charge of x% by the fund house if you were to redeem the fund within the soft lock-up period.' and, if applicable, that 'the fund is currently still within the soft lock-up period. should you wish to proceed with the redemption, you will incur an early redemption charge of x% by the fund house.'</code> | <code>there's a bit of a soft lock on this one, you know, if you take the money out too soon there's a small charge, but it's no big deal.</code> | <code>0.0</code> |
189
+ | <code>the system identifies an execution only wrapper when the order initiation confirms that this trade is performed on an execution only basis with no advice given.</code> | <code>i can't believe how expensive flights have become lately, it's just ridiculous. let's just go ahead with that stock buy, i'll put it through as we discussed earlier, it’s a simple execution for us.</code> | <code>0.0</code> |
190
+ | <code>for a client initiated (ci) wrapper where the order initiation is 'client initiated', the bank must confirm that 'this trade is based on your initiated interest in underlying and product type' or 'this trade is based on your initiated interest in underlying or product type'.</code> | <code>exactly, i-i see what you mean, and since you're the one who initiated this conversation about the emerging markets fund, i'll just log that as your interest. did you ever get that classic car fixed up?</code> | <code>1.0</code> |
191
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
192
+ ```json
193
+ {
194
+ "activation_fn": "torch.nn.modules.linear.Identity",
195
+ "pos_weight": null
196
+ }
197
+ ```
198
+
199
+ ### Evaluation Dataset
200
+
201
+ #### Unnamed Dataset
202
+
203
+ * Size: 165 evaluation samples
204
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
205
+ * Approximate statistics based on the first 165 samples:
206
+ | | sentence1 | sentence2 | label |
207
+ |:--------|:--------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
208
+ | type | string | string | float |
209
+ | details | <ul><li>min: 135 characters</li><li>mean: 302.44 characters</li><li>max: 725 characters</li></ul> | <ul><li>min: 97 characters</li><li>mean: 178.02 characters</li><li>max: 631 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.53</li><li>max: 1.0</li></ul> |
210
+ * Samples:
211
+ | sentence1 | sentence2 | label |
212
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
213
+ | <code>the system must identify any risk profile that has expired and is currently marked as overdue to ensure ongoing suitability compliance.</code> | <code>so, like, your portfolio risk profile is out of date, and i've got a flag here saying it needs renewal before we can do any new trades.</code> | <code>1.0</code> |
214
+ | <code>to identify risk misalignment trades, the system must flag a risk mismatch whenever the product risk rating exceeds the client risk profile.</code> | <code>so, it's a solid choice, but i gotta mention, there's a bit of a risk mismatch between the fund's rating and your own suitability score, so it's a bit of a hurdle.</code> | <code>1.0</code> |
215
+ | <code>the system identifies an execution only wrapper when the order initiation confirms that this trade is performed on an execution only basis with no advice given.</code> | <code>so... uh... let's just do it, but it's execution only, you know? no advice was provided, so you're on your own with the strategy on this one, i'm so rushed.</code> | <code>1.0</code> |
216
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
217
+ ```json
218
+ {
219
+ "activation_fn": "torch.nn.modules.linear.Identity",
220
+ "pos_weight": null
221
+ }
222
+ ```
223
+
224
+ ### Training Hyperparameters
225
+ #### Non-Default Hyperparameters
226
+
227
+ - `eval_strategy`: steps
228
+ - `per_device_train_batch_size`: 16
229
+ - `per_device_eval_batch_size`: 16
230
+ - `learning_rate`: 2e-05
231
+ - `warmup_ratio`: 0.1
232
+ - `load_best_model_at_end`: True
233
+
234
+ #### All Hyperparameters
235
+ <details><summary>Click to expand</summary>
236
+
237
+ - `overwrite_output_dir`: False
238
+ - `do_predict`: False
239
+ - `eval_strategy`: steps
240
+ - `prediction_loss_only`: True
241
+ - `per_device_train_batch_size`: 16
242
+ - `per_device_eval_batch_size`: 16
243
+ - `per_gpu_train_batch_size`: None
244
+ - `per_gpu_eval_batch_size`: None
245
+ - `gradient_accumulation_steps`: 1
246
+ - `eval_accumulation_steps`: None
247
+ - `torch_empty_cache_steps`: None
248
+ - `learning_rate`: 2e-05
249
+ - `weight_decay`: 0.0
250
+ - `adam_beta1`: 0.9
251
+ - `adam_beta2`: 0.999
252
+ - `adam_epsilon`: 1e-08
253
+ - `max_grad_norm`: 1.0
254
+ - `num_train_epochs`: 3
255
+ - `max_steps`: -1
256
+ - `lr_scheduler_type`: linear
257
+ - `lr_scheduler_kwargs`: {}
258
+ - `warmup_ratio`: 0.1
259
+ - `warmup_steps`: 0
260
+ - `log_level`: passive
261
+ - `log_level_replica`: warning
262
+ - `log_on_each_node`: True
263
+ - `logging_nan_inf_filter`: True
264
+ - `save_safetensors`: True
265
+ - `save_on_each_node`: False
266
+ - `save_only_model`: False
267
+ - `restore_callback_states_from_checkpoint`: False
268
+ - `no_cuda`: False
269
+ - `use_cpu`: False
270
+ - `use_mps_device`: False
271
+ - `seed`: 42
272
+ - `data_seed`: None
273
+ - `jit_mode_eval`: False
274
+ - `bf16`: False
275
+ - `fp16`: False
276
+ - `fp16_opt_level`: O1
277
+ - `half_precision_backend`: auto
278
+ - `bf16_full_eval`: False
279
+ - `fp16_full_eval`: False
280
+ - `tf32`: None
281
+ - `local_rank`: 0
282
+ - `ddp_backend`: None
283
+ - `tpu_num_cores`: None
284
+ - `tpu_metrics_debug`: False
285
+ - `debug`: []
286
+ - `dataloader_drop_last`: False
287
+ - `dataloader_num_workers`: 0
288
+ - `dataloader_prefetch_factor`: None
289
+ - `past_index`: -1
290
+ - `disable_tqdm`: False
291
+ - `remove_unused_columns`: True
292
+ - `label_names`: None
293
+ - `load_best_model_at_end`: True
294
+ - `ignore_data_skip`: False
295
+ - `fsdp`: []
296
+ - `fsdp_min_num_params`: 0
297
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
298
+ - `fsdp_transformer_layer_cls_to_wrap`: None
299
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
300
+ - `parallelism_config`: None
301
+ - `deepspeed`: None
302
+ - `label_smoothing_factor`: 0.0
303
+ - `optim`: adamw_torch_fused
304
+ - `optim_args`: None
305
+ - `adafactor`: False
306
+ - `group_by_length`: False
307
+ - `length_column_name`: length
308
+ - `project`: huggingface
309
+ - `trackio_space_id`: trackio
310
+ - `ddp_find_unused_parameters`: None
311
+ - `ddp_bucket_cap_mb`: None
312
+ - `ddp_broadcast_buffers`: False
313
+ - `dataloader_pin_memory`: True
314
+ - `dataloader_persistent_workers`: False
315
+ - `skip_memory_metrics`: True
316
+ - `use_legacy_prediction_loop`: False
317
+ - `push_to_hub`: False
318
+ - `resume_from_checkpoint`: None
319
+ - `hub_model_id`: None
320
+ - `hub_strategy`: every_save
321
+ - `hub_private_repo`: None
322
+ - `hub_always_push`: False
323
+ - `hub_revision`: None
324
+ - `gradient_checkpointing`: False
325
+ - `gradient_checkpointing_kwargs`: None
326
+ - `include_inputs_for_metrics`: False
327
+ - `include_for_metrics`: []
328
+ - `eval_do_concat_batches`: True
329
+ - `fp16_backend`: auto
330
+ - `push_to_hub_model_id`: None
331
+ - `push_to_hub_organization`: None
332
+ - `mp_parameters`:
333
+ - `auto_find_batch_size`: False
334
+ - `full_determinism`: False
335
+ - `torchdynamo`: None
336
+ - `ray_scope`: last
337
+ - `ddp_timeout`: 1800
338
+ - `torch_compile`: False
339
+ - `torch_compile_backend`: None
340
+ - `torch_compile_mode`: None
341
+ - `include_tokens_per_second`: False
342
+ - `include_num_input_tokens_seen`: no
343
+ - `neftune_noise_alpha`: None
344
+ - `optim_target_modules`: None
345
+ - `batch_eval_metrics`: False
346
+ - `eval_on_start`: False
347
+ - `use_liger_kernel`: False
348
+ - `liger_kernel_config`: None
349
+ - `eval_use_gather_object`: False
350
+ - `average_tokens_across_devices`: True
351
+ - `prompts`: None
352
+ - `batch_sampler`: batch_sampler
353
+ - `multi_dataset_batch_sampler`: proportional
354
+ - `router_mapping`: {}
355
+ - `learning_rate_mapping`: {}
356
+
357
+ </details>
358
+
359
+ ### Training Logs
360
+ | Epoch | Step | Training Loss | Validation Loss | compliance-eval_average_precision |
361
+ |:---------:|:-------:|:-------------:|:---------------:|:---------------------------------:|
362
+ | 0.1075 | 10 | 1.9119 | 1.1985 | 0.6783 |
363
+ | 0.2151 | 20 | 0.9675 | 1.0970 | 0.6914 |
364
+ | 0.3226 | 30 | 0.7458 | 0.4725 | 0.8480 |
365
+ | 0.4301 | 40 | 0.5308 | 0.4431 | 0.8849 |
366
+ | 0.5376 | 50 | 0.3888 | 0.4183 | 0.9097 |
367
+ | 0.6452 | 60 | 0.3477 | 0.3472 | 0.9325 |
368
+ | 0.7527 | 70 | 0.3082 | 0.3005 | 0.9524 |
369
+ | 0.8602 | 80 | 0.3364 | 0.2682 | 0.9647 |
370
+ | 0.9677 | 90 | 0.3069 | 0.2345 | 0.9804 |
371
+ | 1.0753 | 100 | 0.2636 | 0.1847 | 0.9886 |
372
+ | 1.1828 | 110 | 0.2577 | 0.1793 | 0.9847 |
373
+ | 1.2903 | 120 | 0.1793 | 0.1940 | 0.9826 |
374
+ | 1.3978 | 130 | 0.19 | 0.2333 | 0.9794 |
375
+ | 1.5054 | 140 | 0.1788 | 0.1615 | 0.9858 |
376
+ | 1.6129 | 150 | 0.1277 | 0.1576 | 0.9862 |
377
+ | 1.7204 | 160 | 0.1851 | 0.1399 | 0.9903 |
378
+ | 1.8280 | 170 | 0.1652 | 0.1056 | 0.9947 |
379
+ | 1.9355 | 180 | 0.085 | 0.1077 | 0.9949 |
380
+ | **2.043** | **190** | **0.1111** | **0.0943** | **0.9955** |
381
+ | 2.1505 | 200 | 0.09 | 0.1137 | 0.9955 |
382
+ | 2.2581 | 210 | 0.1136 | 0.1222 | 0.9934 |
383
+ | 2.3656 | 220 | 0.0703 | 0.1155 | 0.9937 |
384
+ | 2.4731 | 230 | 0.0866 | 0.1147 | 0.9935 |
385
+ | 2.5806 | 240 | 0.1104 | 0.1089 | 0.9943 |
386
+ | 2.6882 | 250 | 0.1523 | 0.1141 | 0.9940 |
387
+ | 2.7957 | 260 | 0.1189 | 0.1297 | 0.9943 |
388
+ | 2.9032 | 270 | 0.0479 | 0.1365 | 0.9940 |
389
+
390
+ * The bold row denotes the saved checkpoint.
391
+
392
+ ### Framework Versions
393
+ - Python: 3.12.12
394
+ - Sentence Transformers: 5.2.0
395
+ - Transformers: 4.57.3
396
+ - PyTorch: 2.9.0+cu126
397
+ - Accelerate: 1.12.0
398
+ - Datasets: 4.0.0
399
+ - Tokenizers: 0.22.2
400
+
401
+ ## Citation
402
+
403
+ ### BibTeX
404
+
405
+ #### Sentence Transformers
406
+ ```bibtex
407
+ @inproceedings{reimers-2019-sentence-bert,
408
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
409
+ author = "Reimers, Nils and Gurevych, Iryna",
410
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
411
+ month = "11",
412
+ year = "2019",
413
+ publisher = "Association for Computational Linguistics",
414
+ url = "https://arxiv.org/abs/1908.10084",
415
+ }
416
+ ```
417
+
418
+ <!--
419
+ ## Glossary
420
+
421
+ *Clearly define terms in order to be accessible across audiences.*
422
+ -->
423
+
424
+ <!--
425
+ ## Model Card Authors
426
+
427
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
428
+ -->
429
+
430
+ <!--
431
+ ## Model Card Contact
432
+
433
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
434
+ -->
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "dtype": "float32",
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 1536,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "sentence_transformers": {
28
+ "activation_fn": "torch.nn.modules.linear.Identity",
29
+ "version": "5.2.0"
30
+ },
31
+ "transformers_version": "4.57.3",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 30522
35
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77e20bbd161576eccc72a4009e86c4c3b2270a3bcb445253af8352ff9f145252
3
+ size 133464836
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff