Alexhuou commited on
Commit
3a67d7c
·
verified ·
1 Parent(s): 038c398

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,425 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:5700
8
+ - loss:TripletLoss
9
+ base_model: thenlper/gte-small
10
+ widget:
11
+ - source_sentence: Suppose there is a correlation of r = 0.9 between number of hours
12
+ per day students study and GPAs. Which of the following is a reasonable conclusion?
13
+ sentences:
14
+ - 'Ulcerative Colitis
15
+
16
+ '
17
+ - Given that the sample has a standard deviation of zero, which of the following
18
+ is a true statement?
19
+ - Which of the following items is not subject to the application of intraperiod
20
+ income tax allocation?
21
+ - source_sentence: The natural law fallacy is
22
+ sentences:
23
+ - The Theory of _________ posits that 3 three levels of moral reasoning exist which
24
+ an individual can engage in to assess ethical issues, dependant on their cognitive
25
+ capacity.
26
+ - Which of the following is another name for the fallacy of amphiboly?
27
+ - 'Which of the following are plausible approaches to dealing with a model that
28
+ exhibits heteroscedasticity?
29
+
30
+
31
+ i) Take logarithms of each of the variables
32
+
33
+
34
+ ii) Use suitably modified standard errors
35
+
36
+
37
+ iii) Use a generalised least squares procedure
38
+
39
+
40
+ iv) Add lagged values of the variables to the regression equation.'
41
+ - source_sentence: When the ratio of brain size to body size is compared, which species
42
+ has a proportionally larger brain?
43
+ sentences:
44
+ - 'A proposed explanation for some phenomenon that may be derived initially from
45
+ empirical observation through a process called induction is a:'
46
+ - 'Let R be a ring and let U and V be (two-sided) ideals of R. Which of the following
47
+ must also be ideals of R ?
48
+
49
+ I. {u + v : u \in and v \in V}
50
+
51
+ II. {uv : u \in U and v \in V}
52
+
53
+ III. {x : x \in U and x \in V}'
54
+ - Find 3 over 4 − 1 over 8.
55
+ - source_sentence: The AH Protocol provides source authentication and data integrity,
56
+ but not
57
+ sentences:
58
+ - 'Ethnographic research produces qualitative data because:'
59
+ - Let V be the real vector space of all real 2 x 3 matrices, and let W be the real
60
+ vector space of all real 4 x 1 column vectors. If T is a linear transformation
61
+ from V onto W, what is the dimension of the subspace kernel of T?
62
+ - Which of the following is not a block cipher operating mode?
63
+ - source_sentence: 'This question refers to the following information.
64
+
65
+ Gunpowder Weaponry: Europe vs. China
66
+
67
+ In Western Europe during the 1200s through the 1400s, early cannons, as heavy
68
+ and as slow to fire as they were, proved useful enough in the protracted sieges
69
+ that dominated warfare during this period that governments found it sufficiently
70
+ worthwhile to pay for them and for the experimentation that eventually produced
71
+ gunpowder weapons that were both more powerful and easier to move. By contrast,
72
+ China, especially after the mid-1300s, was threatened mainly by highly mobile
73
+ steppe nomads, against whom early gunpowder weapons, with their unwieldiness,
74
+ proved of little utility. It therefore devoted its efforts to the improvement
75
+ of horse archer units who could effectively combat the country''s deadliest foe.
76
+
77
+ According to this passage, why did the Chinese, despite inventing gunpowder, fail
78
+ to lead in the innovation of gunpowder weaponry?'
79
+ sentences:
80
+ - Statement 1| Maximizing the likelihood of logistic regression model yields multiple
81
+ local optimums. Statement 2| No classifier can do better than a naive Bayes classifier
82
+ if the distribution of the data is known.
83
+ - What is the term for decisions limited by human capacity to absorb and analyse
84
+ information?
85
+ - 'This question refers to the following information.
86
+
87
+ By what principle of reason then, should these foreigners send in return a poisonous
88
+ drug? Without meaning to say that the foreigners harbor such destructive intentions
89
+ in their hearts, we yet positively assert that from their inordinate thirst after
90
+ gain, they are perfectly careless about the injuries they inflict upon us! And
91
+ such being the case, we should like to ask what has become of that conscience
92
+ which heaven has implanted in the breasts of all men? We have heard that in your
93
+ own country opium is prohibited with the utmost strictness and severity. This
94
+ is a strong proof that you know full well how hurtful it is to mankind. Since
95
+ you do not permit it to injure your own country, you ought not to have this injurious
96
+ drug transferred to another country, and above all others, how much less to the
97
+ Inner Land! Of the products which China exports to your foreign countries, there
98
+ is not one which is not beneficial to mankind in some shape or other.
99
+
100
+ Lin Zexu, Chinese trade commissioner, letter to Queen Victoria, 1839
101
+
102
+ On which of the following arguments does the author of the passage principally
103
+ base his appeal?'
104
+ pipeline_tag: sentence-similarity
105
+ library_name: sentence-transformers
106
+ ---
107
+
108
+ # SentenceTransformer based on thenlper/gte-small
109
+
110
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-small](https://huggingface.co/thenlper/gte-small). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
111
+
112
+ ## Model Details
113
+
114
+ ### Model Description
115
+ - **Model Type:** Sentence Transformer
116
+ - **Base model:** [thenlper/gte-small](https://huggingface.co/thenlper/gte-small) <!-- at revision 17e1f347d17fe144873b1201da91788898c639cd -->
117
+ - **Maximum Sequence Length:** 512 tokens
118
+ - **Output Dimensionality:** 384 dimensions
119
+ - **Similarity Function:** Cosine Similarity
120
+ <!-- - **Training Dataset:** Unknown -->
121
+ <!-- - **Language:** Unknown -->
122
+ <!-- - **License:** Unknown -->
123
+
124
+ ### Model Sources
125
+
126
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
127
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
128
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
129
+
130
+ ### Full Model Architecture
131
+
132
+ ```
133
+ SentenceTransformer(
134
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
135
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
136
+ )
137
+ ```
138
+
139
+ ## Usage
140
+
141
+ ### Direct Usage (Sentence Transformers)
142
+
143
+ First install the Sentence Transformers library:
144
+
145
+ ```bash
146
+ pip install -U sentence-transformers
147
+ ```
148
+
149
+ Then you can load this model and run inference.
150
+ ```python
151
+ from sentence_transformers import SentenceTransformer
152
+
153
+ # Download from the 🤗 Hub
154
+ model = SentenceTransformer("Alexhuou/embedder_model_STxmmluV3")
155
+ # Run inference
156
+ sentences = [
157
+ "This question refers to the following information.\nGunpowder Weaponry: Europe vs. China\nIn Western Europe during the 1200s through the 1400s, early cannons, as heavy and as slow to fire as they were, proved useful enough in the protracted sieges that dominated warfare during this period that governments found it sufficiently worthwhile to pay for them and for the experimentation that eventually produced gunpowder weapons that were both more powerful and easier to move. By contrast, China, especially after the mid-1300s, was threatened mainly by highly mobile steppe nomads, against whom early gunpowder weapons, with their unwieldiness, proved of little utility. It therefore devoted its efforts to the improvement of horse archer units who could effectively combat the country's deadliest foe.\nAccording to this passage, why did the Chinese, despite inventing gunpowder, fail to lead in the innovation of gunpowder weaponry?",
158
+ 'This question refers to the following information.\nBy what principle of reason then, should these foreigners send in return a poisonous drug? Without meaning to say that the foreigners harbor such destructive intentions in their hearts, we yet positively assert that from their inordinate thirst after gain, they are perfectly careless about the injuries they inflict upon us! And such being the case, we should like to ask what has become of that conscience which heaven has implanted in the breasts of all men? We have heard that in your own country opium is prohibited with the utmost strictness and severity. This is a strong proof that you know full well how hurtful it is to mankind. Since you do not permit it to injure your own country, you ought not to have this injurious drug transferred to another country, and above all others, how much less to the Inner Land! Of the products which China exports to your foreign countries, there is not one which is not beneficial to mankind in some shape or other.\nLin Zexu, Chinese trade commissioner, letter to Queen Victoria, 1839\nOn which of the following arguments does the author of the passage principally base his appeal?',
159
+ 'What is the term for decisions limited by human capacity to absorb and analyse information?',
160
+ ]
161
+ embeddings = model.encode(sentences)
162
+ print(embeddings.shape)
163
+ # [3, 384]
164
+
165
+ # Get the similarity scores for the embeddings
166
+ similarities = model.similarity(embeddings, embeddings)
167
+ print(similarities.shape)
168
+ # [3, 3]
169
+ ```
170
+
171
+ <!--
172
+ ### Direct Usage (Transformers)
173
+
174
+ <details><summary>Click to see the direct usage in Transformers</summary>
175
+
176
+ </details>
177
+ -->
178
+
179
+ <!--
180
+ ### Downstream Usage (Sentence Transformers)
181
+
182
+ You can finetune this model on your own dataset.
183
+
184
+ <details><summary>Click to expand</summary>
185
+
186
+ </details>
187
+ -->
188
+
189
+ <!--
190
+ ### Out-of-Scope Use
191
+
192
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
193
+ -->
194
+
195
+ <!--
196
+ ## Bias, Risks and Limitations
197
+
198
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
199
+ -->
200
+
201
+ <!--
202
+ ### Recommendations
203
+
204
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
205
+ -->
206
+
207
+ ## Training Details
208
+
209
+ ### Training Dataset
210
+
211
+ #### Unnamed Dataset
212
+
213
+ * Size: 5,700 training samples
214
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
215
+ * Approximate statistics based on the first 1000 samples:
216
+ | | sentence_0 | sentence_1 | sentence_2 |
217
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
218
+ | type | string | string | string |
219
+ | details | <ul><li>min: 4 tokens</li><li>mean: 47.47 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 51.06 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 47.92 tokens</li><li>max: 512 tokens</li></ul> |
220
+ * Samples:
221
+ | sentence_0 | sentence_1 | sentence_2 |
222
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
223
+ | <code>This question refers to the following information.<br>Let us not, I beseech you sir, deceive ourselves. Sir, we have done everything that could be done, to avert the storm which is now coming on. We have petitioned; we have remonstrated; we have supplicated; we have prostrated ourselves before the throne, and have implored its interposition to arrest the tyrannical hands of the ministry and Parliament. Our petitions have been slighted; our remonstrances have produced additional violence and insult; our supplications have been disregarded; and we have been spurned, with contempt, from the foot of the throne. In vain, after these things, may we indulge the fond hope of peace and reconciliation. There is no longer any room for hope.… It is in vain, sir, to extenuate the matter. Gentlemen may cry, Peace, Peace, but there is no peace. The war is actually begun! The next gale that sweeps from the north will bring to our ears the clash of resounding arms! Our brethren are already in the field! W...</code> | <code>This question refers to the following information.<br>"In one view the slaveholders have a decided advantage over all opposition. It is well to notice this advantage—the advantage of complete organization. They are organized; and yet were not at the pains of creating their organizations. The State governments, where the system of slavery exists, are complete slavery organizations. The church organizations in those States are equally at the service of slavery; while the Federal Government, with its army and navy, from the chief magistracy in Washington, to the Supreme Court, and thence to the chief marshalship at New York, is pledged to support, defend, and propagate the crying curse of human bondage. The pen, the purse, and the sword, are united against the simple truth, preached by humble men in obscure places."<br>Frederick Douglass, 1857<br>Frederick Douglass was most influenced by which of the following social movements?</code> | <code>Replacing supply chains with _______ enhances the importance of product _______as well as a fundamental redesign of every activity a firm engages in that produces _______.</code> |
224
+ | <code>Which of the following is a true statement about program documentation?</code> | <code>The boolean expression a[i] == max || !(max != a[i]) can be simplified to</code> | <code>The insurance program for poor people of all ages is called</code> |
225
+ | <code>If both parents are affected with the same autosomal recessive disorder then the probability that each of their children will be affected equals ___.</code> | <code>Which of the following conditions shows anticipation in paternal transmission?</code> | <code>From 1988 to 1990 among heterosexuals in the US, the number of unmarried adults aged 20 to 45 who report having multiple partners has:</code> |
226
+ * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
227
+ ```json
228
+ {
229
+ "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
230
+ "triplet_margin": 5
231
+ }
232
+ ```
233
+
234
+ ### Training Hyperparameters
235
+ #### Non-Default Hyperparameters
236
+
237
+ - `num_train_epochs`: 5
238
+ - `multi_dataset_batch_sampler`: round_robin
239
+
240
+ #### All Hyperparameters
241
+ <details><summary>Click to expand</summary>
242
+
243
+ - `overwrite_output_dir`: False
244
+ - `do_predict`: False
245
+ - `eval_strategy`: no
246
+ - `prediction_loss_only`: True
247
+ - `per_device_train_batch_size`: 8
248
+ - `per_device_eval_batch_size`: 8
249
+ - `per_gpu_train_batch_size`: None
250
+ - `per_gpu_eval_batch_size`: None
251
+ - `gradient_accumulation_steps`: 1
252
+ - `eval_accumulation_steps`: None
253
+ - `torch_empty_cache_steps`: None
254
+ - `learning_rate`: 5e-05
255
+ - `weight_decay`: 0.0
256
+ - `adam_beta1`: 0.9
257
+ - `adam_beta2`: 0.999
258
+ - `adam_epsilon`: 1e-08
259
+ - `max_grad_norm`: 1
260
+ - `num_train_epochs`: 5
261
+ - `max_steps`: -1
262
+ - `lr_scheduler_type`: linear
263
+ - `lr_scheduler_kwargs`: {}
264
+ - `warmup_ratio`: 0.0
265
+ - `warmup_steps`: 0
266
+ - `log_level`: passive
267
+ - `log_level_replica`: warning
268
+ - `log_on_each_node`: True
269
+ - `logging_nan_inf_filter`: True
270
+ - `save_safetensors`: True
271
+ - `save_on_each_node`: False
272
+ - `save_only_model`: False
273
+ - `restore_callback_states_from_checkpoint`: False
274
+ - `no_cuda`: False
275
+ - `use_cpu`: False
276
+ - `use_mps_device`: False
277
+ - `seed`: 42
278
+ - `data_seed`: None
279
+ - `jit_mode_eval`: False
280
+ - `use_ipex`: False
281
+ - `bf16`: False
282
+ - `fp16`: False
283
+ - `fp16_opt_level`: O1
284
+ - `half_precision_backend`: auto
285
+ - `bf16_full_eval`: False
286
+ - `fp16_full_eval`: False
287
+ - `tf32`: None
288
+ - `local_rank`: 0
289
+ - `ddp_backend`: None
290
+ - `tpu_num_cores`: None
291
+ - `tpu_metrics_debug`: False
292
+ - `debug`: []
293
+ - `dataloader_drop_last`: False
294
+ - `dataloader_num_workers`: 0
295
+ - `dataloader_prefetch_factor`: None
296
+ - `past_index`: -1
297
+ - `disable_tqdm`: False
298
+ - `remove_unused_columns`: True
299
+ - `label_names`: None
300
+ - `load_best_model_at_end`: False
301
+ - `ignore_data_skip`: False
302
+ - `fsdp`: []
303
+ - `fsdp_min_num_params`: 0
304
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
305
+ - `fsdp_transformer_layer_cls_to_wrap`: None
306
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
307
+ - `deepspeed`: None
308
+ - `label_smoothing_factor`: 0.0
309
+ - `optim`: adamw_torch
310
+ - `optim_args`: None
311
+ - `adafactor`: False
312
+ - `group_by_length`: False
313
+ - `length_column_name`: length
314
+ - `ddp_find_unused_parameters`: None
315
+ - `ddp_bucket_cap_mb`: None
316
+ - `ddp_broadcast_buffers`: False
317
+ - `dataloader_pin_memory`: True
318
+ - `dataloader_persistent_workers`: False
319
+ - `skip_memory_metrics`: True
320
+ - `use_legacy_prediction_loop`: False
321
+ - `push_to_hub`: False
322
+ - `resume_from_checkpoint`: None
323
+ - `hub_model_id`: None
324
+ - `hub_strategy`: every_save
325
+ - `hub_private_repo`: None
326
+ - `hub_always_push`: False
327
+ - `gradient_checkpointing`: False
328
+ - `gradient_checkpointing_kwargs`: None
329
+ - `include_inputs_for_metrics`: False
330
+ - `include_for_metrics`: []
331
+ - `eval_do_concat_batches`: True
332
+ - `fp16_backend`: auto
333
+ - `push_to_hub_model_id`: None
334
+ - `push_to_hub_organization`: None
335
+ - `mp_parameters`:
336
+ - `auto_find_batch_size`: False
337
+ - `full_determinism`: False
338
+ - `torchdynamo`: None
339
+ - `ray_scope`: last
340
+ - `ddp_timeout`: 1800
341
+ - `torch_compile`: False
342
+ - `torch_compile_backend`: None
343
+ - `torch_compile_mode`: None
344
+ - `include_tokens_per_second`: False
345
+ - `include_num_input_tokens_seen`: False
346
+ - `neftune_noise_alpha`: None
347
+ - `optim_target_modules`: None
348
+ - `batch_eval_metrics`: False
349
+ - `eval_on_start`: False
350
+ - `use_liger_kernel`: False
351
+ - `eval_use_gather_object`: False
352
+ - `average_tokens_across_devices`: False
353
+ - `prompts`: None
354
+ - `batch_sampler`: batch_sampler
355
+ - `multi_dataset_batch_sampler`: round_robin
356
+
357
+ </details>
358
+
359
+ ### Training Logs
360
+ | Epoch | Step | Training Loss |
361
+ |:------:|:----:|:-------------:|
362
+ | 0.7013 | 500 | 1.9382 |
363
+ | 1.4025 | 1000 | 1.0882 |
364
+ | 2.1038 | 1500 | 0.8478 |
365
+ | 2.8050 | 2000 | 0.5961 |
366
+ | 3.5063 | 2500 | 0.5179 |
367
+ | 4.2076 | 3000 | 0.3774 |
368
+ | 4.9088 | 3500 | 0.3646 |
369
+
370
+
371
+ ### Framework Versions
372
+ - Python: 3.11.13
373
+ - Sentence Transformers: 4.1.0
374
+ - Transformers: 4.52.4
375
+ - PyTorch: 2.6.0+cu124
376
+ - Accelerate: 1.7.0
377
+ - Datasets: 3.6.0
378
+ - Tokenizers: 0.21.1
379
+
380
+ ## Citation
381
+
382
+ ### BibTeX
383
+
384
+ #### Sentence Transformers
385
+ ```bibtex
386
+ @inproceedings{reimers-2019-sentence-bert,
387
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
388
+ author = "Reimers, Nils and Gurevych, Iryna",
389
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
390
+ month = "11",
391
+ year = "2019",
392
+ publisher = "Association for Computational Linguistics",
393
+ url = "https://arxiv.org/abs/1908.10084",
394
+ }
395
+ ```
396
+
397
+ #### TripletLoss
398
+ ```bibtex
399
+ @misc{hermans2017defense,
400
+ title={In Defense of the Triplet Loss for Person Re-Identification},
401
+ author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
402
+ year={2017},
403
+ eprint={1703.07737},
404
+ archivePrefix={arXiv},
405
+ primaryClass={cs.CV}
406
+ }
407
+ ```
408
+
409
+ <!--
410
+ ## Glossary
411
+
412
+ *Clearly define terms in order to be accessible across audiences.*
413
+ -->
414
+
415
+ <!--
416
+ ## Model Card Authors
417
+
418
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
419
+ -->
420
+
421
+ <!--
422
+ ## Model Card Contact
423
+
424
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
425
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 384,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 1536,
12
+ "layer_norm_eps": 1e-12,
13
+ "max_position_embeddings": 512,
14
+ "model_type": "bert",
15
+ "num_attention_heads": 12,
16
+ "num_hidden_layers": 12,
17
+ "pad_token_id": 0,
18
+ "position_embedding_type": "absolute",
19
+ "torch_dtype": "float32",
20
+ "transformers_version": "4.52.4",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 30522
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.52.4",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:90720ed366b4f6f46a644f514c18e4d38124a8dcfb9b3b02a4396d9c1709d27f
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 1000000000000000019884624838656,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff