Alexhuou commited on
Commit
8d194ec
·
verified ·
1 Parent(s): a924ef6

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,453 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:5700
8
+ - loss:TripletLoss
9
+ base_model: thenlper/gte-small
10
+ widget:
11
+ - source_sentence: Perchloric acid (HClO4) is considered one of the stronger acids
12
+ in existence. Which of the following statements corresponds most accurately with
13
+ strong acids?
14
+ sentences:
15
+ - Who argued that if an organization did not affect a public then there was no need
16
+ for a practitioner to consider that public in its communications?
17
+ - 'Glycogen breakdown in exercising muscle is activated by:'
18
+ - The collision theory of reaction rates does not include
19
+ - source_sentence: 'In Aristotle’s terminology, incontinence is when:'
20
+ sentences:
21
+ - 'This question refers to the following information.
22
+
23
+ The pair of excerpts below is written by explorer Christopher Columbus and the
24
+ Dominican Bishop of Chiapas, Mexico, Bartholomew de las Casas.
25
+
26
+ Source 1
27
+
28
+ Indians would give whatever the seller required. . . . Thus they bartered, like
29
+ idiots, cotton and gold for fragments of bows, glasses, bottles, and jars; which
30
+ I forbad as being unjust, and myself gave them many beautiful and acceptable articles
31
+ which I had brought with me, taking nothing from them in return; I did this in
32
+ order that I might the more easily conciliate them, that they might be led to
33
+ become Christians, and be inclined to entertain a regard for the King and Queen,
34
+ our Princes and all Spaniards, and that I might induce them to take an interest
35
+ in seeking out, and collecting and delivering to us such things as they possessed
36
+ in abundance, but which we greatly needed.
37
+
38
+ —Christopher Columbus: letter to Raphael Sanchez, 1493
39
+
40
+ Source 2
41
+
42
+ It was upon these gentle lambs . . . that from the very first day they clapped
43
+ eyes on them the Spanish fell like ravening wolves upon the fold, or like tigers
44
+ and savage lions who have not eaten meat for days. The pattern established at
45
+ the outset has remained unchanged to this day, and the Spaniards still do nothing
46
+ save tear the natives to shreds, murder them and inflict upon them untold misery,
47
+ suffering and distress, tormenting, harrying and persecuting them mercilessly.
48
+ We shall in due course describe some of the many ingenious methods of torture
49
+ they have invented and refined for this purpose, but one can get some idea of
50
+ the effectiveness of their methods from the figures alone. When the Spanish first
51
+ journeyed there, the indigenous population of the island of Hispaniola stood at
52
+ some three million; today only two hundred survive. Their reason for killing and
53
+ destroying such an infinite number of souls is that the Christians have an ultimate
54
+ aim, which is to acquire gold, and to swell themselves with riches in a very brief
55
+ time and thus rise to a high estate disproportionate to their merits.
56
+
57
+ —Bartholomew de las Casas: A Short Account of the Destruction of the Indies, 1542
58
+
59
+ Which of the following would best account for the differences between the interactions
60
+ of the Spaniards and the natives as described in the two accounts?'
61
+ - 'For Plato, ordinary sensible objects exist and are knowable as examples or instances
62
+ of Ideas or "Forms" that do not exist in our ordinary sensible world. Forms do
63
+ not exist in the sensible world because:'
64
+ - A solution of a weak base is titrated with a solution of a standard strong acid.
65
+ The progress of the titration is followed with a pH meter. Which of the following
66
+ observations would occur?
67
+ - source_sentence: Which of the following causes more deaths globally each year (as
68
+ of 2017)?
69
+ sentences:
70
+ - About what percentage of survey respondents from China report having paid a bribe
71
+ in the last year to access public services (such as education; judiciary; medical
72
+ and health; police; registry and permit services; utilities; tax revenue and customs;
73
+ and land service) as of 2017?
74
+ - ' In response to the objection that it would be wrong to prohibit the manufacture
75
+ and sale of fatty foods and tobacco products, de Marneffe argues that'
76
+ - Which of the following about meiosis is NOT true?
77
+ - source_sentence: What is 'unilateral acts'?
78
+ sentences:
79
+ - Which of the following statements is true concerning the population regression
80
+ function (PRF) and sample regression function (SRF)?
81
+ - Find the maximum possible order for some element of Z_8 x Z_10 x Z_24.
82
+ - What is jus cogens?
83
+ - source_sentence: 'This question refers to the following information.
84
+
85
+ "Those whose condition is such that their function is the use of their bodies
86
+ and nothing better can be expected of them, those, I say, are slaves of nature.
87
+ It is better for them to be ruled thus."
88
+
89
+ Juan de Sepulveda, Politics, 1522
90
+
91
+ "When Latin American nations gained independence in the 19th century, those two
92
+ strains converged, and merged with an older, more universalist, natural law tradition.
93
+ The result was a distinctively Latin American form of rights discourse. Paolo
94
+ Carozza traces the roots of that discourse to a distinctive application, and extension,
95
+ of Thomistic moral philosophy to the injustices of Spanish conquests in the New
96
+ World. The key figure in that development seems to have been Bartolomé de Las
97
+ Casas, a 16th-century Spanish bishop who condemned slavery and championed the
98
+ cause of Indians on the basis of a natural right to liberty grounded in their
99
+ membership in a single common humanity. ''All the peoples of the world are humans,''
100
+ Las Casas wrote, and ''all the races of humankind are one.'' According to Brian
101
+ Tierney, Las Casas and other Spanish Dominican philosophers laid the groundwork
102
+ for a doctrine of natural rights that was independent of religious revelation
103
+ ''by drawing on a juridical tradition that derived natural rights and natural
104
+ law from human rationality and free will, and by appealing to Aristotelian philosophy.''"
105
+
106
+ Mary Ann Glendon, "The Forgotten Crucible: The Latin American Influence on the
107
+ Universal Human Rights Idea,” 2003
108
+
109
+ Which one of the following statements about the Spanish conquest of the Americas
110
+ is most accurate?'
111
+ sentences:
112
+ - 'Statement 1 | If T: V -> W is a linear transformation and dim(V ) < dim(W) <
113
+ 1, then T must be injective. Statement 2 | Let dim(V) = n and suppose that T:
114
+ V -> V is linear. If T is injective, then it is a bijection.'
115
+ - If the finite group G contains a subgroup of order seven but no element (other
116
+ than the identity) is its own inverse, then the order of G could be
117
+ - 'This question refers to the following information.
118
+
119
+ "One-half of the people of this nation to-day are utterly powerless to blot from
120
+ the statute books an unjust law, or to write there a new and a just one. The women,
121
+ dissatisfied as they are with this form of government, that enforces taxation
122
+ without representation,—that compels them to obey laws to which they have never
123
+ given their consent,—that imprisons and hangs them without a trial by a jury of
124
+ their peers, that robs them, in marriage, of the custody of their own persons,
125
+ wages and children,—are this half of the people left wholly at the mercy of the
126
+ other half, in direct violation of the spirit and letter of the declarations of
127
+ the framers of this government, every one of which was based on the immutable
128
+ principle of equal rights to all."
129
+
130
+ —Susan B. Anthony, "I Stand Before You Under Indictment" (speech), 1873
131
+
132
+ Which of the following statements best represents the criticism of Andrew Carnegie
133
+ found in this cartoon?'
134
+ pipeline_tag: sentence-similarity
135
+ library_name: sentence-transformers
136
+ ---
137
+
138
+ # SentenceTransformer based on thenlper/gte-small
139
+
140
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-small](https://huggingface.co/thenlper/gte-small). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
141
+
142
+ ## Model Details
143
+
144
+ ### Model Description
145
+ - **Model Type:** Sentence Transformer
146
+ - **Base model:** [thenlper/gte-small](https://huggingface.co/thenlper/gte-small) <!-- at revision 17e1f347d17fe144873b1201da91788898c639cd -->
147
+ - **Maximum Sequence Length:** 512 tokens
148
+ - **Output Dimensionality:** 384 dimensions
149
+ - **Similarity Function:** Cosine Similarity
150
+ <!-- - **Training Dataset:** Unknown -->
151
+ <!-- - **Language:** Unknown -->
152
+ <!-- - **License:** Unknown -->
153
+
154
+ ### Model Sources
155
+
156
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
157
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
158
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
159
+
160
+ ### Full Model Architecture
161
+
162
+ ```
163
+ SentenceTransformer(
164
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
165
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
166
+ )
167
+ ```
168
+
169
+ ## Usage
170
+
171
+ ### Direct Usage (Sentence Transformers)
172
+
173
+ First install the Sentence Transformers library:
174
+
175
+ ```bash
176
+ pip install -U sentence-transformers
177
+ ```
178
+
179
+ Then you can load this model and run inference.
180
+ ```python
181
+ from sentence_transformers import SentenceTransformer
182
+
183
+ # Download from the 🤗 Hub
184
+ model = SentenceTransformer("Alexhuou/embedder_model_FT")
185
+ # Run inference
186
+ sentences = [
187
+ 'This question refers to the following information.\n"Those whose condition is such that their function is the use of their bodies and nothing better can be expected of them, those, I say, are slaves of nature. It is better for them to be ruled thus."\nJuan de Sepulveda, Politics, 1522\n"When Latin American nations gained independence in the 19th century, those two strains converged, and merged with an older, more universalist, natural law tradition. The result was a distinctively Latin American form of rights discourse. Paolo Carozza traces the roots of that discourse to a distinctive application, and extension, of Thomistic moral philosophy to the injustices of Spanish conquests in the New World. The key figure in that development seems to have been Bartolomé de Las Casas, a 16th-century Spanish bishop who condemned slavery and championed the cause of Indians on the basis of a natural right to liberty grounded in their membership in a single common humanity. \'All the peoples of the world are humans,\' Las Casas wrote, and \'all the races of humankind are one.\' According to Brian Tierney, Las Casas and other Spanish Dominican philosophers laid the groundwork for a doctrine of natural rights that was independent of religious revelation \'by drawing on a juridical tradition that derived natural rights and natural law from human rationality and free will, and by appealing to Aristotelian philosophy.\'"\nMary Ann Glendon, "The Forgotten Crucible: The Latin American Influence on the Universal Human Rights Idea,” 2003\nWhich one of the following statements about the Spanish conquest of the Americas is most accurate?',
188
+ 'This question refers to the following information.\n"One-half of the people of this nation to-day are utterly powerless to blot from the statute books an unjust law, or to write there a new and a just one. The women, dissatisfied as they are with this form of government, that enforces taxation without representation,—that compels them to obey laws to which they have never given their consent,—that imprisons and hangs them without a trial by a jury of their peers, that robs them, in marriage, of the custody of their own persons, wages and children,—are this half of the people left wholly at the mercy of the other half, in direct violation of the spirit and letter of the declarations of the framers of this government, every one of which was based on the immutable principle of equal rights to all."\n—Susan B. Anthony, "I Stand Before You Under Indictment" (speech), 1873\nWhich of the following statements best represents the criticism of Andrew Carnegie found in this cartoon?',
189
+ 'If the finite group G contains a subgroup of order seven but no element (other than the identity) is its own inverse, then the order of G could be',
190
+ ]
191
+ embeddings = model.encode(sentences)
192
+ print(embeddings.shape)
193
+ # [3, 384]
194
+
195
+ # Get the similarity scores for the embeddings
196
+ similarities = model.similarity(embeddings, embeddings)
197
+ print(similarities.shape)
198
+ # [3, 3]
199
+ ```
200
+
201
+ <!--
202
+ ### Direct Usage (Transformers)
203
+
204
+ <details><summary>Click to see the direct usage in Transformers</summary>
205
+
206
+ </details>
207
+ -->
208
+
209
+ <!--
210
+ ### Downstream Usage (Sentence Transformers)
211
+
212
+ You can finetune this model on your own dataset.
213
+
214
+ <details><summary>Click to expand</summary>
215
+
216
+ </details>
217
+ -->
218
+
219
+ <!--
220
+ ### Out-of-Scope Use
221
+
222
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
223
+ -->
224
+
225
+ <!--
226
+ ## Bias, Risks and Limitations
227
+
228
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
229
+ -->
230
+
231
+ <!--
232
+ ### Recommendations
233
+
234
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
235
+ -->
236
+
237
+ ## Training Details
238
+
239
+ ### Training Dataset
240
+
241
+ #### Unnamed Dataset
242
+
243
+ * Size: 5,700 training samples
244
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>sentence_2</code>
245
+ * Approximate statistics based on the first 1000 samples:
246
+ | | sentence_0 | sentence_1 | sentence_2 |
247
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
248
+ | type | string | string | string |
249
+ | details | <ul><li>min: 3 tokens</li><li>mean: 44.65 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 44.72 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 47.14 tokens</li><li>max: 512 tokens</li></ul> |
250
+ * Samples:
251
+ | sentence_0 | sentence_1 | sentence_2 |
252
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
253
+ | <code>This question refers to the following information.<br>"When the Portuguese go from Macao in China to Japan, they carry much white silk, gold, musk, and porcelain: and they bring from Japan nothing but silver. They have a great carrack which goes there every year and she brings from there every year about six hundred coins: and all this silver of Japan, and two hundred thousand coins more in silver which they bring yearly out of India, they employ to their great advantage in China: and they bring from there gold, musk, silk, copper, porcelains, and many other things very costly and gilded.<br>When the Portuguese come to Canton in China to traffic, they must remain there but certain days: and when they come in at the gate of the city, they must enter their names in a book, and when they go out at night they must put out their names. They may not lie in the town all night, but must lie in their boats outside of the town. And, their time expired, if any man remains there, he is imprisoned."<br>Ralp...</code> | <code>This question refers to the following information.<br>Although in Protestant Europe, [Peter the Great] was surrounded by evidence of the new civil and political rights of individual men embodied in constitutions, bills of rights and parliaments, he did not return to Russia determined to share power with his people. On the contrary, he returned not only determined to change his country but also convinced that if Russia was to be transformed, it was he who must provide both the direction and the motive force. He would try to lead; but where education and persuasion were not enough, he could drive—and if necessary flog—the backward nation forward.<br>—Robert K. Massie, Peter the Great: His Life and World<br>Based on the above passage, what kinds of reforms did Peter the Great embrace?</code> | <code>This question refers to the following information.<br>Now, we have organized a society, and we call it "Share Our Wealth Society," a society with the motto "Every Man a King."…<br>We propose to limit the wealth of big men in the country. There is an average of $15,000 in wealth to every family in America. That is right here today.<br>We do not propose to divide it up equally. We do not propose a division of wealth, but we do propose to limit poverty that we will allow to be inflicted on any man's family. We will not say we are going to try to guarantee any equality … but we do say that one third of the average is low enough for any one family to hold, that there should be a guarantee of a family wealth of around $5,000; enough for a home, an automobile, a radio, and the ordinary conveniences, and the opportunity to educate their children.…<br>We will have to limit fortunes. Our present plan is that we will allow no man to own more than $50,000,000. We think that with that limit we will be able to ...</code> |
254
+ | <code>This question refers to the following information.<br>An Act to place certain restrictions on Immigration and to provide for the removal from the Commonwealth of Prohibited Immigrants.<br>…<br>3. The immigration into the Commonwealth of the persons described in any of the following paragraphs in this section (hereinafter called "prohibited immigrants") is prohibited, namely<br>(a) Any person who when asked to do so by an officer fails to write out at dictation and sign in the presence of the officer a passage of fifty words in length in a European language directed by the officer;<br>(b) Any person in the opinion of the Minister or of an officer to become a charge upon the public or upon any public or charitable organisation;<br>…<br>(g) Any persons under a contract or agreement to perform manual labour within the Commonwealth: Provided that this paragraph shall not apply to workmen exempted by the Minister for special skill required by Australia…<br>Immigration Restriction Act of 1901 (Australia)<br>Whereas in ...</code> | <code>This question refers to the following information.<br>"My little homestead in the city, which I recently insured for £2,000 would no doubt have shared the common fate, as the insurance companies will not make good that which is destroyed by the Queen's enemies. And although I have a farm of 50 acres close to the town, no doubt the crops and premises would have been destroyed. In fact, this has already partly been the case, and I am now suing the Government for damages done by a contingent of 1,500 natives that have recently encamped not many hundred yards from the place, who have done much damage all around."<br>Letter from a British citizen to his sister during the Anglo-Zulu War, South Africa, 1879<br>Incidents such as those described by the author of the letter were used by the British government to do which of the following?</code> | <code>If the price of a product decreases with the price of a substitute product remaining constant such that the consumer buys more of this product, this is called the</code> |
255
+ | <code> The term 'marketing mix' describes:</code> | <code>____________________ are those who address the same target market but provide a different offering to satisfy the market need, for example Spotify, Sony, and Apple's iPod.</code> | <code>The anatomic location of the spinal canal is</code> |
256
+ * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
257
+ ```json
258
+ {
259
+ "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
260
+ "triplet_margin": 5
261
+ }
262
+ ```
263
+
264
+ ### Training Hyperparameters
265
+ #### Non-Default Hyperparameters
266
+
267
+ - `multi_dataset_batch_sampler`: round_robin
268
+
269
+ #### All Hyperparameters
270
+ <details><summary>Click to expand</summary>
271
+
272
+ - `overwrite_output_dir`: False
273
+ - `do_predict`: False
274
+ - `eval_strategy`: no
275
+ - `prediction_loss_only`: True
276
+ - `per_device_train_batch_size`: 8
277
+ - `per_device_eval_batch_size`: 8
278
+ - `per_gpu_train_batch_size`: None
279
+ - `per_gpu_eval_batch_size`: None
280
+ - `gradient_accumulation_steps`: 1
281
+ - `eval_accumulation_steps`: None
282
+ - `torch_empty_cache_steps`: None
283
+ - `learning_rate`: 5e-05
284
+ - `weight_decay`: 0.0
285
+ - `adam_beta1`: 0.9
286
+ - `adam_beta2`: 0.999
287
+ - `adam_epsilon`: 1e-08
288
+ - `max_grad_norm`: 1
289
+ - `num_train_epochs`: 3
290
+ - `max_steps`: -1
291
+ - `lr_scheduler_type`: linear
292
+ - `lr_scheduler_kwargs`: {}
293
+ - `warmup_ratio`: 0.0
294
+ - `warmup_steps`: 0
295
+ - `log_level`: passive
296
+ - `log_level_replica`: warning
297
+ - `log_on_each_node`: True
298
+ - `logging_nan_inf_filter`: True
299
+ - `save_safetensors`: True
300
+ - `save_on_each_node`: False
301
+ - `save_only_model`: False
302
+ - `restore_callback_states_from_checkpoint`: False
303
+ - `no_cuda`: False
304
+ - `use_cpu`: False
305
+ - `use_mps_device`: False
306
+ - `seed`: 42
307
+ - `data_seed`: None
308
+ - `jit_mode_eval`: False
309
+ - `use_ipex`: False
310
+ - `bf16`: False
311
+ - `fp16`: False
312
+ - `fp16_opt_level`: O1
313
+ - `half_precision_backend`: auto
314
+ - `bf16_full_eval`: False
315
+ - `fp16_full_eval`: False
316
+ - `tf32`: None
317
+ - `local_rank`: 0
318
+ - `ddp_backend`: None
319
+ - `tpu_num_cores`: None
320
+ - `tpu_metrics_debug`: False
321
+ - `debug`: []
322
+ - `dataloader_drop_last`: False
323
+ - `dataloader_num_workers`: 0
324
+ - `dataloader_prefetch_factor`: None
325
+ - `past_index`: -1
326
+ - `disable_tqdm`: False
327
+ - `remove_unused_columns`: True
328
+ - `label_names`: None
329
+ - `load_best_model_at_end`: False
330
+ - `ignore_data_skip`: False
331
+ - `fsdp`: []
332
+ - `fsdp_min_num_params`: 0
333
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
334
+ - `fsdp_transformer_layer_cls_to_wrap`: None
335
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
336
+ - `deepspeed`: None
337
+ - `label_smoothing_factor`: 0.0
338
+ - `optim`: adamw_torch
339
+ - `optim_args`: None
340
+ - `adafactor`: False
341
+ - `group_by_length`: False
342
+ - `length_column_name`: length
343
+ - `ddp_find_unused_parameters`: None
344
+ - `ddp_bucket_cap_mb`: None
345
+ - `ddp_broadcast_buffers`: False
346
+ - `dataloader_pin_memory`: True
347
+ - `dataloader_persistent_workers`: False
348
+ - `skip_memory_metrics`: True
349
+ - `use_legacy_prediction_loop`: False
350
+ - `push_to_hub`: False
351
+ - `resume_from_checkpoint`: None
352
+ - `hub_model_id`: None
353
+ - `hub_strategy`: every_save
354
+ - `hub_private_repo`: None
355
+ - `hub_always_push`: False
356
+ - `gradient_checkpointing`: False
357
+ - `gradient_checkpointing_kwargs`: None
358
+ - `include_inputs_for_metrics`: False
359
+ - `include_for_metrics`: []
360
+ - `eval_do_concat_batches`: True
361
+ - `fp16_backend`: auto
362
+ - `push_to_hub_model_id`: None
363
+ - `push_to_hub_organization`: None
364
+ - `mp_parameters`:
365
+ - `auto_find_batch_size`: False
366
+ - `full_determinism`: False
367
+ - `torchdynamo`: None
368
+ - `ray_scope`: last
369
+ - `ddp_timeout`: 1800
370
+ - `torch_compile`: False
371
+ - `torch_compile_backend`: None
372
+ - `torch_compile_mode`: None
373
+ - `include_tokens_per_second`: False
374
+ - `include_num_input_tokens_seen`: False
375
+ - `neftune_noise_alpha`: None
376
+ - `optim_target_modules`: None
377
+ - `batch_eval_metrics`: False
378
+ - `eval_on_start`: False
379
+ - `use_liger_kernel`: False
380
+ - `eval_use_gather_object`: False
381
+ - `average_tokens_across_devices`: False
382
+ - `prompts`: None
383
+ - `batch_sampler`: batch_sampler
384
+ - `multi_dataset_batch_sampler`: round_robin
385
+
386
+ </details>
387
+
388
+ ### Training Logs
389
+ | Epoch | Step | Training Loss |
390
+ |:------:|:----:|:-------------:|
391
+ | 1.4006 | 500 | 1.7059 |
392
+ | 2.8011 | 1000 | 0.88 |
393
+ | 0.7013 | 500 | 0.7526 |
394
+ | 1.4025 | 1000 | 0.6176 |
395
+ | 2.1038 | 1500 | 0.4602 |
396
+ | 2.8050 | 2000 | 0.3472 |
397
+
398
+
399
+ ### Framework Versions
400
+ - Python: 3.11.13
401
+ - Sentence Transformers: 4.1.0
402
+ - Transformers: 4.52.4
403
+ - PyTorch: 2.6.0+cu124
404
+ - Accelerate: 1.7.0
405
+ - Datasets: 3.6.0
406
+ - Tokenizers: 0.21.1
407
+
408
+ ## Citation
409
+
410
+ ### BibTeX
411
+
412
+ #### Sentence Transformers
413
+ ```bibtex
414
+ @inproceedings{reimers-2019-sentence-bert,
415
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
416
+ author = "Reimers, Nils and Gurevych, Iryna",
417
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
418
+ month = "11",
419
+ year = "2019",
420
+ publisher = "Association for Computational Linguistics",
421
+ url = "https://arxiv.org/abs/1908.10084",
422
+ }
423
+ ```
424
+
425
+ #### TripletLoss
426
+ ```bibtex
427
+ @misc{hermans2017defense,
428
+ title={In Defense of the Triplet Loss for Person Re-Identification},
429
+ author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
430
+ year={2017},
431
+ eprint={1703.07737},
432
+ archivePrefix={arXiv},
433
+ primaryClass={cs.CV}
434
+ }
435
+ ```
436
+
437
+ <!--
438
+ ## Glossary
439
+
440
+ *Clearly define terms in order to be accessible across audiences.*
441
+ -->
442
+
443
+ <!--
444
+ ## Model Card Authors
445
+
446
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
447
+ -->
448
+
449
+ <!--
450
+ ## Model Card Contact
451
+
452
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
453
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 384,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 1536,
12
+ "layer_norm_eps": 1e-12,
13
+ "max_position_embeddings": 512,
14
+ "model_type": "bert",
15
+ "num_attention_heads": 12,
16
+ "num_hidden_layers": 12,
17
+ "pad_token_id": 0,
18
+ "position_embedding_type": "absolute",
19
+ "torch_dtype": "float32",
20
+ "transformers_version": "4.52.4",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 30522
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.52.4",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:837776686b8721399a6895badb8844c97878a2ab31c0bf9f83215046f64055c7
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "max_length": 128,
51
+ "model_max_length": 1000000000000000019884624838656,
52
+ "never_split": null,
53
+ "pad_to_multiple_of": null,
54
+ "pad_token": "[PAD]",
55
+ "pad_token_type_id": 0,
56
+ "padding_side": "right",
57
+ "sep_token": "[SEP]",
58
+ "stride": 0,
59
+ "strip_accents": null,
60
+ "tokenize_chinese_chars": true,
61
+ "tokenizer_class": "BertTokenizer",
62
+ "truncation_side": "right",
63
+ "truncation_strategy": "longest_first",
64
+ "unk_token": "[UNK]"
65
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff